Impossibility of Agreement in Asynchronous Systems

Size: px

Start display at page:

Download "Impossibility of Agreement in Asynchronous Systems"

Julius Hubbard
6 years ago
Views:

1 Consensus protocol P 8 schedule from C finite or infinite sequence σ of events that can be applied, in turn, from C a run is the sequence of steps associated with a schedule let σ finite, then C' = σ(c) denotes the configuration that will be reached from C by applying, in turn, σ to C iwe then say C' is reachable from C iif C is an initial configuration, and C' is reachable from C, then C' is accessible * in the sequel, we assume all configurations to be accessible Distributed Systems - Fall 2001 IV - 90 Stefan Leue 2001

2 Lemma 1: Commutativity of Independent Schedules 8 let C some configuration and σ 1 and σ 2 finite schedules that lead to configurations C 1 and C 2, respectively 8 we say that two schedules are independent if the sets of processes taking steps in them are disjoint 8 if σ 1 and σ 2 are independent, then σ 2 can be applied to C 1 and σ 1 can be applied to C 2, and both applications lead to the same configuration C 3 Proof of Lemma 1 8 follows immediately from the above given construction of a consensus protocol and the fact that σ 1 and σ 2 are independent Distributed Systems - Fall 2001 IV - 91 Stefan Leue 2001

3 Lemma 1: Commutativity of Independent Schedules 8 let C some configuration and σ 1 and σ 2 finite schedules that lead to configurations C 1 and C 2, respectively 8 we say that two schedules are independent if the sets of processes taking steps in them are disjoint 8 if σ 1 and σ 2 are independent, then σ 2 can be applied to C 1 and σ 1 can be applied to C 2, and both applications lead to the same configuration C 3 C σ 1 σ 2 C 1 C 2 σ 2 σ 1 Proof of Lemma 1 8 follows immediately from the above given construction of a consensus protocol and the fact that σ 1 and σ 2 do not interact since they are independent C 3 Distributed Systems - Fall 2001 IV - 92 Stefan Leue 2001

4 Consensus protocol P 8 a configuration C has a decision value v if some process p is in a decision state and y p = v 8 a consensus protocol is partially correct, if 1. no accessible configuration has more than one distinct decision value, and 2. ( v {0, 1}) (there exists some configuration that has a decision value v) 8 a process p is non-faulty or correct in a run if it takes infinitely many steps, faulty or crashed otherwise 8 a run is admissible if at most one process is faulty, and all messages sent to a non-faulty process are eventually received 8 a run is deciding provided some process reaches a decision state in that run 8 a consensus protocol is totally correct in spite of one process fault (crash) if it is partially correct and every admissible run of P is also a deciding run Distributed Systems - Fall 2001 IV - 93 Stefan Leue 2001

5 Consensus protocol P 8 let C a configuration, V the set of decision values of configurations reachable from C C is bivalent, if V = 2 C is univalent, if V = 1 ic is 0-valent, if V = {0} ic is 1-valent, if V = {1} Distributed Systems - Fall 2001 IV - 94 Stefan Leue 2001

6 Theorem 8 no consensus protocol is totally correct in spite of one fault Note 8 theorem claims that every partially correct consensus protocol has some admissible run that is not deciding Proof of Theorem 8 by contradiction assume that P is a consensus protocol that is totally correct in spite of one fault then show that P remains forever indecisive, i.e., has a run that is never deciding, by arguing the following: 1. there is some initial configuration in which the decision is not already taken, and 2. it is possible to construct an admissible run that never takes a step so that the system would commit to a particular decision note idue to the assumption of total correctness of P, and due to the fact that there are always admissible runs, P is always at least univalent Distributed Systems - Fall 2001 IV - 95 Stefan Leue 2001

7 Lemma 2 8 P has a bivalent initial configuration Proof of Lemma 2 8 by contradiction 8 assume the contrary (P has only univalent initial configurations) due to assumed partial correctness, P must have both 0-valent and 1- valent initial configurations two initial configurations are adjacent, if they differ only in the initial value x 0 p of a single process p between any two initial configurations C i and C m one can form a chain of initial configurations so that every pair of neighbouring configurations in the chain is adjacent Distributed Systems - Fall 2001 IV - 96 Stefan Leue 2001

8 Proof of Lemma 2 (cont.) this implies that there must be a pair of adjacent initial configurations C k and C l so that C k is 0-valent and C l is 1-valent ilet p the process in which C k and C l differ iconsider an admissible deciding run from C k in which p takes no steps, and let σ the associated schedule idue to commutativity (c.f. Lemma 1), σ can also be applied to C l icorresponding configurations in the two runs are identical except for the internal state of p ieasy to see that both runs eventually reach the same decision value * if decision value is 1, then C k is bivalent, * otherwise, C l is bivalent, which establishes a contradiction to the assumption of nonexistence of bivalent initial configurations Distributed Systems - Fall 2001 IV - 97 Stefan Leue 2001

9 Lemma 3 8 let C a bivalent configuration e = (p, m) an event applicable to C : set of configurations reachable from C without applying e Γ = e( ) = { e(e) E and e is applicable to E } 8 then, Γ contains a bivalent configuration Proof of Lemma 3 8 e is applicable to every E since e is applicable to C Definition of implies that e will never be executed during any run related to E messages can be delayed arbitrarily, i.e., e will remain enabled for the length of any run related to some E 8 assume Γ contains no bivalent configurations, i.e., every D Γis univalent goal: derive contradiction Distributed Systems - Fall 2001 IV - 98 Stefan Leue 2001

10 Proof of Lemma 3 (cont.) 8 let E i an i-valent configuration reachable from C, i {0, 1} inote that both E 1 and E 2 exist since C is bivalent 8 E i? if E i, then let F i the successor configuration of E i, i.e., if i = e(e i ) Γ if E i, ithen e was applied in reaching E i from C ihence, there is a configuration F i Γfrom which E i is reachable in either case, if i is i-valent since Γ contains no bivalent configurations, and ieither E i is reachable from F i, or F i is reachable from E i 8 since F i Γfor both i {0, 1}, Γ contains both 0- and 1-valent configurations Distributed Systems - Fall 2001 IV - 99 Stefan Leue 2001

11 Proof of Lemma 3 (cont.) 8 let C and C configurations. If C is the result of applying a single step to C, then we call C and C neighbours. 8 assumption: there exist neighbours C 0, C 1 such that D i = e(c i ) for both i {0, 1}, e = (p, m) to be proven by an inductive agument over p without loss of generality, we consider C 1 = e'(c 0 ), where e' = (p',m') i case 1: if p' p, then D 1 = e'(d 0 ) (by Lemma 1) * impossible, since any successor of a 0-valent configuration is a 0-valent configuration, hence a contradiction C 0 e e D 0 C 1 e e D 1 Distributed Systems - Fall 2001 IV Stefan Leue 2001

12 Proof of Lemma 3 (cont.) icase 2: if p = p, then consider any finite deciding run from C 0 in which p takes no steps * let σ the corresponding schedule, and let A = σ(c 0 ) * σ is applicable to D i (by Lemma 1) and it leads to an i-valent configuration E i = σ(d i ) for both i {0, 1} e C 0 e C 1 D D 0 1 A σ σ e e e E 0 E 1 * by Lemma 1, e(a) = E 0 and e(e (A)) = E 1 * hence, A is bivalent * this is a contradiction to the assumption that A is deciding, i.e., A must be univalent contradiction in both cases, hence Γ contains a bivalent configuration Distributed Systems - Fall 2001 IV Stefan Leue 2001 σ e

13 Proof of Theorem (cont.) 8 any deciding run from a bivalent initial configuration must possess a single step that goes from a bivalent to a univalent configuration this step determines the decision value 8 proceed to show that it is always possible to construct an admissible run in stages, starting form an initial configuration, that avoids taking such a deciding step: admissibility is ascertained by the following construction of a run imaintenance of a queue of processes, initially in arbitrary order iorder messages in message buffer according to time of sending, earliest first istage comprises one or more process steps * stage ends when the first process in the queue takes a step in which, if its message queue was not empty at the start of the stage, its earliest message is received by another process (which will eventually happen, according to our assumptions regarding the message system) * process is then moved to the back of the queue ithis construction means that in any infinite sequence of stages * every process performs infinitely many steps * every message will be delivered i.e., admissibility of the run Distributed Systems - Fall 2001 IV Stefan Leue 2001

14 Proof of Theorem (cont.) 8 proceed to show that it is always possible to construct a run in stages, starting form an initial configuration, that avoids taking such a deciding step: avoidance of decision making is ascertained by the following reasoning ilet C 0 a bivalent initial configuration * existance assured by Lemma 2 ifirst stage of run begins in C 0 ifollowing construction ensures that every following stage of the run starts in a bivalent configuration isequence of steps from C to C' to form a stage: * let C bivalent, and p to be at the head of the process queue * let m the earliest message to p in the message buffer, if any, and otherwise * let e = (p, m) * there is a bivalent configuration C' reachable from C by a schedule in which e is the last event applied (by Lemma 3) ieach stage leads to bivalent configuration, hence an infinite schedule consisting of steps as described above is possible 8 hence, there is an admissible infinite run of steps in which no decision is ever reached, and hence P is not totally correct (contradiction) = Distributed Systems - Fall 2001 IV Stefan Leue 2001

15 Some Remarks 8 theorem applies to a weak form of consensus in which only some of the correct processes need to have made a decision (c.f. definition of decision value) 8 the naive solution in which always the same decision value is chosen is ruled out (c.f. definition of partial correctness) Distributed Systems - Fall 2001 IV Stefan Leue 2001

16 Consensus Impossibility of Agreement in Asynchronous Systems 8 consequences in asynchronous systems, no solution to BG, IC, TOR-multicast 8 of course, in practice consensus can often be reached, but a residual probability that consensus cannot be reached remains 8 possible approaches to reaching consensus by weakening system assumptions partial synchrony masking faults modified failure detectors randomized algorithms Distributed Systems - Fall 2001 IV Stefan Leue 2001

17 Consensus Impossibility of Agreement in Asynchronous Systems 8 partial synchrony message delays are bounded, but bound unknown known bound, but longer transmission delays for some, finite, initial period of time 8 masking faults design system so that failures appear like intermittent slowdown in processing of messages istore system state on persistent storage before crash irestart system in that state after recovery Distributed Systems - Fall 2001 IV Stefan Leue 2001

18 Consensus Impossibility of Agreement in Asynchronous Systems 8 modified failure detectors in ISIS system (Birman, 1993) ideem process that has not responded as failed itreat this process as fail-safe, i.e., discard any subsequent messages from this process iproblems: * long timeouts necessary * false negatives possible that reduce effectiveness of system eventually weak failure detector (Chandra and Toueg, 1996) iconsensus can be solved, even with a weak failure detector, if fewer than N/2 processes crash and communication is reliable ieventually weak failure detector * eventually weakly complete: each faulty process is eventually suspected permanently * eventually weakly accurate: after some time, at least one correct process is never suspected by any correct process ieventually weak failure detector cannot be implemented in asynchronous system based on message passing, however, failure detectors adapting timeout values can come close to ewfd s Distributed Systems - Fall 2001 IV Stefan Leue 2001

Consensus Problem. Pradipta De

Consensus Problem. Pradipta De Consensus Problem Slides are based on the book chapter from Distributed Computing: Principles, Paradigms and Algorithms (Chapter 14) by Kshemkalyani and Singhal Pradipta De pradipta.de@sunykorea.ac.kr