BYZANTINE GENERALS BYZANTINE GENERALS (1) A fable: Michał Szychowiak, 2002 Dependability of Distributed Systems (Byzantine agreement)

BYZANTINE GENERALS (1) BYZANTINE GENERALS A fable:

BYZANTINE GENERALS (2) Byzantine Generals Problem: Condition 1: All loyal generals decide upon the same plan of action. Condition 2: A small number of traitors cannot cause the loyal generals to adopt a bad plan. Condition 2 needs a more formal statement. But it is hard to formalize, since it requires expressing precisely what a bad plan is. Instead we consider how the generals reach a decision.

BYZANTINE GENERALS (3) Each general G i makes a decision v i and broadcasts it. If general G i is loyal, then every other general uses his v i. Let s trust the loyal generals: If most of generals independently decide ATTACK, then attack. If most of generals independently decide RETREAT, then retreat. Let s try a majority based voting If they are divided, then... hmm... well, uh... hmm... who cares? What s the problem?... traitors! if you receive v j from G j, you can t trust him you must be careful, because you don t know who to trust

BYZANTINE GENERALS (4) ATTACK? heh-heh ATTACK RETREAT RETREAT

BYZANTINE GENERALS (5) Byzantine failures Failure types: crash failures omission failures fail-stop model Byzantine (malicious) failures Omissions in Byzantine Generals model: Since a faulty general (a traitor) can refuse to sent a message, a nonfaulty general may never receive an expected message. In such a situation, we assume that the nonfaulty general (loyal) simply chooses an arbitrary value and acts as if the expected message has been received. Obviously, we require, that such omission can be detected by the respective receiver. In synchronous systems, where the duration of each round is known, this detection is simple all expected messages not received by the end of a round were not sent (omitted).

BYZANTINE AGREEMENT (1) BYZANTINE AGREEMENT A commanding general G C must send an order to N-1 lieutenant generals such that: BA1: All loyal lieutenants obey the same order. BA2: If the commander is loyal, then every loyal lieutenant obeys the order he sends. Note that if G C is loyal, then BA1 follows directly from BA2.

IMPOSSIBILITY OF BYZANTINE AGREEMENT (1) IMPOSSIBILITY OF BYZANTINE AGREEMENT Loyalty is very important: Claim: The Byzantine Generals Problem is impossible if f 3 1 N generals are traitors. (there is no f-resilient Byzantine Agreement algorithm for f 3 1 N) First we will show that no solution for 3 generals can handle a single traitor.

IMPOSSIBILITY OF BYZANTINE AGREEMENT (2) 1 (ATTACK) 0 (RETREAT)??? he said 0 (RETREAT)

IMPOSSIBILITY OF BYZANTINE AGREEMENT (3) 1 (ATTACK) 1 (ATTACK)??? he said 0 (RETREAT)

IMPOSSIBILITY OF BYZANTINE AGREEMENT (4) Theorem: No solution for fewer than 3t + 1 generals can cope with t traitors. Proof strategy: Assume we have an algorithm A t for 3t generals with t > 1 traitors. We construct a 3-general 1-traitor algorithm. We define A 1 : each general simulates t generals in A t 1 traitor simulates t traitors 2 loyal generals simulate 2t loyal generals both BA1 and BA2 are satisfied But this is impossible, so our assumption is wrong!

APPROXIMATE AGREEMENT (1) APPROXIMATE AGREEMENT Maybe the difficulty is requiring exact agreement? New problem: G C sends an attack TIME to N-1 lieutenant generals: AA1: All loyal lieutenants attack within 10 minutes of each other. AA2: If the commander is loyal, then every loyal lieutenant attacks within 10 minutes of the commander s order. Question: Is this agreement problem any easier?

APPROXIMATE AGREEMENT (2) Theorem: No Approximate Agreement algorithm for fewer than 3t + 1 generals can cope with t traitors. Proof strategy: Assume we have a 3-general 1-traitor Approximate Agreement algorithm. We will transform it into a 3-general 1-traitor Byzantine Agreement algorithm. But this is impossible, so our assumption is wrong!

APPROXIMATE AGREEMENT (3) Transformation to Byzantine Agreement Suppose G C sends: 1:00 to mean ATTACK 2:00 to mean RETREAT Each lieutenant does: Phase 1 run the Approximate Agreement protocol if time is before 1:10, then decide attack if time is after 1:50, then decide retreat Phase 2 if you don t come to any decision ask the other lieutenant Have you decided? if so do the same thing if not retreat

APPROXIMATE AGREEMENT (4) Claim: If the Approximate Agreement protocol works, so does the Byzantine Agreement protocol. If G C is loyal, then AA2 ensures that the loyal lieutenant gets the right attack time, so BA2 is satisfied. If G C is a traitor, both lieutenants are loyal. AA1 ensures they cannot make contradictory decisions in Phase 1. If one decides in Phase 1, the other agrees in Phase 2. If neither decides in Phase 1, they both retreat in Phase 2. Thus Approximate Agreement is impossible for 3 generals with 1 traitor. Simulation method works for 3t generals and t traitors.

ORAL MESSAGES (1) BYZANTINE AGREEMENT ALGORITHMS To reach agreement, processes have to exchange their values and replay the received values to other processes several times. The capability of faulty process to distort what it receive from the others greatly depends upon the type of messages. Two types of messages: oral messages (non-authenticated) a faulty process can forge a message and claim to have received it from another process or change the contents of a received message before it relays this message to other processes. There is no way for a process to verify the authenticity of a received message. signed messages (authenticated) a faulty process cannot forge a message or change the contents of a received one before relying it to other processes. Each process can verify the authenticity of a received message. Faulty processes do less damage.

ORAL MESSAGES (2) ORAL MESSAGES This is a simple solution using communication by oral messages to cope with t traitors, where N 3t + 1 Assumptions: A1: Every message sent is delivered correctly. A2: The receiver of a message knows who sent it. A3: The absence of a message can be detected. A1 and A2 prevent a traitor from interfering with communication between others (A2 = no spurious messages). A3 means that a traitor can t simply remain quiet to disallow progress.

ORAL MESSAGES (3) Details: More requirements: Generals can communicate directly with each other (however this can be relaxed) If a message is missing, assume it says 0 (default order RETREAT) majority function: if the majority of v i doesn t exist majority(v 1,...,v n ) = 0 if the majority of v i equals v, then majority(v 1,...,v n ) = v n is the number of processes currently reaching agreement alternatively, you can use arbitrary member of the {v i } set (for ordered sets, median is a good choice)

ORAL MESSAGES (4) Algorithm Lamport-Shostak-Pease [1] OM(0): 1. G C sends his value to every G i 2. Each lieutenant uses that value (or 0 if none) OM(m), m>0: 1. G C sends his value to every G i 2. Let v i be the value received by G i G i now acts as commander for OM(m-1) with n-2 other lieutenants acts as participant for n-2 instances of OM(m-1), with each G j :i j as commander, deciding v j (or RETREAT) 3. decide majority(v 1,..., v n-1 )

ORAL MESSAGES (5) Description: algorithm for N processes starts with OM(t) execution of OM(t) invokes N-1 separate executions of OM(t-1), each of which invokes N-2 executions of OM(t-2), and so on... processes are successively divided into smaller and smaller groups of n members (initially n=n) and the Byzantine Agreement is recursively achieved within each group in step 2 of OM(m). in total, there are (N-1)(N-2)(N-3)...(N-t+1) separate executions of OM(m), m = N-1, N-2, N-3,..., N-t+1 the message complexity is O(N t ) algorithm requires t+1 rounds of message exchanges t+1 is the lowest bound on the number of rounds to reach Byzantine Agreement in a fully connected network with process failures only using oral messages however, using signed messages, this bound is relaxed

ORAL MESSAGES (6) Example 1 step 1: G C 0 0 0 OM(1) G 1 G 2 G 3 OM(0) G C majority(0,0,0)=0 step 2: 0 0 G 1 1 G 2 G 0 3 step 3: majority(0,1,0)=0 0 0

ORAL MESSAGES (7) Example 2 G C 1 0 (null) G 1 G 2 G 3 majority(1,0,0)=0 G C majority(1,0,0)=0 1 0 majority(1,0,0)=0 G 1 0 G 2 0 G 3 0 1

ORAL MESSAGES (8) Correctness proof Lemma: For any m and t, OM(m) satisfies BA2 if there are more than 2t + m generals and at most t traitors. Recall: BA2: If the commander is loyal, then every loyal lieutenant obeys the order he sends. Lemma proof: assume commander is loyal OM(0) trivially satisfies BA2 (by assumption A1) proceed by induction: assume lemma for OM(m-1) prove it for OM(m), m > 0

ORAL MESSAGES (9) Induction step: loyal commander sends v to n-1 lieutenants each loyal lieutenant calls OM(m-1) with n-2 other lieutenants by hypothesis, n > 2t + m, so (n-1) > 2t + (m-1) apply induction hypothesis: each loyal lieutenant gets v j = v, from each loyal G j there are at most t traitors and (n-1) > 2t + (m-1) 2t so majority of n-1 lieutenants are loyal for each loyal lieutenant majority(v 1,...,v n-1 ) = v

ORAL MESSAGES (10) Theorem: For any t, OM(t) satisfies BA1 and BA2 if there are more than 3t generals and at most t traitors. Theorem proof (by induction on t): if there are no traitors easy assume claim for OM(t-1) assume commander is loyal take m = t in previous lemma OM(t) satisfies BA2 BA2 implies BA1 if commander is loyal assume commander is a traitor...

ORAL MESSAGES (11) if commander is a traitor: at most t traitors, including commander at most t-1 traitors among the lieutenants lieutenants are more than 3t-1 > 3(t-1) apply induction hypothesis: OM(t-1) satisfies BA1 and BA2 for all G j, any 2 loyal lieutenants get the same v j any 2 loyal lieutenants get the same vector v 1,...,v n-1 any 2 loyal lieutenants get the same majority(v 1,...,v n-1 )

BIBLIOGRAPHY (1) BIBLIOGRAPHY [1] L. Lamport, R. Shostak, M. Pease, "The Byzantine Generals Problem", ACM Transactions on Programming Languages and Systems, 1982. [2] Bharat Bhargava, ed. "Concurrency Control and Reliability in Distributed Systems", Van Nostraud Reinhold Co., New York 1987, ch. 12: The Byzantine Generals (D. Dolev, L. Lamport, M. Pease, R. Shostak) [3] R. Chow, T. Johnson, "Distributed Operating Systems & Algorithms", Addison Wesley Longman, 1997, ch. 11