Distributed Algorithms Practical Byzantine Fault Tolerance
|
|
- Annis Nichols
- 5 years ago
- Views:
Transcription
1 Distributed Algorithms Practical Byzantine Fault Tolerance Alberto Montresor Università di Trento 2018/12/06 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
2 Table of contents 1 Introduction 2 Byzantine generals 3 Practical Byzantine Fault Tolerance 4 Beyond PBFT Overview 5 Zyzzyva Introduction Three cases The case of the missing phase View changes 6 Aardvark 7 UpRight
3 Introduction Motivation Processes may exhibit arbitrary (Byzantine) behavior Malicious attacks They lie They collude Software error Arbitrary states, messages Examples Amazon outage (2008), Root cause was a single bit flip in internal state messages 1 Shuttle Mission STS-124 (2008), 3-1 disagreement on sensors during fuel loading (on Earth!) Alberto Montresor (UniTN) DS - BFT 2018/12/06 1 / 80
4 Introduction History State-of-the-art at the end of the 90 s Theoretically feasible algorithms to tolerate Byzantine failures, but inefficient in practice Assume synchrony known bounds for message delays and processing speed Most importantly: synchrony assumption needed for correctness what about DoS? Bibliography L. Lamport, R. Shostak, and M. Pease. The Byzantine generals problem. ACM Transactions on Programming Languages and Systems (TOPLAS), 4(3): , Alberto Montresor (UniTN) DS - BFT 2018/12/06 2 / 80
5 Table of contents 1 Introduction 2 Byzantine generals 3 Practical Byzantine Fault Tolerance 4 Beyond PBFT Overview 5 Zyzzyva Introduction Three cases The case of the missing phase View changes 6 Aardvark 7 UpRight
6 Byzantine generals Byzantine generals Wait Attack! No, wait! Surrender! Attack! Attack! Wait From cs4410 fall 08 lecture Alberto Montresor (UniTN) DS - BFT 2018/12/06 3 / 80
7 Byzantine generals Specification A commanding general must send an order to his n 1 lieutenant generals such that: IC1: All loyal lieutenants obey the same order IC2: If the commanding general is loyal, then every loyal lieutenant obeys the order he sends Assumptions ( Oral messages): Every message that is sent is received correctly The receiver of a message knows who sent it The absence of a message can be detected Alberto Montresor (UniTN) DS - BFT 2018/12/06 4 / 80
8 Byzantine generals Impossibility results Under the Oral messages assumption, no solution with three generals can handle even a single traitor Comm. Gen. Comm. Gen. Attack! Attack! Attack! Retreat! Liut. 1 He said Retreat! Liut. 2 Liut. 1 He said Retreat! Liut. 2 Alberto Montresor (UniTN) DS - BFT 2018/12/06 5 / 80
9 Byzantine generals Oral Message algorithm OM(m) Algorithm OM(0) 1 The commander sends its value to every lieutenant 2 Each lieutenant uses the value he received from commander, or uses retreat if he received no value Algorithm OM(m) 1 The commander sends its value to every lieutenant 2 i, let v i be the value lieutenant i receives from the commander, or retreat if it has received no value. Lieutenant i acts as the commander of algorithm OM(m 1) to send the value v i to each of the other n 2 other lieutenants 3 j i, let v j be the value received by i from j in Step 2 of algorithm OM(m 1) or retreat if no value. Lieutenant i uses the value majority(v 1,..., v n ) (deterministic function) Alberto Montresor (UniTN) DS - BFT 2018/12/06 6 / 80
10 Byzantine generals Oral Message Algorithm Example OM(1) C L1 A A A L2 A A R L3 A A A Alberto Montresor (UniTN) DS - BFT 2018/12/06 7 / 80
11 Byzantine generals Oral messages Theorem For any m, Algorithm OM(m) satisfies conditions IC1 and IC2 if there are more than 3m generals and at most m traitors Problems: message paths of length up to m + 1 (expensive) absence of messages must be detected via time-out (vulnerable to DoS) Alberto Montresor (UniTN) DS - BFT 2018/12/06 8 / 80
12 Table of contents 1 Introduction 2 Byzantine generals 3 Practical Byzantine Fault Tolerance 4 Beyond PBFT Overview 5 Zyzzyva Introduction Three cases The case of the missing phase View changes 6 Aardvark 7 UpRight
13 Practical Byzantine Fault Tolerance A Byzantine renaissance Bibliography M. Castro and B. Liskov. Practical Byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst., 20: , Nov Contributions First state machine replication protocol that survives Byzantine faults in asynchronous networks Live under weak Byzantine assumptions Byzantine Paxos/Raft! Implementation of a Byzantine, fault tolerant distributed FS Experiments measuring cost of replication technique Alberto Montresor (UniTN) DS - BFT 2018/12/06 9 / 80
14 Practical Byzantine Fault Tolerance Assumptions System model Asynchronous distributed system with N processes Unreliable channels Unbreakable cryptography Message m is signed by its sender i, and we write m σ(i), through: Public/private key pairs Message authentication codes (MAC) A digest d(m) of message m is produced through collision-resistant hash functions Alberto Montresor (UniTN) DS - BFT 2018/12/06 10 / 80
15 Practical Byzantine Fault Tolerance Assumptions Failure model Up to f Byzantine servers N > 3f total servers (Potentially Byzantine clients) Independent failures Different implementations of the service Different operating systems Different root passwords, different administrator Alberto Montresor (UniTN) DS - BFT 2018/12/06 11 / 80
16 Practical Byzantine Fault Tolerance Specification State machine replication Replicated service with a state and deterministic operations operating on it Clients issue a request and block waiting for reply Safety The system satisfies linearizability, provided that N > 3f + 1 Regardless of faulty clients... all operations performed by faulty clients are observed in a consistent way by non-faulty clients The algorithm does not rely on synchrony to provide safety... Liveness It relies on synchrony to provide liveness Assumes delay(t) does not grow faster than t indefinitely Weak assumption if network faults are eventually repaired Circumvent the impossibility results of FLP Alberto Montresor (UniTN) DS - BFT 2018/12/06 12 / 80
17 Practical Byzantine Fault Tolerance Optimality Theorem To tolerate up to f malicious nodes, N must be equal to 3f + 1 Proof
18 Practical Byzantine Fault Tolerance Optimality Theorem To tolerate up to f malicious nodes, N must be equal to 3f + 1 Proof It must be possible to proceed after communicating with N f replicas, because the faulty replicas may not respond
19 Practical Byzantine Fault Tolerance Optimality Theorem To tolerate up to f malicious nodes, N must be equal to 3f + 1 Proof It must be possible to proceed after communicating with N f replicas, because the faulty replicas may not respond But the f replicas not responding may be just slow, so f of those that responded might be faulty
20 Practical Byzantine Fault Tolerance Optimality Theorem To tolerate up to f malicious nodes, N must be equal to 3f + 1 Proof It must be possible to proceed after communicating with N f replicas, because the faulty replicas may not respond But the f replicas not responding may be just slow, so f of those that responded might be faulty The correct replicas who responded (N 2f) must outnumber the faulty replicas, so N 2f > f N > 3f Alberto Montresor (UniTN) DS - BFT 2018/12/06 13 / 80
21 Practical Byzantine Fault Tolerance Optimality So, N > 3f to ensure that at least a correct replica is present in the reply set N = 3f + 1; more is useless more and larger messages without improving resiliency Alberto Montresor (UniTN) DS - BFT 2018/12/06 14 / 80
22 Practical Byzantine Fault Tolerance Processes and views Replicas IDs: 0... N 1 Replicas move through a sequence of configurations called views During view v: Primary replica is i: i = v mod N The other are backups View changes are carried out when the primary appears to have failed Alberto Montresor (UniTN) DS - BFT 2018/12/06 15 / 80
23 Practical Byzantine Fault Tolerance The algorithm To invoke an operation, the client sends a request to the primary The primary multicasts the request to the backups Quorums are employed to guarantee ordering on operations When an order has been agreed, replicas execute the request and send a reply to the client When the client receives at least f + 1 identical replies, it is satisfied Client Primary Backup 1 Backup 2 Backup 3 Alberto Montresor (UniTN) DS - BFT 2018/12/06 16 / 80
24 Practical Byzantine Fault Tolerance Problems The primary could be faulty! could ignore commands; assign same sequence number to different requests; skip sequence numbers; etc backups monitor primary s behavior and trigger view changes to replace faulty primary Backups could be faulty! could incorrectly store commands forwarded by a correct primary use dissemination Byzantine quorum systems Faulty replicas could incorrectly respond to the client! Client waits for f + 1 matching replies before accepting response Alberto Montresor (UniTN) DS - BFT 2018/12/06 17 / 80
25 Practical Byzantine Fault Tolerance The general idea Algorithm steps are justified by certificates Sets (quorums) of signed messages from distinct replicas proving that a property of interest holds With quorums of size at least 2f + 1 Any two quorums intersect in at least one correct replica There is always one quorum that contains only non-faulty replicas 1. State: A 2. State: A 3. State: A 4. State: Servers X Clients write A write A write A write A Alberto Montresor (UniTN) DS - BFT 2018/12/06 18 / 80
26 Practical Byzantine Fault Tolerance The general idea Algorithm steps are justified by certificates Sets (quorums) of signed messages from distinct replicas proving that a property of interest holds With quorums of size at least 2f + 1 Any two quorums intersect in at least one correct replica There is always one quorum that contains only non-faulty replicas 1. State: 2. State: 3. State: 4. State: A A B B B Servers Clients X write B write B write B write B Alberto Montresor (UniTN) DS - BFT 2018/12/06 18 / 80
27 Practical Byzantine Fault Tolerance Protocol schema Normal operation How the protocol works in the absence of failures hopefully, the common case View changes How to depose a faulty primary and elect a new one Garbage collection How to reclaim the storage used to keep certificates Recovery How to make a faulty replica behave correctly again (not here) Alberto Montresor (UniTN) DS - BFT 2018/12/06 19 / 80
28 Practical Byzantine Fault Tolerance State The internal state of each of the replicas include: the state of the actual service a message log containing all the messages the replica has accepted an integer denoting the replica current view Alberto Montresor (UniTN) DS - BFT 2018/12/06 20 / 80
29 Practical Byzantine Fault Tolerance Client request Primary Request Backup 1 Backup 2 Backup 3 request, o, t, c σ(c) o: state machine operation t: timestamp (used to ensure exactly-once semantics) c: client id σ(c): client signature Alberto Montresor (UniTN) DS - BFT 2018/12/06 21 / 80
30 Practical Byzantine Fault Tolerance Pre-prepare phase Primary Request Backup 1 Backup 2 Backup 3 Pre-prepare pre-prepare, v, n, d(m) σ(p), m v: current view n: sequence number d(m): digest of client message σ(p): primary signature m: client message Alberto Montresor (UniTN) DS - BFT 2018/12/06 22 / 80
31 Practical Byzantine Fault Tolerance Pre-prepare phase pre-prepare, v, n, d(m) σ(p), m Correct replica i accepts pre-prepare if: the pre-prepare message is well-formed the current view of i is v i has not accepted another pre-prepare for v, n with a different digest n is between two water-marks L and H (to avoid sequence number exhaustion caused by faulty primaries) Each accepted pre-prepare message is stored in the accepting replica s message log (including the primary s) Non-accepted pre-prepare messages are just discarded Alberto Montresor (UniTN) DS - BFT 2018/12/06 23 / 80
32 Practical Byzantine Fault Tolerance Prepare phase Primary Request Backup 1 Backup 2 Backup 3 Pre-prepare Prepare prepare, v, n, d(m) σ(i) Accepted by correct replica j if: the prepare message is well-formed current view of j is v n is between two water-marks L and H Alberto Montresor (UniTN) DS - BFT 2018/12/06 24 / 80
33 Practical Byzantine Fault Tolerance Prepare phase Primary Request Backup 1 Backup 2 Backup 3 Pre-prepare Prepare prepare, v, n, d(m) σ(i) Replicas that send prepare accept the sequence number n for m in view v Each accepted prepare message is stored in the accepting replica s message log Alberto Montresor (UniTN) DS - BFT 2018/12/06 24 / 80
34 Practical Byzantine Fault Tolerance Prepare certificate (P-certificate) Replica i produces a prepare certificate prepared(m, v, n, i) iff its log holds: The request m A pre-prepare for m in view v with sequence number n Log contains 2f prepare messages from different backups that match the pre-prepare prepared(m, v, n, i) means that a quorum of (2f + 1) replicas agrees with assigning sequence number n to m in view v Theorem There are no two non-faulty replicas i, j such that prepared(m, v, n, i) and prepared(m, v, n, j), with m m Proof? Alberto Montresor (UniTN) DS - BFT 2018/12/06 25 / 80
35 Practical Byzantine Fault Tolerance Commit phase Primary Request Backup 1 Backup 2 Backup 3 Pre-prepare Prepare Commit commit, v, n, d(m), i σ(i) After having collected a P-certificate prepared(m, v, n, i), replica i sends a commit message Accepted if: The commit message is well-formed Current view of i is v n is between two water-marks L and H Alberto Montresor (UniTN) DS - BFT 2018/12/06 26 / 80
36 Practical Byzantine Fault Tolerance Commit certificate (C-Certificate) Commit certificates ensure total order across views we guarantee that we can t miss prepare certificates during a view change A replica has a certificate committed(m, v, n, i) if: it had a P-certificate prepared(m, v, n, i) log contains 2f + 1 matching commit from different replicas (possibly including its own) Replica executes a request after it gets commit certificate for it, and has cleared all requests with smaller sequence numbers Alberto Montresor (UniTN) DS - BFT 2018/12/06 27 / 80
37 Practical Byzantine Fault Tolerance Reply phase Primary Request Backup 1 Backup 2 Backup 3 Pre-prepare Prepare Commit Reply reply, v, t, c, i, r σ(i) r is the reply Client waits for f + 1 replies with the same t, r If the client does not receive replies soon enough, it broadcast the request to all replicas Alberto Montresor (UniTN) DS - BFT 2018/12/06 28 / 80
38 Practical Byzantine Fault Tolerance View change A un-satisfied replica backup i mutinies: stops accepting messages (except view-change and new-view) multicasts view-change, v + 1, P, i σ(i) P contains a P-certificate P m for each request m (up to a given number, see garbage collection) Mutiny succeeds if the new primary collects a new-view certificate V : a set containing 2f + 1 view-change messages indicating that 2f + 1 distinct replicas (including itself) support the change of leadership Alberto Montresor (UniTN) DS - BFT 2018/12/06 29 / 80
39 Practical Byzantine Fault Tolerance View change The primary elect p (replica v + 1 mod N): extracts from the new-view certificate V the highest sequence number h of any message for which V contains a P-certificate creates a new pre-prepare message for any client message m with sequence number n h and add it to the set O if there is a P-certificate for n, m in V Otherwise O O pre-prepare, v + 1, n, d m σ(p ) O O pre-prepare, v + 1, n, d null σ(p ) p multicasts new-view, v + 1, V, O σ(p ) Alberto Montresor (UniTN) DS - BFT 2018/12/06 30 / 80
40 Practical Byzantine Fault Tolerance View change Backup accepts a new-view, v + 1, V, O σ(p ) message for v + 1 if it is signed properly by p V contains valid view-change messages for v + 1 the correctness of O can be locally verified (repeating the primary s computation) Actions: Adds all entries in O to its log (so did p!) Multicasts a prepare for each message in O Adds all prepares to the log and enters new view Alberto Montresor (UniTN) DS - BFT 2018/12/06 31 / 80
41 Practical Byzantine Fault Tolerance Garbage collection A correct replica keeps in log messages about request o until: o has been executed by a majority of correct replicas, and this fact can proven during a view change Truncate log with stable checkpoints Each replica i periodically (after processing k requests) checkpoints state and multicasts checkpoint, n, d, i n: last executed request d: state digest A set S containing 2f + 1 equivalent checkpoint messages from distinct processes are a proof of the checkpoint s correctness (stable checkpoint certificate) Alberto Montresor (UniTN) DS - BFT 2018/12/06 32 / 80
42 Practical Byzantine Fault Tolerance View Change, revisited Message view-change, v + 1, n, S, C, P, i σ(i) n: the sequence number of the last stable checkpoint S: the last stable checkpoint C: the checkpoint certificate (2f + 1 checkpoint messages) Message new-view, v + 1, n, V, O σ(p ) n: the sequence number of the last stable checkpoint V, O: contains only requests with sequence number larger than n Alberto Montresor (UniTN) DS - BFT 2018/12/06 33 / 80
43 Practical Byzantine Fault Tolerance Optimizations Reducing replies One replica designated to send reply to client Other replicas send digest of the reply Lower latency for writes (4 messages) Replicas respond at Prepare phase (tentative execution) Client waits for 2f + 1 matching responses Fast reads (one round trip) Client sends to all; they respond immediately Client waits for 2f + 1 matching responses Alberto Montresor (UniTN) DS - BFT 2018/12/06 34 / 80
44 Practical Byzantine Fault Tolerance Optimizations: cryptography Reducing overhead Public-key cryptography only for view changes MACs (message authentication codes) for all other messages To give an idea (Pentium 200Mhz) Generating 1024-bit RSA signature of a MD5 digest: 43ms Generating a MAC of the same message: 10µs Alberto Montresor (UniTN) DS - BFT 2018/12/06 35 / 80
45 Practical Byzantine Fault Tolerance Application: Byzantine NFS server Alberto Montresor (UniTN) DS - BFT 2018/12/06 36 / 80
46 r readimpact hmark okup iolates o have Practical Byzantine Fault Tolerance the first four phases because the time spent waiting for Application: Byzantine NFS server de ran client e same ons for ad and es, and ark for ion for below 4% for is high ration. lookup operations to complete in BFS-strict is at least 20% of the elapsed time for these phases, whereas it is less than 5% of the elapsed time for the last phase. BFS phase strict r/o lookup NFS-std (-69%) 0.47 (-73%) (-2%) 7.91 (-16%) (35%) 6.45 (20%) (32%) 7.87 (19%) (-2%) (-2%) total (3%) (-2%) Table 3: Andrew benchmark: BFS vs NFS-std. times are in seconds. The Table 3 shows the results for BFS vs NFS-std. These results show that BFS can be used in practice BFSstrict takes only 3% more time to run the complete benchmark. Thus, one could replace the NFS V2 Alberto Montresor (UniTN) DS - BFT 2018/12/06 37 / 80
47 Practical Byzantine Fault Tolerance Reality Check Example of systems that have adopted Byzantine Fault Tolerance: Boeing 777 Aircraft Information Management System Boeing 777/787 flight control system SpaceX Dragon flight control system BitCoin Alberto Montresor (UniTN) DS - BFT 2018/12/06 38 / 80
48 Distributed Algorithms Practical Byzantine Fault Tolerance Alberto Montresor Università di Trento 2018/12/06 Acknowledgments: Lorenzo Alvisi This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
49 Table of contents 1 Introduction 2 Byzantine generals 3 Practical Byzantine Fault Tolerance 4 Beyond PBFT Overview 5 Zyzzyva Introduction Three cases The case of the missing phase View changes 6 Aardvark 7 UpRight
50 Table of contents 1 Introduction 2 Byzantine generals 3 Practical Byzantine Fault Tolerance 4 Beyond PBFT Overview 5 Zyzzyva Introduction Three cases The case of the missing phase View changes 6 Aardvark 7 UpRight
51 Beyond PBFT Overview Overview After PBFT, several others papers started to appear: HQ: J. Cowling, D. Myers, B. Liskov, R. Rodrigues, and L. Shrira. HQ replication: A hybrid quorum protocol for Byzantine fault tolerance. In Proc. of the Symposium on Operating systems design and implementation, OSDI 06, Oct Q/U: M. Abd-El-Malek, G. Ganger, G. Goodson, M. Retier, and J. Wylie. Fault-scalable Byzantine fault-tolerant services. In Proc. of the ACM Symposium on Operating Systems Principles, SOSP 05, Oct The end results has been to complicate the adoption of Byzantine solutions. Alberto Montresor (UniTN) DS - BFT 2018/12/06 39 / 80
52 Beyond PBFT Overview Overview In the regions we studied (up to f = 5), if contention is low and low latency is the main issue, then if it is acceptable to use 5f + 1 replicas, Q/U is the best choice, else HQ is the best since it outperforms PBFT with a batch size of 1. Otherwise, PBFT is the best choice in this region: It can handle high contention workloads, and it can beat the throughput of both HQ and Q/U through its use of batching. Outside of this region, we expect HQ will scale best: HQ s throughput decreases more slowly than Q/U s (because of the latter s larger message and processing costs) and PBFT s (where eventually batching cannot compen- sate for the quadratic number of messages). Alberto Montresor (UniTN) DS - BFT 2018/12/06 40 / 80
53 Table of contents 1 Introduction 2 Byzantine generals 3 Practical Byzantine Fault Tolerance 4 Beyond PBFT Overview 5 Zyzzyva Introduction Three cases The case of the missing phase View changes 6 Aardvark 7 UpRight
54 Zyzzyva Introduction Zyzzyva3 OSDI 06 R. Kotla, A. Clement, E. Wong, L. Alvisi, and M. Dahlin. Zyzzyva: Speculative byzantine fault tolerance. In Proc. of the ACM Symposium on Operating Systems Principles, (SOSP 07), Stevenson, WA, Oct ACM. One protocol to rule them all! Zyzzyva is the last word on BFT! (Is it?) 3 Zyzzyva is the last word of the English dictionary Apart from Zyzzyzus Alberto Montresor (UniTN) DS - BFT 2018/12/06 41 / 80
55 Zyzzyva Introduction Replica coordination All correct replicas execute the same sequence of commands For each received command c, correct replicas: Agree on c s position in the sequence Execute c in the agreed upon order Reply to the client Alberto Montresor (UniTN) DS - BFT 2018/12/06 42 / 80
56 Zyzzyva Introduction How it is done now Primary Request Backup 1 Backup 2 Backup 3 Pre-prepare Prepare Commit Reply Alberto Montresor (UniTN) DS - BFT 2018/12/06 43 / 80
57 Zyzzyva Introduction The engineer s Rule of thumb Citation Handle normal and worst case separately as a rule, because the requirements for the two are quite different: the normal case must be fast; the worst case must make some progress Butler Lampson, Hints for Computer System Design Alberto Montresor (UniTN) DS - BFT 2018/12/06 44 / 80
58 Zyzzyva Introduction How Zyzzyva does it Primary Request Replica 1 Replica 2 Replica 3 Alberto Montresor (UniTN) DS - BFT 2018/12/06 45 / 80
59 Zyzzyva Introduction Specification for State Machine Replication (SMR) Stability A command is stable at a replica once its position in the sequence cannot change Safety Correct clients only process replies to stable commands Liveness All commands issued by correct clients eventually become stable and elicit a reply Alberto Montresor (UniTN) DS - BFT 2018/12/06 46 / 80
60 Zyzzyva Introduction Enforncing safety Safety requires: Correct clients only process replies to stable commands...but SMR implementations enforce instead: Correct replicas only execute and reply to commands that are stable Service performs an output commit with each reply Alberto Montresor (UniTN) DS - BFT 2018/12/06 47 / 80
61 Zyzzyva Introduction Speculative BFT (Trust, but verify) Replicas execute and reply to a command without knowing whether it is stable trust order provided by primary no explicit replica agreement! Correct client, before processing reply, verifies that it corresponds to stable command if not, client takes action to ensure liveness Alberto Montresor (UniTN) DS - BFT 2018/12/06 48 / 80
62 Zyzzyva Introduction Verifying stability Necessary condition for stability in Zyzzyva: A command c can become stable only if a majority of correct replicas agree on its position in the sequence Client can process a response for c iff: a majority of correct replicas agrees on c s position the set of replies is incompatible, for all possible future executions, with a majority of correct replicas agreeing on a different command holding c s current position Alberto Montresor (UniTN) DS - BFT 2018/12/06 49 / 80
63 Zyzzyva Introduction History History H i,k is the sequence of the first k commands executed by replica i On receipt of a command c from the primary, replica appends c to its command history Replica reply for c includes: the application-level response the corresponding command history Additional details: Can be hashed through incremental hashing Alberto Montresor (UniTN) DS - BFT 2018/12/06 50 / 80
64 Zyzzyva Three cases Case 1: Unanimity Primary c <c,k> < r 1,H 1,k > Replica 1 < r 2,H 2,k > <c,k> Replica 2 <c,k> < r 3,H 3,k > Replica 3 < r 4,H 4,k > Client processes response if all replies match: r 1 =... = r 4 H 1,k =... = H 4,k Alberto Montresor (UniTN) DS - BFT 2018/12/06 51 / 80
65 Zyzzyva Three cases Case 1: Unanimity Some comments: Note that although a client has a proof that the request position in the command history is irremediately set, no server has such a proof Comparison of histories may be based on incremental hash Three message hops to complete the request in the good case Is it safe to accept the reply in this case? Alberto Montresor (UniTN) DS - BFT 2018/12/06 52 / 80
66 Zyzzyva Three cases Case 1: Unanimity Some comments: Note that although a client has a proof that the request position in the command history is irremediately set, no server has such a proof Comparison of histories may be based on incremental hash Three message hops to complete the request in the good case Is it safe to accept the reply in this case? All processes have agreed on ordering Alberto Montresor (UniTN) DS - BFT 2018/12/06 52 / 80
67 Zyzzyva Three cases Case 1: Unanimity Some comments: Note that although a client has a proof that the request position in the command history is irremediately set, no server has such a proof Comparison of histories may be based on incremental hash Three message hops to complete the request in the good case Is it safe to accept the reply in this case? All processes have agreed on ordering Correct processes cannot change their mind later Alberto Montresor (UniTN) DS - BFT 2018/12/06 52 / 80
68 Zyzzyva Three cases Case 1: Unanimity Some comments: Note that although a client has a proof that the request position in the command history is irremediately set, no server has such a proof Comparison of histories may be based on incremental hash Three message hops to complete the request in the good case Is it safe to accept the reply in this case? All processes have agreed on ordering Correct processes cannot change their mind later New primary can ask n f replicas for their histories Alberto Montresor (UniTN) DS - BFT 2018/12/06 52 / 80
69 Zyzzyva Three cases Case 2: A majority of correct replicas agree Primary c <c,k> < r 1,H 1,k > Replica 1 < r 2,H 2,k > <c,k> Replica 2 <c,k> < r 3,H 3,k > Replica 3 Is it safe to accept such a message? Alberto Montresor (UniTN) DS - BFT 2018/12/06 53 / 80
70 Zyzzyva Three cases Case 2: A majority of correct replicas agree Primary c <c,k> < r 1,H 1,k > Replica 1 < r 2,H 2,k > <c,k> Replica 2 < r 3,H 3,k > Replica 3 Consider this case... Alberto Montresor (UniTN) DS - BFT 2018/12/06 54 / 80
71 Zyzzyva Three cases Case 2: A majority of correct replicas agree Primary Replica 1 c < r i,h i,k > <c,k> <c,k> CC=<H 1,k, H 2,k, H 3,k> Replica 2 <c,k> Replica 3 Client sends to all a commit certificate containing 2f + 1 matching histories Alberto Montresor (UniTN) DS - BFT 2018/12/06 55 / 80
72 Zyzzyva Three cases Case 2: A majority of correct replicas agree Primary Replica 1 c < r i,h i,k > <c,k> <c,k> ack CC=<H 1,k, H 2,k, H 3,k> Replica 2 <c,k> Replica 3 Client processes response if it receives at least 2f + 1 acks Alberto Montresor (UniTN) DS - BFT 2018/12/06 56 / 80
73 Zyzzyva Three cases Case 2: A majority of correct replicas agree Safe? Certificate proves that a majority of correct processes agree on its position in the sequence Incompatible with a majority backing a different command for that position Stability Stability depends on matching command histories Stability is prefix-closed: If a command with sequence number k is stable, then so is every command with sequence number k < k Alberto Montresor (UniTN) DS - BFT 2018/12/06 57 / 80
74 Zyzzyva Three cases Case 3: None of the above Primary c <c,k> < r 1,H 1,k > Replica 1 < r 2,H 2,k > Replica 2 Replica 3 Fewer than 2f + 1 replies match Clients retransmits c to all replicas hinting primary may be faulty Alberto Montresor (UniTN) DS - BFT 2018/12/06 58 / 80
75 Zyzzyva The case of the missing phase The case of the missing phase Primary Backup 1 Backup 2 Backup 3 Request Pre-prepare Prepare Commit Reply Where did the third phase go? Why was it there to begin with? Primary Replica 1 c < r i,h i,k > <c,k> <c,k> ack CC=<H 1,k, H 2,k, H 3,k> Replica 2 <c,k> Replica 3 Alberto Montresor (UniTN) DS - BFT 2018/12/06 59 / 80
76 Zyzzyva The case of the missing phase The missing phase commit Consider this scenario: f malicious replicas, including the primary The primary stops communicating with f correct replicas They go on strike they stop accepting messages in this view, ask a view change f + f replicas stops accepting messages, f + 1 replicas keep working The remaining f + 1 replicas are not enough to conclude the pre-prepare and prepare phases The f correct processes that are asking a view change are not enough to conclude one, so there is no opportunity to regain liveness by electing a new primary Alberto Montresor (UniTN) DS - BFT 2018/12/06 60 / 80
77 Zyzzyva The case of the missing phase The missing phase commit The third phase of PBFT breaks this stalemate: The remaining f + 1 replicas either gather the evidence necessary to complete the request, or determine that a view change is necessary Commit phase needed for liveness Alberto Montresor (UniTN) DS - BFT 2018/12/06 61 / 80
78 Zyzzyva View changes Where the third phase go? In PBFT What compromises liveness in the previous scenario is that the PBFT view change protocol lets correct replicas commit to a view change and become silent in a view without any guarantee that their action will lead to the view change In Zyzzyva A correct replica does not abandon view v unless it is guaranteed that every other correct replica will do the same, forcing a new view and a new primary Alberto Montresor (UniTN) DS - BFT 2018/12/06 62 / 80
79 Zyzzyva View changes View change Two phases: Processes unsatisfied with the current primary sent a message i-hate-the-primary, v to all If a process collect f + 1 i-hate-the-primary messages, sends a message to all containing such messages and starts a new view change (similar to the traditional one) Extra phase of agreement protocol is moved to the view change protocol Alberto Montresor (UniTN) DS - BFT 2018/12/06 63 / 80
80 Zyzzyva View changes Optimizations Checkpoint protocol to garbage collect histories Replacing digital signatures with MAC Replicating application state at only 2f + 1 replicas Batching Alberto Montresor (UniTN) DS - BFT 2018/12/06 64 / 80
81 Zyzzyva View changes Performance 7:28 R. Kotla et al. 140 Unreplicated Throughput (Kops/sec) Zyzzyva Zyzzyva (B=10) Zyzzyva5 (B=10) PBFT (B=10) Zyzzyva5 20 PBFT Q/U max throughput HQ Number of clients Fig. 4. Realized throughput for the 0/0 benchmark as the number of client varies for systems configured to tolerate f = 1 faults. 4 Alberto Montresor (UniTN) DS - BFT 2018/12/06 65 / 80
82 Zyzzyva View changes Discussion What have you learned? Do you agree on the principles? Alberto Montresor (UniTN) DS - BFT 2018/12/06 66 / 80
83 Table of contents 1 Introduction 2 Byzantine generals 3 Practical Byzantine Fault Tolerance 4 Beyond PBFT Overview 5 Zyzzyva Introduction Three cases The case of the missing phase View changes 6 Aardvark 7 UpRight
84 Aardvark Aardvark 4 NSDI 09 A. Clement, E. Wong, L. Alvisi, M. Dahlin, and M. Marchetti. Making Byzantine fault tolerant systems tolerate Byzantine faults. In Proc. of the 6th USENIX symposium on Networked systems design and implementation, NSDI 09, pages USENIX Association, A new beginning! Porc_formiguer.JPG 4 Aardvark is the first word of the English dictionary Oritteropo in Italian Alberto Montresor (UniTN) DS - BFT 2018/12/06 67 / 80
85 Aardvark From the article Surviving vs tolerating Although current BFT systems can survive Byzantine faults without compromising safety, we contend that a system that can be made completely unavailable by a simple Byzantine failure can hardly be said to tolerate Byzantine faults. Alberto Montresor (UniTN) DS - BFT 2018/12/06 68 / 80
86 Aardvark Conventional wisdom Handle normal and worst case separately remain safe in worst case make progress in normal case Maximize performance when the network is synchronous all clients and servers behave correctly Alberto Montresor (UniTN) DS - BFT 2018/12/06 69 / 80
87 Aardvark Conventional wisdom Misguided encourages systems that fail to deliver BFT Maximize performance when the network is synchronous all clients and servers behave correctly Alberto Montresor (UniTN) DS - BFT 2018/12/06 69 / 80
88 Aardvark Conventional wisdom Misguided encourages systems that fail to deliver BFT Dangerous it encourages fragile optimizations Alberto Montresor (UniTN) DS - BFT 2018/12/06 69 / 80
89 Aardvark Conventional wisdom Misguided encourages systems that fail to deliver BFT Dangerous it encourages fragile optimizations Futile it yields diminishing return on common case Alberto Montresor (UniTN) DS - BFT 2018/12/06 69 / 80
90 Aardvark Blueprint Build the system around execution path that: provides acceptable performance across the broadest set of executions it is easy to implement it is robust against Byzantine attempts to push the system away from it Alberto Montresor (UniTN) DS - BFT 2018/12/06 70 / 80
91 Aardvark Revisiting conventional wisdom Signatures are expensive use MACs Faulty clients can use MACs to generate ambiguity (One node validating a MAC authenticator does not guarantee that any other nodes will validate that same authenticator) Aardvark requires clients to sign requests View changes are to be avoided Aardvark uses regular view changes to maintain high throughput despite faulty primaries Hardware multicast is a boon Aardvark uses separate work queues for clients and individual replicas Aardvark uses fully connected topology among replicas (separate NICs) Alberto Montresor (UniTN) DS - BFT 2018/12/06 71 / 80
92 Aardvark MAC Attack Primary c <c,k> Replica 1 <c,k> Replica 2 <c,k> Replica 3 Alberto Montresor (UniTN) DS - BFT 2018/12/06 72 / 80
93 Aardvark MAC Attack Primary c <c,k> Replica 1 <c,k> Replica 2 <c,k> Replica 3 Alberto Montresor (UniTN) DS - BFT 2018/12/06 73 / 80
94 Aardvark Throughput Best Faulty Client Faulty Faulty case client flood primary replica PBFT 62K 0 crash 1k 250 QU 24K 0 crash NA 19k HQ 15K NA 4.5K NA crash Zyzzyva 80K 0 crash crash 0 Aardvark 39K 39K 7.8K 37K 11K Alberto Montresor (UniTN) DS - BFT 2018/12/06 74 / 80
95 Table of contents 1 Introduction 2 Byzantine generals 3 Practical Byzantine Fault Tolerance 4 Beyond PBFT Overview 5 Zyzzyva Introduction Three cases The case of the missing phase View changes 6 Aardvark 7 UpRight
96 UpRight UpRight Bibliography A. Clement, M. Kapritsos, S. Lee, Y. Wang, L. Alvisi, M. Dahlin, and T. Riche. UpRight cluster services. In Proc. of the ACM Symposium on Operating Systems Principles, SOSP 09, Oct A new (B)FT replication library Minimal intrusiveness for existing apps Adequate performance Goal: ease BFT deployment make explicit incremental cost of BFT switching to BFT: simple change in a config file Alberto Montresor (UniTN) DS - BFT 2018/12/06 75 / 80
97 UpRight UpRight u= max number of failures to ensure liveness r = max number of commission failures to preserve safety Omission Byzantine Commission r = u = f: BFT r = 0 : CFT Crash Alberto Montresor (UniTN) DS - BFT 2018/12/06 76 / 80
98 UpRight UpRight Exposes incremental cost of BFT Byzantine agreement if r << u, BFT CFT in replication cost Allows richer design options Byzantine faults are rare: u > r Safety more critical than liveness: r > u Alberto Montresor (UniTN) DS - BFT 2018/12/06 77 / 80
99 UpRight Reality Check UpRight 5 (Java; latest update Oct. 2009) ArchiStar-BFT 6 (Java; latest update May 2015) Bft-SMaRt 7 (Java; latest update Apr. 2016) Alberto Montresor (UniTN) DS - BFT 2018/12/06 78 / 80
100 UpRight For (far in the) future lectures S. Gaertner, M. Bourennane, C. Kurtsiefer, A. Cabello, and H. Weinfurter. Experimental demonstration of a quantum protocol for byzantine agreement and liar detection. Physical Review Letters, 100(7), Feb Alberto Montresor (UniTN) DS - BFT 2018/12/06 79 / 80
101 UpRight Reading material M. Castro and B. Liskov. Practical Byzantine fault tolerance. In Proc. of the 3 rd Symposium on Operating systems design and implementation, OSDI 99, pages , New Orleans, Louisiana, USA, USENIX Association. R. Kotla, A. Clement, E. Wong, L. Alvisi, and M. Dahlin. Zyzzyva: Speculative byzantine fault tolerance. In Proc. of the ACM Symposium on Operating Systems Principles, (SOSP 07), Stevenson, WA, Oct ACM. A. Clement, E. Wong, L. Alvisi, M. Dahlin, and M. Marchetti. Making Byzantine fault tolerant systems tolerate Byzantine faults. In Proc. of the 6th USENIX symposium on Networked systems design and implementation, NSDI 09, pages USENIX Association, Alberto Montresor (UniTN) DS - BFT 2018/12/06 80 / 80
Distributed Algorithms Practical Byzantine Fault Tolerance
Distributed Algorithms Practical Byzantine Fault Tolerance Alberto Montresor University of Trento, Italy 2017/01/06 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International
More informationPractical Byzantine Fault Tolerance (The Byzantine Generals Problem)
Practical Byzantine Fault Tolerance (The Byzantine Generals Problem) Introduction Malicious attacks and software errors that can cause arbitrary behaviors of faulty nodes are increasingly common Previous
More informationPBFT: A Byzantine Renaissance. The Setup. What could possibly go wrong? The General Idea. Practical Byzantine Fault-Tolerance (CL99, CL00)
PBFT: A Byzantine Renaissance Practical Byzantine Fault-Tolerance (CL99, CL00) first to be safe in asynchronous systems live under weak synchrony assumptions -Byzantine Paxos! The Setup Crypto System Model
More informationZyzzyva. Speculative Byzantine Fault Tolerance. Ramakrishna Kotla. L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin
Zyzzyva Speculative Byzantine Fault Tolerance Ramakrishna Kotla L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin The Goal Transform high-performance service into high-performance
More informationAuthenticated Agreement
Chapter 18 Authenticated Agreement Byzantine nodes are able to lie about their inputs as well as received messages. Can we detect certain lies and limit the power of byzantine nodes? Possibly, the authenticity
More informationPractical Byzantine Fault
Practical Byzantine Fault Tolerance Practical Byzantine Fault Tolerance Castro and Liskov, OSDI 1999 Nathan Baker, presenting on 23 September 2005 What is a Byzantine fault? Rationale for Byzantine Fault
More informationPractical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov
Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change
More informationByzantine fault tolerance. Jinyang Li With PBFT slides from Liskov
Byzantine fault tolerance Jinyang Li With PBFT slides from Liskov What we ve learnt so far: tolerate fail-stop failures Traditional RSM tolerates benign failures Node crashes Network partitions A RSM w/
More informationByzantine Techniques
November 29, 2005 Reliability and Failure There can be no unity without agreement, and there can be no agreement without conciliation René Maowad Reliability and Failure There can be no unity without agreement,
More informationRobust BFT Protocols
Robust BFT Protocols Sonia Ben Mokhtar, LIRIS, CNRS, Lyon Joint work with Pierre Louis Aublin, Grenoble university Vivien Quéma, Grenoble INP 18/10/2013 Who am I? CNRS reseacher, LIRIS lab, DRIM research
More informationAS distributed systems develop and grow in size,
1 hbft: Speculative Byzantine Fault Tolerance With Minimum Cost Sisi Duan, Sean Peisert, Senior Member, IEEE, and Karl N. Levitt Abstract We present hbft, a hybrid, Byzantine fault-tolerant, ted state
More informationReducing the Costs of Large-Scale BFT Replication
Reducing the Costs of Large-Scale BFT Replication Marco Serafini & Neeraj Suri TU Darmstadt, Germany Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de
More informationPractical Byzantine Fault Tolerance
Practical Byzantine Fault Tolerance Robert Grimm New York University (Partially based on notes by Eric Brewer and David Mazières) The Three Questions What is the problem? What is new or different? What
More informationZyzzyva: Speculative Byzantine Fault Tolerance
: Speculative Byzantine Fault Tolerance Ramakrishna Kotla, Lorenzo Alvisi, Mike Dahlin, Allen Clement, and Edmund Wong Dept. of Computer Sciences University of Texas at Austin {kotla,lorenzo,dahlin,aclement,elwong}@cs.utexas.edu
More informationZyzzyva: Speculative Byzantine Fault Tolerance
: Speculative Byzantine Fault Tolerance Ramakrishna Kotla Microsoft Research Silicon Valley, USA kotla@microsoft.com Allen Clement, Edmund Wong, Lorenzo Alvisi, and Mike Dahlin Dept. of Computer Sciences
More informationPractical Byzantine Fault Tolerance and Proactive Recovery
Practical Byzantine Fault Tolerance and Proactive Recovery MIGUEL CASTRO Microsoft Research and BARBARA LISKOV MIT Laboratory for Computer Science Our growing reliance on online services accessible on
More informationPractical Byzantine Fault Tolerance Using Fewer than 3f+1 Active Replicas
Proceedings of the 17th International Conference on Parallel and Distributed Computing Systems San Francisco, California, pp 241-247, September 24 Practical Byzantine Fault Tolerance Using Fewer than 3f+1
More informationAuthenticated Byzantine Fault Tolerance Without Public-Key Cryptography
Appears as Technical Memo MIT/LCS/TM-589, MIT Laboratory for Computer Science, June 999 Authenticated Byzantine Fault Tolerance Without Public-Key Cryptography Miguel Castro and Barbara Liskov Laboratory
More informationor? Paxos: Fun Facts Quorum Quorum: Primary Copy vs. Majority Quorum: Primary Copy vs. Majority
Paxos: Fun Facts Quorum Why is the algorithm called Paxos? Leslie Lamport described the algorithm as the solution to a problem of the parliament on a fictitious Greek island called Paxos Many readers were
More informationTradeoffs in Byzantine-Fault-Tolerant State-Machine-Replication Protocol Design
Tradeoffs in Byzantine-Fault-Tolerant State-Machine-Replication Protocol Design Michael G. Merideth March 2008 CMU-ISR-08-110 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213
More informationBYZANTINE GENERALS BYZANTINE GENERALS (1) A fable: Michał Szychowiak, 2002 Dependability of Distributed Systems (Byzantine agreement)
BYZANTINE GENERALS (1) BYZANTINE GENERALS A fable: BYZANTINE GENERALS (2) Byzantine Generals Problem: Condition 1: All loyal generals decide upon the same plan of action. Condition 2: A small number of
More informationPractical Byzantine Fault Tolerance
Appears in the Proceedings of the Third Symposium on Operating Systems Design and Implementation, New Orleans, USA, February 1999 Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Laboratory
More informationKey-value store with eventual consistency without trusting individual nodes
basementdb Key-value store with eventual consistency without trusting individual nodes https://github.com/spferical/basementdb 1. Abstract basementdb is an eventually-consistent key-value store, composed
More informationByzantine Fault Tolerance
Byzantine Fault Tolerance CS 240: Computing Systems and Concurrency Lecture 11 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. So far: Fail-stop failures
More informationWhy then another BFT protocol? Zyzzyva. Simplify, simplify. Simplify, simplify. Complex decision tree hampers BFT adoption. H.D. Thoreau. H.D.
Why then another BFT protool? Yes No Zyzzyva Yes No Yes No Comple deision tree hampers BFT adoption Simplify, simplify H.D. Thoreau Simplify, simplify H.D. Thoreau Yes No Yes No Yes Yes No One protool
More informationByzantine Fault Tolerance and Consensus. Adi Seredinschi Distributed Programming Laboratory
Byzantine Fault Tolerance and Consensus Adi Seredinschi Distributed Programming Laboratory 1 (Original) Problem Correct process General goal: Run a distributed algorithm 2 (Original) Problem Correct process
More informationEECS 591 DISTRIBUTED SYSTEMS. Manos Kapritsos Fall 2018
EECS 591 DISTRIBUTED SYSTEMS Manos Kapritsos Fall 2018 THE GENERAL IDEA Replicas A Primary A 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 PBFT: NORMAL OPERATION Three phases: Pre-prepare Prepare Commit assigns sequence
More informationByzantine Fault Tolerance
Byzantine Fault Tolerance CS6450: Distributed Systems Lecture 10 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University.
More informationCSE 5306 Distributed Systems. Fault Tolerance
CSE 5306 Distributed Systems Fault Tolerance 1 Failure in Distributed Systems Partial failure happens when one component of a distributed system fails often leaves other components unaffected A failure
More informationTolerating Latency in Replicated State Machines through Client Speculation
Tolerating Latency in Replicated State Machines through Client Speculation April 22, 2009 1, James Cowling 2, Edmund B. Nightingale 3, Peter M. Chen 1, Jason Flinn 1, Barbara Liskov 2 University of Michigan
More informationData Consistency and Blockchain. Bei Chun Zhou (BlockChainZ)
Data Consistency and Blockchain Bei Chun Zhou (BlockChainZ) beichunz@cn.ibm.com 1 Data Consistency Point-in-time consistency Transaction consistency Application consistency 2 Strong Consistency ACID Atomicity.
More informationTwo New Protocols for Fault Tolerant Agreement
Two New Protocols for Fault Tolerant Agreement Poonam Saini 1 and Awadhesh Kumar Singh 2, 1,2 Department of Computer Engineering, National Institute of Technology, Kurukshetra, India nit.sainipoonam@gmail.com,
More informationRevisiting Fast Practical Byzantine Fault Tolerance
Revisiting Fast Practical Byzantine Fault Tolerance Ittai Abraham, Guy Gueta, Dahlia Malkhi VMware Research with: Lorenzo Alvisi (Cornell), Rama Kotla (Amazon), Jean-Philippe Martin (Verily) December 4,
More informationA definition. Byzantine Generals Problem. Synchronous, Byzantine world
The Byzantine Generals Problem Leslie Lamport, Robert Shostak, and Marshall Pease ACM TOPLAS 1982 Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov OSDI 1999 A definition Byzantine (www.m-w.com):
More informationCSE 5306 Distributed Systems
CSE 5306 Distributed Systems Fault Tolerance Jia Rao http://ranger.uta.edu/~jrao/ 1 Failure in Distributed Systems Partial failure Happens when one component of a distributed system fails Often leaves
More informationConsensus and related problems
Consensus and related problems Today l Consensus l Google s Chubby l Paxos for Chubby Consensus and failures How to make process agree on a value after one or more have proposed what the value should be?
More informationInternational Journal of Advanced Research in Computer Science and Software Engineering
Volume 2, Issue 9, September 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Backup Two
More informationThe Long March of BFT. Weird Things Happen in Distributed Systems. A specter is haunting the system s community... A hierarchy of failure models
A specter is haunting the system s community... The Long March of BFT Lorenzo Alvisi UT Austin BFT Fail-stop A hierarchy of failure models Crash Weird Things Happen in Distributed Systems Send Omission
More informationCS 138: Practical Byzantine Consensus. CS 138 XX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.
CS 138: Practical Byzantine Consensus CS 138 XX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Scenario Asynchronous system Signed messages s are state machines It has to be practical CS 138
More informationProactive Recovery in a Byzantine-Fault-Tolerant System
Proactive Recovery in a Byzantine-Fault-Tolerant System Miguel Castro and Barbara Liskov Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA 02139
More informationAdapting Byzantine Fault Tolerant Systems
Adapting Byzantine Fault Tolerant Systems Miguel Neves Pasadinhas miguel.pasadinhas@tecnico.ulisboa.pt Instituto Superior Técnico (Advisor: Professor Luís Rodrigues) Abstract. Malicious attacks, software
More informationToday: Fault Tolerance
Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing
More informationByzantine Fault-Tolerance with Commutative Commands
Byzantine Fault-Tolerance with Commutative Commands Pavel Raykov 1, Nicolas Schiper 2, and Fernando Pedone 2 1 Swiss Federal Institute of Technology (ETH) Zurich, Switzerland 2 University of Lugano (USI)
More informationPractical Byzantine Fault Tolerance. Castro and Liskov SOSP 99
Practical Byzantine Fault Tolerance Castro and Liskov SOSP 99 Why this paper? Kind of incredible that it s even possible Let alone a practical NFS implementation with it So far we ve only considered fail-stop
More informationToday: Fault Tolerance. Fault Tolerance
Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing
More informationTo do. Consensus and related problems. q Failure. q Raft
Consensus and related problems To do q Failure q Consensus and related problems q Raft Consensus We have seen protocols tailored for individual types of consensus/agreements Which process can enter the
More informationProactive and Reactive View Change for Fault Tolerant Byzantine Agreement
Journal of Computer Science 7 (1): 101-107, 2011 ISSN 1549-3636 2011 Science Publications Proactive and Reactive View Change for Fault Tolerant Byzantine Agreement Poonam Saini and Awadhesh Kumar Singh
More informationA Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm
Appears as Technical Memo MIT/LCS/TM-590, MIT Laboratory for Computer Science, June 1999 A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Miguel Castro and Barbara Liskov
More informationProactive Recovery in a Byzantine-Fault-Tolerant System
Proactive Recovery in a Byzantine-Fault-Tolerant System Miguel Castro and Barbara Liskov Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, Cambridge, MA 02139
More informationDistributed Systems 11. Consensus. Paul Krzyzanowski
Distributed Systems 11. Consensus Paul Krzyzanowski pxk@cs.rutgers.edu 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value must be one
More informationCSCI 5454, CU Boulder Samriti Kanwar Lecture April 2013
1. Byzantine Agreement Problem In the Byzantine agreement problem, n processors communicate with each other by sending messages over bidirectional links in order to reach an agreement on a binary value.
More informationFailure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18
Failure models Byzantine Fault Tolerance Fail-stop: nodes either execute the protocol correctly or just stop Byzantine failures: nodes can behave in any arbitrary way Send illegal messages, try to trick
More informationPBFT: A Byzantine Renaissance. The Setup. What could possibly go wrong? The General Idea. Practical Byzantine Fault-Tolerance (CL99, CL00)
PBFT: A Byzantine Renaissane Pratial Byzantine Fault-Tolerane (CL99, CL00) first to be safe in asynhronous systems live under weak synhrony assumptions -Byzantine Paos! The Setup Crypto System Model Asynhronous
More informationByzantine Fault Tolerant Raft
Abstract Byzantine Fault Tolerant Raft Dennis Wang, Nina Tai, Yicheng An {dwang22, ninatai, yicheng} @stanford.edu https://github.com/g60726/zatt For this project, we modified the original Raft design
More informationPractical Byzantine Fault Tolerance
Practical Byzantine Fault Tolerance Miguel Castro January 31, 2001 c Massachusetts Institute of Technology 2001 This research was supported in part by DARPA under contract DABT63-95-C-005, monitored by
More informationarxiv: v2 [cs.dc] 12 Sep 2017
Efficient Synchronous Byzantine Consensus Ittai Abraham 1, Srinivas Devadas 2, Danny Dolev 3, Kartik Nayak 4, and Ling Ren 2 arxiv:1704.02397v2 [cs.dc] 12 Sep 2017 1 VMware Research iabraham@vmware.com
More informationZZ: Cheap Practical BFT using Virtualization
University of Massachusetts, Technical Report TR14-08 1 ZZ: Cheap Practical BFT using Virtualization Timothy Wood, Rahul Singh, Arun Venkataramani, and Prashant Shenoy Department of Computer Science, University
More informationResource-efficient Byzantine Fault Tolerance. Tobias Distler, Christian Cachin, and Rüdiger Kapitza
1 Resource-efficient Byzantine Fault Tolerance Tobias Distler, Christian Cachin, and Rüdiger Kapitza Abstract One of the main reasons why Byzantine fault-tolerant (BFT) systems are currently not widely
More informationByzID: Byzantine Fault Tolerance from Intrusion Detection
: Byzantine Fault Tolerance from Intrusion Detection Sisi Duan UC Davis sduan@ucdavis.edu Karl Levitt UC Davis levitt@ucdavis.edu Hein Meling University of Stavanger, Norway hein.meling@uis.no Sean Peisert
More informationGlobal atomicity. Such distributed atomicity is called global atomicity A protocol designed to enforce global atomicity is called commit protocol
Global atomicity In distributed systems a set of processes may be taking part in executing a task Their actions may have to be atomic with respect to processes outside of the set example: in a distributed
More informationReplication in Distributed Systems
Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over
More informationEvaluating BFT Protocols for Spire
Evaluating BFT Protocols for Spire Henry Schuh & Sam Beckley 600.667 Advanced Distributed Systems & Networks SCADA & Spire Overview High-Performance, Scalable Spire Trusted Platform Module Known Network
More informationDistributed Systems (ICE 601) Fault Tolerance
Distributed Systems (ICE 601) Fault Tolerance Dongman Lee ICU Introduction Failure Model Fault Tolerance Models state machine primary-backup Class Overview Introduction Dependability availability reliability
More informationFailures, Elections, and Raft
Failures, Elections, and Raft CS 8 XI Copyright 06 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved. Distributed Banking SFO add interest based on current balance PVD deposit $000 CS 8 XI Copyright
More informationDistributed Systems. Fault Tolerance. Paul Krzyzanowski
Distributed Systems Fault Tolerance Paul Krzyzanowski Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License. Faults Deviation from expected
More informationToday: Fault Tolerance. Replica Management
Today: Fault Tolerance Failure models Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Failure recovery
More informationByTAM: a Byzantine Fault Tolerant Adaptation Manager
ByTAM: a Byzantine Fault Tolerant Adaptation Manager Frederico Miguel Reis Sabino Thesis to obtain the Master of Science Degree in Information Systems and Computer Engineering Supervisor: Prof. Doutor
More informationChapter 8 Fault Tolerance
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 8 Fault Tolerance 1 Fault Tolerance Basic Concepts Being fault tolerant is strongly related to
More informationABSTRACT. Web Service Atomic Transaction (WS-AT) is a standard used to implement distributed
ABSTRACT Web Service Atomic Transaction (WS-AT) is a standard used to implement distributed processing over the internet. Trustworthy coordination of transactions is essential to ensure proper running
More informationDistributed Deadlock
Distributed Deadlock 9.55 DS Deadlock Topics Prevention Too expensive in time and network traffic in a distributed system Avoidance Determining safe and unsafe states would require a huge number of messages
More informationVelisarios: Byzantine Fault-Tolerant Protocols Powered by Coq
Velisarios: Byzantine Fault-Tolerant Protocols Powered by Coq Vincent Rahli, Ivana Vukotic, Marcus Völp, Paulo Esteves-Verissimo SnT, University of Luxembourg, Esch-sur-Alzette, Luxembourg firstname.lastname@uni.lu
More informationDistributed Algorithms (PhD course) Consensus SARDAR MUHAMMAD SULAMAN
Distributed Algorithms (PhD course) Consensus SARDAR MUHAMMAD SULAMAN Consensus (Recapitulation) A consensus abstraction is specified in terms of two events: 1. Propose ( propose v )» Each process has
More informationToday: Fault Tolerance. Failure Masking by Redundancy
Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Failure recovery Checkpointing
More informationUpRight Cluster Services
UpRight Cluster Services Allen Clement, Manos Kapritsos, Sangmin Lee, Yang Wang, Lorenzo Alvisi, Mike Dahlin, Taylor Riché Department of Computer Sciences The University of Texas at Austin Austin, Texas,
More informationHQ Replication: A Hybrid Quorum Protocol for Byzantine Fault Tolerance
HQ Replication: A Hybrid Quorum Protocol for Byzantine Fault Tolerance James Cowling 1, Daniel Myers 1, Barbara Liskov 1, Rodrigo Rodrigues 2, and Liuba Shrira 3 1 MIT CSAIL, 2 INESC-ID and Instituto Superior
More informationDistributed Systems 2 Introduction
Distributed Systems 2 Introduction Alberto Montresor University of Trento, Italy 2018/09/13 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1 Getting
More informationBAR Gossip. Lorenzo Alvisi UT Austin
BAR Gossip Lorenzo Alvisi UT Austin MAD Services Nodes collaborate to provide service that benefits each node Service spans multiple administrative domains (MADs) Examples: Overlay routing, wireless mesh
More informationByzantine Fault Tolerance Can Be Fast
Byzantine Fault Tolerance Can Be Fast Miguel Castro Microsoft Research Ltd. 1 Guildhall St., Cambridge CB2 3NH, UK mcastro@microsoft.com Barbara Liskov MIT Laboratory for Computer Science 545 Technology
More informationAGREEMENT PROTOCOLS. Paxos -a family of protocols for solving consensus
AGREEMENT PROTOCOLS Paxos -a family of protocols for solving consensus OUTLINE History of the Paxos algorithm Paxos Algorithm Family Implementation in existing systems References HISTORY OF THE PAXOS ALGORITHM
More informationDistributed Algorithms Introduction
Distributed Algorithms Introduction Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1
More informationCMSC 858F: Algorithmic Game Theory Fall 2010 Achieving Byzantine Agreement and Broadcast against Rational Adversaries
CMSC 858F: Algorithmic Game Theory Fall 2010 Achieving Byzantine Agreement and Broadcast against Rational Adversaries Instructor: Mohammad T. Hajiaghayi Scribe: Adam Groce, Aishwarya Thiruvengadam, Ateeq
More informationDistributed Systems COMP 212. Lecture 19 Othon Michail
Distributed Systems COMP 212 Lecture 19 Othon Michail Fault Tolerance 2/31 What is a Distributed System? 3/31 Distributed vs Single-machine Systems A key difference: partial failures One component fails
More informationBAR gossip. Antonio Massaro. May 9, May 9, / 40
BAR gossip Antonio Massaro May 9, 2016 May 9, 2016 1 / 40 MAD services Single nodes cooperate to provide services in Multiple Administrative Domains Internet routing File distribution Archival storage
More informationZeno: Eventually Consistent Byzantine-Fault Tolerance
Zeno: Eventually Consistent Byzantine-Fault Tolerance Atul Singh 1,2, Pedro Fonseca 1, Petr Kuznetsov 3, Rodrigo Rodrigues 1, Petros Maniatis 4 1 MPI-SWS, 2 Rice University, 3 TU Berlin/Deutsche Telekom
More informationHT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers
1 HT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers Vinit Kumar 1 and Ajay Agarwal 2 1 Associate Professor with the Krishna Engineering College, Ghaziabad, India.
More informationarxiv:cs/ v3 [cs.dc] 1 Aug 2007
A Byzantine Fault Tolerant Distributed Commit Protocol arxiv:cs/0612083v3 [cs.dc] 1 Aug 2007 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University, 2121 Euclid Ave,
More informationParsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast
Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast HariGovind V. Ramasamy Christian Cachin August 19, 2005 Abstract Atomic broadcast is a communication primitive that allows a group of
More informationConsensus Problem. Pradipta De
Consensus Problem Slides are based on the book chapter from Distributed Computing: Principles, Paradigms and Algorithms (Chapter 14) by Kshemkalyani and Singhal Pradipta De pradipta.de@sunykorea.ac.kr
More informationZzyzx: Scalable Fault Tolerance through Byzantine Locking
Zzyzx: Scalable Fault Tolerance through Byzantine Locking James Hendricks Shafeeq Sinnamohideen Gregory R. Ganger Michael K. Reiter Carnegie Mellon University University of North Carolina at Chapel Hill
More informationSecurity (and finale) Dan Ports, CSEP 552
Security (and finale) Dan Ports, CSEP 552 Today Security: what if parts of your distributed system are malicious? BFT: state machine replication Bitcoin: peer-to-peer currency Course wrap-up Security Too
More informationFault Tolerance. Distributed Systems. September 2002
Fault Tolerance Distributed Systems September 2002 Basics A component provides services to clients. To provide services, the component may require the services from other components a component may depend
More informationTolerating Byzantine Faulty Clients in a Quorum System
Tolerating Byzantine Faulty Clients in a Quorum System Barbara Liskov MIT CSAIL Cambridge, MA, USA Rodrigo Rodrigues INESC-ID / Instituto Superior Técnico Lisbon, Portugal Abstract Byzantine quorum systems
More informationCS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.
Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message
More informationByzantine Clients Rendered Harmless Barbara Liskov, Rodrigo Rodrigues
Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-2005-047 MIT-LCS-TR-994 July 21, 2005 Byzantine Clients Rendered Harmless Barbara Liskov, Rodrigo Rodrigues massachusetts
More informationAll about Eve: Execute-Verify Replication for Multi-Core Servers
All about Eve: Execute-Verify Replication for Multi-Core Servers Manos Kapritsos, Yang Wang, Vivien Quema, Allen Clement, Lorenzo Alvisi, Mike Dahlin Dependability Multi-core Databases Key-value stores
More informationCoordination and Agreement
Coordination and Agreement Nicola Dragoni Embedded Systems Engineering DTU Informatics 1. Introduction 2. Distributed Mutual Exclusion 3. Elections 4. Multicast Communication 5. Consensus and related problems
More informationToward Intrusion Tolerant Clouds
Toward Intrusion Tolerant Clouds Prof. Yair Amir, Prof. Vladimir Braverman Daniel Obenshain, Tom Tantillo Department of Computer Science Johns Hopkins University Prof. Cristina Nita-Rotaru, Prof. Jennifer
More informationFault Tolerance. Basic Concepts
COP 6611 Advanced Operating System Fault Tolerance Chi Zhang czhang@cs.fiu.edu Dependability Includes Availability Run time / total time Basic Concepts Reliability The length of uninterrupted run time
More informationSemi-Passive Replication in the Presence of Byzantine Faults
Semi-Passive Replication in the Presence of Byzantine Faults HariGovind V. Ramasamy Adnan Agbaria William H. Sanders University of Illinois at Urbana-Champaign 1308 W. Main Street, Urbana IL 61801, USA
More informationBFT Selection. Ali Shoker and Jean-Paul Bahsoun. University of Toulouse III, IRIT Lab. Toulouse, France
BFT Selection Ali Shoker and Jean-Paul Bahsoun University of Toulouse III, IRIT Lab. Toulouse, France firstname.lastname@irit.fr Abstract. One-size-fits-all protocols are hard to achieve in Byzantine fault
More information