BYZANTINE GENERALS BYZANTINE GENERALS (1) A fable: Michał Szychowiak, 2002 Dependability of Distributed Systems (Byzantine agreement)

Size: px
Start display at page:

Download "BYZANTINE GENERALS BYZANTINE GENERALS (1) A fable: Michał Szychowiak, 2002 Dependability of Distributed Systems (Byzantine agreement)"

Transcription

1 BYZANTINE GENERALS (1) BYZANTINE GENERALS A fable:

2 BYZANTINE GENERALS (2) Byzantine Generals Problem: Condition 1: All loyal generals decide upon the same plan of action. Condition 2: A small number of traitors cannot cause the loyal generals to adopt a bad plan. Condition 2 needs a more formal statement. But it is hard to formalize, since it requires expressing precisely what a bad plan is. Instead we consider how the generals reach a decision.

3 BYZANTINE GENERALS (3) Each general G i makes a decision v i and broadcasts it. If general G i is loyal, then every other general uses his v i. Let s trust the loyal generals: If most of generals independently decide ATTACK, then attack. If most of generals independently decide RETREAT, then retreat. Let s try a majority based voting If they are divided, then... hmm... well, uh... hmm... who cares? What s the problem?... traitors! if you receive v j from G j, you can t trust him you must be careful, because you don t know who to trust

4 BYZANTINE GENERALS (4) ATTACK? heh-heh ATTACK RETREAT RETREAT

5 BYZANTINE GENERALS (5) Byzantine failures Failure types: crash failures omission failures fail-stop model Byzantine (malicious) failures Omissions in Byzantine Generals model: Since a faulty general (a traitor) can refuse to sent a message, a nonfaulty general may never receive an expected message. In such a situation, we assume that the nonfaulty general (loyal) simply chooses an arbitrary value and acts as if the expected message has been received. Obviously, we require, that such omission can be detected by the respective receiver. In synchronous systems, where the duration of each round is known, this detection is simple all expected messages not received by the end of a round were not sent (omitted).

6 BYZANTINE AGREEMENT (1) BYZANTINE AGREEMENT A commanding general G C must send an order to N-1 lieutenant generals such that: BA1: All loyal lieutenants obey the same order. BA2: If the commander is loyal, then every loyal lieutenant obeys the order he sends. Note that if G C is loyal, then BA1 follows directly from BA2.

7 IMPOSSIBILITY OF BYZANTINE AGREEMENT (1) IMPOSSIBILITY OF BYZANTINE AGREEMENT Loyalty is very important: Claim: The Byzantine Generals Problem is impossible if f 3 1 N generals are traitors. (there is no f-resilient Byzantine Agreement algorithm for f 3 1 N) First we will show that no solution for 3 generals can handle a single traitor.

8 IMPOSSIBILITY OF BYZANTINE AGREEMENT (2) 1 (ATTACK) 0 (RETREAT)??? he said 0 (RETREAT)

9 IMPOSSIBILITY OF BYZANTINE AGREEMENT (3) 1 (ATTACK) 1 (ATTACK)??? he said 0 (RETREAT)

10 IMPOSSIBILITY OF BYZANTINE AGREEMENT (4) Theorem: No solution for fewer than 3t + 1 generals can cope with t traitors. Proof strategy: Assume we have an algorithm A t for 3t generals with t > 1 traitors. We construct a 3-general 1-traitor algorithm. We define A 1 : each general simulates t generals in A t 1 traitor simulates t traitors 2 loyal generals simulate 2t loyal generals both BA1 and BA2 are satisfied But this is impossible, so our assumption is wrong!

11 APPROXIMATE AGREEMENT (1) APPROXIMATE AGREEMENT Maybe the difficulty is requiring exact agreement? New problem: G C sends an attack TIME to N-1 lieutenant generals: AA1: All loyal lieutenants attack within 10 minutes of each other. AA2: If the commander is loyal, then every loyal lieutenant attacks within 10 minutes of the commander s order. Question: Is this agreement problem any easier?

12 APPROXIMATE AGREEMENT (2) Theorem: No Approximate Agreement algorithm for fewer than 3t + 1 generals can cope with t traitors. Proof strategy: Assume we have a 3-general 1-traitor Approximate Agreement algorithm. We will transform it into a 3-general 1-traitor Byzantine Agreement algorithm. But this is impossible, so our assumption is wrong!

13 APPROXIMATE AGREEMENT (3) Transformation to Byzantine Agreement Suppose G C sends: 1:00 to mean ATTACK 2:00 to mean RETREAT Each lieutenant does: Phase 1 run the Approximate Agreement protocol if time is before 1:10, then decide attack if time is after 1:50, then decide retreat Phase 2 if you don t come to any decision ask the other lieutenant Have you decided? if so do the same thing if not retreat

14 APPROXIMATE AGREEMENT (4) Claim: If the Approximate Agreement protocol works, so does the Byzantine Agreement protocol. If G C is loyal, then AA2 ensures that the loyal lieutenant gets the right attack time, so BA2 is satisfied. If G C is a traitor, both lieutenants are loyal. AA1 ensures they cannot make contradictory decisions in Phase 1. If one decides in Phase 1, the other agrees in Phase 2. If neither decides in Phase 1, they both retreat in Phase 2. Thus Approximate Agreement is impossible for 3 generals with 1 traitor. Simulation method works for 3t generals and t traitors.

15 ORAL MESSAGES (1) BYZANTINE AGREEMENT ALGORITHMS To reach agreement, processes have to exchange their values and replay the received values to other processes several times. The capability of faulty process to distort what it receive from the others greatly depends upon the type of messages. Two types of messages: oral messages (non-authenticated) a faulty process can forge a message and claim to have received it from another process or change the contents of a received message before it relays this message to other processes. There is no way for a process to verify the authenticity of a received message. signed messages (authenticated) a faulty process cannot forge a message or change the contents of a received one before relying it to other processes. Each process can verify the authenticity of a received message. Faulty processes do less damage.

16 ORAL MESSAGES (2) ORAL MESSAGES This is a simple solution using communication by oral messages to cope with t traitors, where N 3t + 1 Assumptions: A1: Every message sent is delivered correctly. A2: The receiver of a message knows who sent it. A3: The absence of a message can be detected. A1 and A2 prevent a traitor from interfering with communication between others (A2 = no spurious messages). A3 means that a traitor can t simply remain quiet to disallow progress.

17 ORAL MESSAGES (3) Details: More requirements: Generals can communicate directly with each other (however this can be relaxed) If a message is missing, assume it says 0 (default order RETREAT) majority function: if the majority of v i doesn t exist majority(v 1,...,v n ) = 0 if the majority of v i equals v, then majority(v 1,...,v n ) = v n is the number of processes currently reaching agreement alternatively, you can use arbitrary member of the {v i } set (for ordered sets, median is a good choice)

18 ORAL MESSAGES (4) Algorithm Lamport-Shostak-Pease [1] OM(0): 1. G C sends his value to every G i 2. Each lieutenant uses that value (or 0 if none) OM(m), m>0: 1. G C sends his value to every G i 2. Let v i be the value received by G i G i now acts as commander for OM(m-1) with n-2 other lieutenants acts as participant for n-2 instances of OM(m-1), with each G j :i j as commander, deciding v j (or RETREAT) 3. decide majority(v 1,..., v n-1 )

19 ORAL MESSAGES (5) Description: algorithm for N processes starts with OM(t) execution of OM(t) invokes N-1 separate executions of OM(t-1), each of which invokes N-2 executions of OM(t-2), and so on... processes are successively divided into smaller and smaller groups of n members (initially n=n) and the Byzantine Agreement is recursively achieved within each group in step 2 of OM(m). in total, there are (N-1)(N-2)(N-3)...(N-t+1) separate executions of OM(m), m = N-1, N-2, N-3,..., N-t+1 the message complexity is O(N t ) algorithm requires t+1 rounds of message exchanges t+1 is the lowest bound on the number of rounds to reach Byzantine Agreement in a fully connected network with process failures only using oral messages however, using signed messages, this bound is relaxed

20 ORAL MESSAGES (6) Example 1 step 1: G C OM(1) G 1 G 2 G 3 OM(0) G C majority(0,0,0)=0 step 2: 0 0 G 1 1 G 2 G 0 3 step 3: majority(0,1,0)=0 0 0

21 ORAL MESSAGES (7) Example 2 G C 1 0 (null) G 1 G 2 G 3 majority(1,0,0)=0 G C majority(1,0,0)=0 1 0 majority(1,0,0)=0 G 1 0 G 2 0 G 3 0 1

22 ORAL MESSAGES (8) Correctness proof Lemma: For any m and t, OM(m) satisfies BA2 if there are more than 2t + m generals and at most t traitors. Recall: BA2: If the commander is loyal, then every loyal lieutenant obeys the order he sends. Lemma proof: assume commander is loyal OM(0) trivially satisfies BA2 (by assumption A1) proceed by induction: assume lemma for OM(m-1) prove it for OM(m), m > 0

23 ORAL MESSAGES (9) Induction step: loyal commander sends v to n-1 lieutenants each loyal lieutenant calls OM(m-1) with n-2 other lieutenants by hypothesis, n > 2t + m, so (n-1) > 2t + (m-1) apply induction hypothesis: each loyal lieutenant gets v j = v, from each loyal G j there are at most t traitors and (n-1) > 2t + (m-1) 2t so majority of n-1 lieutenants are loyal for each loyal lieutenant majority(v 1,...,v n-1 ) = v

24 ORAL MESSAGES (10) Theorem: For any t, OM(t) satisfies BA1 and BA2 if there are more than 3t generals and at most t traitors. Theorem proof (by induction on t): if there are no traitors easy assume claim for OM(t-1) assume commander is loyal take m = t in previous lemma OM(t) satisfies BA2 BA2 implies BA1 if commander is loyal assume commander is a traitor...

25 ORAL MESSAGES (11) if commander is a traitor: at most t traitors, including commander at most t-1 traitors among the lieutenants lieutenants are more than 3t-1 > 3(t-1) apply induction hypothesis: OM(t-1) satisfies BA1 and BA2 for all G j, any 2 loyal lieutenants get the same v j any 2 loyal lieutenants get the same vector v 1,...,v n-1 any 2 loyal lieutenants get the same majority(v 1,...,v n-1 )

26 BIBLIOGRAPHY (1) BIBLIOGRAPHY [1] L. Lamport, R. Shostak, M. Pease, "The Byzantine Generals Problem", ACM Transactions on Programming Languages and Systems, [2] Bharat Bhargava, ed. "Concurrency Control and Reliability in Distributed Systems", Van Nostraud Reinhold Co., New York 1987, ch. 12: The Byzantine Generals (D. Dolev, L. Lamport, M. Pease, R. Shostak) [3] R. Chow, T. Johnson, "Distributed Operating Systems & Algorithms", Addison Wesley Longman, 1997, ch. 11

CSCI 5454, CU Boulder Samriti Kanwar Lecture April 2013

CSCI 5454, CU Boulder Samriti Kanwar Lecture April 2013 1. Byzantine Agreement Problem In the Byzantine agreement problem, n processors communicate with each other by sending messages over bidirectional links in order to reach an agreement on a binary value.

More information

Byzantine Techniques

Byzantine Techniques November 29, 2005 Reliability and Failure There can be no unity without agreement, and there can be no agreement without conciliation René Maowad Reliability and Failure There can be no unity without agreement,

More information

Global atomicity. Such distributed atomicity is called global atomicity A protocol designed to enforce global atomicity is called commit protocol

Global atomicity. Such distributed atomicity is called global atomicity A protocol designed to enforce global atomicity is called commit protocol Global atomicity In distributed systems a set of processes may be taking part in executing a task Their actions may have to be atomic with respect to processes outside of the set example: in a distributed

More information

The Long March of BFT. Weird Things Happen in Distributed Systems. A specter is haunting the system s community... A hierarchy of failure models

The Long March of BFT. Weird Things Happen in Distributed Systems. A specter is haunting the system s community... A hierarchy of failure models A specter is haunting the system s community... The Long March of BFT Lorenzo Alvisi UT Austin BFT Fail-stop A hierarchy of failure models Crash Weird Things Happen in Distributed Systems Send Omission

More information

CSE 5306 Distributed Systems. Fault Tolerance

CSE 5306 Distributed Systems. Fault Tolerance CSE 5306 Distributed Systems Fault Tolerance 1 Failure in Distributed Systems Partial failure happens when one component of a distributed system fails often leaves other components unaffected A failure

More information

Consensus and agreement algorithms

Consensus and agreement algorithms CHAPTER 4 Consensus and agreement algorithms 4. Problem definition Agreement among the processes in a distributed system is a fundamental requirement for a wide range of applications. Many forms of coordination

More information

Leslie Lamport. April 20, Leslie Lamport. Jenny Tyrväinen. Introduction. Education and Career. Most important works.

Leslie Lamport. April 20, Leslie Lamport. Jenny Tyrväinen. Introduction. Education and Career. Most important works. April 20, 2016 Born February 7 1941 in New York Mathematician by his education Has worked in industry, not an academic Fields: concurrency and distributed systems Lists 180 publications and other texts

More information

Fault-Tolerant Distributed Consensus

Fault-Tolerant Distributed Consensus Fault-Tolerant Distributed Consensus Lawrence Kesteloot January 20, 1995 1 Introduction A fault-tolerant system is one that can sustain a reasonable number of process or communication failures, both intermittent

More information

Signed Messages. Signed Messages

Signed Messages. Signed Messages Signed Messages! Traitors ability to lie makes Byzantine General Problem so difficult.! If we restrict this ability, then the problem becomes easier! Use authentication, i.e. allow generals to send unforgeable

More information

ROUND COMPLEXITY LOWER BOUND OF ISC PROTOCOL IN THE PARALLELIZABLE MODEL. Huijing Gong CMSC 858F

ROUND COMPLEXITY LOWER BOUND OF ISC PROTOCOL IN THE PARALLELIZABLE MODEL. Huijing Gong CMSC 858F ROUND COMPLEXITY LOWER BOUND OF ISC PROTOCOL IN THE PARALLELIZABLE MODEL Huijing Gong CMSC 858F Overview Background Byzantine Generals Problem Network Model w/o Pre-existing Setup ISC Protocol in Parallelizable

More information

Consensus Problem. Pradipta De

Consensus Problem. Pradipta De Consensus Problem Slides are based on the book chapter from Distributed Computing: Principles, Paradigms and Algorithms (Chapter 14) by Kshemkalyani and Singhal Pradipta De pradipta.de@sunykorea.ac.kr

More information

Concepts. Techniques for masking faults. Failure Masking by Redundancy. CIS 505: Software Systems Lecture Note on Consensus

Concepts. Techniques for masking faults. Failure Masking by Redundancy. CIS 505: Software Systems Lecture Note on Consensus CIS 505: Software Systems Lecture Note on Consensus Insup Lee Department of Computer and Information Science University of Pennsylvania CIS 505, Spring 2007 Concepts Dependability o Availability ready

More information

CSE 5306 Distributed Systems

CSE 5306 Distributed Systems CSE 5306 Distributed Systems Fault Tolerance Jia Rao http://ranger.uta.edu/~jrao/ 1 Failure in Distributed Systems Partial failure Happens when one component of a distributed system fails Often leaves

More information

Consensus and related problems

Consensus and related problems Consensus and related problems Today l Consensus l Google s Chubby l Paxos for Chubby Consensus and failures How to make process agree on a value after one or more have proposed what the value should be?

More information

CHAPTER AGREEMENT ~ROTOCOLS 8.1 INTRODUCTION

CHAPTER AGREEMENT ~ROTOCOLS 8.1 INTRODUCTION CHAPTER 8 ~ AGREEMENT ~ROTOCOLS 8.1 NTRODUCTON \ n distributed systems, where sites (or processors) often compete as well as cooperate to achieve a common goal, it is often required that sites reach mutual

More information

CMSC 858F: Algorithmic Game Theory Fall 2010 Achieving Byzantine Agreement and Broadcast against Rational Adversaries

CMSC 858F: Algorithmic Game Theory Fall 2010 Achieving Byzantine Agreement and Broadcast against Rational Adversaries CMSC 858F: Algorithmic Game Theory Fall 2010 Achieving Byzantine Agreement and Broadcast against Rational Adversaries Instructor: Mohammad T. Hajiaghayi Scribe: Adam Groce, Aishwarya Thiruvengadam, Ateeq

More information

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input Initial Assumptions Modern Distributed Computing Theory and Applications Ioannis Chatzigiannakis Sapienza University of Rome Lecture 4 Tuesday, March 6, 03 Exercises correspond to problems studied during

More information

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change

More information

BYZANTINE AGREEMENT CH / $ IEEE. by H. R. Strong and D. Dolev. IBM Research Laboratory, K55/281 San Jose, CA 95193

BYZANTINE AGREEMENT CH / $ IEEE. by H. R. Strong and D. Dolev. IBM Research Laboratory, K55/281 San Jose, CA 95193 BYZANTINE AGREEMENT by H. R. Strong and D. Dolev IBM Research Laboratory, K55/281 San Jose, CA 95193 ABSTRACT Byzantine Agreement is a paradigm for problems of reliable consistency and synchronization

More information

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Page 1 Introduction We frequently want to get a set of nodes in a distributed system to agree Commitment protocols and mutual

More information

To do. Consensus and related problems. q Failure. q Raft

To do. Consensus and related problems. q Failure. q Raft Consensus and related problems To do q Failure q Consensus and related problems q Raft Consensus We have seen protocols tailored for individual types of consensus/agreements Which process can enter the

More information

Distributed Deadlock

Distributed Deadlock Distributed Deadlock 9.55 DS Deadlock Topics Prevention Too expensive in time and network traffic in a distributed system Avoidance Determining safe and unsafe states would require a huge number of messages

More information

Practical Byzantine Fault Tolerance (The Byzantine Generals Problem)

Practical Byzantine Fault Tolerance (The Byzantine Generals Problem) Practical Byzantine Fault Tolerance (The Byzantine Generals Problem) Introduction Malicious attacks and software errors that can cause arbitrary behaviors of faulty nodes are increasingly common Previous

More information

On the Composition of Authenticated Byzantine Agreement

On the Composition of Authenticated Byzantine Agreement On the Composition of Authenticated Byzantine Agreement Yehuda Lindell Anna Lysyanskaya Tal Rabin July 28, 2004 Abstract A fundamental problem of distributed computing is that of simulating a secure broadcast

More information

Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms

Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms Holger Karl Computer Networks Group Universität Paderborn Goal of this chapter Apart from issues in distributed time and resulting

More information

Fault Tolerance. Distributed Software Systems. Definitions

Fault Tolerance. Distributed Software Systems. Definitions Fault Tolerance Distributed Software Systems Definitions Availability: probability the system operates correctly at any given moment Reliability: ability to run correctly for a long interval of time Safety:

More information

21. Distributed Algorithms

21. Distributed Algorithms 21. Distributed Algorithms We dene a distributed system as a collection of individual computing devices that can communicate with each other [2]. This denition is very broad, it includes anything, from

More information

Distributed Systems. Fault Tolerance. Paul Krzyzanowski

Distributed Systems. Fault Tolerance. Paul Krzyzanowski Distributed Systems Fault Tolerance Paul Krzyzanowski Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License. Faults Deviation from expected

More information

Byzantine Consensus in Directed Graphs

Byzantine Consensus in Directed Graphs Byzantine Consensus in Directed Graphs Lewis Tseng 1,3, and Nitin Vaidya 2,3 1 Department of Computer Science, 2 Department of Electrical and Computer Engineering, and 3 Coordinated Science Laboratory

More information

Artificial Neural Network Based Byzantine Agreement Protocol

Artificial Neural Network Based Byzantine Agreement Protocol P Artificial Neural Network Based Byzantine Agreement Protocol K.W. Lee and H.T. Ewe 2 Faculty of Engineering & Technology 2 Faculty of Information Technology Multimedia University Multimedia University

More information

CS5412: CONSENSUS AND THE FLP IMPOSSIBILITY RESULT

CS5412: CONSENSUS AND THE FLP IMPOSSIBILITY RESULT 1 CS5412: CONSENSUS AND THE FLP IMPOSSIBILITY RESULT Lecture XII Ken Birman Generalizing Ron and Hermione s challenge 2 Recall from last time: Ron and Hermione had difficulty agreeing where to meet for

More information

An optimal novel Byzantine agreement protocol (ONBAP) for heterogeneous distributed database processing systems

An optimal novel Byzantine agreement protocol (ONBAP) for heterogeneous distributed database processing systems Available online at www.sciencedirect.com Procedia Technology 6 (2012 ) 57 66 2 nd International Conference on Communication, Computing & Security An optimal novel Byzantine agreement protocol (ONBAP)

More information

Capacity of Byzantine Agreement: Complete Characterization of Four-Node Networks

Capacity of Byzantine Agreement: Complete Characterization of Four-Node Networks Capacity of Byzantine Agreement: Complete Characterization of Four-Node Networks Guanfeng Liang and Nitin Vaidya Department of Electrical and Computer Engineering, and Coordinated Science Laboratory University

More information

FAULT TOLERANCE. Fault Tolerant Systems. Faults Faults (cont d)

FAULT TOLERANCE. Fault Tolerant Systems. Faults Faults (cont d) Distributed Systems Fö 9/10-1 Distributed Systems Fö 9/10-2 FAULT TOLERANCE 1. Fault Tolerant Systems 2. Faults and Fault Models. Redundancy 4. Time Redundancy and Backward Recovery. Hardware Redundancy

More information

Distributed Systems Fault Tolerance

Distributed Systems Fault Tolerance Distributed Systems Fault Tolerance [] Fault Tolerance. Basic concepts - terminology. Process resilience groups and failure masking 3. Reliable communication reliable client-server communication reliable

More information

An Efficient Implementation of the SM Agreement Protocol for a Time Triggered Communication System

An Efficient Implementation of the SM Agreement Protocol for a Time Triggered Communication System An Efficient Implementation of the SM Agreement Protocol for a Time Triggered Communication System 2010-01-2320 Published 10/19/2010 Markus Jochim and Thomas M. Forest General Motors Copyright 2010 SAE

More information

A Formally Verified Algorithm for Interactive Consistency Under a Hybrid Fault Model

A Formally Verified Algorithm for Interactive Consistency Under a Hybrid Fault Model Reprint from Proceedings of the Fault-Tolerant Computing Symposium, FTCS 23, Toulouse, France, June 1993, pp. 402 411; also appears in FTCS: Highlights from 25 Years, pp. 438 447. A Formally Verified Algorithm

More information

Dfinity Consensus, Explored

Dfinity Consensus, Explored Dfinity Consensus, Explored Ittai Abraham, Dahlia Malkhi, Kartik Nayak, and Ling Ren VMware Research {iabraham,dmalkhi,nkartik,lingren}@vmware.com Abstract. We explore a Byzantine Consensus protocol called

More information

Distributed Systems 11. Consensus. Paul Krzyzanowski

Distributed Systems 11. Consensus. Paul Krzyzanowski Distributed Systems 11. Consensus Paul Krzyzanowski pxk@cs.rutgers.edu 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value must be one

More information

BYZANTINE CONSENSUS THROUGH BITCOIN S PROOF- OF-WORK

BYZANTINE CONSENSUS THROUGH BITCOIN S PROOF- OF-WORK Informatiemanagement: BYZANTINE CONSENSUS THROUGH BITCOIN S PROOF- OF-WORK The aim of this paper is to elucidate how Byzantine consensus is achieved through Bitcoin s novel proof-of-work system without

More information

Self-stabilizing Byzantine Digital Clock Synchronization

Self-stabilizing Byzantine Digital Clock Synchronization Self-stabilizing Byzantine Digital Clock Synchronization Ezra N. Hoch, Danny Dolev and Ariel Daliot The Hebrew University of Jerusalem We present a scheme that achieves self-stabilizing Byzantine digital

More information

Dep. Systems Requirements

Dep. Systems Requirements Dependable Systems Dep. Systems Requirements Availability the system is ready to be used immediately. A(t) = probability system is available for use at time t MTTF/(MTTF+MTTR) If MTTR can be kept small

More information

arxiv: v2 [cs.dc] 12 Sep 2017

arxiv: v2 [cs.dc] 12 Sep 2017 Efficient Synchronous Byzantine Consensus Ittai Abraham 1, Srinivas Devadas 2, Danny Dolev 3, Kartik Nayak 4, and Ling Ren 2 arxiv:1704.02397v2 [cs.dc] 12 Sep 2017 1 VMware Research iabraham@vmware.com

More information

Byzantine Fault Tolerance

Byzantine Fault Tolerance Byzantine Fault Tolerance CS 240: Computing Systems and Concurrency Lecture 11 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. So far: Fail-stop failures

More information

Notes for Recitation 8

Notes for Recitation 8 6.04/8.06J Mathematics for Computer Science October 5, 00 Tom Leighton and Marten van Dijk Notes for Recitation 8 Build-up error Recall a graph is connected iff there is a path between every pair of its

More information

Semi-Passive Replication in the Presence of Byzantine Faults

Semi-Passive Replication in the Presence of Byzantine Faults Semi-Passive Replication in the Presence of Byzantine Faults HariGovind V. Ramasamy Adnan Agbaria William H. Sanders University of Illinois at Urbana-Champaign 1308 W. Main Street, Urbana IL 61801, USA

More information

Practical Byzantine Fault

Practical Byzantine Fault Practical Byzantine Fault Tolerance Practical Byzantine Fault Tolerance Castro and Liskov, OSDI 1999 Nathan Baker, presenting on 23 September 2005 What is a Byzantine fault? Rationale for Byzantine Fault

More information

Adapting Byzantine Fault Tolerant Systems

Adapting Byzantine Fault Tolerant Systems Adapting Byzantine Fault Tolerant Systems Miguel Neves Pasadinhas miguel.pasadinhas@tecnico.ulisboa.pt Instituto Superior Técnico (Advisor: Professor Luís Rodrigues) Abstract. Malicious attacks, software

More information

NONBLOCKING COMMIT PROTOCOLS

NONBLOCKING COMMIT PROTOCOLS Dale Skeen NONBLOCKING COMMIT PROTOCOLS MC714 Sistemas Distribuídos Nonblocking Commit Protocols Dale Skeen From a certain point onward there is no longer any turning back. That is the point that must

More information

Introduction to Distributed Systems Seif Haridi

Introduction to Distributed Systems Seif Haridi Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send

More information

Consensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS

Consensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS 16 CHAPTER 2. CONSENSUS Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of a node. Chapter

More information

Fault Tolerance. Basic Concepts

Fault Tolerance. Basic Concepts COP 6611 Advanced Operating System Fault Tolerance Chi Zhang czhang@cs.fiu.edu Dependability Includes Availability Run time / total time Basic Concepts Reliability The length of uninterrupted run time

More information

A definition. Byzantine Generals Problem. Synchronous, Byzantine world

A definition. Byzantine Generals Problem. Synchronous, Byzantine world The Byzantine Generals Problem Leslie Lamport, Robert Shostak, and Marshall Pease ACM TOPLAS 1982 Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov OSDI 1999 A definition Byzantine (www.m-w.com):

More information

A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment

A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment Michel RAYNAL IRISA, Campus de Beaulieu 35042 Rennes Cedex (France) raynal @irisa.fr Abstract This paper considers

More information

Today: Fault Tolerance. Failure Masking by Redundancy

Today: Fault Tolerance. Failure Masking by Redundancy Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Failure recovery Checkpointing

More information

Resilient-Optimal Interactive Consistency in Constant Time

Resilient-Optimal Interactive Consistency in Constant Time Resilient-Optimal Interactive Consistency in Constant Time Michael Ben-Or Institute of mathematics and Computer Science, The Hebrew University, Jerusalem, Israel benor@cs.huji.ac.il Ran El-Yaniv Department

More information

Byzantine Failures. Nikola Knezevic. knl

Byzantine Failures. Nikola Knezevic. knl Byzantine Failures Nikola Knezevic knl Different Types of Failures Crash / Fail-stop Send Omissions Receive Omissions General Omission Arbitrary failures, authenticated messages Arbitrary failures Arbitrary

More information

Specifying and Proving Broadcast Properties with TLA

Specifying and Proving Broadcast Properties with TLA Specifying and Proving Broadcast Properties with TLA William Hipschman Department of Computer Science The University of North Carolina at Chapel Hill Abstract Although group communication is vitally important

More information

Distributed Algorithms Practical Byzantine Fault Tolerance

Distributed Algorithms Practical Byzantine Fault Tolerance Distributed Algorithms Practical Byzantine Fault Tolerance Alberto Montresor University of Trento, Italy 2017/01/06 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International

More information

UNIT 02 DISTRIBUTED DEADLOCKS UNIT-02/LECTURE-01

UNIT 02 DISTRIBUTED DEADLOCKS UNIT-02/LECTURE-01 UNIT 02 DISTRIBUTED DEADLOCKS UNIT-02/LECTURE-01 Deadlock ([RGPV/ Dec 2011 (5)] In an operating system, a deadlock is a situation which occurs when a process or thread enters a waiting state because a

More information

Name: 1. CS372H: Spring 2009 Final Exam

Name: 1. CS372H: Spring 2009 Final Exam Name: 1 Instructions CS372H: Spring 2009 Final Exam This exam is closed book and notes with one exception: you may bring and refer to a 1-sided 8.5x11- inch piece of paper printed with a 10-point or larger

More information

Distributed Algorithms Practical Byzantine Fault Tolerance

Distributed Algorithms Practical Byzantine Fault Tolerance Distributed Algorithms Practical Byzantine Fault Tolerance Alberto Montresor Università di Trento 2018/12/06 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

More information

A Comparison. of the Byzantin~ Agreement Problem and the Transaction Commit Problem

A Comparison. of the Byzantin~ Agreement Problem and the Transaction Commit Problem 1'TANDEMCOMPUTERS A Comparison. of the Byzantin~ Agreement Problem and the Transaction Commit Problem Jim Gray Technical Report 88.6 May 1988. Part Number 15274 ... A Comparison of the Byzantine Agreement

More information

Outline More Security Protocols CS 239 Computer Security February 6, 2006

Outline More Security Protocols CS 239 Computer Security February 6, 2006 Outline More Security Protocols CS 239 Computer Security February 6, 2006 Combining key distribution and authentication Verifying security protocols Page 1 Page 2 Combined Key Distribution and Authentication

More information

Chapter 5: Distributed Systems: Fault Tolerance. Fall 2013 Jussi Kangasharju

Chapter 5: Distributed Systems: Fault Tolerance. Fall 2013 Jussi Kangasharju Chapter 5: Distributed Systems: Fault Tolerance Fall 2013 Jussi Kangasharju Chapter Outline n Fault tolerance n Process resilience n Reliable group communication n Distributed commit n Recovery 2 Basic

More information

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University)

Secure Multiparty Computation: Introduction. Ran Cohen (Tel Aviv University) Secure Multiparty Computation: Introduction Ran Cohen (Tel Aviv University) Scenario 1: Private Dating Alice and Bob meet at a pub If both of them want to date together they will find out If Alice doesn

More information

Concurrency and OS recap. Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin e Greg Gagne

Concurrency and OS recap. Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin e Greg Gagne Concurrency and OS recap Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin e Greg Gagne 64 Process Concept An operating system executes a variety of programs:

More information

Today: Fault Tolerance

Today: Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm

A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Appears as Technical Memo MIT/LCS/TM-590, MIT Laboratory for Computer Science, June 1999 A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Miguel Castro and Barbara Liskov

More information

Outline More Security Protocols CS 239 Computer Security February 4, 2004

Outline More Security Protocols CS 239 Computer Security February 4, 2004 Outline More Security Protocols CS 239 Computer Security February 4, 2004 Combining key distribution and authentication Verifying security protocols Page 1 Page 2 Combined Key Distribution and Authentication

More information

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast HariGovind V. Ramasamy Christian Cachin August 19, 2005 Abstract Atomic broadcast is a communication primitive that allows a group of

More information

The Ripple Protocol Consensus Algorithm

The Ripple Protocol Consensus Algorithm Ripple Labs Inc, 2014 The Ripple Protocol Consensus Algorithm David Schwartz david@ripple.com Noah Youngs nyoungs@nyu.edu Arthur Britto arthur@ripple.com Abstract While several consensus algorithms exist

More information

G1 m G2 Attack at dawn? e e e e 1 S 1 = {0} End of round 1 End of round 2 2 S 2 = {1} {1} {0,1} decide -1 3 S 3 = {1} { 0,1} {0,1} decide -1 white hats are loyal or good guys black hats are traitor

More information

Dependability and real-time. TDDD07 Real-time Systems Lecture 7: Dependability & Fault tolerance. Early computer systems. Dependable systems

Dependability and real-time. TDDD07 Real-time Systems Lecture 7: Dependability & Fault tolerance. Early computer systems. Dependable systems TDDD7 Real-time Systems Lecture 7: Dependability & Fault tolerance Simin i Nadjm-Tehrani Real-time Systems Laboratory Department of Computer and Information Science Linköping university Dependability and

More information

Distributed Algorithms Reliable Broadcast

Distributed Algorithms Reliable Broadcast Distributed Algorithms Reliable Broadcast Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents

More information

Complexity of Multi-Value Byzantine Agreement

Complexity of Multi-Value Byzantine Agreement Complexity of Multi-Value Byzantine Agreement Guanfeng Liang and Nitin Vaidya Department of Electrical and Computer Engineering, and Coordinated Science Laboratory University of Illinois at Urbana-Champaign

More information

Fault Tolerance. Distributed Systems. September 2002

Fault Tolerance. Distributed Systems. September 2002 Fault Tolerance Distributed Systems September 2002 Basics A component provides services to clients. To provide services, the component may require the services from other components a component may depend

More information

Byzantine Clients Rendered Harmless Barbara Liskov, Rodrigo Rodrigues

Byzantine Clients Rendered Harmless Barbara Liskov, Rodrigo Rodrigues Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-2005-047 MIT-LCS-TR-994 July 21, 2005 Byzantine Clients Rendered Harmless Barbara Liskov, Rodrigo Rodrigues massachusetts

More information

Cache-Oblivious Traversals of an Array s Pairs

Cache-Oblivious Traversals of an Array s Pairs Cache-Oblivious Traversals of an Array s Pairs Tobias Johnson May 7, 2007 Abstract Cache-obliviousness is a concept first introduced by Frigo et al. in [1]. We follow their model and develop a cache-oblivious

More information

Building TMR-Based Reliable Servers Despite Bounded Input Lifetimes

Building TMR-Based Reliable Servers Despite Bounded Input Lifetimes Building TMR-Based Reliable Servers Despite Bounded Input Lifetimes Paul Ezhilchelvan University of Newcastle Dept of Computing Science NE1 7RU, UK paul.ezhilchelvan@newcastle.ac.uk Jean-Michel Hélary

More information

Outline. More Security Protocols CS 239 Security for System Software April 22, Needham-Schroeder Key Exchange

Outline. More Security Protocols CS 239 Security for System Software April 22, Needham-Schroeder Key Exchange Outline More Security Protocols CS 239 Security for System Software April 22, 2002 Combining key distribution and authentication Verifying security protocols Page 1 Page 2 Combined Key Distribution and

More information

Ruminations on Domain-Based Reliable Broadcast

Ruminations on Domain-Based Reliable Broadcast Ruminations on Domain-Based Reliable Broadcast Svend Frølund Fernando Pedone Hewlett-Packard Laboratories Palo Alto, CA 94304, USA Abstract A distributed system is no longer confined to a single administrative

More information

Computer Science 236 Fall Nov. 11, 2010

Computer Science 236 Fall Nov. 11, 2010 Computer Science 26 Fall Nov 11, 2010 St George Campus University of Toronto Assignment Due Date: 2nd December, 2010 1 (10 marks) Assume that you are given a file of arbitrary length that contains student

More information

Secure Multi-Party Computation Without Agreement

Secure Multi-Party Computation Without Agreement Secure Multi-Party Computation Without Agreement Shafi Goldwasser Department of Computer Science The Weizmann Institute of Science Rehovot 76100, Israel. shafi@wisdom.weizmann.ac.il Yehuda Lindell IBM

More information

Yale University Department of Computer Science

Yale University Department of Computer Science Yale University Department of Computer Science The Consensus Problem in Unreliable Distributed Systems (A Brief Survey) Michael J. Fischer YALEU/DCS/TR-273 June 1983 Reissued February 2000 To be presented

More information

Distributed Systems (ICE 601) Fault Tolerance

Distributed Systems (ICE 601) Fault Tolerance Distributed Systems (ICE 601) Fault Tolerance Dongman Lee ICU Introduction Failure Model Fault Tolerance Models state machine primary-backup Class Overview Introduction Dependability availability reliability

More information

Today: Fault Tolerance. Fault Tolerance

Today: Fault Tolerance. Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

Byzantine Agreement with a Rational Adversary

Byzantine Agreement with a Rational Adversary Byzantine Agreement with a Rational Adversary Adam Groce, Jonathan Katz, Aishwarya Thiruvengadam, and Vassilis Zikas Department of Computer Science, University of Maryland {agroce,jkatz,aish,vzikas}@cs.umd.edu

More information

Dependability and real-time. TDDD07 Real-time Systems. Where to start? Two lectures. June 16, Lecture 8

Dependability and real-time. TDDD07 Real-time Systems. Where to start? Two lectures. June 16, Lecture 8 TDDD7 Real-time Systems Lecture 7 Dependability & Fault tolerance Simin Nadjm-Tehrani Real-time Systems Laboratory Department of Computer and Information Science Dependability and real-time If a system

More information

Byzantine Consensus. Definition

Byzantine Consensus. Definition Byzantine Consensus Definition Agreement: No two correct processes decide on different values Validity: (a) Weak Unanimity: if all processes start from the same value v and all processes are correct, then

More information

Notes from Reviews. Byzan&ne Generals. Two Generals Problem. Background on Failure. Mo&va&on. Example: Triple Modular Redundancy

Notes from Reviews. Byzan&ne Generals. Two Generals Problem. Background on Failure. Mo&va&on. Example: Triple Modular Redundancy S 739 Distributed Systems One paper: UNIVESITY of WISONSIN-MDISON omputer Sciences Department Byzan&ne Generals Michael Swift Notes (c) ndrea. rpaci-dusseau The Byzantine Generals Problem, by Lamport,

More information

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Yong-Hwan Cho, Sung-Hoon Park and Seon-Hyong Lee School of Electrical and Computer Engineering, Chungbuk National

More information

Erez Petrank. Department of Computer Science. Haifa, Israel. Abstract

Erez Petrank. Department of Computer Science. Haifa, Israel. Abstract The Best of Both Worlds: Guaranteeing Termination in Fast Randomized Byzantine Agreement Protocols Oded Goldreich Erez Petrank Department of Computer Science Technion Haifa, Israel. Abstract All known

More information

Failures, Elections, and Raft

Failures, Elections, and Raft Failures, Elections, and Raft CS 8 XI Copyright 06 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved. Distributed Banking SFO add interest based on current balance PVD deposit $000 CS 8 XI Copyright

More information

The algorithms we describe here are more robust require that prior to applying the algorithm the and comprehensive than the other distributed particip

The algorithms we describe here are more robust require that prior to applying the algorithm the and comprehensive than the other distributed particip DISTRIBUTED COMMIT WITH BOUNDED WAITING D. Dolev H. R. Strong IBM Research Laboratory San Jose, CA 95193 ABSTRACT Two-Phase Commit and other distributed commit protocols provide a method to commit changes

More information

Reaching Consensus. Lecture The Problem

Reaching Consensus. Lecture The Problem Lecture 4 Reaching Consensus 4.1 The Problem In the previous lecture, we ve seen that consensus can t be solved in asynchronous systems. That means asking how can we solve consensus? is pointless. But

More information

Today: Fault Tolerance. Replica Management

Today: Fault Tolerance. Replica Management Today: Fault Tolerance Failure models Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Failure recovery

More information

Distributed Consensus Protocols and Algorithms

Distributed Consensus Protocols and Algorithms Chapter 1 Distributed Consensus Protocols and Algorithms Yang Xiao, Ning Zhang, Jin Li, Wenjing Lou, Y. Thomas Hou Edit: This manuscript was built with L A TEX documentclass[11pt]{book}. The titles marked

More information

15 212: Principles of Programming. Some Notes on Induction

15 212: Principles of Programming. Some Notes on Induction 5 22: Principles of Programming Some Notes on Induction Michael Erdmann Spring 20 These notes provide a brief introduction to induction for proving properties of ML programs. We assume that the reader

More information

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University Fault Tolerance Dr. Yong Guan Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University Outline for Today s Talk Basic Concepts Process Resilience Reliable

More information