Distributed Algorithms Introduction

Size: px
Start display at page:

Download "Distributed Algorithms Introduction"

Transcription

1 Distributed Algorithms Introduction Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

2 Contents 1 Getting Started Two generals Common Knowledge Byzantine generals Summary 2 Themes of the course Impossible vs practical Classical vs extreme distributed systems Syllabus 3 Distributed Systems Informal definition Challenges 4 Conclusions

3 Getting Started Two generals Two generals A thought experiment: Alberto Montresor (UniTN) DS - Introduction 2016/04/26 1 / 35

4 Getting Started Two generals A potential solution General A: attack at dawn! General B: ack, attack at dawn! General A: ack ack, attack at dawn! General B: ack ack ack, attack at dawn!... Theorem Under this scenario, there is no solution for the Two Generals Problem Proof. By contradiction on the number of messages exchanged. Alberto Montresor (UniTN) DS - Introduction 2016/04/26 2 / 35

5 DS - Introduction Getting Started Two generals A potential solution A potential solution General A: attack at dawn! General B: ack, attack at dawn! General A: ack ack, attack at dawn! General B: ack ack ack, attack at dawn!... Theorem Under this scenario, there is no solution for the Two Generals Problem Proof. By contradiction on the number of messages exchanged. Let s assume (by contradiction) that there is at least one solution to this problem under this scenario. If there is one, there could be many. If there are many, we can find one which uses the minimum number of messages. Take the last message of this protocol: it can be received or it can be lost The protocol should work in both cases. So we could avoid sending it at all! The resulting protocol uses less messages than the minimum, which is a contradiction

6 Getting Started Two generals Reality Check Atomic Commit An atomic commit is an operation in which a set of distinct changes is applied as a single operation. Example: ATM s withdrawal You whitdraw 100 euro from an ATM in Trento Your balance should be decreased by 100 euro Alberto Montresor (UniTN) DS - Introduction 2016/04/26 3 / 35

7 Getting Started Two generals A pragmatic solution Probabilistic protocol Assuming messengers are caught independently of each other with probability p General A: Send n messengers Attacks no matter what General B: Attacks if receives at least one messenger p n is the probability that the attack will be uncoordinated Trade-off: We can decrease the probability of failure by increasing n... But at the additional cost of sending more messengers! Without be ever certain that the attack will be coordinated! Alberto Montresor (UniTN) DS - Introduction 2016/04/26 4 / 35

8 Getting Started Common Knowledge Muddy children n children go playing Children are truthful, perceptive, intelligent Mom says: Don t get muddy! A bunch (say, k) get mud on their forehead Daddy comes, looks around, and says Some of you got a muddy forehead Daddy repeatedly ask: Do you know whether you have a muddy forehead? What happens? Slides from Lorenzo Alvisi Alberto Montresor (UniTN) DS - Introduction 2016/04/26 5 / 35

9 Getting Started Common Knowledge Muddy children Theorem The first k 1 times Daddy asks, they children says No The k-th time, the k children say Yes. Proof. By induction on k Alberto Montresor (UniTN) DS - Introduction 2016/04/26 6 / 35

10 DS - Introduction Getting Started Common Knowledge Muddy children Muddy children Theorem The first k 1 times Daddy asks, they children says No The k-th time, the k children say Yes. Proof. By induction on k Let k = 1. The first time daddy asks, the child with mud on his forehead say yes. Because all the other have no mud, and someone has mud on his forehead, it must be him. Let k > 1 Every child with mud see k 1 children with mud on the forehead. If there were k 1 children with mud, they would have said yes at the (k 1)-th time daddy asks, but they didn t. So there are actually k children with mud, and they all say yes at the k-th time daddy asks.

11 Getting Started Common Knowledge Muddy children Variation 1 Suppose k > 1 Every one knows that someone has a dirty forehead before Dad announces it Does Dad still need to speak up? Let p = Someone s forehead is dirty Every one knows p But, unless the father speak, if k = 2 not every one knows that everyone knows p! Suppose A and B are dirty. Before the father speaks A does not know whether B knows p If k = 3, not every one knows that every one knows that every one knows p... Alberto Montresor (UniTN) DS - Introduction 2016/04/26 7 / 35

12 Getting Started Common Knowledge Muddy children Variation 2... the father took every child aside and told them individually (without others noticing) that someone s forehead is muddy? Variation 3... every child had (unknown to the other children) put a miniature microphone on every other child so they can hear what the father says in private to them? Alberto Montresor (UniTN) DS - Introduction 2016/04/26 8 / 35

13 Getting Started Common Knowledge Two generals, reloaded There is an entire logic that formalizes what knowledge participants acquire while running a protocol J. Halpern and Y. Moses. Knowledge and Common Knowledge in a Distributed Environment. E.W. Dijkstra Prize Solving the Two Generals Problem requires common knowledge everyone knows that everyone knows that everyone knows... But: Common knowledge cannot be achieved by communicating through unreliable channels Alberto Montresor (UniTN) DS - Introduction 2016/04/26 9 / 35

14 Getting Started Common Knowledge A common knowledge puzzle Albert and Bernard just become friends with Cheryl, and they want to know when her birthday is. Cheryl gives them a list of 10 dates: May 15,16,19 June 17,18 July 14,16 August 14,15,17 Cheryl then tells Albert and Bernard separately the month and the day of her birthday respectively. Albert: I don t know when Cheryl s birthday is, but I know that Bernard does not know too. Bernard: At first I didn t know when Cheryl s birthday is, but I know now. Albert: Then I also know when Cheryl s birthday is Alberto Montresor (UniTN) DS - Introduction 2016/04/26 10 / 35

15 Getting Started Byzantine generals Byzantine generals Scenario n Byzantine generals encircling a city They must decide whether to attack or retreat! Messengers are reliable and synchronous Generals may be traitors Nobody knows which generals are traitors Problem specification The generals require an algorithm to reach an agreement such that (i) all loyal generals decide on the same plan of action and (ii) a small number of traitorous generals cannot cause the loyal generals to adopt different plans. Alberto Montresor (UniTN) DS - Introduction 2016/04/26 11 / 35

16 Getting Started Byzantine generals A potential solution Wait for a majority of generals to agree WRONG! Possible scenario: 3 generals 1 vote attack, 1 vote retreat 1 traitorous general: sends a vote attack to the attack general sends a vote retreat to the retreat general Alberto Montresor (UniTN) DS - Introduction 2016/04/26 12 / 35

17 Getting Started Byzantine generals The problem is solvable Byzantine Fault Tolerance (1982) L. Lamport, R. Shostak, M. Pease, The Byzantine Generals Problem, ACM Trans. on Programming Languages and Systems, 4(3): , A protocol that given n processes,can tolerate up to t traitorous generals with n 3t + 1 Example: 4 generals can tolerate up to 1 byzantine general Practical Byzantine Fault Tolerance (2002) M. Castro and B. Liskov, Practical Byzantine Fault Tolerance and Proactive Recovery, ACM Trans. on Computer Systems, 20(4): , PBFT triggered a renaissance in BFT replication research Still going on... Alberto Montresor (UniTN) DS - Introduction 2016/04/26 13 / 35

18 Getting Started Byzantine generals Reality Check BFT sponsors in 1982 NASA The Ballistic Missile Defense System Command Army Research Office Nancy Lynch s book on Distributed Systems: The agreement problem is a simplified version of a problem that originally arose in the development of on-board aircraft control systems. BitCoin, a peer-to-peer digital currency system, is based on BFT. The 8-hour downtime of Amazon S3 in July 2008 is a well-known example of what happens when you don t use BFT. Alberto Montresor (UniTN) DS - Introduction 2016/04/26 14 / 35

19 Getting Started Summary Take-home lessons We need to properly model our distributed systems Reliable / unreliable communication Benign / malicious processes Solutions depend on the underlying model Approximate or probabilistic solutions Bounded solution No solution at all! Coordinating multiple processes is difficult Unexpected events: failures, malicious behavior Lack of common knowledge Alberto Montresor (UniTN) DS - Introduction 2016/04/26 15 / 35

20 Themes of the course Impossible vs practical Theory vs practice Yogi Berra says: In theory, theory and practice are the same. In practice, they are not. Yogi Berra Always go to other people s funerals, otherwise they won t go to yours I really didn t say everything I said Nobody goes there anymore; it s too crowded Alberto Montresor (UniTN) DS - Introduction 2016/04/26 16 / 35

21 Themes of the course Impossible vs practical First theme of the course Impossible vs practical Several papers about impossibility results: M. Fischer, N. Lynch, M. Paterson. Impossibility of Distributed Consensus with One Faulty Process. Journal of ACM, 32(2): , S. Gilbert, N. Lynch. Brewer s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services. ACM SIGACT News, 33(2):51-59, Yet, many of these problems have practical solutions: T. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2): , L. Lamport. Paxos made simple. ACM SIGACT News, 32(4):18 25, A general tension in Computer Science: M. Vardi. Solving the Unsolvable. Comm. of the ACM, 54(7):5, Alberto Montresor (UniTN) DS - Introduction 2016/04/26 17 / 35

22 Themes of the course Classical vs extreme distributed systems Beyond technology The Spanish Flu, Pandemic: killed 20M people in a relatively short time, more than World War I Virus goal: spread itself as quickly as possible Unreliable environment: viruses may be killed transmission may fail Transmission network is a complex graph Alberto Montresor (UniTN) DS - Introduction 2016/04/26 18 / 35

23 Themes of the course Classical vs extreme distributed systems Beyond technology Flocks of birds Flying in a flock is good: probability of being killed by a predator is reduced Flying in a flock is bad: probability of finding (enough) food is reduced Birds self-organize themselves in a flock No central authority Alberto Montresor (UniTN) DS - Introduction 2016/04/26 19 / 35

24 Themes of the course Classical vs extreme distributed systems Second theme of the course Classical vs extreme distributed systems Classical distributed system problems include agreement, total order broadcast, atomic commit, replication, etc. Extreme distributed system problems include self-* properties, scalability, full decentralization, etc. Special issue on Springer Computing (Sept. 2012) Alberto Montresor, Gusz Eiben, Maarten van Steen, editors Modern distributed systems may nowadays consist of hundreds of thousands of computers, ranging from high-end powerful machines to low-end resource-constrained wireless devices. We label them as extreme distributed systems, as they push scalability and complexity well beyond traditional scenarios. Alberto Montresor (UniTN) DS - Introduction 2016/04/26 20 / 35

25 Themes of the course Syllabus Topics Introduction Models Time, clocks, events Reliable broadcast Epidemic protocols Impossibility of consensus Consensus and failure detectors Complex networks Replication P2P Epidemics: Beyond dissemination Atomic Commitment Rollback and Recovery Group Communication Paxos Practical Byzantine Fault Tolerance Blockchains Byzantine, altruistic, rational model Distributed frameworks Alberto Montresor (UniTN) DS - Introduction 2016/04/26 21 / 35

26 Distributed Systems Informal definition What is a distributed system? Definition (pragmatic) A collection of independent, autonomous hosts connected through a communication network. Host communicate via message passing to achieve some form of cooperation. Definition (by negation) A parallel system where there are no: shared, global clock shared memory accurate failure detection Alberto Montresor (UniTN) DS - Introduction 2016/04/26 22 / 35

27 Distributed Systems Informal definition What is a distributed system? Definition (optimistic) A collection of independent computers that appears to its users as a single coherent system [Andrew Tanenbaum] Definition (pessimistic) A distributed system is one in which the failure of a computer you did not even know existed can render your own computer unusable [Leslie Lamport] Alberto Montresor (UniTN) DS - Introduction 2016/04/26 23 / 35

28 Distributed Systems Informal definition Motivation Inherent distribution Applications which require sharing of resources or dissemination of information among geographically distant entities are natural distributed systems Examples: Bank with several branches Distributed file systems, shared devices, etc. Alberto Montresor (UniTN) DS - Introduction 2016/04/26 24 / 35

29 Distributed Systems Informal definition Motivation Distribution as an artifact Distribution may be an artifact of an engineering solution to satisfy some specific requirements such as: fault-tolerance increased performance / cost ratio minimum level of Quality of Service (QoS) Examples: Replicated servers Alberto Montresor (UniTN) DS - Introduction 2016/04/26 25 / 35

30 Distributed Systems Informal definition Independent failures From distributed systems to failures Dependencies between hosts augment the effect of failures Distributed systems make the life of developers more difficult Availability example: n dependent hosts, probability of failure p Availability: (1 p) n From failures to distributed systems Providing a replicated service via multiple independent processes enables fault tolerance Distributed systems should simplify the life of users n independent hosts, probability of failure p Availability: 1 p n Alberto Montresor (UniTN) DS - Introduction 2016/04/26 26 / 35

31 Distributed Systems Informal definition Parallel systems Definition Multiple processors that: share memory Communication based on shared memory and synchronization mechanisms share time Access to a common clock share fate No independent failures Alberto Montresor (UniTN) DS - Introduction 2016/04/26 27 / 35

32 Distributed Systems Informal definition The true spirit of distributed systems Parallel systems exploit determinism to achieve efficiency Distributed systems address uncertainty created by: multiplicity of control flows absence of shared memory and global time presence of failures Alberto Montresor (UniTN) DS - Introduction 2016/04/26 28 / 35

33 Distributed Systems Challenges Challenges blah, blah Heterogeneity Openness Security Failure handling Transparency Alberto Montresor (UniTN) DS - Introduction 2016/04/26 29 / 35

34 Distributed Systems Challenges Heterogeneity Levels: Network Computing hardware Operating systems Programming languages Multiple implementations Middleware Software layer that abstracts from the above providing a uniform computational model Alberto Montresor (UniTN) DS - Introduction 2016/04/26 30 / 35

35 Distributed Systems Challenges Openness The degree to which a distributed system can be extended and re-implemented Public interfaces Standardization Examples: Corba Java2EE.Net Web Services Alberto Montresor (UniTN) DS - Introduction 2016/04/26 31 / 35

36 Distributed Systems Challenges Security aspects Confidentiality: avoiding the disclosure of the content of a message to a party distinct from the intended receiver Integrity: avoiding the corruption of the transmitted contents by a third party Availability: the capability of providing a service in spite of malicious behavior Alberto Montresor (UniTN) DS - Introduction 2016/04/26 32 / 35

37 Distributed Systems Challenges Failure handling Failure detection (e.g., message checksum) Failure masking (e.g., retransmission) Failure tolerance (e.g., replicated servers) Failure recovery (e.g., log files) Distributed systems use a mix of these techniques Alberto Montresor (UniTN) DS - Introduction 2016/04/26 33 / 35

38 Distributed Systems Challenges Transparency Access transparency: Hides differences in data representation and invocation mechanisms Location transparency: enables resources to be accessed without knowledge of their location Concurrency transparency: enables several processes to operate concurrently using shared resources without interference Replication transparency: enables multiple resource instances to be used to increase reliability and performance without knowledge of the replicas by users Failure transparency: enables the concealment of faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components. Migration transparency: allows the movement of resources and clients within a system without affecting operations Alberto Montresor (UniTN) DS - Introduction 2016/04/26 34 / 35

39 Conclusions What s next in the course? Distributed system modeling Which kind of failures exist? Which kind of failures we tolerate? Problem specification Formal description of the problem Algorithms, algorithms, algorithms Pseudo-code descriptions of algorithms Proofs Just writing the code is not enough! Sometimes, impossibility proofs Reality checks Learn about real systems where these protocols are applied Alberto Montresor (UniTN) DS - Introduction 2016/04/26 35 / 35

40 Reading Material A. Panconesi. The coordinated attack and the jealous amazons. A. Panconesi. Coordination and the fall of Eastern Roman Empire.

41 Reality Check: Interesting links The S3 incident Solving the unsolvable The rise and fall of Corba Notes on Distributed Systems for Young Bloods

Distributed Systems 2 Introduction

Distributed Systems 2 Introduction Distributed Systems 2 Introduction Alberto Montresor University of Trento, Italy 2018/09/13 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1 Getting

More information

Where I was born. Where I studied. Distributed Computing Honors. Distributed Computing. Lorenzo Alvisi. Lorenzo Alvisi Surbhi Goel Benny Renard

Where I was born. Where I studied. Distributed Computing Honors. Distributed Computing. Lorenzo Alvisi. Lorenzo Alvisi Surbhi Goel Benny Renard Distributed Computing Distributed Computing Honors Lorenzo Alvisi Lorenzo Alvisi Surbhi Goel Benny Renard Where I was born Bologna, Italy Surbhi Goel Benny Renard Where I studied Where I studied What is

More information

CSE 5306 Distributed Systems. Fault Tolerance

CSE 5306 Distributed Systems. Fault Tolerance CSE 5306 Distributed Systems Fault Tolerance 1 Failure in Distributed Systems Partial failure happens when one component of a distributed system fails often leaves other components unaffected A failure

More information

CSE 5306 Distributed Systems

CSE 5306 Distributed Systems CSE 5306 Distributed Systems Fault Tolerance Jia Rao http://ranger.uta.edu/~jrao/ 1 Failure in Distributed Systems Partial failure Happens when one component of a distributed system fails Often leaves

More information

Byzantine Fault Tolerance

Byzantine Fault Tolerance Byzantine Fault Tolerance CS 240: Computing Systems and Concurrency Lecture 11 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. So far: Fail-stop failures

More information

Today: Fault Tolerance

Today: Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

Chapter 8 Fault Tolerance

Chapter 8 Fault Tolerance DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 8 Fault Tolerance 1 Fault Tolerance Basic Concepts Being fault tolerant is strongly related to

More information

Today: Fault Tolerance. Fault Tolerance

Today: Fault Tolerance. Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

Distributed Algorithms Reliable Broadcast

Distributed Algorithms Reliable Broadcast Distributed Algorithms Reliable Broadcast Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents

More information

BYZANTINE CONSENSUS THROUGH BITCOIN S PROOF- OF-WORK

BYZANTINE CONSENSUS THROUGH BITCOIN S PROOF- OF-WORK Informatiemanagement: BYZANTINE CONSENSUS THROUGH BITCOIN S PROOF- OF-WORK The aim of this paper is to elucidate how Byzantine consensus is achieved through Bitcoin s novel proof-of-work system without

More information

A definition. Byzantine Generals Problem. Synchronous, Byzantine world

A definition. Byzantine Generals Problem. Synchronous, Byzantine world The Byzantine Generals Problem Leslie Lamport, Robert Shostak, and Marshall Pease ACM TOPLAS 1982 Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov OSDI 1999 A definition Byzantine (www.m-w.com):

More information

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change

More information

Security (and finale) Dan Ports, CSEP 552

Security (and finale) Dan Ports, CSEP 552 Security (and finale) Dan Ports, CSEP 552 Today Security: what if parts of your distributed system are malicious? BFT: state machine replication Bitcoin: peer-to-peer currency Course wrap-up Security Too

More information

Distributed Systems Consensus

Distributed Systems Consensus Distributed Systems Consensus Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Consensus 1393/6/31 1 / 56 What is the Problem?

More information

Alternative Consensus

Alternative Consensus 1 Alternative Consensus DEEP DIVE Alexandra Tran, Dev Ojha, Jeremiah Andrews, Steven Elleman, Ashvin Nihalani 2 TODAY S AGENDA GETTING STARTED 1 INTRO TO CONSENSUS AND BFT 2 NAKAMOTO CONSENSUS 3 BFT ALGORITHMS

More information

BYZANTINE GENERALS BYZANTINE GENERALS (1) A fable: Michał Szychowiak, 2002 Dependability of Distributed Systems (Byzantine agreement)

BYZANTINE GENERALS BYZANTINE GENERALS (1) A fable: Michał Szychowiak, 2002 Dependability of Distributed Systems (Byzantine agreement) BYZANTINE GENERALS (1) BYZANTINE GENERALS A fable: BYZANTINE GENERALS (2) Byzantine Generals Problem: Condition 1: All loyal generals decide upon the same plan of action. Condition 2: A small number of

More information

Introduction to Distributed Systems Seif Haridi

Introduction to Distributed Systems Seif Haridi Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send

More information

02 - Distributed Systems

02 - Distributed Systems 02 - Distributed Systems Definition Coulouris 1 (Dis)advantages Coulouris 2 Challenges Saltzer_84.pdf Models Physical Architectural Fundamental 2/58 Definition Distributed Systems Distributed System is

More information

Consensus Problem. Pradipta De

Consensus Problem. Pradipta De Consensus Problem Slides are based on the book chapter from Distributed Computing: Principles, Paradigms and Algorithms (Chapter 14) by Kshemkalyani and Singhal Pradipta De pradipta.de@sunykorea.ac.kr

More information

Global atomicity. Such distributed atomicity is called global atomicity A protocol designed to enforce global atomicity is called commit protocol

Global atomicity. Such distributed atomicity is called global atomicity A protocol designed to enforce global atomicity is called commit protocol Global atomicity In distributed systems a set of processes may be taking part in executing a task Their actions may have to be atomic with respect to processes outside of the set example: in a distributed

More information

Byzantine Fault Tolerance

Byzantine Fault Tolerance Byzantine Fault Tolerance CS6450: Distributed Systems Lecture 10 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University.

More information

02 - Distributed Systems

02 - Distributed Systems 02 - Distributed Systems Definition Coulouris 1 (Dis)advantages Coulouris 2 Challenges Saltzer_84.pdf Models Physical Architectural Fundamental 2/60 Definition Distributed Systems Distributed System is

More information

Distributed Systems. Fault Tolerance. Paul Krzyzanowski

Distributed Systems. Fault Tolerance. Paul Krzyzanowski Distributed Systems Fault Tolerance Paul Krzyzanowski Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License. Faults Deviation from expected

More information

System Models for Distributed Systems

System Models for Distributed Systems System Models for Distributed Systems INF5040/9040 Autumn 2015 Lecturer: Amir Taherkordi (ifi/uio) August 31, 2015 Outline 1. Introduction 2. Physical Models 4. Fundamental Models 2 INF5040 1 System Models

More information

Distributed Systems Fault Tolerance

Distributed Systems Fault Tolerance Distributed Systems Fault Tolerance [] Fault Tolerance. Basic concepts - terminology. Process resilience groups and failure masking 3. Reliable communication reliable client-server communication reliable

More information

BYZANTINE AGREEMENT CH / $ IEEE. by H. R. Strong and D. Dolev. IBM Research Laboratory, K55/281 San Jose, CA 95193

BYZANTINE AGREEMENT CH / $ IEEE. by H. R. Strong and D. Dolev. IBM Research Laboratory, K55/281 San Jose, CA 95193 BYZANTINE AGREEMENT by H. R. Strong and D. Dolev IBM Research Laboratory, K55/281 San Jose, CA 95193 ABSTRACT Byzantine Agreement is a paradigm for problems of reliable consistency and synchronization

More information

Today: Fault Tolerance. Failure Masking by Redundancy

Today: Fault Tolerance. Failure Masking by Redundancy Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Failure recovery Checkpointing

More information

Fault Tolerance Part I. CS403/534 Distributed Systems Erkay Savas Sabanci University

Fault Tolerance Part I. CS403/534 Distributed Systems Erkay Savas Sabanci University Fault Tolerance Part I CS403/534 Distributed Systems Erkay Savas Sabanci University 1 Overview Basic concepts Process resilience Reliable client-server communication Reliable group communication Distributed

More information

Recovering from a Crash. Three-Phase Commit

Recovering from a Crash. Three-Phase Commit Recovering from a Crash If INIT : abort locally and inform coordinator If Ready, contact another process Q and examine Q s state Lecture 18, page 23 Three-Phase Commit Two phase commit: problem if coordinator

More information

Distributed Systems 11. Consensus. Paul Krzyzanowski

Distributed Systems 11. Consensus. Paul Krzyzanowski Distributed Systems 11. Consensus Paul Krzyzanowski pxk@cs.rutgers.edu 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value must be one

More information

Consensus and related problems

Consensus and related problems Consensus and related problems Today l Consensus l Google s Chubby l Paxos for Chubby Consensus and failures How to make process agree on a value after one or more have proposed what the value should be?

More information

Fault-Tolerant Distributed Consensus

Fault-Tolerant Distributed Consensus Fault-Tolerant Distributed Consensus Lawrence Kesteloot January 20, 1995 1 Introduction A fault-tolerant system is one that can sustain a reasonable number of process or communication failures, both intermittent

More information

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Page 1 Introduction We frequently want to get a set of nodes in a distributed system to agree Commitment protocols and mutual

More information

The CAP theorem. The bad, the good and the ugly. Michael Pfeiffer Advanced Networking Technologies FG Telematik/Rechnernetze TU Ilmenau

The CAP theorem. The bad, the good and the ugly. Michael Pfeiffer Advanced Networking Technologies FG Telematik/Rechnernetze TU Ilmenau The CAP theorem The bad, the good and the ugly Michael Pfeiffer Advanced Networking Technologies FG Telematik/Rechnernetze TU Ilmenau 2017-05-15 1 / 19 1 The bad: The CAP theorem s proof 2 The good: A

More information

FAULT TOLERANCE. Fault Tolerant Systems. Faults Faults (cont d)

FAULT TOLERANCE. Fault Tolerant Systems. Faults Faults (cont d) Distributed Systems Fö 9/10-1 Distributed Systems Fö 9/10-2 FAULT TOLERANCE 1. Fault Tolerant Systems 2. Faults and Fault Models. Redundancy 4. Time Redundancy and Backward Recovery. Hardware Redundancy

More information

Distributed Algorithms Models

Distributed Algorithms Models Distributed Algorithms Models Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1 Taxonomy

More information

Distributed Systems COMP 212. Lecture 19 Othon Michail

Distributed Systems COMP 212. Lecture 19 Othon Michail Distributed Systems COMP 212 Lecture 19 Othon Michail Fault Tolerance 2/31 What is a Distributed System? 3/31 Distributed vs Single-machine Systems A key difference: partial failures One component fails

More information

TWO-PHASE COMMIT ATTRIBUTION 5/11/2018. George Porter May 9 and 11, 2018

TWO-PHASE COMMIT ATTRIBUTION 5/11/2018. George Porter May 9 and 11, 2018 TWO-PHASE COMMIT George Porter May 9 and 11, 2018 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative Commons license These slides

More information

Failure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18

Failure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18 Failure models Byzantine Fault Tolerance Fail-stop: nodes either execute the protocol correctly or just stop Byzantine failures: nodes can behave in any arbitrary way Send illegal messages, try to trick

More information

A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment

A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment Michel RAYNAL IRISA, Campus de Beaulieu 35042 Rennes Cedex (France) raynal @irisa.fr Abstract This paper considers

More information

Fault Tolerance. Distributed Systems. September 2002

Fault Tolerance. Distributed Systems. September 2002 Fault Tolerance Distributed Systems September 2002 Basics A component provides services to clients. To provide services, the component may require the services from other components a component may depend

More information

DISTRIBUTED SYSTEMS. Second Edition. Andrew S. Tanenbaum Maarten Van Steen. Vrije Universiteit Amsterdam, 7'he Netherlands PEARSON.

DISTRIBUTED SYSTEMS. Second Edition. Andrew S. Tanenbaum Maarten Van Steen. Vrije Universiteit Amsterdam, 7'he Netherlands PEARSON. DISTRIBUTED SYSTEMS 121r itac itple TAYAdiets Second Edition Andrew S. Tanenbaum Maarten Van Steen Vrije Universiteit Amsterdam, 7'he Netherlands PEARSON Prentice Hall Upper Saddle River, NJ 07458 CONTENTS

More information

Today: Fault Tolerance. Replica Management

Today: Fault Tolerance. Replica Management Today: Fault Tolerance Failure models Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Failure recovery

More information

Data Consistency and Blockchain. Bei Chun Zhou (BlockChainZ)

Data Consistency and Blockchain. Bei Chun Zhou (BlockChainZ) Data Consistency and Blockchain Bei Chun Zhou (BlockChainZ) beichunz@cn.ibm.com 1 Data Consistency Point-in-time consistency Transaction consistency Application consistency 2 Strong Consistency ACID Atomicity.

More information

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Modified by: Dr. Ramzi Saifan Definition of a Distributed System (1) A distributed

More information

Consensus in Distributed Systems. Jeff Chase Duke University

Consensus in Distributed Systems. Jeff Chase Duke University Consensus in Distributed Systems Jeff Chase Duke University Consensus P 1 P 1 v 1 d 1 Unreliable multicast P 2 P 3 Consensus algorithm P 2 P 3 v 2 Step 1 Propose. v 3 d 2 Step 2 Decide. d 3 Generalizes

More information

Byzantine Techniques

Byzantine Techniques November 29, 2005 Reliability and Failure There can be no unity without agreement, and there can be no agreement without conciliation René Maowad Reliability and Failure There can be no unity without agreement,

More information

Tradeoffs in Byzantine-Fault-Tolerant State-Machine-Replication Protocol Design

Tradeoffs in Byzantine-Fault-Tolerant State-Machine-Replication Protocol Design Tradeoffs in Byzantine-Fault-Tolerant State-Machine-Replication Protocol Design Michael G. Merideth March 2008 CMU-ISR-08-110 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213

More information

To do. Consensus and related problems. q Failure. q Raft

To do. Consensus and related problems. q Failure. q Raft Consensus and related problems To do q Failure q Consensus and related problems q Raft Consensus We have seen protocols tailored for individual types of consensus/agreements Which process can enter the

More information

System models for distributed systems

System models for distributed systems System models for distributed systems INF5040/9040 autumn 2010 lecturer: Frank Eliassen INF5040 H2010, Frank Eliassen 1 System models Purpose illustrate/describe common properties and design choices for

More information

CS5412: CONSENSUS AND THE FLP IMPOSSIBILITY RESULT

CS5412: CONSENSUS AND THE FLP IMPOSSIBILITY RESULT 1 CS5412: CONSENSUS AND THE FLP IMPOSSIBILITY RESULT Lecture XII Ken Birman Generalizing Ron and Hermione s challenge 2 Recall from last time: Ron and Hermione had difficulty agreeing where to meet for

More information

The Long March of BFT. Weird Things Happen in Distributed Systems. A specter is haunting the system s community... A hierarchy of failure models

The Long March of BFT. Weird Things Happen in Distributed Systems. A specter is haunting the system s community... A hierarchy of failure models A specter is haunting the system s community... The Long March of BFT Lorenzo Alvisi UT Austin BFT Fail-stop A hierarchy of failure models Crash Weird Things Happen in Distributed Systems Send Omission

More information

Fault Tolerance. Basic Concepts

Fault Tolerance. Basic Concepts COP 6611 Advanced Operating System Fault Tolerance Chi Zhang czhang@cs.fiu.edu Dependability Includes Availability Run time / total time Basic Concepts Reliability The length of uninterrupted run time

More information

Consensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS

Consensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS 16 CHAPTER 2. CONSENSUS Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of a node. Chapter

More information

Topics in Reliable Distributed Systems

Topics in Reliable Distributed Systems Topics in Reliable Distributed Systems 049017 1 T R A N S A C T I O N S Y S T E M S What is A Database? Organized collection of data typically persistent organization models: relational, object-based,

More information

Distributed Systems. Before We Begin. Advantages. What is a Distributed System? CSE 120: Principles of Operating Systems. Lecture 13.

Distributed Systems. Before We Begin. Advantages. What is a Distributed System? CSE 120: Principles of Operating Systems. Lecture 13. CSE 120: Principles of Operating Systems Lecture 13 Distributed Systems December 2, 2003 Before We Begin Read Chapters 15, 17 (on Distributed Systems topics) Prof. Joe Pasquale Department of Computer Science

More information

CS October 2017

CS October 2017 Atomic Transactions Transaction An operation composed of a number of discrete steps. Distributed Systems 11. Distributed Commit Protocols All the steps must be completed for the transaction to be committed.

More information

Dep. Systems Requirements

Dep. Systems Requirements Dependable Systems Dep. Systems Requirements Availability the system is ready to be used immediately. A(t) = probability system is available for use at time t MTTF/(MTTF+MTTR) If MTTR can be kept small

More information

Fault Tolerance. it continues to perform its function in the event of a failure example: a system with redundant components

Fault Tolerance. it continues to perform its function in the event of a failure example: a system with redundant components Fault Tolerance To avoid disruption due to failure and to improve availability, systems are designed to be fault-tolerant Two broad categories of fault-tolerant systems are: systems that mask failure it

More information

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 08: Fault Tolerance Version: December 2, 2010 2 / 65 Contents Chapter

More information

PRIMARY-BACKUP REPLICATION

PRIMARY-BACKUP REPLICATION PRIMARY-BACKUP REPLICATION Primary Backup George Porter Nov 14, 2018 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative Commons

More information

Introduction to Distributed Systems

Introduction to Distributed Systems Introduction to Distributed Systems Minsoo Ryu Department of Computer Science and Engineering 2 Definition A distributed system is a collection of independent computers that appears to its users as a single

More information

Practical Byzantine Fault Tolerance Consensus and A Simple Distributed Ledger Application Hao Xu Muyun Chen Xin Li

Practical Byzantine Fault Tolerance Consensus and A Simple Distributed Ledger Application Hao Xu Muyun Chen Xin Li Practical Byzantine Fault Tolerance Consensus and A Simple Distributed Ledger Application Hao Xu Muyun Chen Xin Li Abstract Along with cryptocurrencies become a great success known to the world, how to

More information

Concepts. Techniques for masking faults. Failure Masking by Redundancy. CIS 505: Software Systems Lecture Note on Consensus

Concepts. Techniques for masking faults. Failure Masking by Redundancy. CIS 505: Software Systems Lecture Note on Consensus CIS 505: Software Systems Lecture Note on Consensus Insup Lee Department of Computer and Information Science University of Pennsylvania CIS 505, Spring 2007 Concepts Dependability o Availability ready

More information

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University Fault Tolerance Dr. Yong Guan Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University Outline for Today s Talk Basic Concepts Process Resilience Reliable

More information

Failures, Elections, and Raft

Failures, Elections, and Raft Failures, Elections, and Raft CS 8 XI Copyright 06 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved. Distributed Banking SFO add interest based on current balance PVD deposit $000 CS 8 XI Copyright

More information

Distributed Systems COMP 212. Revision 2 Othon Michail

Distributed Systems COMP 212. Revision 2 Othon Michail Distributed Systems COMP 212 Revision 2 Othon Michail Synchronisation 2/55 How would Lamport s algorithm synchronise the clocks in the following scenario? 3/55 How would Lamport s algorithm synchronise

More information

Consensus. Chapter Two Friends. 8.3 Impossibility of Consensus. 8.2 Consensus 8.3. IMPOSSIBILITY OF CONSENSUS 55

Consensus. Chapter Two Friends. 8.3 Impossibility of Consensus. 8.2 Consensus 8.3. IMPOSSIBILITY OF CONSENSUS 55 8.3. IMPOSSIBILITY OF CONSENSUS 55 Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of

More information

Distributed Systems (5DV147)

Distributed Systems (5DV147) Distributed Systems (5DV147) Fundamentals Fall 2013 1 basics 2 basics Single process int i; i=i+1; 1 CPU - Steps are strictly sequential - Program behavior & variables state determined by sequence of operations

More information

Consensus, impossibility results and Paxos. Ken Birman

Consensus, impossibility results and Paxos. Ken Birman Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

Fault-Tolerance & Paxos

Fault-Tolerance & Paxos Chapter 15 Fault-Tolerance & Paxos How do you create a fault-tolerant distributed system? In this chapter we start out with simple questions, and, step by step, improve our solutions until we arrive at

More information

Practical Byzantine Fault

Practical Byzantine Fault Practical Byzantine Fault Tolerance Practical Byzantine Fault Tolerance Castro and Liskov, OSDI 1999 Nathan Baker, presenting on 23 September 2005 What is a Byzantine fault? Rationale for Byzantine Fault

More information

Distributed Systems Conclusions & Exam. Brian Nielsen

Distributed Systems Conclusions & Exam. Brian Nielsen Distributed Systems Conclusions & Exam Brian Nielsen bnielsen@cs.aau.dk Study Regulations Purpose: That the student obtains knowledge about concepts in distributed systems, knowledge about their construction,

More information

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks.

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks. Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

Distributed Commit in Asynchronous Systems

Distributed Commit in Asynchronous Systems Distributed Commit in Asynchronous Systems Minsoo Ryu Department of Computer Science and Engineering 2 Distributed Commit Problem - Either everybody commits a transaction, or nobody - This means consensus!

More information

Fault Tolerance 1/64

Fault Tolerance 1/64 Fault Tolerance 1/64 Fault Tolerance Fault tolerance is the ability of a distributed system to provide its services even in the presence of faults. A distributed system should be able to recover automatically

More information

CMSC 858F: Algorithmic Game Theory Fall 2010 Achieving Byzantine Agreement and Broadcast against Rational Adversaries

CMSC 858F: Algorithmic Game Theory Fall 2010 Achieving Byzantine Agreement and Broadcast against Rational Adversaries CMSC 858F: Algorithmic Game Theory Fall 2010 Achieving Byzantine Agreement and Broadcast against Rational Adversaries Instructor: Mohammad T. Hajiaghayi Scribe: Adam Groce, Aishwarya Thiruvengadam, Ateeq

More information

Lecture 3. Introduction to Cryptocurrencies

Lecture 3. Introduction to Cryptocurrencies Lecture 3 Introduction to Cryptocurrencies Public Keys as Identities public key := an identity if you see sig such that verify(pk, msg, sig)=true, think of it as: pk says, [msg] to speak for pk, you must

More information

Introduction to Distributed Systems

Introduction to Distributed Systems Introduction to Distributed Systems Distributed Systems L-A Sistemi Distribuiti L-A Andrea Omicini andrea.omicini@unibo.it Ingegneria Due Alma Mater Studiorum Università di Bologna a Cesena Academic Year

More information

Byzantine fault tolerance. Jinyang Li With PBFT slides from Liskov

Byzantine fault tolerance. Jinyang Li With PBFT slides from Liskov Byzantine fault tolerance Jinyang Li With PBFT slides from Liskov What we ve learnt so far: tolerate fail-stop failures Traditional RSM tolerates benign failures Node crashes Network partitions A RSM w/

More information

Byzantine Consensus in Directed Graphs

Byzantine Consensus in Directed Graphs Byzantine Consensus in Directed Graphs Lewis Tseng 1,3, and Nitin Vaidya 2,3 1 Department of Computer Science, 2 Department of Electrical and Computer Engineering, and 3 Coordinated Science Laboratory

More information

Distributed Systems Conclusions & Exam. Brian Nielsen

Distributed Systems Conclusions & Exam. Brian Nielsen Distributed Systems Conclusions & Exam Brian Nielsen bnielsen@cs.aau.dk Definition A distributed system is the one in which hardware and software components at networked computers communicate and coordinate

More information

The Ripple Protocol Consensus Algorithm

The Ripple Protocol Consensus Algorithm Ripple Labs Inc, 2014 The Ripple Protocol Consensus Algorithm David Schwartz david@ripple.com Noah Youngs nyoungs@nyu.edu Arthur Britto arthur@ripple.com Abstract While several consensus algorithms exist

More information

Distributed Systems COMP 212. Lecture 1 Othon Michail

Distributed Systems COMP 212. Lecture 1 Othon Michail Distributed Systems COMP 212 Lecture 1 Othon Michail Course Information Lecturer: Othon Michail Office 2.14 Holt Building http://csc.liv.ac.uk/~michailo/teaching/comp2 12 Structure 30 Lectures + 10 lab

More information

Intuitive distributed algorithms. with F#

Intuitive distributed algorithms. with F# Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype

More information

CS6450: Distributed Systems Lecture 11. Ryan Stutsman

CS6450: Distributed Systems Lecture 11. Ryan Stutsman Strong Consistency CS6450: Distributed Systems Lecture 11 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed

More information

Basic vs. Reliable Multicast

Basic vs. Reliable Multicast Basic vs. Reliable Multicast Basic multicast does not consider process crashes. Reliable multicast does. So far, we considered the basic versions of ordered multicasts. What about the reliable versions?

More information

CS6450: Distributed Systems Lecture 15. Ryan Stutsman

CS6450: Distributed Systems Lecture 15. Ryan Stutsman Strong Consistency CS6450: Distributed Systems Lecture 15 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed

More information

Distributed Deadlock

Distributed Deadlock Distributed Deadlock 9.55 DS Deadlock Topics Prevention Too expensive in time and network traffic in a distributed system Avoidance Determining safe and unsafe states would require a huge number of messages

More information

Distributed Algorithms Practical Byzantine Fault Tolerance

Distributed Algorithms Practical Byzantine Fault Tolerance Distributed Algorithms Practical Byzantine Fault Tolerance Alberto Montresor University of Trento, Italy 2017/01/06 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International

More information

Chapter 8 Fault Tolerance

Chapter 8 Fault Tolerance DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 8 Fault Tolerance Fault Tolerance Basic Concepts Being fault tolerant is strongly related to what

More information

Chapter 5: Distributed Systems: Fault Tolerance. Fall 2013 Jussi Kangasharju

Chapter 5: Distributed Systems: Fault Tolerance. Fall 2013 Jussi Kangasharju Chapter 5: Distributed Systems: Fault Tolerance Fall 2013 Jussi Kangasharju Chapter Outline n Fault tolerance n Process resilience n Reliable group communication n Distributed commit n Recovery 2 Basic

More information

Fault Tolerance. Distributed Software Systems. Definitions

Fault Tolerance. Distributed Software Systems. Definitions Fault Tolerance Distributed Software Systems Definitions Availability: probability the system operates correctly at any given moment Reliability: ability to run correctly for a long interval of time Safety:

More information

Distributed algorithms

Distributed algorithms Distributed algorithms Prof R. Guerraoui lpdwww.epfl.ch Exam: Written Reference: Book - Springer Verlag http://lpd.epfl.ch/site/education/da - Introduction to Reliable (and Secure) Distributed Programming

More information

Exam 2 Review. Fall 2011

Exam 2 Review. Fall 2011 Exam 2 Review Fall 2011 Question 1 What is a drawback of the token ring election algorithm? Bad question! Token ring mutex vs. Ring election! Ring election: multiple concurrent elections message size grows

More information

Chapter 1: Distributed Systems: What is a distributed system? Fall 2013

Chapter 1: Distributed Systems: What is a distributed system? Fall 2013 Chapter 1: Distributed Systems: What is a distributed system? Fall 2013 Course Goals and Content n Distributed systems and their: n Basic concepts n Main issues, problems, and solutions n Structured and

More information

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Yong-Hwan Cho, Sung-Hoon Park and Seon-Hyong Lee School of Electrical and Computer Engineering, Chungbuk National

More information

Distributed Algorithms Benoît Garbinato

Distributed Algorithms Benoît Garbinato Distributed Algorithms Benoît Garbinato 1 Distributed systems networks distributed As long as there were no machines, programming was no problem networks distributed at all; when we had a few weak computers,

More information

Practical Byzantine Fault Tolerance

Practical Byzantine Fault Tolerance Practical Byzantine Fault Tolerance Robert Grimm New York University (Partially based on notes by Eric Brewer and David Mazières) The Three Questions What is the problem? What is new or different? What

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 2, Issue 9, September 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Backup Two

More information