Impossibility of Agreement in Asynchronous Systems

Size: px
Start display at page:

Download "Impossibility of Agreement in Asynchronous Systems"

Transcription

1 Consensus protocol P 8 schedule from C finite or infinite sequence σ of events that can be applied, in turn, from C a run is the sequence of steps associated with a schedule let σ finite, then C' = σ(c) denotes the configuration that will be reached from C by applying, in turn, σ to C iwe then say C' is reachable from C iif C is an initial configuration, and C' is reachable from C, then C' is accessible * in the sequel, we assume all configurations to be accessible Distributed Systems - Fall 2001 IV - 90 Stefan Leue 2001

2 Lemma 1: Commutativity of Independent Schedules 8 let C some configuration and σ 1 and σ 2 finite schedules that lead to configurations C 1 and C 2, respectively 8 we say that two schedules are independent if the sets of processes taking steps in them are disjoint 8 if σ 1 and σ 2 are independent, then σ 2 can be applied to C 1 and σ 1 can be applied to C 2, and both applications lead to the same configuration C 3 Proof of Lemma 1 8 follows immediately from the above given construction of a consensus protocol and the fact that σ 1 and σ 2 are independent Distributed Systems - Fall 2001 IV - 91 Stefan Leue 2001

3 Lemma 1: Commutativity of Independent Schedules 8 let C some configuration and σ 1 and σ 2 finite schedules that lead to configurations C 1 and C 2, respectively 8 we say that two schedules are independent if the sets of processes taking steps in them are disjoint 8 if σ 1 and σ 2 are independent, then σ 2 can be applied to C 1 and σ 1 can be applied to C 2, and both applications lead to the same configuration C 3 C σ 1 σ 2 C 1 C 2 σ 2 σ 1 Proof of Lemma 1 8 follows immediately from the above given construction of a consensus protocol and the fact that σ 1 and σ 2 do not interact since they are independent C 3 Distributed Systems - Fall 2001 IV - 92 Stefan Leue 2001

4 Consensus protocol P 8 a configuration C has a decision value v if some process p is in a decision state and y p = v 8 a consensus protocol is partially correct, if 1. no accessible configuration has more than one distinct decision value, and 2. ( v {0, 1}) (there exists some configuration that has a decision value v) 8 a process p is non-faulty or correct in a run if it takes infinitely many steps, faulty or crashed otherwise 8 a run is admissible if at most one process is faulty, and all messages sent to a non-faulty process are eventually received 8 a run is deciding provided some process reaches a decision state in that run 8 a consensus protocol is totally correct in spite of one process fault (crash) if it is partially correct and every admissible run of P is also a deciding run Distributed Systems - Fall 2001 IV - 93 Stefan Leue 2001

5 Consensus protocol P 8 let C a configuration, V the set of decision values of configurations reachable from C C is bivalent, if V = 2 C is univalent, if V = 1 ic is 0-valent, if V = {0} ic is 1-valent, if V = {1} Distributed Systems - Fall 2001 IV - 94 Stefan Leue 2001

6 Theorem 8 no consensus protocol is totally correct in spite of one fault Note 8 theorem claims that every partially correct consensus protocol has some admissible run that is not deciding Proof of Theorem 8 by contradiction assume that P is a consensus protocol that is totally correct in spite of one fault then show that P remains forever indecisive, i.e., has a run that is never deciding, by arguing the following: 1. there is some initial configuration in which the decision is not already taken, and 2. it is possible to construct an admissible run that never takes a step so that the system would commit to a particular decision note idue to the assumption of total correctness of P, and due to the fact that there are always admissible runs, P is always at least univalent Distributed Systems - Fall 2001 IV - 95 Stefan Leue 2001

7 Lemma 2 8 P has a bivalent initial configuration Proof of Lemma 2 8 by contradiction 8 assume the contrary (P has only univalent initial configurations) due to assumed partial correctness, P must have both 0-valent and 1- valent initial configurations two initial configurations are adjacent, if they differ only in the initial value x 0 p of a single process p between any two initial configurations C i and C m one can form a chain of initial configurations so that every pair of neighbouring configurations in the chain is adjacent Distributed Systems - Fall 2001 IV - 96 Stefan Leue 2001

8 Proof of Lemma 2 (cont.) this implies that there must be a pair of adjacent initial configurations C k and C l so that C k is 0-valent and C l is 1-valent ilet p the process in which C k and C l differ iconsider an admissible deciding run from C k in which p takes no steps, and let σ the associated schedule idue to commutativity (c.f. Lemma 1), σ can also be applied to C l icorresponding configurations in the two runs are identical except for the internal state of p ieasy to see that both runs eventually reach the same decision value * if decision value is 1, then C k is bivalent, * otherwise, C l is bivalent, which establishes a contradiction to the assumption of nonexistence of bivalent initial configurations Distributed Systems - Fall 2001 IV - 97 Stefan Leue 2001

9 Lemma 3 8 let C a bivalent configuration e = (p, m) an event applicable to C : set of configurations reachable from C without applying e Γ = e( ) = { e(e) E and e is applicable to E } 8 then, Γ contains a bivalent configuration Proof of Lemma 3 8 e is applicable to every E since e is applicable to C Definition of implies that e will never be executed during any run related to E messages can be delayed arbitrarily, i.e., e will remain enabled for the length of any run related to some E 8 assume Γ contains no bivalent configurations, i.e., every D Γis univalent goal: derive contradiction Distributed Systems - Fall 2001 IV - 98 Stefan Leue 2001

10 Proof of Lemma 3 (cont.) 8 let E i an i-valent configuration reachable from C, i {0, 1} inote that both E 1 and E 2 exist since C is bivalent 8 E i? if E i, then let F i the successor configuration of E i, i.e., if i = e(e i ) Γ if E i, ithen e was applied in reaching E i from C ihence, there is a configuration F i Γfrom which E i is reachable in either case, if i is i-valent since Γ contains no bivalent configurations, and ieither E i is reachable from F i, or F i is reachable from E i 8 since F i Γfor both i {0, 1}, Γ contains both 0- and 1-valent configurations Distributed Systems - Fall 2001 IV - 99 Stefan Leue 2001

11 Proof of Lemma 3 (cont.) 8 let C and C configurations. If C is the result of applying a single step to C, then we call C and C neighbours. 8 assumption: there exist neighbours C 0, C 1 such that D i = e(c i ) for both i {0, 1}, e = (p, m) to be proven by an inductive agument over p without loss of generality, we consider C 1 = e'(c 0 ), where e' = (p',m') i case 1: if p' p, then D 1 = e'(d 0 ) (by Lemma 1) * impossible, since any successor of a 0-valent configuration is a 0-valent configuration, hence a contradiction C 0 e e D 0 C 1 e e D 1 Distributed Systems - Fall 2001 IV Stefan Leue 2001

12 Proof of Lemma 3 (cont.) icase 2: if p = p, then consider any finite deciding run from C 0 in which p takes no steps * let σ the corresponding schedule, and let A = σ(c 0 ) * σ is applicable to D i (by Lemma 1) and it leads to an i-valent configuration E i = σ(d i ) for both i {0, 1} e C 0 e C 1 D D 0 1 A σ σ e e e E 0 E 1 * by Lemma 1, e(a) = E 0 and e(e (A)) = E 1 * hence, A is bivalent * this is a contradiction to the assumption that A is deciding, i.e., A must be univalent contradiction in both cases, hence Γ contains a bivalent configuration Distributed Systems - Fall 2001 IV Stefan Leue 2001 σ e

13 Proof of Theorem (cont.) 8 any deciding run from a bivalent initial configuration must possess a single step that goes from a bivalent to a univalent configuration this step determines the decision value 8 proceed to show that it is always possible to construct an admissible run in stages, starting form an initial configuration, that avoids taking such a deciding step: admissibility is ascertained by the following construction of a run imaintenance of a queue of processes, initially in arbitrary order iorder messages in message buffer according to time of sending, earliest first istage comprises one or more process steps * stage ends when the first process in the queue takes a step in which, if its message queue was not empty at the start of the stage, its earliest message is received by another process (which will eventually happen, according to our assumptions regarding the message system) * process is then moved to the back of the queue ithis construction means that in any infinite sequence of stages * every process performs infinitely many steps * every message will be delivered i.e., admissibility of the run Distributed Systems - Fall 2001 IV Stefan Leue 2001

14 Proof of Theorem (cont.) 8 proceed to show that it is always possible to construct a run in stages, starting form an initial configuration, that avoids taking such a deciding step: avoidance of decision making is ascertained by the following reasoning ilet C 0 a bivalent initial configuration * existance assured by Lemma 2 ifirst stage of run begins in C 0 ifollowing construction ensures that every following stage of the run starts in a bivalent configuration isequence of steps from C to C' to form a stage: * let C bivalent, and p to be at the head of the process queue * let m the earliest message to p in the message buffer, if any, and otherwise * let e = (p, m) * there is a bivalent configuration C' reachable from C by a schedule in which e is the last event applied (by Lemma 3) ieach stage leads to bivalent configuration, hence an infinite schedule consisting of steps as described above is possible 8 hence, there is an admissible infinite run of steps in which no decision is ever reached, and hence P is not totally correct (contradiction) = Distributed Systems - Fall 2001 IV Stefan Leue 2001

15 Some Remarks 8 theorem applies to a weak form of consensus in which only some of the correct processes need to have made a decision (c.f. definition of decision value) 8 the naive solution in which always the same decision value is chosen is ruled out (c.f. definition of partial correctness) Distributed Systems - Fall 2001 IV Stefan Leue 2001

16 Consensus Impossibility of Agreement in Asynchronous Systems 8 consequences in asynchronous systems, no solution to BG, IC, TOR-multicast 8 of course, in practice consensus can often be reached, but a residual probability that consensus cannot be reached remains 8 possible approaches to reaching consensus by weakening system assumptions partial synchrony masking faults modified failure detectors randomized algorithms Distributed Systems - Fall 2001 IV Stefan Leue 2001

17 Consensus Impossibility of Agreement in Asynchronous Systems 8 partial synchrony message delays are bounded, but bound unknown known bound, but longer transmission delays for some, finite, initial period of time 8 masking faults design system so that failures appear like intermittent slowdown in processing of messages istore system state on persistent storage before crash irestart system in that state after recovery Distributed Systems - Fall 2001 IV Stefan Leue 2001

18 Consensus Impossibility of Agreement in Asynchronous Systems 8 modified failure detectors in ISIS system (Birman, 1993) ideem process that has not responded as failed itreat this process as fail-safe, i.e., discard any subsequent messages from this process iproblems: * long timeouts necessary * false negatives possible that reduce effectiveness of system eventually weak failure detector (Chandra and Toueg, 1996) iconsensus can be solved, even with a weak failure detector, if fewer than N/2 processes crash and communication is reliable ieventually weak failure detector * eventually weakly complete: each faulty process is eventually suspected permanently * eventually weakly accurate: after some time, at least one correct process is never suspected by any correct process ieventually weak failure detector cannot be implemented in asynchronous system based on message passing, however, failure detectors adapting timeout values can come close to ewfd s Distributed Systems - Fall 2001 IV Stefan Leue 2001

Consensus Problem. Pradipta De

Consensus Problem. Pradipta De Consensus Problem Slides are based on the book chapter from Distributed Computing: Principles, Paradigms and Algorithms (Chapter 14) by Kshemkalyani and Singhal Pradipta De pradipta.de@sunykorea.ac.kr

More information

Consensus. Chapter Two Friends. 8.3 Impossibility of Consensus. 8.2 Consensus 8.3. IMPOSSIBILITY OF CONSENSUS 55

Consensus. Chapter Two Friends. 8.3 Impossibility of Consensus. 8.2 Consensus 8.3. IMPOSSIBILITY OF CONSENSUS 55 8.3. IMPOSSIBILITY OF CONSENSUS 55 Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of

More information

Consensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS

Consensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS 16 CHAPTER 2. CONSENSUS Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of a node. Chapter

More information

Consensus, impossibility results and Paxos. Ken Birman

Consensus, impossibility results and Paxos. Ken Birman Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

The Wait-Free Hierarchy

The Wait-Free Hierarchy Jennifer L. Welch References 1 M. Herlihy, Wait-Free Synchronization, ACM TOPLAS, 13(1):124-149 (1991) M. Fischer, N. Lynch, and M. Paterson, Impossibility of Distributed Consensus with One Faulty Process,

More information

Distributed systems. Consensus

Distributed systems. Consensus Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory Consensus B A C 2 Consensus In the consensus problem, the processes propose values and have to agree on one among these

More information

CS505: Distributed Systems

CS505: Distributed Systems Department of Computer Science CS505: Distributed Systems Lecture 13: Distributed Transactions Outline Distributed Transactions Two Phase Commit and Three Phase Commit Non-blocking Atomic Commit with P

More information

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks.

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks. Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

Semi-Passive Replication in the Presence of Byzantine Faults

Semi-Passive Replication in the Presence of Byzantine Faults Semi-Passive Replication in the Presence of Byzantine Faults HariGovind V. Ramasamy Adnan Agbaria William H. Sanders University of Illinois at Urbana-Champaign 1308 W. Main Street, Urbana IL 61801, USA

More information

6.852: Distributed Algorithms Fall, Class 21

6.852: Distributed Algorithms Fall, Class 21 6.852: Distributed Algorithms Fall, 2009 Class 21 Today s plan Wait-free synchronization. The wait-free consensus hierarchy Universality of consensus Reading: [Herlihy, Wait-free synchronization] (Another

More information

The Relative Power of Synchronization Methods

The Relative Power of Synchronization Methods Chapter 5 The Relative Power of Synchronization Methods So far, we have been addressing questions of the form: Given objects X and Y, is there a wait-free implementation of X from one or more instances

More information

Point-Set Topology for Impossibility Results in Distributed Computing. Thomas Nowak

Point-Set Topology for Impossibility Results in Distributed Computing. Thomas Nowak Point-Set Topology for Impossibility Results in Distributed Computing Thomas Nowak Overview Introduction Safety vs. Liveness First Example: Wait-Free Shared Memory Message Omission Model Execution Trees

More information

Distributed Algorithms Benoît Garbinato

Distributed Algorithms Benoît Garbinato Distributed Algorithms Benoît Garbinato 1 Distributed systems networks distributed As long as there were no machines, programming was no problem networks distributed at all; when we had a few weak computers,

More information

A class C of objects is universal for some set E of classes if and only if any object in E can be implemented with objects of C (and registers)

A class C of objects is universal for some set E of classes if and only if any object in E can be implemented with objects of C (and registers) Universality A class C of objects is universal for some set E of classes if and only if any object in E can be implemented with objects of C (and registers) n-consensus is universal for n-objects (objects

More information

Research Report. (Im)Possibilities of Predicate Detection in Crash-Affected Systems. RZ 3361 (# 93407) 20/08/2001 Computer Science 27 pages

Research Report. (Im)Possibilities of Predicate Detection in Crash-Affected Systems. RZ 3361 (# 93407) 20/08/2001 Computer Science 27 pages RZ 3361 (# 93407) 20/08/2001 Computer Science 27 pages Research Report (Im)Possibilities of Predicate Detection in Crash-Affected Systems Felix C. Gärtner and Stefan Pleisch Department of Computer Science

More information

CSE 5306 Distributed Systems. Fault Tolerance

CSE 5306 Distributed Systems. Fault Tolerance CSE 5306 Distributed Systems Fault Tolerance 1 Failure in Distributed Systems Partial failure happens when one component of a distributed system fails often leaves other components unaffected A failure

More information

Asynchronous Models. Chapter Asynchronous Processes States, Inputs, and Outputs

Asynchronous Models. Chapter Asynchronous Processes States, Inputs, and Outputs Chapter 3 Asynchronous Models 3.1 Asynchronous Processes Like a synchronous reactive component, an asynchronous process interacts with other processes via inputs and outputs, and maintains an internal

More information

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input Initial Assumptions Modern Distributed Computing Theory and Applications Ioannis Chatzigiannakis Sapienza University of Rome Lecture 4 Tuesday, March 6, 03 Exercises correspond to problems studied during

More information

Specifying and Proving Broadcast Properties with TLA

Specifying and Proving Broadcast Properties with TLA Specifying and Proving Broadcast Properties with TLA William Hipschman Department of Computer Science The University of North Carolina at Chapel Hill Abstract Although group communication is vitally important

More information

Self-stabilizing Byzantine Digital Clock Synchronization

Self-stabilizing Byzantine Digital Clock Synchronization Self-stabilizing Byzantine Digital Clock Synchronization Ezra N. Hoch, Danny Dolev and Ariel Daliot The Hebrew University of Jerusalem We present a scheme that achieves self-stabilizing Byzantine digital

More information

Asynchronous Reconfiguration for Paxos State Machines

Asynchronous Reconfiguration for Paxos State Machines Asynchronous Reconfiguration for Paxos State Machines Leander Jehl and Hein Meling Department of Electrical Engineering and Computer Science University of Stavanger, Norway Abstract. This paper addresses

More information

Consensus in the Presence of Partial Synchrony

Consensus in the Presence of Partial Synchrony Consensus in the Presence of Partial Synchrony CYNTHIA DWORK AND NANCY LYNCH.Massachusetts Institute of Technology, Cambridge, Massachusetts AND LARRY STOCKMEYER IBM Almaden Research Center, San Jose,

More information

CHAPTER 8. Copyright Cengage Learning. All rights reserved.

CHAPTER 8. Copyright Cengage Learning. All rights reserved. CHAPTER 8 RELATIONS Copyright Cengage Learning. All rights reserved. SECTION 8.3 Equivalence Relations Copyright Cengage Learning. All rights reserved. The Relation Induced by a Partition 3 The Relation

More information

Discharging and reducible configurations

Discharging and reducible configurations Discharging and reducible configurations Zdeněk Dvořák March 24, 2018 Suppose we want to show that graphs from some hereditary class G are k- colorable. Clearly, we can restrict our attention to graphs

More information

Byzantine Consensus in Directed Graphs

Byzantine Consensus in Directed Graphs Byzantine Consensus in Directed Graphs Lewis Tseng 1,3, and Nitin Vaidya 2,3 1 Department of Computer Science, 2 Department of Electrical and Computer Engineering, and 3 Coordinated Science Laboratory

More information

Failure Tolerance. Distributed Systems Santa Clara University

Failure Tolerance. Distributed Systems Santa Clara University Failure Tolerance Distributed Systems Santa Clara University Distributed Checkpointing Distributed Checkpointing Capture the global state of a distributed system Chandy and Lamport: Distributed snapshot

More information

A Timing Assumption and a t-resilient Protocol for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems

A Timing Assumption and a t-resilient Protocol for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems A Timing Assumption and a t-resilient Protocol for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems Antonio FERNÁNDEZ y Ernesto JIMÉNEZ z Michel RAYNAL? Gilles TRÉDAN? y LADyR,

More information

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast HariGovind V. Ramasamy Christian Cachin August 19, 2005 Abstract Atomic broadcast is a communication primitive that allows a group of

More information

21. Distributed Algorithms

21. Distributed Algorithms 21. Distributed Algorithms We dene a distributed system as a collection of individual computing devices that can communicate with each other [2]. This denition is very broad, it includes anything, from

More information

Module 11. Directed Graphs. Contents

Module 11. Directed Graphs. Contents Module 11 Directed Graphs Contents 11.1 Basic concepts......................... 256 Underlying graph of a digraph................ 257 Out-degrees and in-degrees.................. 258 Isomorphism..........................

More information

Byzantine Failures. Nikola Knezevic. knl

Byzantine Failures. Nikola Knezevic. knl Byzantine Failures Nikola Knezevic knl Different Types of Failures Crash / Fail-stop Send Omissions Receive Omissions General Omission Arbitrary failures, authenticated messages Arbitrary failures Arbitrary

More information

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Yong-Hwan Cho, Sung-Hoon Park and Seon-Hyong Lee School of Electrical and Computer Engineering, Chungbuk National

More information

CSE 486/586 Distributed Systems

CSE 486/586 Distributed Systems CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo CSE 486/586 Recap: Consensus On a synchronous system There s an algorithm that works. On

More information

A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment

A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment Michel RAYNAL IRISA, Campus de Beaulieu 35042 Rennes Cedex (France) raynal @irisa.fr Abstract This paper considers

More information

Quiescent Consensus in Mobile Ad-hoc Networks using Eventually Storage-Free Broadcasts

Quiescent Consensus in Mobile Ad-hoc Networks using Eventually Storage-Free Broadcasts Quiescent Consensus in Mobile Ad-hoc Networks using Eventually Storage-Free Broadcasts François Bonnet Département Info & Télécom, École Normale Supérieure de Cachan, France Paul Ezhilchelvan School of

More information

Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer

Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors Michel Raynal, Julien Stainer Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors

More information

HW Graph Theory SOLUTIONS (hbovik) - Q

HW Graph Theory SOLUTIONS (hbovik) - Q 1, Diestel 9.3: An arithmetic progression is an increasing sequence of numbers of the form a, a+d, a+ d, a + 3d.... Van der Waerden s theorem says that no matter how we partition the natural numbers into

More information

A MODULAR FRAMEWORK TO IMPLEMENT FAULT TOLERANT DISTRIBUTED SERVICES. P. Nicolas Kokkalis

A MODULAR FRAMEWORK TO IMPLEMENT FAULT TOLERANT DISTRIBUTED SERVICES. P. Nicolas Kokkalis A MODULAR FRAMEWORK TO IMPLEMENT FAULT TOLERANT DISTRIBUTED SERVICES by P. Nicolas Kokkalis A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate Department

More information

Fault-Tolerant Distributed Consensus

Fault-Tolerant Distributed Consensus Fault-Tolerant Distributed Consensus Lawrence Kesteloot January 20, 1995 1 Introduction A fault-tolerant system is one that can sustain a reasonable number of process or communication failures, both intermittent

More information

Generic Proofs of Consensus Numbers for Abstract Data Types

Generic Proofs of Consensus Numbers for Abstract Data Types Generic Proofs of Consensus Numbers for Abstract Data Types Edward Talmage and Jennifer Welch Parasol Laboratory, Texas A&M University College Station, Texas, USA etalmage@tamu.edu, welch@cse.tamu.edu

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q Coordination 1 To do q q q Mutual exclusion Election algorithms Next time: Global state Coordination and agreement in US Congress 1798-2015 Process coordination How can processes coordinate their action?

More information

ACONCURRENT system may be viewed as a collection of

ACONCURRENT system may be viewed as a collection of 252 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 3, MARCH 1999 Constructing a Reliable Test&Set Bit Frank Stomp and Gadi Taubenfeld AbstractÐThe problem of computing with faulty

More information

Distributed Systems (ICE 601) Fault Tolerance

Distributed Systems (ICE 601) Fault Tolerance Distributed Systems (ICE 601) Fault Tolerance Dongman Lee ICU Introduction Failure Model Fault Tolerance Models state machine primary-backup Class Overview Introduction Dependability availability reliability

More information

CSE 5306 Distributed Systems

CSE 5306 Distributed Systems CSE 5306 Distributed Systems Fault Tolerance Jia Rao http://ranger.uta.edu/~jrao/ 1 Failure in Distributed Systems Partial failure Happens when one component of a distributed system fails Often leaves

More information

Eventually k-bounded Wait-Free Distributed Daemons

Eventually k-bounded Wait-Free Distributed Daemons Eventually k-bounded Wait-Free Distributed Daemons Yantao Song and Scott M. Pike Texas A&M University Department of Computer Science College Station, TX 77843-3112, USA {yantao, pike}@tamu.edu Technical

More information

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski Distributed Systems 09. State Machine Replication & Virtual Synchrony Paul Krzyzanowski Rutgers University Fall 2016 1 State machine replication 2 State machine replication We want high scalability and

More information

Graph Theory Questions from Past Papers

Graph Theory Questions from Past Papers Graph Theory Questions from Past Papers Bilkent University, Laurence Barker, 19 October 2017 Do not forget to justify your answers in terms which could be understood by people who know the background theory

More information

Election Algorithms. has elected i. will eventually set elected i

Election Algorithms. has elected i. will eventually set elected i Election Algorithms Election 8 algorithm designed to designate one unique rocess out of a set of rocesses with similar caabilities to take over certain functions in a distributes system central server

More information

Chapter S:V. V. Formal Properties of A*

Chapter S:V. V. Formal Properties of A* Chapter S:V V. Formal Properties of A* Properties of Search Space Graphs Auxiliary Concepts Roadmap Completeness of A* Admissibility of A* Efficiency of A* Monotone Heuristic Functions S:V-1 Formal Properties

More information

Incompatibility Dimensions and Integration of Atomic Commit Protocols

Incompatibility Dimensions and Integration of Atomic Commit Protocols The International Arab Journal of Information Technology, Vol. 5, No. 4, October 2008 381 Incompatibility Dimensions and Integration of Atomic Commit Protocols Yousef Al-Houmaily Department of Computer

More information

Consensus and related problems

Consensus and related problems Consensus and related problems Today l Consensus l Google s Chubby l Paxos for Chubby Consensus and failures How to make process agree on a value after one or more have proposed what the value should be?

More information

Efficient Reductions for Wait-Free Termination Detection in Faulty Distributed Systems

Efficient Reductions for Wait-Free Termination Detection in Faulty Distributed Systems Aachen Department of Computer Science Technical Report Efficient Reductions for Wait-Free Termination Detection in Faulty Distributed Systems Neeraj Mittal, S. Venkatesan, Felix Freiling and Lucia Draque

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

Distributed predicate detection in a faulty environment. The University of Texas at Austin, tion. Since failure detection is a special case of general

Distributed predicate detection in a faulty environment. The University of Texas at Austin, tion. Since failure detection is a special case of general Distributed predicate detection in a faulty environment Vijay K. Garg J. Roger Mitchell y http://maple.ece.utexas.edu Department of Electrical and Computer Engineering The University of Texas at Austin,

More information

Active leave behavior of members in a fault-tolerant group

Active leave behavior of members in a fault-tolerant group 260 Science in China Ser. F Information Sciences 2004 Vol.47 No.2 260 272 Active leave behavior of members in a fault-tolerant group WANG Yun Department of Computer Science and Engineering, Southeast University,

More information

Distributed Algorithms Failure detection and Consensus. Ludovic Henrio CNRS - projet SCALE

Distributed Algorithms Failure detection and Consensus. Ludovic Henrio CNRS - projet SCALE Distributed Algorithms Failure detection and Consensus Ludovic Henrio CNRS - projet SCALE ludovic.henrio@cnrs.fr Acknowledgement The slides for this lecture are based on ideas and materials from the following

More information

CSE 486/586 Distributed Systems

CSE 486/586 Distributed Systems CSE 486/586 Distributed Systems Failure Detectors Slides by: Steve Ko Computer Sciences and Engineering University at Buffalo Administrivia Programming Assignment 2 is out Please continue to monitor Piazza

More information

An Anonymous Self-Stabilizing Algorithm For 1-Maximal Matching in Trees

An Anonymous Self-Stabilizing Algorithm For 1-Maximal Matching in Trees An Anonymous Self-Stabilizing Algorithm For 1-Maximal Matching in Trees Wayne Goddard, Stephen T. Hedetniemi Department of Computer Science, Clemson University {goddard,hedet}@cs.clemson.edu Zhengnan Shi

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

The Fault Detection Problem

The Fault Detection Problem The Fault Detection Problem Andreas Haeberlen 1 and Petr Kuznetsov 2 1 Max Planck Institute for Software Systems (MPI-SWS) 2 TU Berlin / Deutsche Telekom Laboratories Abstract. One of the most important

More information

From Bounded to Unbounded Concurrency Objects and Back

From Bounded to Unbounded Concurrency Objects and Back From Bounded to Unbounded Concurrency Objects and Back Yehuda Afek afek@post.tau.ac.il Adam Morrison adamx@post.tau.ac.il School of Computer Science Tel Aviv University Guy Wertheim vgvertex@gmail.com

More information

Notes for Recitation 8

Notes for Recitation 8 6.04/8.06J Mathematics for Computer Science October 5, 00 Tom Leighton and Marten van Dijk Notes for Recitation 8 Build-up error Recall a graph is connected iff there is a path between every pair of its

More information

Discrete Mathematics Lecture 4. Harper Langston New York University

Discrete Mathematics Lecture 4. Harper Langston New York University Discrete Mathematics Lecture 4 Harper Langston New York University Sequences Sequence is a set of (usually infinite number of) ordered elements: a 1, a 2,, a n, Each individual element a k is called a

More information

Failures, Elections, and Raft

Failures, Elections, and Raft Failures, Elections, and Raft CS 8 XI Copyright 06 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved. Distributed Banking SFO add interest based on current balance PVD deposit $000 CS 8 XI Copyright

More information

09 B: Graph Algorithms II

09 B: Graph Algorithms II Correctness and Complexity of 09 B: Graph Algorithms II CS1102S: Data Structures and Algorithms Martin Henz March 19, 2010 Generated on Thursday 18 th March, 2010, 00:20 CS1102S: Data Structures and Algorithms

More information

Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms

Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms Holger Karl Computer Networks Group Universität Paderborn Goal of this chapter Apart from issues in distributed time and resulting

More information

Optimistic Asynchronous Atomic Broadcast

Optimistic Asynchronous Atomic Broadcast Optimistic Asynchronous Atomic Broadcast Klaus Kursawe Victor Shoup IBM Research Zurich Research Laboratory CH-8803 Rüschlikon, Switzerland {kku,sho}@zurich.ibm.com April 19, 2002 Abstract This paper presents

More information

To do. Consensus and related problems. q Failure. q Raft

To do. Consensus and related problems. q Failure. q Raft Consensus and related problems To do q Failure q Consensus and related problems q Raft Consensus We have seen protocols tailored for individual types of consensus/agreements Which process can enter the

More information

Introduction to Distributed Systems Seif Haridi

Introduction to Distributed Systems Seif Haridi Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send

More information

Fault Tolerance. Distributed Software Systems. Definitions

Fault Tolerance. Distributed Software Systems. Definitions Fault Tolerance Distributed Software Systems Definitions Availability: probability the system operates correctly at any given moment Reliability: ability to run correctly for a long interval of time Safety:

More information

arxiv: v1 [cs.dc] 6 May 2014

arxiv: v1 [cs.dc] 6 May 2014 Consensus with an Abstract MAC Layer Calvin Newport Georgetown University cnewport@cs.georgetown.edu arxiv:1405.1382v1 [cs.dc] 6 May 2014 Abstract In this paper, we study distributed consensus in the radio

More information

A Vizing-like theorem for union vertex-distinguishing edge coloring

A Vizing-like theorem for union vertex-distinguishing edge coloring A Vizing-like theorem for union vertex-distinguishing edge coloring Nicolas Bousquet, Antoine Dailly, Éric Duchêne, Hamamache Kheddouci, Aline Parreau Abstract We introduce a variant of the vertex-distinguishing

More information

arxiv:submit/ [math.co] 9 May 2011

arxiv:submit/ [math.co] 9 May 2011 arxiv:submit/0243374 [math.co] 9 May 2011 Connectivity and tree structure in finite graphs J. Carmesin R. Diestel F. Hundertmark M. Stein 6 May, 2011 Abstract We prove that, for every integer k 0, every

More information

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need

More information

Generating Fast Indulgent Algorithms

Generating Fast Indulgent Algorithms Generating Fast Indulgent Algorithms Dan Alistarh 1, Seth Gilbert 2, Rachid Guerraoui 1, and Corentin Travers 3 1 EPFL, Switzerland 2 National University of Singapore 3 Université de Bordeaux 1, France

More information

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change

More information

Lecture 1. 1 Notation

Lecture 1. 1 Notation Lecture 1 (The material on mathematical logic is covered in the textbook starting with Chapter 5; however, for the first few lectures, I will be providing some required background topics and will not be

More information

arxiv: v2 [cs.dc] 12 Sep 2017

arxiv: v2 [cs.dc] 12 Sep 2017 Efficient Synchronous Byzantine Consensus Ittai Abraham 1, Srinivas Devadas 2, Danny Dolev 3, Kartik Nayak 4, and Ling Ren 2 arxiv:1704.02397v2 [cs.dc] 12 Sep 2017 1 VMware Research iabraham@vmware.com

More information

Sources for this lecture 2. Shortest paths and minimum spanning trees

Sources for this lecture 2. Shortest paths and minimum spanning trees S-72.2420 / T-79.5203 Shortest paths and minimum spanning trees 1 S-72.2420 / T-79.5203 Shortest paths and minimum spanning trees 3 Sources for this lecture 2. Shortest paths and minimum spanning trees

More information

Computing with Infinitely Many Processes

Computing with Infinitely Many Processes Computing with Infinitely Many Processes Michael Merritt Gadi Taubenfeld December 5, 2013 Abstract We explore four classic problems in concurrent computing (election, mutual exclusion, consensus, and naming)

More information

Time and Space Lower Bounds for Implementations Using k-cas

Time and Space Lower Bounds for Implementations Using k-cas Time and Space Lower Bounds for Implementations Using k-cas Hagit Attiya Danny Hendler September 12, 2006 Abstract This paper presents lower bounds on the time- and space-complexity of implementations

More information

The challenges of non-stable predicates. The challenges of non-stable predicates. The challenges of non-stable predicates

The challenges of non-stable predicates. The challenges of non-stable predicates. The challenges of non-stable predicates The challenges of non-stable predicates Consider a non-stable predicate Φ encoding, say, a safety property. We want to determine whether Φ holds for our program. The challenges of non-stable predicates

More information

Recognizing Interval Bigraphs by Forbidden Patterns

Recognizing Interval Bigraphs by Forbidden Patterns Recognizing Interval Bigraphs by Forbidden Patterns Arash Rafiey Simon Fraser University, Vancouver, Canada, and Indiana State University, IN, USA arashr@sfu.ca, arash.rafiey@indstate.edu Abstract Let

More information

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University Fault Tolerance Dr. Yong Guan Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University Outline for Today s Talk Basic Concepts Process Resilience Reliable

More information

Concepts. Techniques for masking faults. Failure Masking by Redundancy. CIS 505: Software Systems Lecture Note on Consensus

Concepts. Techniques for masking faults. Failure Masking by Redundancy. CIS 505: Software Systems Lecture Note on Consensus CIS 505: Software Systems Lecture Note on Consensus Insup Lee Department of Computer and Information Science University of Pennsylvania CIS 505, Spring 2007 Concepts Dependability o Availability ready

More information

Fault Tolerance Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Fault Tolerance Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University Fault Tolerance Part II CS403/534 Distributed Systems Erkay Savas Sabanci University 1 Reliable Group Communication Reliable multicasting: A message that is sent to a process group should be delivered

More information

SANDRA SPIROFF AND CAMERON WICKHAM

SANDRA SPIROFF AND CAMERON WICKHAM A ZERO DIVISOR GRAPH DETERMINED BY EQUIVALENCE CLASSES OF ZERO DIVISORS arxiv:0801.0086v2 [math.ac] 17 Aug 2009 SANDRA SPIROFF AND CAMERON WICKHAM Abstract. We study the zero divisor graph determined by

More information

Idit Keidar y Danny Dolev z. The Hebrew University of Jerusalem, Abstract

Idit Keidar y Danny Dolev z. The Hebrew University of Jerusalem, Abstract Increasing the Resilience of Distributed and Replicated Database Systems Idit Keidar y Danny Dolev z Institute of Computer Science, The Hebrew University of Jerusalem, Jerusalem, Israel, 91904 E-mail:

More information

This article was originally published in a journal published by Elsevier, and the attached copy is provided by Elsevier for the author s benefit and for the benefit of the author s institution, for non-commercial

More information

Distributed minimum spanning tree problem

Distributed minimum spanning tree problem Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with

More information

COORDINATION PROBLEMS IN DISTRIBUTED SYSTEMS (Extended Abstract) Danny Dolev. Ray Strong. IBM Almaden Research Center

COORDINATION PROBLEMS IN DISTRIBUTED SYSTEMS (Extended Abstract) Danny Dolev. Ray Strong. IBM Almaden Research Center COORDINATION PROBLEMS IN DISTRIBUTED SYSTEMS (Extended Abstract) Danny Dolev Ray Strong IBM Almaden Research Center Abstract: In this paper we provide a framework for understanding and comparing various

More information

C 1. Today s Question. CSE 486/586 Distributed Systems Failure Detectors. Two Different System Models. Failure Model. Why, What, and How

C 1. Today s Question. CSE 486/586 Distributed Systems Failure Detectors. Two Different System Models. Failure Model. Why, What, and How CSE 486/586 Distributed Systems Failure Detectors Today s Question I have a feeling that something went wrong Steve Ko Computer Sciences and Engineering University at Buffalo zzz You ll learn new terminologies,

More information

Capacity of Byzantine Agreement: Complete Characterization of Four-Node Networks

Capacity of Byzantine Agreement: Complete Characterization of Four-Node Networks Capacity of Byzantine Agreement: Complete Characterization of Four-Node Networks Guanfeng Liang and Nitin Vaidya Department of Electrical and Computer Engineering, and Coordinated Science Laboratory University

More information

Distributed Algorithms Reliable Broadcast

Distributed Algorithms Reliable Broadcast Distributed Algorithms Reliable Broadcast Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents

More information

Distributed Systems (5DV147)

Distributed Systems (5DV147) Distributed Systems (5DV147) Fundamentals Fall 2013 1 basics 2 basics Single process int i; i=i+1; 1 CPU - Steps are strictly sequential - Program behavior & variables state determined by sequence of operations

More information

Computer Science Technical Report

Computer Science Technical Report Computer Science Technical Report Feasibility of Stepwise Addition of Multitolerance to High Atomicity Programs Ali Ebnenasir and Sandeep S. Kulkarni Michigan Technological University Computer Science

More information

arxiv:cs/ v1 [cs.dc] 22 Nov 2006

arxiv:cs/ v1 [cs.dc] 22 Nov 2006 Discovering Network Topology in the Presence of Byzantine Faults Mikhail Nesterenko 1 and Sébastien Tixeuil 2 arxiv:cs/0611116v1 [cs.dc] 22 Nov 2006 1 Computer Science Department, Kent State University,

More information

Adjacent: Two distinct vertices u, v are adjacent if there is an edge with ends u, v. In this case we let uv denote such an edge.

Adjacent: Two distinct vertices u, v are adjacent if there is an edge with ends u, v. In this case we let uv denote such an edge. 1 Graph Basics What is a graph? Graph: a graph G consists of a set of vertices, denoted V (G), a set of edges, denoted E(G), and a relation called incidence so that each edge is incident with either one

More information

Assignment 12: Commit Protocols and Replication Solution

Assignment 12: Commit Protocols and Replication Solution Data Modelling and Databases Exercise dates: May 24 / May 25, 2018 Ce Zhang, Gustavo Alonso Last update: June 04, 2018 Spring Semester 2018 Head TA: Ingo Müller Assignment 12: Commit Protocols and Replication

More information