Generating Fast Indulgent Algorithms

Size: px
Start display at page:

Download "Generating Fast Indulgent Algorithms"

Transcription

1 Generating Fast Indulgent Algorithms Dan Alistarh 1, Seth Gilbert 2, Rachid Guerraoui 1, and Corentin Travers 3 1 EPFL, Switzerland 2 National University of Singapore 3 Université de Bordeaux 1, France Abstract. Synchronous distributed algorithms are easier to design and prove correct than algorithms that tolerate asynchrony. Yet, in the real world, networks experience asynchrony and other timing anomalies. In this paper, we address the question of how to efficiently transform an algorithm that relies on synchronization into an algorithm that tolerates asynchronous executions. We introduce a transformation technique from synchronous algorithms to indulgent algorithms [1], which induces only a constant overhead in terms of time complexity in well-behaved executions. Our technique is based on a new abstraction we call an asynchrony detector, which the participating processes implement collectively. The resulting transformation works for a large class of colorless tasks, including consensus and set agreement. Interestingly, we also show that our technique is relevant for colored tasks, by applying it to the renaming problem, to obtain the first indulgent renaming algorithm. 1 Introduction The feasibility and complexity of distributed tasks has been thoroughly studied both in the synchronous and asynchronous models. To better capture the properties of realworld systems, Dwork, Lynch, and Stockmeyer [2] proposed the partially synchronous model, in which the distributed system may alternate between synchronous to asynchronous periods. This line of research inspired the introduction of indulgent algorithms [1], i.e. algorithms that guarantee correctness and efficiency when the system is synchronous, and maintain safety even when the system is asynchronous. Several indulgent algorithms have been designed for specific distributed problems, such as consensus (e.g., [3, 4]). However, designing and proving correctness of such algorithms is usually a difficult task, especially if the algorithm has to provide good performance guarantees. Contribution. In this paper, we introduce a general transformation technique from synchronous algorithms to indulgent algorithms, which induces only a constant overhead in terms of time complexity. Our technique is based on a new primitive called an asynchrony detector, which identifies periods of asynchrony in a fault-prone asynchronous system. We showcase the resulting transformation to obtain indulgent algorithms for a large class of colorless agreement tasks, including consensus and set agreement. We also apply our transformation to the distinct class of colored tasks, to obtain the first indulgent renaming algorithm.

2 Detecting Asynchrony. Central to our technique is a new abstraction, called an asynchrony detector, which we design as a distributed service for detecting periods of asynchrony. The service detects asynchrony both at a local level, by determining whether the view of a process is consistent with a synchronous execution, and at a global level, by determining whether the collective view of a set of processes could have been observed in a synchronous execution. We present an implementation of an asynchrony detector, based on the idea that each process maintains a log of the messages sent and received, which it exchanges with other processes. This creates a view of the system for every process, which we use to detect asynchronous executions. The Transformation Technique. Based on this abstraction, we introduce a general technique allowing synchronous algorithms to tolerate asynchrony, while maintaining time efficiency in well-behaved executions. The main idea behind the transformation is the following: as long as the asynchrony detector signals a synchronous execution, processes run the synchronous algorithm. If the system is well behaved, then the synchronous algorithm yields an output, on which the process decides. Otherwise, if the detector notices asynchrony, we revert to an existing asynchronous backup algorithm with weaker termination and performance guarantees. Transforming Agreement Algorithms. We first showcase the technique by transforming algorithms for a large class of agreement tasks, called colorless tasks, which includes consensus and set agreement. Intuitively, a colorless task allows processes to adopt each other s output values without violating the task specification, while ensuring that every value returned has been proposed by a process. We show that any synchronous algorithm solving a colorless task can be made indulgent at the cost of two rounds of communication. For example, if a synchronous algorithm solves synchronous consensus in t+1 rounds, where t is the maximum number of crash failures (i.e. the algorithm it is time-optimal), then the resulting indulgent algorithm will solve consensus in t + 3 rounds if the system is initially synchronous, or will revert to a safe backup, e.g. Paxos [4, 5] or ASAP [6], otherwise. The crux of the technique is the hand-off procedure: we ensure that, if a process decides using the synchronous algorithm, any other process either decides or adopts a state which is consistent with the decision. In this second case, we show that a process can recover a consistent state by examining the views of other processes. The validity property will ensure that the backup protocol generates a valid output configuration. Transforming Renaming Algorithms. We also apply our technique to the renaming problem [7], and obtain the first indulgent renaming algorithm. Starting from the synchronous protocol of [8], our protocol renames in a tight namespace of N names and terminates in (log N + 3) rounds, in synchronous executions. In asynchronous executions, the protocol renames in a namespace of size N + t. Roadmap. In Section 2, we present the model, while Section 3 presents an overview of related work. We define asynchrony detectors in Section 4. Section 5 presents the transformation for colorless agreement tasks, while Section 6 applies it to the renaming problem. In Section 7 we discuss our results. Due to space limitations, the proofs of some basic results are omitted, and we present detailed sketches for some of the proofs.

3 2 Model We consider an eventually synchronous system with N processes Π = {p 1, p 2,..., p N }, in which t < N/2 processes may fail by crashing. Processes communicate via messagepassing in rounds, which we model much as in [3, 9, 10]. In particular, time is divided into rounds, which are synchronized. However, the system is asynchronous, i.e. there is no guarantee that a message sent in a round is also delivered in the same round. We do assume that processes receive at least N t messages in every round, and that a process always receives its own message in every round. Also, we assume that there exists a global stabilization time GST 0 after which the system becomes synchronous, i.e. every message is delivered in the same round in which it was sent. We denote such a system by ES(N, t). Although indulgent algorithms are designed to work in this asynchronous setting, they are optimized for the case in which the system is initially synchronous, i.e. when GST = 0. We denote the synchronous message-passing model with t < N failures by S(N, t). In case the system stabilizes at a later point in the execution, i.e. 0 < GST <, then the algorithms are still guaranteed to terminate, although they might be less efficient. If the system never stabilizes, i.e. GST =, indulgent algorithms might not terminate, although they always maintain safety. In the following, we say that an execution is synchronous if every message sent by a correct process in the course of the execution is delivered in the same round in which it was sent. Alternatively, if process p i receives a message m from process p j in round r 2, then every process received all messages sent by process p j in all rounds r < r. The view of a process p at a round r is given by the messages that p received at round r and in all previous rounds. We say that the view of process p is synchronous at round r if there exists an r-round synchronous execution which is indistinguishable from p s view at round r. 3 Related Work Starting with seminal work by Dwork, Lynch and Stockmeyer [2], a variety of different models have been introduced to express relaxations of the standard asynchronous model of computation. These include failure detectors [11], round-by-round fault detectors (RRFD) [12], and, more recently, indulgent algorithms [1]. In [3, 9], Guerraoui and Dutta address the complexity of indulgent consensus in the presence of an eventually perfect failure detector. They prove a tight lower bound of t + 2 rounds on the time complexity of the problem, even in synchronous runs, thus proving that there is an inherent price to tolerating asynchronous executions. Our approach is more general than that of this reference, since we transform a whole class of synchronous distributed algorithms, solving various tasks, into their indulgent counterparts. On the other hand, since our technique induces a delay of two rounds of communication over the synchronous algorithm, in the case of consensus, we miss the lower bound of t + 2 rounds by one round. Recent work studied the complexity of agreement problems, such as consensus [6] and k-set agreement [10], if the system becomes synchronous after an unknown stabilization time GST. In [6], the authors present a consensus algorithm that terminates in

4 f + 2 rounds after GST, where f is the number of failures in the system. In [10], the authors consider k-set agreement in the same setting, proving that t/k + 4 rounds after GST are enough for k-set agreement, and that at least t/k + 2 rounds are required. The algorithms from these references work with the same time complexity in the indulgent setting, where GST = 0. On the other hand, the transformation in the current paper does not immediately yield algorithms that would work in a window of synchrony. From the point of view of the technique, references [6, 10] also use the idea of detecting asynchrony as part of the algorithms, although this technique has been generalized in the current work to address a large family of distributed tasks. Reference [13] considered a setting in which failures stop after GST, in which case 3 rounds of communication are necessary and sufficient. Leader-based, Paxos-like algorithms, e.g. [4, 5], form another class of algorithms that tolerate asynchrony, and can also be seen as indulgent algorithms. A precise definition of colorless tasks is given in [14]. Note that, in this paper, we augment their definition to include the standard validity property (see Section 5). 4 Asynchrony Detectors An asynchrony detector is a distributed service that detects periods of asynchrony in an asynchronous system that may be initially synchronous. The service returns a YES/NO indication at the end of every round, and has the property that processes which receive YES at some round share a synchronous execution prefix. Next, we make this definition precise. Definition 1 (Asynchrony Detector). Let d be a positive integer. A d-delay asynchrony detector in ES(N, t) is a distributed service that, in every round r, returns either YES or NO, at each process. The detector ensures the following properties. (Local detection) If process p receives YES at round r, then there exists an r-round synchronous execution in which p has the same view as its current view at round r. (Global detection) For all processes that receive YES in round r, there exists an (r d)-round synchronous execution prefix S[1, 2,..., r d] that is indistinguishable from their views at the end of round r d. (Non-triviality) The detector never returns NO during a synchronous execution. The local detection property ensures that, if the detector returns YES, then there exists a synchronous execution consistent with the process view. On the other hand, the global detection property ensures that, for processes that receive YES from the detector, the (r R)-round execution prefix was synchronous enough, i.e. there exists a synchronous execution consistent with what these processes perceived during the prefix. The non-triviality property ensures that there are no false positives. 4.1 Implementing an Asynchrony Detector Next, we present an implementation of a 2-delay asynchrony detector in ES(N, t), which we call AD(2). The pseudocode is presented in Figure 1.

5 The main idea behind the detector, implemented in the process procedure, is that processes maintain a detailed view of the state of the system by aggregating all messages received in every round. For each round, each process maintains an Active set of processes, i.e. processes that sent at least one message in the round; all other processes are in the Failed set for that round (lines 2 4). Whenever a process receives a new message, it merges the contents of the Active and Failed sets of the sender with its own (lines 8 9). Asynchrony is detected by checking if there exists any process that is in the Active set in some round r, while being in the Failed set in some previous round r < r (lines 10 12). In the next round, each process sends its updated view of the system together with a synch flag, which was set to true, if asynchrony was detected procedure detector() i msg i ; synch i true; Active i [ ]; Failed i [ ]; for each round R c do send( msg i ) msgset i receive() (synch i, msg i) process(msgset i, R c) if synch i = true then output YES else output NO procedure process( msgset i, r ) i if synch i = true then Active i[r c] processes from which p i receives a message in round R c F ailed i[r c] processes from which p i did not receive a message in round R c if there exists p j msgset i with synch j = false then synch i false for every msg j msgset i do for round r from 1 to R c do Active i[r] msg j.active j[r] Active i[r] Failed i[r] msg j.failed j[r] Failed i[r] for round r from 1 to R c 1 do for round k from r + 1 to R c do if (Active i[k] Failed i[r] ) then synch i false if synch i = true then msg i (synch i, (Active i[r]) r [1,Rc], (F ailed i[r]) r [1,Rc]) else msg i (synch i,, ) return (synch i, msg i) Fig. 1. The AD(2) asynchrony detection protocol. 4.2 Proof of Correctness In this section, we prove that the protocol presented in the Section 4.1 satisfies the definition of an asynchrony detector. First, to see that the local detection condition is satisfied, notice that the contents of the Active and Failed sets at each process p can be used to construct a synchronous execution which is coherent with process p s view.

6 In the following, we focus on the global detection property. We show that, for a fixed round r > 0, given a set of processes P Π that receive YES from AD(2) at the end of round r + 2, there exists an r-round synchronous execution S[1, r] such that the views of processes in P at the end of round r are consistent with S[1, r]. We begin by proving that if two processes receive YES from the asynchronous detector in round r + 2, then they must have received eachother s round r + 1 messages, either directly, or through a relay. Note that, because of the round structure, a process s round r + 1 message only contains information that it has acquired up to round r. In the following, we will use a superscript notation to denote the round at which the local variables are seen. For example, Active r+2 q [r + 1] denotes the set Active[r + 2] at process q, as seen from the end of round r + 2. Lemma 1. Let p and q be two processes that receive YES from AD(2) at the end of round r + 2. Then p Activeq r+2 [r + 1] and q Active r+2 p [r + 1]. Proof. We prove that p Active r+2 q [r + 1] the proof of the second statement is symmetric. Assume, for the sake of contradiction, that p / Active r+2 q [r + 1]. Then, by lines 8 9 of the process() procedure, none of the processes that send a message to q in round r + 2 received a message from p in round r + 1. However, this set of processes contains at least N t > t elements, and therefore, in round r + 2, process p receives a message from at least one process that did not receive a message from p in round r + 1. Therefore p Active r+2 p [r + 2] F ailed r+2 p [r + 1] (recall that p receives its own message in every round). Following the process() procedure for p, we obtain that synch p = false in round r + 2, which means that process p receives NO from AD(2) in round r + 2, contradiction. Lemma 2. Let p and q be two processes in P. Then, for all rounds k < l r, Active r p[l] F ailed r q[k] =, and Active r p[l] F ailed r q[k] =, where the Active and Failed sets are seen from the end of round r. Proof. We prove that, given r l > k, Active r p[l] F ailed r q[k] =. Assume, for the sake of contradiction, that there exist rounds k < l r and a processor s such that s Active r p[l] F ailed r q[k]. Lemma 1 ensures that p and q communicate in round r + 1, therefore it follows that s F ailed r+2 p [k]. This means that s Active r+2 p [l] F ailed r+2 p [k], for k < l, therefore p cannot receive YES in round r + 2, contradiction. The next lemma provides a sufficient condition for a set of processes to share a synchronous execution up to the end of some round R. The proof follows from the observation that the required synchronous execution E can be constructed by exactly following the contents of the Active and Failed sets by processes at every round in the execution. Lemma 3. Let E be an R-round execution in ES(N, t), and P be a set of processes in Π such that, at the end of round R, the following two properties are satisfied: 1. For any p and q in P, and any round r {1, 2,..., R 1}, Active R p [r + 1] F ailed R q [r] =. 2. p P ActiveR p [R] N t.

7 Then there exists a synchronous execution E which is indistinguishable from the views of processes in P at the end of round R. Finally, we prove that if a set of processes P receive YES from AD(2) at the end of some round R+2, then there exists a synchronous execution consistent with their views at the end of round R, for any R > 0, i.e. that AD(2) is indeed a 2-round asynchrony detector. The proof follows from the previous results. Lemma 4. Let R > 0 be a round and P be a set of processes that receive YES from AD(2) at the end of round R + 2. Then there exists a synchronous execution consistent with their views at the end of round R. 5 Generating Indulgent Algorithms for Colorless Tasks 5.1 Task Definition In the following, a task is a tuple (I, O, ), where I is the set of vectors of input values, O is a set of vectors of output values, and is a total relation from I to O. A solution to a task, given an input vector I, yields an output vector O O such that O (I). Intuitively, a colorless task is a terminating task in which any process can adopt any input or output value of any other process, without violating the task specification, and in which any (decided) output value is a (proposed) input value. We also assume that the output values have to verify a predicate P, such as agreement or k-agreement. For example, in the case of consensus, the predicate P states that all output values should be equal. Let val(v ) be the set of values in a vector V. We precisely define this family of tasks as follows. A colorless task satisfies the following properties: (1) Termination: every correct process eventually outputs; (2) Validity: for every O (I), val(o) val(i); (3) The Colorless property: If O (I), then for every I with val(i ) val(i) : I I and (I ) (I). Also, for every O with val(o ) val(o) : O O and O (I). Finally, we assume that the outputs satisfy a generic property (4) Output Predicate: every O O satisfies a given predicate P. Consensus and k-set agreement are canonical examples of colorless tasks. 5.2 Transformation Description We present an emulation technique that generates an indulgent protocol in ES(N, t) out of any protocol in S(N, t) solving a given colorless task T, at the cost of two communication rounds. If the system is not synchronous, the generated protocol will run a given backup protocol Backup which ensures safety, even in asynchronous executions. For example, if an protocol solves synchronous consensus in t + 1 rounds (i.e. it is optimal), then the resulting protocol will solve consensus in t + 3 rounds if the system is initially synchronous. Otherwise, the protocol reverts to a safe backup, e.g. Paxos [5], or ASAP [6]. We fix a protocol A solving a colorless task in the synchronous model S(N, t). The running time of the synchronous protocol is known to be of R rounds. In the first phase

8 of the transformation, each process p runs the AD(2) asynchrony detector in parallel with the protocol A, as long as the detector returns a YES indication at every round. Note that the protocol s messages are included in the detector s messages (or viceversa), preventing the possibility that the protocol encounters asynchronous message deliveries without the detector noticing. If the detector returns NO during this phase, the process stops running the synchronous protocol, and continues running only AD(2). If the process receives YES at the end of round R + 2, then it returns the decision value that A produced at the end of round R 4. On the other hand, if the process receives NO from AD(2) in round R + 2, i.e. asynchrony was detected, then the process will run the second phase of the transformation. More precisely, in phase two, the process will run a backup agreement protocol that tolerates periods of asynchrony (for example, the K4 protocol [10], if the task is k-set agreement). The main question is how to initialize the backup protocol, given that some of the processes may have already decided in phase one, without breaking the properties of the task. We solve this problem as follows. Let Supp (the support set) be the set of processes that received YES from AD(2) in round R + 1 that process p receives messages from in round R + 2. There are two cases. (1) If the set Supp is empty, then the process starts running the backup protocol using its initial proposal value. (2) If the set Supp is non-empty, then the process obtains a new proposal value as follows. It picks one process from Supp and adopts its state at the end of round R 1. Then, in round R, it simulates receiving the messages in j Supp [R], where we maintain the notation used in Section 4. We will show that in this case, the simulated protocol A will necessarily return a decision value at the end of simulated round R. The process p then runs the backup protocol, using as initial value the decision value resulting from the simulation of the first R rounds. msgset R+1 j 5.3 Proof of Correctness We now prove that the resulting protocol verifies the task specification. The proofs of termination, validity, and the colorless property follow from the properties of the A and Backup protocols, therefore we will concentrate on proving that the resulting protocol also satisfies the output predicate P. Theorem 1 (Output Predicate). The indulgent transformation protocol satisfies the output predicate P associated to the task T. Assume for the sake of contradiction that there exists an execution in which the output of the transformation breaks the output predicate P. If all process decisions are made at the end of round R + 2, then, by the global detection property of AD(2), there exists a synchronous execution of A in which the same outputs are decided, which break the predicate P, contradiction. If all decisions occur after round R + 2, first notice that, by the validity and colorless properties, the inputs processes propose to the 4 Since AD(2) returns YES at process p at the end of round R + 2, it follows that it must have returned YES at p at the end of round R as well. The local detection property of the asynchrony detector implies that the protocol A has to return a decision value, since it executes a synchronous execution.

9 Backup protocol are always valid inputs for the task. It follows that, since all decisions are output by Backup, there exists an execution of the Backup protocol in which the predicate P is broken, again a contradiction. Therefore, at least one process outputs at the end of round R+2, and some processes decide at some later round. We prove the following claim. Claim. If a process decides at the end of round R + 2, then (i) all correct processes will have a non-empty support set Supp and (ii) there exists an R-round synchronous execution consistent with the views that all correct processes adopt at the end of round R + 2. Proof (Sketch). First, let d be a process that decides at the end of round R + 2. Then, in round R + 2, process d received a message from at least N t processes that got YES from AD(2) at the end of round R + 1. Since N 2t + 1, it follows that every process that has not crashed by the end of round R + 2 will have received at least one message from a process that has received YES from AD(2) in round R + 1; therefore, all non-crashed processes that get NO from AD(2) in round R + 2 will execute case 2, which ensures the first claim. Let Q = {q 1,..., q l } be the non-crashed processes at the end of round R+2. By the above claim, we know that these processes either decide or simulate an execution. We prove that all views simulated in this round are consistent with a synchronous execution up to the end of round R, in the sense of Lemma 3. To prove that the intersection of their simulated views in round R contains at least (N t) messages, notice that the processes from which process d receives messages in round R + 2 are necessarily in this intersection, since otherwise process d would receive NO in round R + 2. To prove the first condition of Lemma 3, note that process d s view of round R, i.e. the set msgset R+2 d [R], contains all messages simulated as received in round R by the processes that receive NO in round R + 2. Since N t > t, every process that receives NO in round R + 2 from the detector also receives a message supporting d s decision in round R + 2; process d receives the same message and does not notice any asynchrony. Therefore, we can apply Lemma 3 to obtain that there exists a synchronous execution of the protocol A in which the processes in Q obtain the same decision values as the values obtained through the simulation or decision at the end of round R + 2. Returning to the proof of the output predicate, recall that there exists at least process d which outputs at the end of round R + 2. From the above Claim, it follows that all non-crashed processes simulate synchronous views of the first R rounds. Therefore all non-crashed processes will receive an output from the synchronous protocol A. Moreover, these synchronous views of processes are consistent with a synchronous execution, therefore the set of outputs received by non-crashed processes verifies the predicate P. Hence all the inputs that the processes propose to the Backup protocol verify the predicate P. Since Backup respects validity, it follows that the outputs of Backup will also verify the predicate P. 6 A Protocol for Strong Indulgent Renaming 6.1 Protocol Description In this section, we present an emulation technique that transforms any synchronous renaming protocol into an indulgent renaming protocol. For simplicity, we will assume

10 that the synchronous renaming protocol is the one by Herlihy et al. [8], which is timeoptimal, terminating in log N + 1 synchronous rounds. The resulting indulgent protocol will rename in N names using log N + 3 rounds of communication if the system is initially synchronous, and will eventually rename into N + t names if the system is asynchronous, by safely reverting to a backup constituted by the asynchronous renaming algorithm by Attiya et al. [7]. Again, the protocol is structured into two phases. First Phase. During the first log N +1 rounds, processes run the AD(2) asynchrony detector in parallel with the synchronous renaming algorithm. Note that the protocol s messages are included in the detector s messages. If the detector returns NO at one of these rounds, then the process stops running the synchronous algorithm, and continues only with the detector. If at the end of round [log N] + 1, the process receives YES from AD(2), then it also receives a name name i as the decision value of the synchronous protocol. Second Phase. At the end of round [log N] + 1, the processes start the asynchronous renaming algorithm of [7]. More precisely, each process builds a vector V with a single entry, which contains the tuple v i, name i, J i, b i, r i, where v i is the processes initial value. The entry name i is the proposed name, which is either the name returned by the synchronous renaming algorithm, if the process received YES from the detector, or, otherwise. The entry J i counts the number of times the process proposed a name it is 1 if the process has received YES from the detector, and 0 otherwise; b i is the decision bit, which is initially 0. Finally, r i is the round number 5 when the entry was last updated, which is in this case log n + 1. The processes broadcast their vectors V for the next two rounds, while continuing to run the asynchrony detector in parallel. The contents of the vector V are updated at every round, as follows: if a vector V containing new entries is received, the process adds all the new entries to its vector; if there are conflicting entries corresponding to the same process, the tie is broken using the round number r i. If, at the end of round log N + 3 the process receives YES from the detector, then it decides on name i. Otherwise, it continues runnning the AttiyaRenaming algorithm until decision is possible. 6.2 Proof of Correctness The first step in the proof of correctness of the transformation provides some properties of the asynchronous renaming algorithm of [7]. More precisely, the first Lemma states that the asynchronous renaming algorithm remains correct even though processes propose names initially, that is at the beginning of round log N + 2. The proof follows from an examination of the protocol and proofs from [7]. Lemma 5. The asynchronous renaming protocol of [7] ensures termination, name uniqueness, and a name space bound of N + t, even if processes propose names at the beginning of the first round. The previous Lemma ensures that the transformation guarantees termination, i.e. that every correct process eventually returns a name. The non-triviality property of the 5 This entry in the vector is implied in the original version of the algorithm [7].

11 asynchrony detector ensures that the resulting algorithm will terminate in log N + 3 rounds in any synchronous run. In the following, we will concentrate on the uniqueness of the names and on the bounds on the resulting namespace. We start by proving that the protocol does not generate duplicate names. Lemma 6 (Uniqueness). Given any two names n i, n j returned by processes in an execution, we have that n i n j. Proof (Sketch). Assume for the sake of contradiction that there exists a run in which two processes p i, p j decide on the same name n 0. First, we consider the case in which both decisions occurred at round log N + 3, the first round at with a process can decide using our emulation. Notice that, if a decision is made, the processes necessarily decide on the decision value of the simulated synchronous protocol 6. By the global detection property of AD(2) it then follows that there exists a synchronous execution of the synchronous renaming protocol in which two distinct processes return the same value, contradicting the correctness of the protocol. Similarly, we can show that if both decisions occur after round log N + 3, we can reduce the correctness of the transformation to the correctness of the asynchronous protocol. Therefore, the remaining case is that in which one of the decisions occurs at round log N + 3, and the other decision occurs at a later round, i.e. it is a decision made by the asynchronous renaming protocol. In this case, let p i be the process that decides on n 0 at the end of round log N +3. This implies that process p i received YES at the end of round log N + 3 from AD(2). Therefore, since p i sees a synchronous view, there exists a set S of at least N t processes that received p i s message reserving name n 0 in round log N + 2. It then follows that each non-crashed process receives a message from a process in the set S in round log N + 3. By the structure of the protocol, we obtain that each process has the entry v i, n 0, 1, 0, log N + 1 in their V vector at the end of round log N + 3. It follows from the structure of the asynchronous protocol that no process other than p i will ever decide on the name n 0 at any later round, which concludes the proof of the Lemma. Finally, we prove that the transformation ensures the following guarantees on the size of the namespace. Lemma 7 (Namespace Size). The transformation ensures the following properties: (1) In synchronous executions, the resulting algorithm will rename in a namespace of at most N names. (2) In any execution, the resulting algorithm will rename in a namespace of at most N + t names. Proof (Sketch). For the proof of the first property, notice that, in a synchronous execution, any output combination for the transformation is an output combination for the synchronous renaming protocol. For the second property, let l 0 be the number of names decided on at the end of round log N +3 in a run of the protocol. These names are clearly between 1 and N. Lemma 6 guarantees that none of these names is decided on in the rest of the execution. On the other hand, Lemma 5 and the namespace bound of N + t for the asynchronous protocol ensure that the asynchronous protocol decides exclusively on names between 1 and N + t, which concludes the proof of the claim. 6 A simple analysis of the asynchronous renaming protocol shows that a process cannot decide after two rounds of communication, unless it had already proposed a value at the beginning of the first round.

12 7 Conclusions and Future Work In this paper, we have introduced a general transformation technique from synchronous algorithms to indulgent algorithms, and applied it to obtain indulgent solutions for a large class of distributed tasks, including consensus, set agreement and renaming. Our results suggest that, even though it is generally hard to design asynchronous algorithms in fault-prone systems, one can obtain efficient algorithms that tolerate asynchronous executions starting from synchronous algorithms. In terms of future work, we first envision generalizing our technique to generate algorithms that also work in a window of synchrony, and investigating its limitations in terms of time and communication complexity. Another interesting research direction would be to analyze if similar techniques exist in the case of Byzantine failures in particular, if, starting from a synchronous fault-tolerant algorithm, one can obtain a Byzantine fault-tolerant algorithm, tolerating asynchronous executions. 8 Acknowledgements The authors would like to thank Prof. Hagit Attiya and Nikola Knežević for their help on previous drafts of this paper, and the anonymous reviewers for their useful feedback. References 1. R. Guerraoui, Indulgent algorithms, in PODC 2000, pp , ACM, July C. Dwork, N. A. Lynch, and L. Stockmeyer, Consensus in the presence of partial synchrony, J. ACM, vol. 35, pp , Apr P. Dutta and R. Guerraoui, The inherent price of indulgence., in PODC 02: Proceedings of the annual ACM symposium on Principles of distributed computing, pp , L. Lamport, Fast paxos, Distributed Computing, vol. 19, no. 2, pp , L. Lamport, Generalized consensus and paxos, Microsoft Research Technical Report MSR- TR , March D. Alistarh, S. Gilbert, R. Guerraoui, and C. Travers, How to solve consensus in the smallest window of synchrony, in DISC, pp , H. Attiya, A. Bar-Noy, D. Dolev, D. Peleg, and R. Reischuk, Renaming in an asynchronous environment, Journal of the ACM, vol. 37, no. 3, pp , S. Chaudhuri, M. Herlihy, and M. R. Tuttle, Wait-free implementations in message-passing systems, Theor. Comput. Sci., vol. 220, no. 1, pp , P. Dutta and R. Guerraoui, The inherent price of indulgence, Distributed Computing, vol. 18, no. 1, pp , D. Alistarh, S. Gilbert, R. Guerraoui, and C. Travers, Of choices, failures and asynchrony: The many faces of set agreement, in ISAAC T. D. Chandra and S. Toueg, Unreliable failure detectors for asynchronous systems (preliminary version), in ACM Symposium on Principles of Distributed Computing, pp , Aug E. Gafni, Round-by-round fault detectors (extended abstract): Unifying synchrony and asynchrony, in Proceedings of the 17th Symposium on Principles of Distributed Computing, P. Dutta, R. Guerraoui, and I. Keidar, The overhead of consensus failure recovery, Distributed Computing, vol. 19, no. 5-6, pp , C. Delporte-Gallet, H. Fauconnier, R. Guerraoui, and A. Tielmann, The disagreement power of an adversary, in DISC, pp. 8 21, 2009.

A General Characterization of Indulgence

A General Characterization of Indulgence A General Characterization of Indulgence R. Guerraoui 1,2 N. Lynch 2 (1) School of Computer and Communication Sciences, EPFL (2) Computer Science and Artificial Intelligence Laboratory, MIT Abstract. An

More information

Anonymous Agreement: The Janus Algorithm

Anonymous Agreement: The Janus Algorithm Anonymous Agreement: The Janus Algorithm Zohir Bouzid 1, Pierre Sutra 1, and Corentin Travers 2 1 University Pierre et Marie Curie - Paris 6, LIP6-CNRS 7606, France. name.surname@lip6.fr 2 LaBRI University

More information

Specifying and Proving Broadcast Properties with TLA

Specifying and Proving Broadcast Properties with TLA Specifying and Proving Broadcast Properties with TLA William Hipschman Department of Computer Science The University of North Carolina at Chapel Hill Abstract Although group communication is vitally important

More information

Byzantine Consensus in Directed Graphs

Byzantine Consensus in Directed Graphs Byzantine Consensus in Directed Graphs Lewis Tseng 1,3, and Nitin Vaidya 2,3 1 Department of Computer Science, 2 Department of Electrical and Computer Engineering, and 3 Coordinated Science Laboratory

More information

A Timing Assumption and a t-resilient Protocol for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems

A Timing Assumption and a t-resilient Protocol for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems A Timing Assumption and a t-resilient Protocol for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems Antonio FERNÁNDEZ y Ernesto JIMÉNEZ z Michel RAYNAL? Gilles TRÉDAN? y LADyR,

More information

Consensus. Chapter Two Friends. 8.3 Impossibility of Consensus. 8.2 Consensus 8.3. IMPOSSIBILITY OF CONSENSUS 55

Consensus. Chapter Two Friends. 8.3 Impossibility of Consensus. 8.2 Consensus 8.3. IMPOSSIBILITY OF CONSENSUS 55 8.3. IMPOSSIBILITY OF CONSENSUS 55 Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of

More information

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Yong-Hwan Cho, Sung-Hoon Park and Seon-Hyong Lee School of Electrical and Computer Engineering, Chungbuk National

More information

Implementing Shared Registers in Asynchronous Message-Passing Systems, 1995; Attiya, Bar-Noy, Dolev

Implementing Shared Registers in Asynchronous Message-Passing Systems, 1995; Attiya, Bar-Noy, Dolev Implementing Shared Registers in Asynchronous Message-Passing Systems, 1995; Attiya, Bar-Noy, Dolev Eric Ruppert, York University, www.cse.yorku.ca/ ruppert INDEX TERMS: distributed computing, shared memory,

More information

Semi-Passive Replication in the Presence of Byzantine Faults

Semi-Passive Replication in the Presence of Byzantine Faults Semi-Passive Replication in the Presence of Byzantine Faults HariGovind V. Ramasamy Adnan Agbaria William H. Sanders University of Illinois at Urbana-Champaign 1308 W. Main Street, Urbana IL 61801, USA

More information

Anonymity-Preserving Failure Detectors

Anonymity-Preserving Failure Detectors Anonymity-Preserving Failure Detectors Zohir Bouzid and Corentin Travers LaBRI, U. Bordeaux, France name.surname@labri.fr Abstract. The paper investigates the consensus problem in anonymous, failures prone

More information

Consensus in the Presence of Partial Synchrony

Consensus in the Presence of Partial Synchrony Consensus in the Presence of Partial Synchrony CYNTHIA DWORK AND NANCY LYNCH.Massachusetts Institute of Technology, Cambridge, Massachusetts AND LARRY STOCKMEYER IBM Almaden Research Center, San Jose,

More information

Dfinity Consensus, Explored

Dfinity Consensus, Explored Dfinity Consensus, Explored Ittai Abraham, Dahlia Malkhi, Kartik Nayak, and Ling Ren VMware Research {iabraham,dmalkhi,nkartik,lingren}@vmware.com Abstract. We explore a Byzantine Consensus protocol called

More information

Ruminations on Domain-Based Reliable Broadcast

Ruminations on Domain-Based Reliable Broadcast Ruminations on Domain-Based Reliable Broadcast Svend Frølund Fernando Pedone Hewlett-Packard Laboratories Palo Alto, CA 94304, USA Abstract A distributed system is no longer confined to a single administrative

More information

Research Report. (Im)Possibilities of Predicate Detection in Crash-Affected Systems. RZ 3361 (# 93407) 20/08/2001 Computer Science 27 pages

Research Report. (Im)Possibilities of Predicate Detection in Crash-Affected Systems. RZ 3361 (# 93407) 20/08/2001 Computer Science 27 pages RZ 3361 (# 93407) 20/08/2001 Computer Science 27 pages Research Report (Im)Possibilities of Predicate Detection in Crash-Affected Systems Felix C. Gärtner and Stefan Pleisch Department of Computer Science

More information

Perspectives on the CAP Theorem

Perspectives on the CAP Theorem Perspectives on the CAP Theorem The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Gilbert, Seth, and

More information

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input Initial Assumptions Modern Distributed Computing Theory and Applications Ioannis Chatzigiannakis Sapienza University of Rome Lecture 4 Tuesday, March 6, 03 Exercises correspond to problems studied during

More information

Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services PODC 2004 The PODC Steering Committee is pleased to announce that PODC 2004 will be held in St. John's, Newfoundland. This will be the thirteenth PODC to be held in Canada but the first to be held there

More information

Fork Sequential Consistency is Blocking

Fork Sequential Consistency is Blocking Fork Sequential Consistency is Blocking Christian Cachin Idit Keidar Alexander Shraer May 14, 2008 Abstract We consider an untrusted server storing shared data on behalf of clients. We show that no storage

More information

BYZANTINE AGREEMENT CH / $ IEEE. by H. R. Strong and D. Dolev. IBM Research Laboratory, K55/281 San Jose, CA 95193

BYZANTINE AGREEMENT CH / $ IEEE. by H. R. Strong and D. Dolev. IBM Research Laboratory, K55/281 San Jose, CA 95193 BYZANTINE AGREEMENT by H. R. Strong and D. Dolev IBM Research Laboratory, K55/281 San Jose, CA 95193 ABSTRACT Byzantine Agreement is a paradigm for problems of reliable consistency and synchronization

More information

21. Distributed Algorithms

21. Distributed Algorithms 21. Distributed Algorithms We dene a distributed system as a collection of individual computing devices that can communicate with each other [2]. This denition is very broad, it includes anything, from

More information

Consensus Problem. Pradipta De

Consensus Problem. Pradipta De Consensus Problem Slides are based on the book chapter from Distributed Computing: Principles, Paradigms and Algorithms (Chapter 14) by Kshemkalyani and Singhal Pradipta De pradipta.de@sunykorea.ac.kr

More information

Consensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS

Consensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS 16 CHAPTER 2. CONSENSUS Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of a node. Chapter

More information

Asynchronous Reconfiguration for Paxos State Machines

Asynchronous Reconfiguration for Paxos State Machines Asynchronous Reconfiguration for Paxos State Machines Leander Jehl and Hein Meling Department of Electrical Engineering and Computer Science University of Stavanger, Norway Abstract. This paper addresses

More information

Fork Sequential Consistency is Blocking

Fork Sequential Consistency is Blocking Fork Sequential Consistency is Blocking Christian Cachin Idit Keidar Alexander Shraer Novembe4, 008 Abstract We consider an untrusted server storing shared data on behalf of clients. We show that no storage

More information

From a Store-collect Object and Ω to Efficient Asynchronous Consensus

From a Store-collect Object and Ω to Efficient Asynchronous Consensus From a Store-collect Object and Ω to Efficient Asynchronous Consensus Michel Raynal, Julien Stainer To cite this version: Michel Raynal, Julien Stainer. From a Store-collect Object and Ω to Efficient Asynchronous

More information

A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment

A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment Michel RAYNAL IRISA, Campus de Beaulieu 35042 Rennes Cedex (France) raynal @irisa.fr Abstract This paper considers

More information

Self-stabilizing Byzantine Digital Clock Synchronization

Self-stabilizing Byzantine Digital Clock Synchronization Self-stabilizing Byzantine Digital Clock Synchronization Ezra N. Hoch, Danny Dolev and Ariel Daliot The Hebrew University of Jerusalem We present a scheme that achieves self-stabilizing Byzantine digital

More information

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast HariGovind V. Ramasamy Christian Cachin August 19, 2005 Abstract Atomic broadcast is a communication primitive that allows a group of

More information

A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm

A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Appears as Technical Memo MIT/LCS/TM-590, MIT Laboratory for Computer Science, June 1999 A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Miguel Castro and Barbara Liskov

More information

Distributed Algorithms 6.046J, Spring, Nancy Lynch

Distributed Algorithms 6.046J, Spring, Nancy Lynch Distributed Algorithms 6.046J, Spring, 205 Nancy Lynch What are Distributed Algorithms? Algorithms that run on networked processors, or on multiprocessors that share memory. They solve many kinds of problems:

More information

Parallélisme. Aim of the talk. Decision tasks. Example: consensus

Parallélisme. Aim of the talk. Decision tasks. Example: consensus Parallélisme Aim of the talk Geometry and Distributed Systems Can we implement some functions on some distributed architecture, even if there are some crashes? Eric Goubault Commissariat à l Energie Atomique

More information

6.852: Distributed Algorithms Fall, Instructor: Nancy Lynch TAs: Cameron Musco, Katerina Sotiraki Course Secretary: Joanne Hanley

6.852: Distributed Algorithms Fall, Instructor: Nancy Lynch TAs: Cameron Musco, Katerina Sotiraki Course Secretary: Joanne Hanley 6.852: Distributed Algorithms Fall, 2015 Instructor: Nancy Lynch TAs: Cameron Musco, Katerina Sotiraki Course Secretary: Joanne Hanley What are Distributed Algorithms? Algorithms that run on networked

More information

Introduction to Distributed Systems Seif Haridi

Introduction to Distributed Systems Seif Haridi Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send

More information

Distributed systems. Consensus

Distributed systems. Consensus Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory Consensus B A C 2 Consensus In the consensus problem, the processes propose values and have to agree on one among these

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

Capacity of Byzantine Agreement: Complete Characterization of Four-Node Networks

Capacity of Byzantine Agreement: Complete Characterization of Four-Node Networks Capacity of Byzantine Agreement: Complete Characterization of Four-Node Networks Guanfeng Liang and Nitin Vaidya Department of Electrical and Computer Engineering, and Coordinated Science Laboratory University

More information

Sigma: A Fault-Tolerant Mutual Exclusion Algorithm in Dynamic Distributed Systems Subject to Process Crashes and Memory Losses

Sigma: A Fault-Tolerant Mutual Exclusion Algorithm in Dynamic Distributed Systems Subject to Process Crashes and Memory Losses Sigma: A Fault-Tolerant Mutual Exclusion Algorithm in Dynamic Distributed Systems Subject to Process Crashes and Memory Losses Wei Chen Shi-Ding Lin Qiao Lian Zheng Zhang Microsoft Research Asia {weic,

More information

arxiv: v1 [cs.dc] 13 May 2017

arxiv: v1 [cs.dc] 13 May 2017 Which Broadcast Abstraction Captures k-set Agreement? Damien Imbs, Achour Mostéfaoui, Matthieu Perrin, Michel Raynal, LIF, Université Aix-Marseille, 13288 Marseille, France LINA, Université de Nantes,

More information

Authenticated Agreement

Authenticated Agreement Chapter 18 Authenticated Agreement Byzantine nodes are able to lie about their inputs as well as received messages. Can we detect certain lies and limit the power of byzantine nodes? Possibly, the authenticity

More information

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change

More information

Tolerating Arbitrary Failures With State Machine Replication

Tolerating Arbitrary Failures With State Machine Replication &CHAPTER 2 Tolerating Arbitrary Failures With State Machine Replication ASSIA DOUDOU 1, BENOÎT GARBINATO 2, and RACHID GUERRAOUI 1, 1 Swiss Federal Institute of Technology in Lausanne, EPFL; and 2 University

More information

Implementing a Register in a Dynamic Distributed System

Implementing a Register in a Dynamic Distributed System Implementing a Register in a Dynamic Distributed System Roberto Baldoni, Silvia Bonomi, Anne-Marie Kermarrec, Michel Raynal Sapienza Università di Roma, Via Ariosto 25, 85 Roma, Italy INRIA, Univeristé

More information

Time and Space Lower Bounds for Implementations Using k-cas

Time and Space Lower Bounds for Implementations Using k-cas Time and Space Lower Bounds for Implementations Using k-cas Hagit Attiya Danny Hendler September 12, 2006 Abstract This paper presents lower bounds on the time- and space-complexity of implementations

More information

arxiv: v2 [cs.dc] 12 Sep 2017

arxiv: v2 [cs.dc] 12 Sep 2017 Efficient Synchronous Byzantine Consensus Ittai Abraham 1, Srinivas Devadas 2, Danny Dolev 3, Kartik Nayak 4, and Ling Ren 2 arxiv:1704.02397v2 [cs.dc] 12 Sep 2017 1 VMware Research iabraham@vmware.com

More information

What Can be Computed in a Distributed System?

What Can be Computed in a Distributed System? What Can be Computed in a Distributed System? Michel Raynal Institut Universitaire de France &IRISA,Université de Rennes, France & Department of Computing, Polytechnic University, Hong Kong raynal@irisa.fr

More information

Asynchronous Convex Hull Consensus in the Presence of Crash Faults

Asynchronous Convex Hull Consensus in the Presence of Crash Faults Asynchronous Convex Hull Consensus in the Presence of Crash Faults ABSTRACT Lewis Tseng Department of Computer Science Coordinated Science Laboratory University of Illinois at Urbana-Champaign ltseng3@illinois.edu

More information

Simulating Few by Many: k-concurrency via k-set Consensus

Simulating Few by Many: k-concurrency via k-set Consensus Simulating Few by Many: k-concurrency via k-set Consensus Eli Gafni May 29, 2007 Abstract We introduce a new novel emulation by which we show an equivalence between k-set Consensus and k-concurrency: Any

More information

Oh-RAM! One and a Half Round Atomic Memory

Oh-RAM! One and a Half Round Atomic Memory Oh-RAM! One and a Half Round Atomic Memory Theophanis Hadjistasi Nicolas Nicolaou Alexander Schwarzmann July 21, 2018 arxiv:1610.08373v1 [cs.dc] 26 Oct 2016 Abstract Emulating atomic read/write shared

More information

An optimal novel Byzantine agreement protocol (ONBAP) for heterogeneous distributed database processing systems

An optimal novel Byzantine agreement protocol (ONBAP) for heterogeneous distributed database processing systems Available online at www.sciencedirect.com Procedia Technology 6 (2012 ) 57 66 2 nd International Conference on Communication, Computing & Security An optimal novel Byzantine agreement protocol (ONBAP)

More information

Fully-Adaptive Algorithms for Long-Lived Renaming

Fully-Adaptive Algorithms for Long-Lived Renaming Fully-Adaptive Algorithms for Long-Lived Renaming Alex Brodsky 1, Faith Ellen 2, and Philipp Woelfel 2 1 Dept. of Applied Computer Science, University of Winnipeg, Winnipeg, Canada, a.brodsky@uwinnipeg.ca

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Fault-Tolerant Distributed Consensus

Fault-Tolerant Distributed Consensus Fault-Tolerant Distributed Consensus Lawrence Kesteloot January 20, 1995 1 Introduction A fault-tolerant system is one that can sustain a reasonable number of process or communication failures, both intermittent

More information

Byzantine Fault-Tolerance with Commutative Commands

Byzantine Fault-Tolerance with Commutative Commands Byzantine Fault-Tolerance with Commutative Commands Pavel Raykov 1, Nicolas Schiper 2, and Fernando Pedone 2 1 Swiss Federal Institute of Technology (ETH) Zurich, Switzerland 2 University of Lugano (USI)

More information

The Complexity of Renaming

The Complexity of Renaming The Complexity of Renaming Dan Alistarh EPFL James Aspnes Yale Seth Gilbert NUS Rachid Guerraoui EPFL Abstract We study the complexity of renaming, a fundamental problem in distributed computing in which

More information

Wait-Free Regular Storage from Byzantine Components

Wait-Free Regular Storage from Byzantine Components Wait-Free Regular Storage from Byzantine Components Ittai Abraham Gregory Chockler Idit Keidar Dahlia Malkhi July 26, 2006 Abstract We consider the problem of implementing a wait-free regular register

More information

Assignment 12: Commit Protocols and Replication Solution

Assignment 12: Commit Protocols and Replication Solution Data Modelling and Databases Exercise dates: May 24 / May 25, 2018 Ce Zhang, Gustavo Alonso Last update: June 04, 2018 Spring Semester 2018 Head TA: Ingo Müller Assignment 12: Commit Protocols and Replication

More information

Consensus in Byzantine Asynchronous Systems

Consensus in Byzantine Asynchronous Systems Consensus in Byzantine Asynchronous Systems R. BALDONI Universitá La Sapienza, Roma, Italy J.-M. HÉLARY IRISA, Université de Rennes 1, France M. RAYNAL IRISA, Université de Rennes 1, France L. TANGUY IRISA,

More information

ACONCURRENT system may be viewed as a collection of

ACONCURRENT system may be viewed as a collection of 252 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 3, MARCH 1999 Constructing a Reliable Test&Set Bit Frank Stomp and Gadi Taubenfeld AbstractÐThe problem of computing with faulty

More information

Distributed Algorithms (PhD course) Consensus SARDAR MUHAMMAD SULAMAN

Distributed Algorithms (PhD course) Consensus SARDAR MUHAMMAD SULAMAN Distributed Algorithms (PhD course) Consensus SARDAR MUHAMMAD SULAMAN Consensus (Recapitulation) A consensus abstraction is specified in terms of two events: 1. Propose ( propose v )» Each process has

More information

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q Coordination 1 To do q q q Mutual exclusion Election algorithms Next time: Global state Coordination and agreement in US Congress 1798-2015 Process coordination How can processes coordinate their action?

More information

Exercise 12: Commit Protocols and Replication

Exercise 12: Commit Protocols and Replication Data Modelling and Databases (DMDB) ETH Zurich Spring Semester 2017 Systems Group Lecturer(s): Gustavo Alonso, Ce Zhang Date: May 22, 2017 Assistant(s): Claude Barthels, Eleftherios Sidirourgos, Eliza

More information

The Alpha of Indulgent Consensus

The Alpha of Indulgent Consensus The Computer Journal Advance Access published August 3, 2006 Ó The Author 2006. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved. For Permissions, please

More information

Byzantine Failures. Nikola Knezevic. knl

Byzantine Failures. Nikola Knezevic. knl Byzantine Failures Nikola Knezevic knl Different Types of Failures Crash / Fail-stop Send Omissions Receive Omissions General Omission Arbitrary failures, authenticated messages Arbitrary failures Arbitrary

More information

From Consensus to Atomic Broadcast: Time-Free Byzantine-Resistant Protocols without Signatures

From Consensus to Atomic Broadcast: Time-Free Byzantine-Resistant Protocols without Signatures From Consensus to Atomic Broadcast: Time-Free Byzantine-Resistant Protocols without Signatures Miguel Correia, Nuno Ferreira Neves, Paulo Veríssimo DI FCUL TR 04 5 June 2004 Departamento de Informática

More information

The Building Blocks of Consensus

The Building Blocks of Consensus The Building Blocks of Consensus Yee Jiun Song 1, Robbert van Renesse 1, Fred B. Schneider 1, and Danny Dolev 2 1 Cornell University 2 The Hebrew University of Jerusalem Abstract. Consensus is an important

More information

Distributed Algorithms 6.046J, Spring, 2015 Part 2. Nancy Lynch

Distributed Algorithms 6.046J, Spring, 2015 Part 2. Nancy Lynch Distributed Algorithms 6.046J, Spring, 2015 Part 2 Nancy Lynch 1 This Week Synchronous distributed algorithms: Leader Election Maximal Independent Set Breadth-First Spanning Trees Shortest Paths Trees

More information

Subconsensus Tasks: Renaming is Weaker than Set Agreement

Subconsensus Tasks: Renaming is Weaker than Set Agreement Subconsensus Tasks: enaming is Weaker than Set Agreement Eli Gafni 1, Sergio ajsbaum 2, and Maurice Herlihy 3 1 Computer Science Department UCLA Los Angles, CA 90095 eli@cs.ucla.edu 2 Instituto de Matemáticas,

More information

Generalized Lattice Agreement

Generalized Lattice Agreement ABSTRACT Jose M. Falerio Microsoft Research India t-josfal@microsoft.com Generalized Lattice Agreement G. Ramalingam Microsoft Research India grama@microsoft.com Lattice agreement is a key decision problem

More information

Computer Science Technical Report

Computer Science Technical Report Computer Science Technical Report Feasibility of Stepwise Addition of Multitolerance to High Atomicity Programs Ali Ebnenasir and Sandeep S. Kulkarni Michigan Technological University Computer Science

More information

Multi-writer Regular Registers in Dynamic Distributed Systems with Byzantine Failures

Multi-writer Regular Registers in Dynamic Distributed Systems with Byzantine Failures Multi-writer Regular Registers in Dynamic Distributed Systems with Byzantine Failures Silvia Bonomi, Amir Soltani Nezhad Università degli Studi di Roma La Sapienza, Via Ariosto 25, 00185 Roma, Italy bonomi@dis.uniroma1.it

More information

A Mechanism for Sequential Consistency in a Distributed Objects System

A Mechanism for Sequential Consistency in a Distributed Objects System A Mechanism for Sequential Consistency in a Distributed Objects System Cristian Ţăpuş, Aleksey Nogin, Jason Hickey, and Jerome White California Institute of Technology Computer Science Department MC 256-80,

More information

Distributed Computing over Communication Networks: Leader Election

Distributed Computing over Communication Networks: Leader Election Distributed Computing over Communication Networks: Leader Election Motivation Reasons for electing a leader? Reasons for not electing a leader? Motivation Reasons for electing a leader? Once elected, coordination

More information

Paxos. Sistemi Distribuiti Laurea magistrale in ingegneria informatica A.A Leonardo Querzoni. giovedì 19 aprile 12

Paxos. Sistemi Distribuiti Laurea magistrale in ingegneria informatica A.A Leonardo Querzoni. giovedì 19 aprile 12 Sistemi Distribuiti Laurea magistrale in ingegneria informatica A.A. 2011-2012 Leonardo Querzoni The Paxos family of algorithms was introduced in 1999 to provide a viable solution to consensus in asynchronous

More information

Simulating Few by Many: Limited Concurrency = Set Consensus (Extended Abstract)

Simulating Few by Many: Limited Concurrency = Set Consensus (Extended Abstract) Simulating Few by Many: Limited Concurrency = Set Consensus (Extended Abstract) Eli Gafni, UCLA Rachid Guerraoui, EPFL May 5, 2009 Abstract Perfection is the enemy of the good. Most backoff schemes wait

More information

The Long March of BFT. Weird Things Happen in Distributed Systems. A specter is haunting the system s community... A hierarchy of failure models

The Long March of BFT. Weird Things Happen in Distributed Systems. A specter is haunting the system s community... A hierarchy of failure models A specter is haunting the system s community... The Long March of BFT Lorenzo Alvisi UT Austin BFT Fail-stop A hierarchy of failure models Crash Weird Things Happen in Distributed Systems Send Omission

More information

Optimistic Erasure-Coded Distributed Storage

Optimistic Erasure-Coded Distributed Storage Optimistic Erasure-Coded Distributed Storage Partha Dutta IBM India Research Lab Bangalore, India Rachid Guerraoui EPFL IC Lausanne, Switzerland Ron R. Levy EPFL IC Lausanne, Switzerland Abstract We study

More information

Distributed Algorithms Benoît Garbinato

Distributed Algorithms Benoît Garbinato Distributed Algorithms Benoît Garbinato 1 Distributed systems networks distributed As long as there were no machines, programming was no problem networks distributed at all; when we had a few weak computers,

More information

Consensus, impossibility results and Paxos. Ken Birman

Consensus, impossibility results and Paxos. Ken Birman Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

Consensus and related problems

Consensus and related problems Consensus and related problems Today l Consensus l Google s Chubby l Paxos for Chubby Consensus and failures How to make process agree on a value after one or more have proposed what the value should be?

More information

Signature-Free Broadcast-Based Intrusion Tolerance: Never Decide a Byzantine Value

Signature-Free Broadcast-Based Intrusion Tolerance: Never Decide a Byzantine Value Signature-Free Broadcast-Based Intrusion Tolerance: Never Decide a Byzantine Value Achour Mostéfaoui and Michel Raynal IRISA, Université de Rennes 1, 35042 Rennes, France {achour,raynal}@irisa.fr Abstract.

More information

Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer

Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors Michel Raynal, Julien Stainer Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors

More information

Replicated State Machines for Collision-Prone Wireless Networks

Replicated State Machines for Collision-Prone Wireless Networks Replicated State Machines for Collision-Prone Wireless Networks Gregory Chockler Seth Gilbert MIT Computer Science and Artificial Intelligence Lab Cambridge, MA 02139 {grishac,sethg}@theory.csail.mit.edu

More information

The Fault Detection Problem

The Fault Detection Problem The Fault Detection Problem Andreas Haeberlen 1 and Petr Kuznetsov 2 1 Max Planck Institute for Software Systems (MPI-SWS) 2 TU Berlin / Deutsche Telekom Laboratories Abstract. One of the most important

More information

Distributed minimum spanning tree problem

Distributed minimum spanning tree problem Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with

More information

Distributed algorithms

Distributed algorithms Distributed algorithms Prof R. Guerraoui lpdwww.epfl.ch Exam: Written Reference: Book - Springer Verlag http://lpd.epfl.ch/site/education/da - Introduction to Reliable (and Secure) Distributed Programming

More information

On the Composition of Authenticated Byzantine Agreement

On the Composition of Authenticated Byzantine Agreement On the Composition of Authenticated Byzantine Agreement Yehuda Lindell Anna Lysyanskaya Tal Rabin July 28, 2004 Abstract A fundamental problem of distributed computing is that of simulating a secure broadcast

More information

Fault-Tolerance & Paxos

Fault-Tolerance & Paxos Chapter 15 Fault-Tolerance & Paxos How do you create a fault-tolerant distributed system? In this chapter we start out with simple questions, and, step by step, improve our solutions until we arrive at

More information

Topics in Reliable Distributed Systems

Topics in Reliable Distributed Systems Topics in Reliable Distributed Systems 049017 1 T R A N S A C T I O N S Y S T E M S What is A Database? Organized collection of data typically persistent organization models: relational, object-based,

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

Erez Petrank. Department of Computer Science. Haifa, Israel. Abstract

Erez Petrank. Department of Computer Science. Haifa, Israel. Abstract The Best of Both Worlds: Guaranteeing Termination in Fast Randomized Byzantine Agreement Protocols Oded Goldreich Erez Petrank Department of Computer Science Technion Haifa, Israel. Abstract All known

More information

Incompatibility Dimensions and Integration of Atomic Commit Protocols

Incompatibility Dimensions and Integration of Atomic Commit Protocols The International Arab Journal of Information Technology, Vol. 5, No. 4, October 2008 381 Incompatibility Dimensions and Integration of Atomic Commit Protocols Yousef Al-Houmaily Department of Computer

More information

To do. Consensus and related problems. q Failure. q Raft

To do. Consensus and related problems. q Failure. q Raft Consensus and related problems To do q Failure q Consensus and related problems q Raft Consensus We have seen protocols tailored for individual types of consensus/agreements Which process can enter the

More information

Time-Free Authenticated Byzantine Consensus

Time-Free Authenticated Byzantine Consensus Time-Free Authenticated Byzantine Consensus Hamouma Moumen, Achour Mostefaoui To cite this version: Hamouma Moumen, Achour Mostefaoui. Time-Free Authenticated Byzantine Consensus. Franck Capello and Hans-Peter

More information

Mastering Agreement Problems in Distributed Systems

Mastering Agreement Problems in Distributed Systems focus fault tolerance Mastering Agreement Problems in Distributed Systems Michel Raynal, IRISA Mukesh Singhal, Ohio State University Overcoming agreement problems in distributed systems is a primary challenge

More information

A MODULAR FRAMEWORK TO IMPLEMENT FAULT TOLERANT DISTRIBUTED SERVICES. P. Nicolas Kokkalis

A MODULAR FRAMEWORK TO IMPLEMENT FAULT TOLERANT DISTRIBUTED SERVICES. P. Nicolas Kokkalis A MODULAR FRAMEWORK TO IMPLEMENT FAULT TOLERANT DISTRIBUTED SERVICES by P. Nicolas Kokkalis A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate Department

More information

Efficient Reduction for Wait-Free Termination Detection in a Crash-Prone Distributed System

Efficient Reduction for Wait-Free Termination Detection in a Crash-Prone Distributed System Efficient Reduction for Wait-Free Termination Detection in a Crash-Prone Distributed System Neeraj Mittal 1, Felix C. Freiling 2, S. Venkatesan 1, and Lucia Draque Penso 2 1 Department of Computer Science,

More information

Eventually k-bounded Wait-Free Distributed Daemons

Eventually k-bounded Wait-Free Distributed Daemons Eventually k-bounded Wait-Free Distributed Daemons Yantao Song and Scott M. Pike Texas A&M University Department of Computer Science College Station, TX 77843-3112, USA {yantao, pike}@tamu.edu Technical

More information

In search of the holy grail: Looking for the weakest failure detector for wait-free set agreement

In search of the holy grail: Looking for the weakest failure detector for wait-free set agreement In search of the holy grail: Looking for the weakest failure detector for wait-free set agreement Michel Raynal, Corentin Travers To cite this version: Michel Raynal, Corentin Travers. In search of the

More information

DISTRIBUTED COMPUTING COLUMN. Stefan Schmid TU Berlin & T-Labs, Germany

DISTRIBUTED COMPUTING COLUMN. Stefan Schmid TU Berlin & T-Labs, Germany DISTRIBUTED COMPUTING COLUMN Stefan Schmid TU Berlin & T-Labs, Germany stefan.schmid@tu-berlin.de 1 The Renaming Problem: Recent Developments and Open Questions Dan Alistarh (Microsoft Research) 1 Introduction

More information

Distributed Algorithms Failure detection and Consensus. Ludovic Henrio CNRS - projet SCALE

Distributed Algorithms Failure detection and Consensus. Ludovic Henrio CNRS - projet SCALE Distributed Algorithms Failure detection and Consensus Ludovic Henrio CNRS - projet SCALE ludovic.henrio@cnrs.fr Acknowledgement The slides for this lecture are based on ideas and materials from the following

More information