Implementing a Regular Register in an Eventually Synchronous Distributed System prone to Continuous Churn
|
|
- Stella Richard
- 5 years ago
- Views:
Transcription
1 1 Implementing a Regular Register in an Eventually Synchronous Distributed System prone to Continuous Churn Roberto Baldoni Silvia Bonomi Michel Raynal Abstract Due to their capability to hide the complexity generated by the messages exchanged between processes, shared objects are one of the main abstractions provided to developers of distributed applications. Implementations of such objects, in modern distributed systems, have to take into account the fact that almost all services, implemented on top of distributed infrastructures, are no longer fully managed due to either their size or their maintenance cost. Therefore, these infrastructures exhibit several autonomic behaviors in order to, for example, tolerate failures and continuous arrival and departure of nodes (churn phenomenon). Among all the shared objects, the register object is a fundamental one. Several protocols have been proposed to build fault resilient registers on top of message-passing system, but, unfortunately, failures are not the only challenge in modern distributed systems and new issues arise from the presence of churn. This paper addresses the construction of a multi-writer/multi-reader regular register in an eventually synchronous distributed system affected by the continuous arrival/departure of participants. In particular, a general protocol implementing a regular register is proposed and feasibility conditions associated with the arrival and departure of the processes are given. The protocol is proved correct under the assumption that a constraint on the churn is satisfied. Index Terms Regular Register, Dynamic Distributed Systems, Churn, Distributed Algorithms. 1 INTRODUCTION Context. Dealing with failures has been one of the main challenges in the construction of real reliable applications able to work in a distributed system. These applications are inherently managed, in the sense that, they implicitly assume the existence of a superior manager (i.e., the application/service provider) that controls processes running the application. The manager does its best to guarantee that assumptions made on the underlying distributed system (e.g. a majority of correct processes) hold along time by activating appropriate reactive or proactive recovery procedures[26]. As an example, the manager can either add new processes when crashes occur or ensure the required degree of synchrony of the underlying distributed platform in terms of processes and communication links. Air traffic control, telecommunication, banking systems and e-government systems are just a few examples of such application domains. In this context, robust abstractions have been defined (shared memory, communication, agreement, etc.) that behave correctly despite asynchrony and failures and that simplify application design and development. When considering protocols implementing such abstractions, in nearly all the cases, the system is always well defined in the sense that the whole set of participating processes is finite and known (directly or transitively) by each process. Universitá La Sapienza, via Ariosto 25, I Roma, Italy Universitá La Sapienza, via Ariosto 25, I Roma, Italy Senior Member, Institut Universitaire de France, IRISA, Université de Rennes, Campus de Beaulieu, F Rennes, France The system composition is modified only when either a process crashes or a new process is added. Therefore, if a process does not crash, it lives for the entire duration of the computation. Motivation. A new challenge is emerging due to the advent of new classes of applications and technologies such as smart environments, sensor networks, mobile systems, peerto-peer systems, cloud computing etc. In these settings, the underlying distributed systems cannot be fully managed but it needs some degree of self-management that depends on the specific application domain. However, it is possible to delineate some common consequences of the presence of such self management: first, there is no entity that can always ensure the validity of the system assumptions during the entire computation and, second, no one knows accurately who joins and who leaves the system at any time introducing a kind of unpredictability in the system composition (this phenomenon of arrival and departure of processes in a system is also known as churn) [6]. As a consequence, distributed computing abstractions have to deal not only with asynchrony and failures, but also with this dynamic dimension where a process that does not crash can leave the system at any time implying that membership can fully change several times during the same computation. Moreover, this dynamic behavior means each process cannot have a precise knowledge on the number of processes composing the system at any given time. Thus, it becomes of primary importance to check under which churn assumption, a protocol implementing a distributed
2 2 computing abstraction is correct. Hence, the abstractions and protocols implementing them have to be reconsidered to take into account this new adversary setting. This selfdefined and continuously evolving distributed system, that we will name in the following dynamic distributed system, makes abstractions more difficult to understand and master than in distributed systems where the set of processes is fixed and known by all participants. The churn notion becomes thus a system parameter whose aim is to make tractable systems having their composition evolving along the time (e.g., [15], [19], [23]). Contribution and roadmap. In this paper, a general churn model that we defined in [5] is considered and used to characterize a dynamic distributed computation where the number of participants change in a given range and the arrival and departure of processes is a non-quiescent phenomenon that depends on join and leave distributions. Such a model places constraints on process arrivals and departures. Specifically the computation size is constrained in a range that is between n 0 k 1 and n 0 + k 2 where n 0 is the number of processes participating in the computation at time t 0 while k 1 and k 2 are two integers greater than or equal to zero that depend on the join and leave distributions. In particular, this paper addresses the problem of deterministically building and maintaining a distributed computation implementing a multiple-writers/multiple-readers regular register. Processes participating in the computation are called active processes. We provide an implementation of a regular register based on a request/reply message pattern and we prove that: any operation issued on the regular register terminates if the number of reply messages needed to perform the operation is at most n 0 k 1 (Lemma 1), and any operation issued on the regular register is valid if the number of reply messages needed to perform the operation is at least (n 0 + k 2 )/2 (Lemma 2). From these two conditions it follows that n 0, k 1 and k 2 cannot be chosen arbitrarily. They are closely related by the condition n 0 > 2k 1 + k 2 (Corollary 1). Let us finally remark that the interest in addressing the regular register abstraction lies in the fact that it is a fundamental notion for building storage systems. Up to now, storage systems that cope with churn ensure regular register consistency criteria in a probabilistic way [2]. The result of this paper gives thus a bound on the churn that a storage system can cope with while still providing deterministic regular consistency guarantee. The paper is structured as follows: Section 2 defines the system model, and in particular Section 2.3 defines the churn model. Section 3 introduces the regular register specification for a dynamic distributed system. Section 4 presents a protocol implementing a regular register and its correctness proof in an eventually synchronous system. Finally, two sections on related work and concluding remarks conclude the paper. 2 SYSTEM MODEL 2.1 Dynamic Distributed System. In a dynamic distributed system, processes may join and leave the system at their will. In order to model processes continuously arriving to and departing from the system, we assume the infinite arrival model (as defined in [24]). The set of processes that can participate in the distributed system, i.e. the distributed system population, is composed by a potentially infinite set of processes Π = {... p i, p j, p k... }, each one having a unique identifier (i.e. its index). However, the distributed system is composed, at each time, by a finite subset of the population. A process enters the distributed system by executing the join System procedure. Such operation aims at connecting the new process to the processes that already belong to the system. A process leaves the system by means of a leave System operation. Processes belonging to the distributed system may fail by crashing before leaving the system; if a process crashes, it stops performing any action. A process that never crashes is said to be correct. In the following we assume the existence of a protocol managing the arrival and the departures of processes from the distributed system; such protocol is also responsible for the connectivity maintenance among processes part of the distributed system. Some examples of topologies and protocols keeping the system connected in a dynamic environment are [16], [17], [20], [27]. The system is eventually synchronous 1, that is after an unknown but finite time the system behaves synchronously [7], [9]. The passage of time is measured by a fictional global clock, represented by integer values, not accessible by processes. Processes belonging to the distributed system communicate by exchanging messages through t either point-to-point reliable channels or broadcast primitives. Both the communication primitives can be characterized by the following property: Eventual Time Delivery: there exists a bound δ, known by processes, and a time t such that any message sent (broadcast) at time t t, is delivered by time t + δ by all the processes that are in the system during the whole interval [t, t + δ]. It is important to notice that processes only know that the time t exists. They never know and nor can deduce or predict when the synchrony period starts. 2.2 Distributed Computation Processes belonging to the distributed system may decide autonomously to join a distributed computation running on top of the system (e.g. a regular register computation). Hence, a distributed computation is composed, at each 1. Sometime also called partially synchronous system.
3 3 leave_computation() leave_system() join_system() leave_system() Distributed Computation (e.g., Regular Register) join_computation() Distributed System based on the definition of two functions (i) the join function λ(t) (defining the join of new processes to the distributed computation with respect to time) and (ii) the leave function µ(t) (defining the leave of processes from the distributed computation with respect to time). Such functions are discrete functions of time. Definition 2: (Join function) The join function λ(t) is a discrete time function that returns the number of processes that invoke the join Computation() operation at time t. Fig. 1. Distributed System and Distributed Computation instant of time, by a subset of processes of the distributed system. A process p i, belonging to the distributed system, that wants to join the distributed computation has to execute the join Computation() operation. Such operation, invoked at some time t, is not instantaneous and it takes time to be executed; how much this time is, depends from the specific implementation provided for the join Computation() operation. However, from time t, the process p i can receive and process messages sent by any other process that participate in the computation. When a process p j, participating in the distributed computation, wishes to leave the computation, it executes the leave Computation operation. Without loss of generality, we assume that if a process leaves the computation and later wishes to re-join, it executes again the join Computation() operation with a new identity. Figure 1 shows the distributed system and the distributed computation layers. It is important to notice that (i) there may exist processes belonging to the distributed system that never join the distributed computation (i.e. they execute the join System() procedure but they never invoke the join Computation() operation) and (ii) there may exist processes that after leaving the distributed computation remain inside the distributed system (i.e. they are correct but they stop to process messages related to the computation). To this aim, it is important to identify the subset of processes that are actively participating in the distributed computation. Definition 1: A process is active in the distributed computation from the time it returns from the join Computation() operation until the time it start executing the leave Computation() operation. A(t) denotes the set of processes that are active at time t, while A([t, t ]) denotes the set of processes that are active during the whole interval [t, t ] (i.e. p i A([t, t ]) iff p i A(τ) for each τ [t, t ]). 2.3 Churn Model Processes may join and leave the distributed computation at any time. To model this activity, we consider the churn model that we introduced in [5]. The model is Definition 3: (Leave function) The leave function µ(t) is a discrete time function that returns the number of processes that invoke the leave Computation() operation at time t. Let t 0 be the starting time of the system. We assume that at time t 0 no process joins or leaves the distributed computation (i.e. λ(t 0 ) = 0 and µ(t 0 ) = 0) and therefore we can say that at t 0 the computation is composed by a set Π 0 of processes and the size of the distributed computation is n 0 (i.e., Π 0 = n 0 ). Moreover, for any time t < t 0 we have λ(t) = µ(t) = 0. The churn is continuous meaning that processes never stop to join and to leave the computation and the following conditions hold. t : τ > t : λ(τ) = 0, and t : τ > t : µ(τ) = 0. As soon as churn starts, the size of the computation and computation membership change. The number of participants of the computation can be calculated as follows. Definition 4: (Node function) Let n 0 be the number of processes participating in the computation at start time t 0. N(t) is the number of processes of the computation at time t for every t t 0 (i.e. N(t) = N(t 1)+λ(t) µ(t), with N(t 0 ) = n 0 ). Based on the previous definitions, let us derive the constraint that a join function and a leave function have to satisfy in order that the distributed computation size remains in a given interval. Note that, such behavior is typical of real applications like peer-to-peer systems, VoIP based application etc. [13], [14]. Let n 0 be the number of processes of the distributed computation at the start time t 0 and k 1, k 2 be two positive integers, the following Lemma (proved in [5]) states the constraints on the join function and the leave function such that the distributed computation size falls in the interval N = [n 0 k 1, n 0 + k 2 ]. Lemma 1: Let k 1 and k 2 be two integers such that k 1, k 2 0 and let n 0 be the number of processes in the distributed computation at starting time t 0. Given a join and leave function λ(t) and µ(t), the node function N(t) falls in the interval N = [n 0 k 1, n 0 + k 2 ] if and only if:
4 4 N(t) n 0 +k 2 n 0 n 0 k 1 t 1 t 2 Fig. 2. Distributed System Size in an interval N = [n 0 k 1, n 0 + k 2 ] (c1) t τ=t 0 µ(τ) t τ=t 0 λ(τ) + k 1 t, (c2) t τ=t 0 µ(τ) t τ=t 0 λ(τ) k 2 t. An example the evolution of the size of a distributed computation along the time is shown in Figure 2. Note that, constraints (c1) and (c2) have to be satisfied independently of the computation. In fact, they just follow from the requirement of having the computation size falling in the range defined by n 0, k 1 and k 2. 3 REGULAR REGISTER IN A DYNAMIC DIS- TRIBUTED SYSTEM. A register is a shared variable accessed by a set of processes through two operations, namely read() and write(). Informally, the write() operation updates the value stored in the shared variable while the read() obtains the value contained in the variable (i.e. the last written value). In case of concurrency while accessing the shared variable, the meaning of last written value becomes ambiguous. Depending on the semantics of the operations, three types of register have been defined by Lamport [18]: safe, regular and atomic. 3.1 Regular Register Computation Processes participating in the distributed computation implement a regular register abstraction. As a specialization of the generic model of the computation defined in Section 2, in the following we consider the existence of a join register operation and of a leave register operation. In particular, in the case of a regular register computation, the aim of the join register operation is to transfer the current value of the register variable to the new process to guarantee the persistence of the value of the register despite churn. The protocol implementing the join register operation is presented in Section 4. We model the leave register operation as an implicit operation; when a process p i leaves the computation it just stops to send and process messages related to the register computation. In this way, it is possible to address the same way (from the register computation point of view) both process failures and process leaves. Thus, in the following, we do not distinguish among voluntary leaves and failures but we refer to both of them as leave. Moreover, to simplify the notation, whenever not strictly necessary, we refer to the join register() operation as join() operation. t 3.2 Operation executions Every operation issued on a register is, generally, not instantaneous and it can be characterized by two events occurring at its boundary: an invocation event and a reply event. These events occur at two time instants (invocation time and reply time) according to the fictional global time. An operation op is complete if both the invocation event and the reply event occur (i.e. the process executing the operation does not crash between the invocation and the reply). Contrary, an operation op is said to be failed if it is invoked by a process that crashes before the reply event occurs. Given two operations op and op, and their invocation event and reply event times (t B (op) and t B (op )) and return times (t E (op) and t E (op )), we say that op precedes op (op op ) iff t E (op) < t B (op ). If op does not precede op and op does not precede op then op and op are concurrent (op op ). Given a write(v) operation, the value v is said to be written when the operation is complete. As a consequence, failed write() operations are incomplete operations. As in [12], we consider that if a process crashes during a write() operation, such write() is concurrent with all the successive operations. 3.3 Multi-reader/Multi-writer Specification The notion of a regular register, as specified in [18], is not directly applicable in a dynamic distributed system like the one presented in the previous section, because it does not consider failures, process joins and leaves. To this aim, we focus on the multi-writer/multi-reader regular register abstraction as defined in [22] and in [25] and we adapt it to consider arrivals and departures of processes. Before introducing the specification, let us introduce the notion of relevant write. Definition 5: A write() operation w is relevant for a read() operation r if: (i) w r or (ii) w : w w r We are now in the position to specify a regular register for a dynamic distributed system. A protocol implements a regular register in a dynamic distributed system if the following properties are satisfied. Termination: If a correct process participating in the computation invokes a read or write operation and does not leave the system, it eventually returns from that operation. Multi-Writer Regularity 1(MWR1): A read operation op returns any of the values written by some write() that is relevant for op. We assumed that each process p i issues either a read() or a write() operation only after it has returned from its join register() operation [4].
5 5 4 REGULAR REGISTER IN EVENTUALLY SYNCHRONOUS DISTRIBUTED SYSTEMS In [5] we presented an implementation of the regular register for a synchronous distributed system. Such implementation is based on the following considerations: (i) the join register() operation is executed once from each process and (ii) read() and write() operations are executed frequently. This led us to design a protocol having local read and fast write operations, by exploiting the synchrony of the communication. Moving to an eventually synchronous system, read() operations are no longer local. They indeed require to gather information from a certain number of active processes in the system in order to retrieve the last written value. Hence, the price to pay for not relying on synchrony is that read() operations cannot be local anymore. This section presents a protocol implementing a regular register in an eventually synchronous distributed system with continuous churn and where the number of processes participating in the distributed computation is alway in the range [n 0 k 1, n 0 + k 2 ]. To master the absence of synchrony assumptions holding at each time, the protocol implements join register(), read() and write() operations involving all the processes belonging to the computation. The basic idea behind join register() and read() operation is to have two phases: (i) the process issuing the operation broadcasts an INQUIRY message, then waits until it receives enough replies to confirm that the operation has been processed by enough processes; (ii) the process helps other processes that join the computation concurrently to terminate the operation by sending them the updated value. Concerning the write() operation, the basic idea is that the writer broadcasts a WRITE message and then just waits until it receives enough acknowledgments for such operation. In the following section, we provide the details of the protocols implementing such operations. 4.1 Protocol Each process p i maintains the following local variables. Two variables denoted register i and sn i, such that register i is the local copy of the register, while sn i is the sequence number of the last write operation that updated register i. A boolean active i, initialized to false. It flips to true just after p i has joined the regular register computation. Two set variables, denoted replies i and reply to i. The first one is used both in the join register() operation and in the read() operation while reply to i is used only during the join period. The local variable replies i contains the 3-uples < id, value, sn > that p i has received from other processes, while reply to i contains the processes that are joining the regular register computation concurrently with p i. read sn i is a sequence number used by p i to timestamp its read requests. The value read sn i equal to zero is used by the join operation. reading i is boolean whose value is true when p i is reading. write ack i is a set used by p i (when it writes a new value) to store identifiers of processes that have acknowledged p i s last write. while p i is joining the distributed computation, dl prev i is a set where p i stores identifiers of processes that have acknowledged p i s inquiry message while these processes were not yet active (so, these processes were joining the computation too) or while they are reading. When it terminates its join operation, p i has to send them a reply to prevent them from being blocked forever. The join register() operation. The protocol implementing this operation is described in Figure 3. After having initialized its local variables, p i broadcasts an INQUIRY(i, read sn i ) message to inform the other processes that it wants to obtain the value of the regular register (line 04, as indicated read sn i is then equal to 0). Then, after it has received a number C of replies (line 05) 2, p i updates its local pair (register i, sn i ) (lines 06-07), becomes active (line 08), and sends a reply to the processes in the set reply to i (line 09-11). It sends such a reply message also to the processes in its set dl prev i to prevent them from waiting forever (see proof of Lemma 3). In addition to the triple < i, register i, sn i >, a reply message sent to a process p j, from a process p i, carries also the read sequence number r sn that identifies the corresponding request issued by p j. When a process p i delivers a message INQUIRY(j), it answers p j sending back a REPLY(< i, register i, sn i >) message containing its local variable. If p i is active and reading (line 15), it also sends a DL PREV() message to p j (line 17); this is required in order that p j sends to p i the value p j has obtained when it terminated its join operation. If p i is not yet active, it postpones its answer until it becomes active (line 19 and lines 09-11) and it sends a DL PREV message (line 20). When p i delivers a REPLY(< j, value, sn >, r sn) message from a process p j, if the reply message is an answer to its INQUIRY(i, read sn) message (line 23), p i adds < j, value, sn > to the set of replies it has received so far and it sends back an ACK(i, r sn) message to p j (lines 24-25). Finally, when p i delivers a message DL PREV(j, r sn), it adds its content to the set dl prev i (line 28), in order to remember that it has to send a reply to p j when it will become active (lines 09-10). The read() operation. A read is a simplified version of the join operation 3. Hence, the code of the read() operation, 2. In the correctness proofs section we will compute the value of C that allows any operation to terminate and be valid. 3. As indicated before, the read identified (i, 0) is the join register() operation issued by p i.
6 6 operation join register(i): (01) register i ; sn i 1; active i false; (02) reading i false; replies i ; reply to i ; (03) write ack i ; dl prev i ; read sn i 0; (04) (05) broadcast INQUIRY(i, read sn i); wait until` replies i > C); (06) let < id, val, sn > replies i such that ( <,, sn > replies i : sn sn ); (07) if (sn > sn i) then sn i sn; register i val end if (08) active i true; (09) for each < j, r sn > reply to i dl prev i do (10) send REPLY (< i, register i, sn i >, r sn) to p j (11) end for; (12) return(ok). (13) when INQUIRY(j, r sn) is delivered: (14) if (active i) (15) then send REPLY (< i, register i, sn i >, r sn) to p j (16) if (reading i) then (17) send DL PREV (i, r sn) to p j (18) end if; (19) else reply to i reply to i {< j, r sn >}; (20) send DL PREV (i, r sn) to p j (21) end if. (22) when REPLY(< j, value, sn >, r sn) is delivered: (23) if ((r sn = read sn i) then (24) replies i replies i {< j, value, sn >}; (25) send ACK (i, r sn) to p j (26) end if. (27) when DL PREV(j, r sn) is delivered: (28) dl prev i dl prev i {< j, r sn >}. Fig. 3. The join register() protocol (code for p i ) described in Figure 4, is a simplified version of the code of the join register() operation. Each read invocation is identified by a pair made up of the process index i and a sequence number read sn i (line 03). p i first broadcasts a read request READ(i, read sn i ). Then, after it has received C replies, p i selects the one with the greatest sequence number, updates (if needed) its local pair (register i, sn i ), and returns the value of register i. When p i delivers a message READ(j, r sn) while being active (line 09). If it is joining the system, p i stores the p j s identifier to remember that p i has to send back a reply to p j when p i will terminate the join operation (line 11). operation read(i): (01) read sn i read sn i + 1; (02) replies i ; reading i true; (03) broadcast READ(i, read sn i); (04) wait until( replies i > C); (05) let < id, val, sn > replies i such that ( <,, sn > replies i : sn sn ); (06) if (sn > sn i) then sn i sn; register i val end if; (07) reading i false; return(register i). (08) when READ(j, r sn) is delivered: (09) if (active i) (10) then send REPLY (< i, register i, sn i >, r sn) to p j (11) else reply to i reply to i {< j, r sn >} (12) end if. Fig. 4. The read() protocol (code for p i ) The write() operation. The code of the write operation is described in Figure 5. Let us recall that it is assumed that a single process at a time issues a write. When a process p i wants to write, it issues first a read operation in order to obtain the sequence number associated with the last value written (line 01) 4. Then, after it has broadcast the WRITE(i, < v, sn i >) message to disseminate the new value and its sequence number to the other processes (line 04), p i waits until it has received C acknowledgments. When this happens, it terminates the write operation by returning the control value ok (line 05). When a message WRITE(j, < val, sn >) is delivered, p i takes into account the pair (val, sn) if it is more up-to-date than its current pair (line 08). In all cases, it sends back to the sender p j a message ACK (i, sn) to terminate its write operation (line 09). When an ACK (j, sn) message is delivered, p i adds it to its set write ack i if this message is an answer to its last write (line 11). operation write(v): (01) read(i); (02) sn i sn i + 1; register i v; (03) write ack i ; (04) broadcast WRITE(i, < v, sn i >); (05) wait until( write ack i > C); (06) return(ok). (07) when WRITE(j, < val, sn >) is delivered: (08) if (sn > sn i) then register i val; sn i sn end if; (09) send ACK (i, sn) to p j. (10) when ACK(j, sn) is delivered: (11) if (sn = sn i) then write ack i write ack i {j} end if. Fig. 5. The write() protocol (code for p i ) Due to lack of space, we omit thee correctness proofs that can be found in the supplementary material. 5 RELATED WORK Several works have been done recently with the aim to address the implementation of concurrent data structures on wired message passing dynamic systems (e.g., [1], [4], [8], [11], [21]). In [21], a Reconfigurable Atomic Memory for Basic Object (RAMBO) is presented. RAMBO works in a distributed system where processes can join and fail during the execution of the algorithm. To guarantee reliability of data, in spite of network changes, RAMBO replicates data at several network location and defines configuration to manage small and transient changes. Each configuration is composed by a set of members, a set of read-quorum and a set of writequorums. In order to manage large changes to the set of participant process, RAMBO defines a reconfiguration procedure whose aim is to move from an existing configuration to a new one where the set of members, read-quorum or write-quorum are modified. In order to ensure atomicity, the reconfiguration procedure is implemented by a distributed consensus algorithm that makes all the processes agree on the same successive configurations. Therefore, RAMBO cannot be implemented in a fully asynchronous system. It 4. In absence of concurrent write operations, this read obtains the greatest sequence number. The same strategy is used in protocols implementing atomic registers (e.g., [3], [10]).
7 7 is important to note that in RAMBO the notion of churn is abstracted by defining a sequence of configurations. Note that, RAMBO poses some constraints on the removal of old configurations and in particular, a certain configuration S cannot be removed until each operation, executed by processes belonging S, is ended; as a consequence, many old configurations may take long time to be removed. [11] and [8] presents some improvements to the original RAMBO protocol and in particular to its reconfiguration mechanism. In [11] the reconfiguration protocol has been changed by parallelizing new configuration installations and the removal of an arbitrary number of old configurations. In [8], the authors present a mechanism that combines the features of RAMBO and the underling consensus algorithm to speed up the reconfiguration and reduce the time during which old configurations are accessible. In [1] Aguilera et al. show that an atomic register can be realized without consensus and, thus, on a fully asynchronous distributed system provided that the number of reconfiguration operations is finite and thus the churn is quiescent (i.e., there exists a finite time after which there are no more joins or failures). Configurations are managed by taking into account all the changes (i.e. join and failure of processes) suggested by the participant and the quorums are represented by any majority of processes. To ensure liveness of read and write operations, the authors assume that the number of reconfigurations is finite and that there is a majority of correct processes in each reconfiguration. In [4], we presented an implementation of a regular register in an eventually synchronous distributed systems prone to continuous churn. Contrarily to what has been presented in this paper, [4] assumes the size of the distributed system is constant (i.e., at any instant of time the same number of processes join and leave the distributed system). In particular, we have shown that if the distributed system size n does not change, a regular register implementation can be done if at any time at least n 2 active processes participate in the regular register implementation and no constraint is given on the value n. The same paper shows that no regular register can be implemented in a fully asynchronous system in presence of continuous churn. Let us finally remark that the result presented in [4] can be seen as a particular case of the result presented in the previous section when considering k 1 = k 2 = 0 and n 0 c processes (where c is a percentage of nodes) invokes the join operation and n 0 c processes leave the system at every time unit (i.e., λ(t) = µ(t) = n 0 c). Figure 6 summarizes the system model assumptions and the assumption on the constraints on processes employed by different algorithms. Note that churn-quiescent implementations (e.g., [1], [8], [11], [21]) do not explicitly use the notion of active processes (they instead use the notion of correct process; it is however possible to consider an active process as being a correct process.). The other direction is not true because a correct process does not pass through a join operation. This is a consequence of the fact that churn-quiescent implementations do not separate distributed system from distributed computation. [9,12,23] Fig. 6. Register in Dynamic Systems prone to Churn 6 CONCLUSION In modern distributed systems the notion of processes continuously departing and joining the system (churn) is actually part of the system model and creates additional unpredictability to be mastered by distributed applications. As an example churn creates condition of consistency violations in large scale storage systems and the probability of consistency violations usually increases as the churn increases. This is why such storage system does not provide any deterministic consistency guarantees (e.g. regular or atomic registers). Hence, there is the need to capture the churn of a dynamic system through tractable realistic models in order to pave the way to distributed applications whose correctness can be formally proved. This paper, based on a generic churn model defined in [5] has presented the implementation of a single writer/multiple reader regular register in such a model. It has been formally proved that a regular register can be implemented in an eventually synchronous distributed system if, at any time, the number of active processes is greater than n0+k2 2, the number of processes in the distributed system remains between n 0 k 1 and n 0 + k 2 (where n 0 is the number of processes in the system at time t 0 ) and n 0 is greater than 2k 1 + k 2. Interestingly, this implementation has shown in a precise way that, when one wants to implement a shared register in a dynamic system there is a tradeoff relating the acceptable degree of churn and the synchrony of the underlying system (namely, the churn has to decrease when one wants to go from a synchronous system to an eventually synchronous system). ACKNOWLEDGEMENTS The authors want to thank the anonymous reviewers for their comments that greatly imporoved content and presentation of the paper. The work has been partially supported by the STREP EU project SM4ALL and the IP EU project SOFIA.
8 8 REFERENCES [1] Aguilera M. K., Keidar I., Malkhi D., Shraer A. Dynamic atomic storage without consensus in proceedings of 28 th Annual ACM Symposium on Principles of Distributed Computing (PODC) 2009, [2] Anderson, E. and Li, X. and Shah, M. A. and Tucek, J. and Wylie, J. J. What consistency does your key-value store actually provide? (To Appear) in proceedings of 6th Workshop on Hot Topics in System Dependability (HotDep) [3] Attiya H., Bar-Noy A. and Dolev D., Sharing Memory Robustly in Message-Passing Systems. JACM, 42(1): , [4] Baldoni R., Bonomi S., Kermarrec A.M., Raynal M., Implementing a Register in a Dynamic Distributed System. in Proc. 29th IEEE Int l Conference on Distributed Computing Systems (ICDCS 09), IEEE Computer Society Press, Montreal (Canada), June [5] Baldoni R., Bonomi S., Raynal M. Regular Register: an Implementation in a Churn Prone Environment. 16th International Colloquium on Structural Information and Communication Complexity (SIROCCO), Springer-Verlag #5869, pp , [6] Baldoni R., and Shvartsman, A. A., Theoretical aspects of dynamic distributed systems: report on the workshop. SIGACT News, 40(4):87-89, [7] Chandra T. and Toueg S., Unreliable Failure Detectors for Reliable Distributed Systems. JACM, 43(2): , [8] Chockler G., Gilbert S., Gramoli V., Musial P. M. and Shvartsman A. Reconfigurable distributed storage for dynamic networks Journal Parallel Distributed Computing 69(1): (2009) [9] Dwork C., Lynch N. and Stockmeyer L., Consensus in the Presence of Partial Synchrony. JACM, 35(2): , [10] Friedman R., Raynal M. and Travers C., Abstractions for Implementing Atomic Objects in Distributed Systems. 9th Int l Conference on Principles of Distributed Systems (OPODIS 05), LNCS #3974, pp , [11] Gilbert S., Lynch N., and Shvartsman A. RAMBO II: Rapidly Reconfigurable Atomic Memory for Dynamic Networks in proceeding of International Conference on Dependable Systems and Networks (DSN 2003). [12] Guerraoui, R. and Levy, R. R. and Pochon, B. and Pugh, J. The collective memory of amnesic processes ACM Transactions on Algorithms 4(1): (2008) [13] Godfrey B., Shenker S., Stoica I., Minimizing churn in distributed systems. Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications (SIGCOMM), , [14] Guha S.,Daswani N. and Jain R. An experimental study of the skype peer-to-peer voip system In Proceeding of he 5th International Workshop on Peer-to-Peer Systems (IPTPS), 2006 [15] Ko S., Hoque I. and Gupta I., Using Tractable and Realistic Churn Models to Analyze Quiescence Behavior of Distributed Protocols. Proc. 27th IEEE Int l Symposium on Reliable Distributed Systems (SRDS 08), [16] Kuhn F., Schmid S., Wattenhofer R. A Self-repairing Peer-to-Peer System Resilient to Dynamic Adversarial Churn. in Proceeding of 4th International Workshop on Peer-to-Peer Systems (IPTPS) [17] Kuhn F., Schmid S., Smit J., Wattenhofer R. A Blueprint for Constructing Peer-to-Peer Systems Robust to Dynamic Worst-Case Joins and Leaves in Proceeding of 14th IEEE International Workshop on Quality of Service (IWQoS) 2006 [18] Lamport. L., On Interprocess Communication, Part 1: Models, Part 2: Algorirhms. Distributed Computing, 1(2):77-101, [19] Liben-Nowell D., Balakrishnan H., and Karger D.R., Analysis of the Evolution of Peer-to-peer Systems. 21th ACM Symp. PODC, ACM press, pp , [20] Liben-Nowell D., Karger D. R., Kaashoek M. F., Dabek F., Balakrishnan H. Stoica I. and Morris R. Chord: A Scalable Peerto-peer Lookup Protocol for Internet Applications. in IEEE/ACM Transactions on Networking, 11(1): (2003). [21] Lynch, N. and Shvartsman A., RAMBO: A Reconfigurable Atomic Memory Service for Dynamic Networks. Proc. 16th Int l Symposium on Distributed Computing (DISC 02), Springer-Verlag LNCS #2508, pp , [22] Malkhi D. and Reiter M. K. Byzantine Quorum Systems, Distributed Computing 11(4): (1998) [23] Mostefaoui A., Raynal M., Travers C., Peterson S., El Abbadi, Agrawal D., From Static Distributed Systems to Dynamic Systems. 24th IEEE Symposium on Reliable Distributed Systems (SRDS 05), IEEE Computer Society Press, pp , [24] Merritt M. and Taubenfeld G., Computing with Infinitely Many Processes. Proc. 14th Int l Symposium on Distributed Computing (DISC 00), LNCS #1914, pp , [25] Shao C., Pierce E. and Welch J., Multi-writer consistency conditions for shared memory objects. Proc. 17th Int l Symposium on Distributed Computing (DISC 03), Springer-Verlag, LNCS #2848, pp , [26] Sousa P., Bessani A. N., Correia M, Ferreira Neves N., and Verssimo P. Highly Available Intrusion-Tolerant Services with Proactive- Reactive Recovery IEEE Transaction on Parallel and Distributed Systems 21(4): pp (2010) [27] Voulgaris S., Gavidia D., and van Steen M. CYCLON: Inexpensive Membership Management for Unstructured P2P Overlays. Journal of Network and Systems Management. 13(2): (2005) Roberto Baldoni is Professor at the University of Rome La Sapienza where he leads the Distributed Systems group and the MIDLAB Laboratory. His research interests include distributed computing, dependable and secure distributed systems, distributed information systems and distributed event based processing. Roberto s research at the University of Rome has been funded along the years by the European Commission, the Ministry of Italian Research, IBM, Microsoft, Finmeccanica and Telecom Italia. In 2010, he received the Science2business Award and the IBM Faculty Award. Roberto is author of around 150 research papers from theory to practice of distributed systems. Roberto belongs to the Steering Committee of ACM DEBS that he chaired in 2008 and he is a member of ACM, IEEE and of the IFIP WG Silvia Bonomi is a PhD in Computer Science at the University of Rome La Sapienza. She is doing research on various computerscience fields including dynamic distributed systems and event-based systems. In these research fields, she published several papers in peer reviewed scientic forums. As a part of the MIDLAB research group, she is currently involved in an EU-funded project dealing with energy saving in private and public buildings (GreenerBuildings project) and she worked on dependable distributed systems (ReSIST network of excellence) and on the definition of new semantic tools for e-government (SemanticGov). Michel Raynal is a professor of computer science at the University of Rennes, France. His main research interests are the basic principles of distributed computing systems. He is a world leading researcher in the domain of distributed computing. He is the author of numerous papers on distributed computing (more than 120 in journals and 250 papers in int l conferences) and is wellknown for his distributed algorithms and his (9) books on distributed computing. He has chaired the program committee of the major conferences on the topic (e.g., ICDCS, DISC, SIROCCO, and OPODIS). He has also served on the program committees of many international conferences, and is the recipient of several Best Paper awards (ICDCS 1999, 2000 and 2001, SSS 2009, Europar 2010). He has been invited by many universities all over the world to give lectures on distributed computing. His h-index is 45. He has recently written two books published by Morgan & Clayppool: Communication and Agreement Abstractions for Fault-Tolerant Asynchronous Distributed Systems (June 2010) and Fault-Tolerant Agreement in Synchronous Distributed Systems (September 2010). Since 2010, Michel Raynal is a senior member to the prestigious Institut Universitaire de France
Multi-writer Regular Registers in Dynamic Distributed Systems with Byzantine Failures
Multi-writer Regular Registers in Dynamic Distributed Systems with Byzantine Failures Silvia Bonomi, Amir Soltani Nezhad Università degli Studi di Roma La Sapienza, Via Ariosto 25, 00185 Roma, Italy bonomi@dis.uniroma1.it
More informationImplementing a Register in a Dynamic Distributed System
Implementing a Register in a Dynamic Distributed System Roberto Baldoni, Silvia Bonomi, Anne-Marie Kermarrec, Michel Raynal Sapienza Università di Roma, Via Ariosto 25, 85 Roma, Italy INRIA, Univeristé
More informationSilvia Bonomi. Implementing Distributed Computing Abstractions in the presence of Churn
N o d ordre : 4041 ANNÉE 2010 THÈSE / UNIVERSITÉ DE RENNES 1 sous le sceau de l Université Européenne de Bretagne pour le grade de DOCTEUR DE L UNIVERSITÉ DE RENNES 1 Mention : Informatique Ecole doctorale
More informationImplementing a Register in a Dynamic Distributed System
Implementing a Register in a Dynamic Distributed System Roberto Baldoni, Silvia Bonomi, Anne-Marie Kermarrec, Michel Raynal To cite this version: Roberto Baldoni, Silvia Bonomi, Anne-Marie Kermarrec, Michel
More informationImplementing Shared Registers in Asynchronous Message-Passing Systems, 1995; Attiya, Bar-Noy, Dolev
Implementing Shared Registers in Asynchronous Message-Passing Systems, 1995; Attiya, Bar-Noy, Dolev Eric Ruppert, York University, www.cse.yorku.ca/ ruppert INDEX TERMS: distributed computing, shared memory,
More informationA Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment
A Case Study of Agreement Problems in Distributed Systems : Non-Blocking Atomic Commitment Michel RAYNAL IRISA, Campus de Beaulieu 35042 Rennes Cedex (France) raynal @irisa.fr Abstract This paper considers
More informationWait-Free Regular Storage from Byzantine Components
Wait-Free Regular Storage from Byzantine Components Ittai Abraham Gregory Chockler Idit Keidar Dahlia Malkhi July 26, 2006 Abstract We consider the problem of implementing a wait-free regular register
More informationGenerating Fast Indulgent Algorithms
Generating Fast Indulgent Algorithms Dan Alistarh 1, Seth Gilbert 2, Rachid Guerraoui 1, and Corentin Travers 3 1 EPFL, Switzerland 2 National University of Singapore 3 Université de Bordeaux 1, France
More informationA Timing Assumption and a t-resilient Protocol for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems
A Timing Assumption and a t-resilient Protocol for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems Antonio FERNÁNDEZ y Ernesto JIMÉNEZ z Michel RAYNAL? Gilles TRÉDAN? y LADyR,
More informationConsensus in Byzantine Asynchronous Systems
Consensus in Byzantine Asynchronous Systems R. BALDONI Universitá La Sapienza, Roma, Italy J.-M. HÉLARY IRISA, Université de Rennes 1, France M. RAYNAL IRISA, Université de Rennes 1, France L. TANGUY IRISA,
More informationSpecifying and Proving Broadcast Properties with TLA
Specifying and Proving Broadcast Properties with TLA William Hipschman Department of Computer Science The University of North Carolina at Chapel Hill Abstract Although group communication is vitally important
More informationFork Sequential Consistency is Blocking
Fork Sequential Consistency is Blocking Christian Cachin Idit Keidar Alexander Shraer May 14, 2008 Abstract We consider an untrusted server storing shared data on behalf of clients. We show that no storage
More informationFork Sequential Consistency is Blocking
Fork Sequential Consistency is Blocking Christian Cachin Idit Keidar Alexander Shraer Novembe4, 008 Abstract We consider an untrusted server storing shared data on behalf of clients. We show that no storage
More informationA General Characterization of Indulgence
A General Characterization of Indulgence R. Guerraoui 1,2 N. Lynch 2 (1) School of Computer and Communication Sciences, EPFL (2) Computer Science and Artificial Intelligence Laboratory, MIT Abstract. An
More informationSignature-Free Broadcast-Based Intrusion Tolerance: Never Decide a Byzantine Value
Signature-Free Broadcast-Based Intrusion Tolerance: Never Decide a Byzantine Value Achour Mostéfaoui and Michel Raynal IRISA, Université de Rennes 1, 35042 Rennes, France {achour,raynal}@irisa.fr Abstract.
More informationTwo-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure
Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Yong-Hwan Cho, Sung-Hoon Park and Seon-Hyong Lee School of Electrical and Computer Engineering, Chungbuk National
More informationIntroduction to Distributed Systems Seif Haridi
Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send
More informationAsynchronous Reconfiguration for Paxos State Machines
Asynchronous Reconfiguration for Paxos State Machines Leander Jehl and Hein Meling Department of Electrical Engineering and Computer Science University of Stavanger, Norway Abstract. This paper addresses
More informationRuminations on Domain-Based Reliable Broadcast
Ruminations on Domain-Based Reliable Broadcast Svend Frølund Fernando Pedone Hewlett-Packard Laboratories Palo Alto, CA 94304, USA Abstract A distributed system is no longer confined to a single administrative
More informationByzantine Consensus in Directed Graphs
Byzantine Consensus in Directed Graphs Lewis Tseng 1,3, and Nitin Vaidya 2,3 1 Department of Computer Science, 2 Department of Electrical and Computer Engineering, and 3 Coordinated Science Laboratory
More informationSynchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer
Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors Michel Raynal, Julien Stainer Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors
More informationOptimistic Erasure-Coded Distributed Storage
Optimistic Erasure-Coded Distributed Storage Partha Dutta IBM India Research Lab Bangalore, India Rachid Guerraoui EPFL IC Lausanne, Switzerland Ron R. Levy EPFL IC Lausanne, Switzerland Abstract We study
More informationConsensus. Chapter Two Friends. 8.3 Impossibility of Consensus. 8.2 Consensus 8.3. IMPOSSIBILITY OF CONSENSUS 55
8.3. IMPOSSIBILITY OF CONSENSUS 55 Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of
More informationReconfigurable Distributed Storage for Dynamic Networks
Reconfigurable Distributed Storage for Dynamic Networks Gregory Chockler 1,2, Seth Gilbert 1, Vincent Gramoli 3,4, Peter M Musial 3, and Alexander A Shvartsman 1,3 1 CSAIL, Massachusetts Institute of Technology,
More informationFault Resilience of Structured P2P Systems
Fault Resilience of Structured P2P Systems Zhiyu Liu 1, Guihai Chen 1, Chunfeng Yuan 1, Sanglu Lu 1, and Chengzhong Xu 2 1 National Laboratory of Novel Software Technology, Nanjing University, China 2
More informationAdapting Commit Protocols for Large-Scale and Dynamic Distributed Applications
Adapting Commit Protocols for Large-Scale and Dynamic Distributed Applications Pawel Jurczyk and Li Xiong Emory University, Atlanta GA 30322, USA {pjurczy,lxiong}@emory.edu Abstract. The continued advances
More informationCommunication-Efficient Probabilistic Quorum Systems for Sensor Networks (Preliminary Abstract)
Communication-Efficient Probabilistic Quorum Systems for Sensor Networks (Preliminary Abstract) Gregory Chockler Seth Gilbert Boaz Patt-Shamir chockler@il.ibm.com sethg@mit.edu boaz@eng.tau.ac.il IBM Research
More informationA Mechanism for Sequential Consistency in a Distributed Objects System
A Mechanism for Sequential Consistency in a Distributed Objects System Cristian Ţăpuş, Aleksey Nogin, Jason Hickey, and Jerome White California Institute of Technology Computer Science Department MC 256-80,
More informationEtna: a fault-tolerant algorithm for atomic mutable DHT data
Etna: a fault-tolerant algorithm for atomic mutable DHT data Athicha Muthitacharoen Seth Gilbert Robert Morris athicha@lcs.mit.edu sethg@mit.edu rtm@lcs.mit.edu MIT Laboratory for Computer Science 200
More informationConsensus in the Presence of Partial Synchrony
Consensus in the Presence of Partial Synchrony CYNTHIA DWORK AND NANCY LYNCH.Massachusetts Institute of Technology, Cambridge, Massachusetts AND LARRY STOCKMEYER IBM Almaden Research Center, San Jose,
More informationOh-RAM! One and a Half Round Atomic Memory
Oh-RAM! One and a Half Round Atomic Memory Theophanis Hadjistasi Nicolas Nicolaou Alexander Schwarzmann July 21, 2018 arxiv:1610.08373v1 [cs.dc] 26 Oct 2016 Abstract Emulating atomic read/write shared
More informationAn Implementation of Causal Memories using the Writing Semantic
An Implementation of Causal Memories using the Writing Semantic R. Baldoni, C. Spaziani and S. Tucci-Piergiovanni D. Tulone Dipartimento di Informatica e Sistemistica Bell-Laboratories Universita di Roma
More informationarxiv: v1 [cs.dc] 13 May 2017
Which Broadcast Abstraction Captures k-set Agreement? Damien Imbs, Achour Mostéfaoui, Matthieu Perrin, Michel Raynal, LIF, Université Aix-Marseille, 13288 Marseille, France LINA, Université de Nantes,
More informationOn the interconnection of message passing systems
Information Processing Letters 105 (2008) 249 254 www.elsevier.com/locate/ipl On the interconnection of message passing systems A. Álvarez a,s.arévalo b, V. Cholvi c,, A. Fernández b,e.jiménez a a Polytechnic
More informationRAMBO: A Robust, Reconfigurable Atomic Memory Service for Dynamic Networks
RAMBO: A Robust, Reconfigurable Atomic Memory Service for Dynamic Networks Seth Gilbert EPFL, Lausanne, Switzerland seth.gilbert@epfl.ch Nancy A. Lynch MIT, Cambridge, USA lynch@theory.lcs.mit.edu Alexander
More informationConsensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS
16 CHAPTER 2. CONSENSUS Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of a node. Chapter
More informationACONCURRENT system may be viewed as a collection of
252 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 10, NO. 3, MARCH 1999 Constructing a Reliable Test&Set Bit Frank Stomp and Gadi Taubenfeld AbstractÐThe problem of computing with faulty
More informationTime-related replication for p2p storage system
Seventh International Conference on Networking Time-related replication for p2p storage system Kyungbaek Kim E-mail: University of California, Irvine Computer Science-Systems 3204 Donald Bren Hall, Irvine,
More informationBYZANTINE AGREEMENT CH / $ IEEE. by H. R. Strong and D. Dolev. IBM Research Laboratory, K55/281 San Jose, CA 95193
BYZANTINE AGREEMENT by H. R. Strong and D. Dolev IBM Research Laboratory, K55/281 San Jose, CA 95193 ABSTRACT Byzantine Agreement is a paradigm for problems of reliable consistency and synchronization
More informationResearch Statement. Yehuda Lindell. Dept. of Computer Science Bar-Ilan University, Israel.
Research Statement Yehuda Lindell Dept. of Computer Science Bar-Ilan University, Israel. lindell@cs.biu.ac.il www.cs.biu.ac.il/ lindell July 11, 2005 The main focus of my research is the theoretical foundations
More informationSelf-stabilizing Byzantine Digital Clock Synchronization
Self-stabilizing Byzantine Digital Clock Synchronization Ezra N. Hoch, Danny Dolev and Ariel Daliot The Hebrew University of Jerusalem We present a scheme that achieves self-stabilizing Byzantine digital
More informationA Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm
Appears as Technical Memo MIT/LCS/TM-590, MIT Laboratory for Computer Science, June 1999 A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Miguel Castro and Barbara Liskov
More informationA Search Theoretical Approach to P2P Networks: Analysis of Learning
A Search Theoretical Approach to P2P Networks: Analysis of Learning Nazif Cihan Taş Dept. of Computer Science University of Maryland College Park, MD 2742 Email: ctas@cs.umd.edu Bedri Kâmil Onur Taş Dept.
More informationAnonymous Agreement: The Janus Algorithm
Anonymous Agreement: The Janus Algorithm Zohir Bouzid 1, Pierre Sutra 1, and Corentin Travers 2 1 University Pierre et Marie Curie - Paris 6, LIP6-CNRS 7606, France. name.surname@lip6.fr 2 LaBRI University
More informationDistributed Systems (5DV147)
Distributed Systems (5DV147) Fundamentals Fall 2013 1 basics 2 basics Single process int i; i=i+1; 1 CPU - Steps are strictly sequential - Program behavior & variables state determined by sequence of operations
More informationDISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Modified by: Dr. Ramzi Saifan Definition of a Distributed System (1) A distributed
More informationDistributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi
1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.
More informationDistributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs
1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds
More informationDISTRIBUTED HASH TABLE PROTOCOL DETECTION IN WIRELESS SENSOR NETWORKS
DISTRIBUTED HASH TABLE PROTOCOL DETECTION IN WIRELESS SENSOR NETWORKS Mr. M. Raghu (Asst.professor) Dr.Pauls Engineering College Ms. M. Ananthi (PG Scholar) Dr. Pauls Engineering College Abstract- Wireless
More informationConsensus in Asynchronous Distributed Systems: A Concise Guided Tour
Consensus in Asynchronous Distributed Systems: A Concise Guided Tour Rachid Guerraoui 1, Michel Hurfin 2, Achour Mostefaoui 2, Riucarlos Oliveira 1, Michel Raynal 2, and Andre Schiper 1 1 EPFL, Département
More informationSystem models for distributed systems
System models for distributed systems INF5040/9040 autumn 2010 lecturer: Frank Eliassen INF5040 H2010, Frank Eliassen 1 System models Purpose illustrate/describe common properties and design choices for
More informationFault-Tolerant Distributed Consensus
Fault-Tolerant Distributed Consensus Lawrence Kesteloot January 20, 1995 1 Introduction A fault-tolerant system is one that can sustain a reasonable number of process or communication failures, both intermittent
More informationSystem Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics
System Models Nicola Dragoni Embedded Systems Engineering DTU Informatics 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models Architectural vs Fundamental Models Systems that are intended
More informationMastering Agreement Problems in Distributed Systems
focus fault tolerance Mastering Agreement Problems in Distributed Systems Michel Raynal, IRISA Mukesh Singhal, Ohio State University Overcoming agreement problems in distributed systems is a primary challenge
More informationDynamic Atomic Storage Without Consensus
Dynamic Atomic Storage Without Consensus Marcos K. Aguilera Idit Keidar Dahlia Malkhi Alexander Shraer June 2, 2009 Abstract This paper deals with the emulation of atomic read/write (R/W) storage in dynamic
More informationJ. Parallel Distrib. Comput.
J. Parallel Distrib. Comput. 69 (2009) 100 116 Contents lists available at ScienceDirect J. Parallel Distrib. Comput. journal homepage: www.elsevier.com/locate/jpdc Reconfigurable distributed storage for
More informationTowards Scalable and Robust Overlay Networks
Towards Scalable and Robust Overlay Networks Baruch Awerbuch Department of Computer Science Johns Hopkins University Baltimore, MD 21218, USA baruch@cs.jhu.edu Christian Scheideler Institute for Computer
More informationParsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast
Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast HariGovind V. Ramasamy Christian Cachin August 19, 2005 Abstract Atomic broadcast is a communication primitive that allows a group of
More informationSystem Models for Distributed Systems
System Models for Distributed Systems INF5040/9040 Autumn 2015 Lecturer: Amir Taherkordi (ifi/uio) August 31, 2015 Outline 1. Introduction 2. Physical Models 4. Fundamental Models 2 INF5040 1 System Models
More informationCoordination and Agreement
Coordination and Agreement Nicola Dragoni Embedded Systems Engineering DTU Informatics 1. Introduction 2. Distributed Mutual Exclusion 3. Elections 4. Multicast Communication 5. Consensus and related problems
More informationR. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch
- Shared Memory - R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch R. Guerraoui 1 The application model P2 P1 Registers P3 2 Register (assumptions) For presentation simplicity, we assume
More informationIncompatibility Dimensions and Integration of Atomic Commit Protocols
The International Arab Journal of Information Technology, Vol. 5, No. 4, October 2008 381 Incompatibility Dimensions and Integration of Atomic Commit Protocols Yousef Al-Houmaily Department of Computer
More informationSemi-Passive Replication in the Presence of Byzantine Faults
Semi-Passive Replication in the Presence of Byzantine Faults HariGovind V. Ramasamy Adnan Agbaria William H. Sanders University of Illinois at Urbana-Champaign 1308 W. Main Street, Urbana IL 61801, USA
More informationProcess groups and message ordering
Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create ( name ), kill ( name ) join ( name, process ), leave
More informationAnnouncements. me your survey: See the Announcements page. Today. Reading. Take a break around 10:15am. Ack: Some figures are from Coulouris
Announcements Email me your survey: See the Announcements page Today Conceptual overview of distributed systems System models Reading Today: Chapter 2 of Coulouris Next topic: client-side processing (HTML,
More informationEmulating Shared-Memory Do-All Algorithms in Asynchronous Message-Passing Systems
Emulating Shared-Memory Do-All Algorithms in Asynchronous Message-Passing Systems Dariusz R. Kowalski 2,3, Mariam Momenzadeh 4, and Alexander A. Shvartsman 1,5 1 Department of Computer Science and Engineering,
More informationDfinity Consensus, Explored
Dfinity Consensus, Explored Ittai Abraham, Dahlia Malkhi, Kartik Nayak, and Ling Ren VMware Research {iabraham,dmalkhi,nkartik,lingren}@vmware.com Abstract. We explore a Byzantine Consensus protocol called
More informationSigma: A Fault-Tolerant Mutual Exclusion Algorithm in Dynamic Distributed Systems Subject to Process Crashes and Memory Losses
Sigma: A Fault-Tolerant Mutual Exclusion Algorithm in Dynamic Distributed Systems Subject to Process Crashes and Memory Losses Wei Chen Shi-Ding Lin Qiao Lian Zheng Zhang Microsoft Research Asia {weic,
More informationCoded Emulation of Shared Atomic Memory for Message Passing Architectures
Coded Emulation of Shared Atomic Memory for Message Passing Architectures Viveck R. Cadambe, ancy Lynch, Muriel Médard, Peter Musial Abstract. This paper considers the communication and storage costs of
More informationDegree Optimal Deterministic Routing for P2P Systems
Degree Optimal Deterministic Routing for P2P Systems Gennaro Cordasco Luisa Gargano Mikael Hammar Vittorio Scarano Abstract We propose routing schemes that optimize the average number of hops for lookup
More informationState-Optimal Snap-Stabilizing PIF In Tree Networks
State-Optimal Snap-Stabilizing PIF In Tree Networks (Extended Abstract) Alain Bui, 1 Ajoy K. Datta, 2 Franck Petit, 1 Vincent Villain 1 1 LaRIA, Université de Picardie Jules Verne, France 2 Department
More informationAtomic Broadcast in Asynchronous Crash-Recovery Distributed Systems
Atomic Broadcast in Asynchronous Crash-Recovery Distributed Systems Luís Rodrigues Michel Raynal DI FCUL TR 99 7 Departamento de Informática Faculdade de Ciências da Universidade de Lisboa Campo Grande,
More informationProviding File Services using a Distributed Hash Table
Providing File Services using a Distributed Hash Table Lars Seipel, Alois Schuette University of Applied Sciences Darmstadt, Department of Computer Science, Schoefferstr. 8a, 64295 Darmstadt, Germany lars.seipel@stud.h-da.de
More informationSecure Multi-Party Computation Without Agreement
Secure Multi-Party Computation Without Agreement Shafi Goldwasser Department of Computer Science The Weizmann Institute of Science Rehovot 76100, Israel. shafi@wisdom.weizmann.ac.il Yehuda Lindell IBM
More informationDistributed Algorithms 6.046J, Spring, Nancy Lynch
Distributed Algorithms 6.046J, Spring, 205 Nancy Lynch What are Distributed Algorithms? Algorithms that run on networked processors, or on multiprocessors that share memory. They solve many kinds of problems:
More information6.852: Distributed Algorithms Fall, Class 21
6.852: Distributed Algorithms Fall, 2009 Class 21 Today s plan Wait-free synchronization. The wait-free consensus hierarchy Universality of consensus Reading: [Herlihy, Wait-free synchronization] (Another
More informationOptimal Resilience for Erasure-Coded Byzantine Distributed Storage
Optimal Resilience for Erasure-Coded Byzantine Distributed Storage Christian Cachin IBM Research Zurich Research Laboratory CH-8803 Rüschlikon, Switzerland cca@zurich.ibm.com Stefano Tessaro ETH Zurich
More informationA Suite of Formal Denitions for Consistency Criteria. in Distributed Shared Memories Rennes Cedex (France) 1015 Lausanne (Switzerland)
A Suite of Formal Denitions for Consistency Criteria in Distributed Shared Memories Michel Raynal Andre Schiper IRISA, Campus de Beaulieu EPFL, Dept d'informatique 35042 Rennes Cedex (France) 1015 Lausanne
More informationData Distribution in Large-Scale Distributed Systems
Università di Roma La Sapienza Dipartimento di Informatica e Sistemistica Data Distribution in Large-Scale Distributed Systems Roberto Baldoni MIDLAB Laboratory Università degli Studi di Roma La Sapienza
More informationDistributed Algorithms Models
Distributed Algorithms Models Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents 1 Taxonomy
More informationHelp when needed, but no more: Efficient Read/Write Partial Snapshot
Help when needed, but no more: Efficient Read/Write Partial Snapshot Damien Imbs, Michel Raynal To cite this version: Damien Imbs, Michel Raynal. Help when needed, but no more: Efficient Read/Write Partial
More informationTime-Free Authenticated Byzantine Consensus
Time-Free Authenticated Byzantine Consensus Hamouma Moumen, Achour Mostefaoui To cite this version: Hamouma Moumen, Achour Mostefaoui. Time-Free Authenticated Byzantine Consensus. Franck Capello and Hans-Peter
More informationA Dual Digraph Approach for Leaderless Atomic Broadcast
A Dual Digraph Approach for Leaderless Atomic Broadcast (Extended Version) Marius Poke Faculty of Mechanical Engineering Helmut Schmidt University marius.poke@hsu-hh.de Colin W. Glass Faculty of Mechanical
More informationDistributed Algorithms Benoît Garbinato
Distributed Algorithms Benoît Garbinato 1 Distributed systems networks distributed As long as there were no machines, programming was no problem networks distributed at all; when we had a few weak computers,
More informationDistributed Algorithms Reliable Broadcast
Distributed Algorithms Reliable Broadcast Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Contents
More informationFrom a Store-collect Object and Ω to Efficient Asynchronous Consensus
From a Store-collect Object and Ω to Efficient Asynchronous Consensus Michel Raynal, Julien Stainer To cite this version: Michel Raynal, Julien Stainer. From a Store-collect Object and Ω to Efficient Asynchronous
More informationQuiescent Consensus in Mobile Ad-hoc Networks using Eventually Storage-Free Broadcasts
Quiescent Consensus in Mobile Ad-hoc Networks using Eventually Storage-Free Broadcasts François Bonnet Département Info & Télécom, École Normale Supérieure de Cachan, France Paul Ezhilchelvan School of
More informationInitial Assumptions. Modern Distributed Computing. Network Topology. Initial Input
Initial Assumptions Modern Distributed Computing Theory and Applications Ioannis Chatzigiannakis Sapienza University of Rome Lecture 4 Tuesday, March 6, 03 Exercises correspond to problems studied during
More informationDistributed Systems COMP 212. Lecture 19 Othon Michail
Distributed Systems COMP 212 Lecture 19 Othon Michail Fault Tolerance 2/31 What is a Distributed System? 3/31 Distributed vs Single-machine Systems A key difference: partial failures One component fails
More informationThe Alpha of Indulgent Consensus
The Computer Journal Advance Access published August 3, 2006 Ó The Author 2006. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved. For Permissions, please
More informationExclusion-Freeness in Multi-party Exchange Protocols
Exclusion-Freeness in Multi-party Exchange Protocols Nicolás González-Deleito and Olivier Markowitch Université Libre de Bruxelles Bd. du Triomphe CP212 1050 Bruxelles Belgium {ngonzale,omarkow}@ulb.ac.be
More informationGeoQuorums: implementing atomic memory
Distrib. Comput. (2005) 18(2): 125 155 DOI 10.1007/s00446-005-0140-9 SPEC ISSUE DISC 03 Shlomi Dolev Seth Gilbert Nancy A. Lynch Alexander A. Shvartsman JenniferL.Welch GeoQuorums: implementing atomic
More informationPractical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov
Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change
More informationEtna: a Fault-tolerant Algorithm for Atomic Mutable DHT Data
Etna: a Fault-tolerant Algorithm for Atomic Mutable DHT Data Athicha Muthitacharoen Seth Gilbert Robert Morris athicha@lcs.mit.edu sethg@mit.edu rtm@lcs.mit.edu MIT Computer Science and Artificial Intelligence
More informationMODELS OF DISTRIBUTED SYSTEMS
Distributed Systems Fö 2/3-1 Distributed Systems Fö 2/3-2 MODELS OF DISTRIBUTED SYSTEMS Basic Elements 1. Architectural Models 2. Interaction Models Resources in a distributed system are shared between
More informationFault Tolerance. Distributed Systems. September 2002
Fault Tolerance Distributed Systems September 2002 Basics A component provides services to clients. To provide services, the component may require the services from other components a component may depend
More informationArvind Krishnamurthy Fall Collection of individual computing devices/processes that can communicate with each other
Distributed Systems Arvind Krishnamurthy Fall 2003 Concurrent Systems Collection of individual computing devices/processes that can communicate with each other General definition encompasses a wide range
More informationBrewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services
PODC 2004 The PODC Steering Committee is pleased to announce that PODC 2004 will be held in St. John's, Newfoundland. This will be the thirteenth PODC to be held in Canada but the first to be held there
More informationByzantine Fault-Tolerance with Commutative Commands
Byzantine Fault-Tolerance with Commutative Commands Pavel Raykov 1, Nicolas Schiper 2, and Fernando Pedone 2 1 Swiss Federal Institute of Technology (ETH) Zurich, Switzerland 2 University of Lugano (USI)
More informationSynchronization is coming back, but is it the same?
Synchronization is coming back, but is it the same? Michel Raynal To cite this version: Michel Raynal. Synchronization is coming back, but is it the same?. [Research Report] PI 1875, 2007, pp.16.
More information