Eventually k-bounded Wait-Free Distributed Daemons

Size: px

Start display at page:

Download "Eventually k-bounded Wait-Free Distributed Daemons"

Jeffery Davidson
6 years ago
Views:

1 Eventually k-bounded Wait-Free Distributed Daemons Yantao Song and Scott M. Pike Texas A&M University Department of Computer Science College Station, TX , USA {yantao, Technical Report: TAMU-CS-TR

2 Eventually k-bounded Wait-Free Distributed Daemons Yantao Song and Scott M. Pike Department of Computer Science Texas A&M University College Station, TX , USA {yantao, Abstract Wait-free scheduling is unsolvable in asynchronous message-passing systems subject to crash faults. Given the practical importance of this problem, we examine its solvability under partial synchrony relative to the eventually perfect failure detector 3P. Specifically, we present a new oracle-based solution to the dining philosophers problem that is wait-free in the presence of arbitrarily many crash faults. Additionally, our solution satisfies eventual k-bounded waiting, which guarantees that every execution has an infinite suffix where no process can overtake any live hungry neighbor more than k consecutive times. Finally, our algorithm uses only bounded space, bounded-capacity channels, and is also quiescent with respect to crashed processes. Among other practical applications, our results support wait-free distributed daemons for fairly scheduling self-stabilizing protocols in the presence of crash faults. Keywords: self-stabilization, daemons, wait-freedom 1. Introduction Self-stabilization [11] is a fundamental technique for developing dependable systems. Starting from any configuration, self-stabilizing algorithms always converge to a closed set of safe states from which correct behavior follows. As such, stabilization is useful for autonomic systems that must bootstrap from arbitrary initial states. More importantly, however, stabilization is an effective technique for recovering from transient faults which, in general, can drive systems into arbitrary configurations. A fundamental assumption for self-stabilization is that every correct process executes infinitely many steps. This This work was supported by the Advanced Research Program of the Texas Higher Education Coordinating Board under Project Number Another version of this paper was published at the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2007). assumption is necessary to guarantee convergence. For example, suppose that some live process j executes only finitely many steps, followed by a local transient fault that yields an unsafe state. If subsequent steps by j are necessary to detect and/or correct the fault, the overall system may never recover from the unsafe state. Recent work has examined the assumption that live processes take infinitely many steps. The work in [12] shows that this requires the underlying microprocessors to be selfstabilizing as well. For example, soft errors may cause micro-code controlled processors to loop in a subset of code without the fetch-decode-execute cycle. This undermines convergence, because the microprocessor ceases to execute the subsequent instructions of its application processes. A parallel line of work has examined self-stabilizing daemons [1, 3, 15, 19, 6]. In general, conflicting actions can impose scheduling constraints. For instance, algorithms using shared memory must coordinate access to critical sections of code that update shared variables. Concurrency control is often coordinated by a daemon that schedules a set of processes to execute non-conflicting actions. Distributed daemons are commonly implemented by dining philosopher algorithms, where each diner represents a process in the stabilizing protocol. As an abstraction of local mutual exclusion, processes with conflicting actions are connected as neighbors in the conflict graph, where each diner becomes hungry infinitely often. When scheduled to eat, the diner can execute any enabled action in the stabilizing protocol, because the mutual exclusion of dining guarantees that no conflicting neighbor will be scheduled to eat simultaneously. Many dining-based daemons stabilize from transient faults to the daemon itself [1, 3, 15, 19, 6]. This is because transient corruptions to the daemon could result in deadlock, which would prevent correct processes from eating infinitely often. A limitation of these daemons, however, is that none addresses the pragmatic possibility of crash faults, whereby processes cease execution without warning and never recover. As it turns out, no purely asynchronous daemon can mask the impact of crash faults entirely; star-

3 vation of correct processes is unavoidable [8]. Since diners that starve are never scheduled to eat again, convergence cannot be guaranteed. The conclusion is that stabilization becomes impossible in crash-faulty environments unless we consider some recourse to crash-fault detection. This paper explores the solvability of wait-free, eventually k-bounded distributed daemons as schedulers for selfstabilizing protocols in the presence of crash faults. We assume that only the stabilizing protocol is subject to transient faults, but that both the protocol and the daemon layers are subject to crash faults. Thus, we consider daemons that are wait-free, but not necessarily stabilizing for transient faults to daemon variables. Our work demonstrates the solvability of wait-free scheduling in partially synchronous systems sufficient to implement the eventually perfect failure detector 3P from the Chandra-Toueg hierarchy [7]. 3P detectors always suspect crashed processes and eventually stop suspecting correct processes. As such, 3P oracles can make finitely many false-positive mistakes during any run. Although 3P provides unreliable information, we show that 3P is still sufficient to solve wait-free dining under eventual weak exclusion 3WX. This safety model guarantees that, for every run, there exists an unknown time after which no two live neighbors eat simultaneously. As such, dining under 3WX permits finitely many scheduling mistakes during any run. Interestingly, wait-free dining is impossible with 3P oracles under the slightly stronger criterion of perpetual weak exclusion [20]. This safety model guarantees that no two live neighbors eat simultaneously (ever). Our interest in 3WX is motivated by two factors. First, it admits of practical wait-free implementations using only 3P, which is a modestly powerful oracle that is implementable in many realistic models of partial synchrony [7, 13, 14]. Second, 3WX is well-suited as a scheduling model for stabilizing algorithms, insofar as each scheduling mistake can be viewed as a sharing violation that precipitates at worst a transient fault on the stabilization layer. Despite making mistakes under 3WX, a wait-free daemon guarantees that every correct process will execute infinitely many steps, which thereby guarantees convergence to safe states after finitely many transient faults. Our distributed daemon satisfies several useful properties in addition to wait-freedom. First, it satisfies a degree of eventual fairness (eventual k-bounded waiting), which guarantees that every execution has an infinite suffix where no process overtakes any hungry neighbor more than k consecutive times. Additionally, our algorithm uses only bounded space and requires only bounded-capacity channels. Finally, our algorithm is also quiescent with respect to crashed processes, which means that correct processes eventually stop sending messages to crashed neighbors. 2. Background and Terminology Computational Model. We consider asynchronous message-passing systems augmented with a local, eventually perfect failure detector 3P 1 (defined below). As such, message delays and relative process speeds are unbounded, but each process has access to a local oracle that provides information about crash faults in each run. Processes can crash only as the result of a crash fault, which occurs when a process ceases execution without warning and never recovers [9]. For each run α, a process i is either faulty or correct. We say that i is faulty in α if i crashes at some time t in α; otherwise, i is correct in α. Additionally, process i is live at time t if i has not crashed by time t. Consequently, correct processes are always live, and faulty processes are live only prior to crashing. Each system is modeled by a set of n distributed processes Π = {p 1, p 2,..., p n } that communicate only by asynchronous message passing. We assume reliable FIFO channels, such that every message sent to a correct process is eventually received by that process in the order sent, and messages are neither lost, duplicated, nor corrupted. Failure Detectors. An unreliable failure detector can be viewed as a distributed oracle that can be queried for (possibly incorrect) information about crash faults in Π. Each process has access to its own local detector module that outputs a set of processes currently suspected of having crashed. Unreliable failure detectors are characterized by the kinds of mistakes they can make. Mistakes include falsenegatives (i.e., not suspecting a crashed process), as well as false-positives (i.e., wrongfully suspecting a correct process). In Chandra and Toueg s original definition [7], each class of failure detectors is defined by two properties: completeness and accuracy. Completeness restricts false negatives, while accuracy restricts false positives. We use a locally scope-restricted refinement of the eventually perfect failure detector called 3P 1 [4, 17]. This oracle satisfies the properties of 3P, but only with respect to immediate neighbors in the dining conflict graph (defined below): Local Strong Completeness: Every crashed process is eventually and permanently suspected by all correct neighbors. Local Eventual Strong Accuracy: For every run, there exists a time after which no correct process is suspected by any correct neighbor. Therefore, 3P 1 may commit false-positive mistakes by suspecting correct neighbors for finitely many times during any run. However, 3P 1 detectors converge at some point after which 3P 1 detectors provide reliable information about neighbors. Unfortunately, the time of convergence is unknown and may vary from run to run.

4 Distributed Daemons and Dining Philosophers. The local program executed by each process i in a stabilizing protocol can be modeled as a set of actions (guarded commands). Processes i and j are connected as neighbors in a conflict graph if i and j cannot be scheduled independently; that is, the actions of i and j have overlapping constraints and should not be scheduled to execute simultaneously. A distributed daemon [11, 3] is a scheduler that continually selects a non-empty subset of processes to execute a set of non-conflicting actions. Distributed daemons are often implemented as solutions to the well-known dining philosophers problem, which is a classic paradigm of process synchronization. Originally proposed by Dijkstra for a ring topology [10], dining was later generalized by Lynch for local mutual exclusion problems on arbitrary conflict graphs [18]. A dining instance is modeled by an undirected conflict graph C = (Π, E), where each vertex i Π represents a diner, and each edge (i, j) E indicates a potential conflict between neighbors i and j. We assume that each pair of processes in the conflict graph is connected by a reliable FIFO channel. At any time, the state of a diner is either thinking, hungry, or eating. These abstract states correspond to three basic phases of an ordinary process: executing independently, requesting shared resources, and utilizing shared resources in a critical section, respectively. A hungry session of any process i is the (inclusive) time period from when i becomes hungry until i gets scheduled to eat. Initially, every process is thinking. Although processes may think forever, they are also permitted to become hungry at any time. By contrast, correct processes can eat for only a finite (but not necessarily bounded) period of time. Hungry neighbors are said to be in conflict, because they compete for shared but mutually exclusive resources. A correct dining solution under eventual weak exclusion 3WX must satisfy the following two requirements: Safety (3WX ): For every run, there exists a time after which no two live neighbors ever eat simultaneously. Progress (Wait-Freedom): Every correct hungry process eventually eats, regardless of process crashes. The safety criterion permits dining solutions to make at most finitely many scheduling mistakes in any run. The progress criterion prevents dining solutions from starving correct hungry processes by never scheduling them to eat. In the presence of arbitrarily many crash faults, a dining algorithm that satisfies progress is called wait-free [16]. Fairness. We say that a daemon satisfies perpetual k- bounded waiting (2k-BW), if no process i can be selected more than k consecutive times, while any correct neighbor j remains continuously hungry [5, 2]. A daemon satisfies eventually k-bounded waiting (3k-BW), if for every run, there exists a time after which no process i can be selected more than k consecutive times, while any correct neighbor j remains continuously hungry. Thus, bounded waiting is measure of fairness, where 2k-BW denotes perpetual k-fairness and 3k-BW denotes eventual k-fairness. In the context of dining-based daemons, every run of a distributed daemon that satisfies 3k-BW has an infinite suffix that guarantees k-f airness among correct processes. That is, for every run, there exists a (potentially unknown) time after which no correct hungry process i can be overtaken more than k consecutive times by any correct neighbor j. In this paper, we present a dining-based daemon that achieves 3k-BW for k = Algorithm Description Our algorithm is related to the asynchronous doorway dining algorithm of Choy and Singh [8], insofar as we use forks for safety, an asynchronous doorway for fairness. Processes connected by an edge in conflict graph share the corresponding fork. In order to eat, a hungry process must collect and hold all of its shared forks. This provides a simple basis for safety, since at most one neighbor can hold a given fork at any time. If two neighbors compete for one fork, the conflict is solved always in favor of the neighbor with higher priority. Process priorities are static and represented by colors, which are assigned to processes at the beginning of each run. Standard node-coloring algorithms can be used to assign colors to processes, such that no two neighbors have the same color. An asynchronous doorway is used to prevent higher-priority processes from starving lower-priority neighbors. In order to eat, every hungry process must go through two phases in the original doorway algorithm: Phase 1: outside the doorway; Phase 2: inside the doorway. In phase 1, in order to enter the doorway, every process collects acks from all of its neighbors. In phase 2, in order to eat, every process holds all of its shared forks continuously. When a process i goes to eat, i must be inside the doorway and hold all of its shared forks. Phase 1: In the original doorway solution, when a process i becomes hungry, it tries to enter the doorway. In order to do so, the process i must receive one acknowledgment from each neighbor through the ping-ack protocol. An acknowledgment (ack) indicates that the corresponding neighbor allows i to enter the doorway. If a neighbor j is outside the doorway, then j will send the ack to process i; otherwise j will defer sending the ack to i until j exits the doorway. Provided that processes inside the doorway will eventually exit the doorway, then every hungry process must eventually enter the doorway. The doorway provides a basis for fairness, simply because a hungry process inside the doorway will prevent its neighbors from entering the doorway.

5 In our fault model, all crashed processes stop sending messages, including ack messages. Consequently, in the original doorway solution, if a process crashes, its neighbors will be potentially blocked outside the doorway and starve. To solve this problem, we introduce 3P 1 in our algorithm. The local strong completeness property guarantees that every crashed process will be eventually and permanently suspected by all correct neighbors. As such, these neighbors will be able to use suspicion from 3P 1 in place of the missing acks to enter the doorway. On the other hand, in the original doorway solution, while some hungry process i waits for the ack from a neighbor j, other neighbors can enter the doorway finitely many times. To achieve eventual 2-bounded waiting, we introduce a modified doorway in our algorithm. Specifically, each process i grants at most one ack per neighbor j per hungry session of i. In addition to acks that may have been sent while i was thinking, this mechanism ensures that no neighbor of i can enter the doorway more than twice while i remains continuously hungry. We revise the original ping-ack protocol as follows. Each hungry process i sends a ping message to each neighbor j to request the doorway ack. Upon j receiving the ping from i, j sends the ack if either (1) j is thinking, or (2) j is hungry and outside the doorway and has not already sent an ack to i during the current hungry session of j. Otherwise, j defers sending the ack to i until after j eats and exits the doorway. Notice that it is possible for two neighbors to enter the doorway simultaneously. If two neighbors suspect each other (before 3P 1 converges), then both can enter the doorway regardless of ack messages. Alternatively, neighbors can receive acks from each other simultaneously while outside the doorway, and then enter together. The symmetry between hungry neighbors inside the doorway is resolved by the color-based priority scheme in phase 2. Phase 2: In the original doorway algorithm, in order to eat, a hungry process must hold all of its shared forks continuously. After a hungry process enters the doorway, the process begins to collect all shared forks. Every fork is associated with one edge in the conflict graph. Processes connected by an edge in the conflict graph share the corresponding fork, which is used to resolve conflicts over the overlapping set of resources they both need. When two neighbors are competing for one fork, the conflict is always solved in favor of the neighbor with higher priority. This provides a simple basis for safety, since at most one neighbor can hold a given fork at any time. In the original doorway solution, any process that crashes while holding forks will cause its corresponding hungry neighbors to starve, because the forks necessary for eating cannot be acquired. To solve this problem, we use 3P 1 in our algorithm. As such, hungry neighbors will be able to use suspicion from 3P 1 in place of the missing fork to proceed to eat. In phase 2, we use the following fork-collection scheme. Each hungry process i that enters the doorway sends a request for each missing fork to the corresponding neighbor j. Upon receiving this request, process j sends the shared fork only if (1) j is outside the doorway, or (2) j is hungry and inside the doorway but has lower priority than i (where process priorities are represented by the static node colors). Otherwise, j defers the fork request until after j eats. Our algorithm is shown in Algorithm Local Variables In addition to the 3P 1 module, every process i has nine types of local variables, which are partitioned into three sets: for describing state of process i, for the ping-ack protocol, and for the fork collection scheme. State Variables. Each process i has an integer-valued variable color i. Upon initialization, we assume that each color variable is assigned a locally-unique value so that no two neighbors have the same color. Several node-coloring approximation algorithms can compute such colorings in polynomial time using only O(δ) distinct values, where δ is the maxmimum degree of the conflict graph. Color values denote process priority and are static after initialization. For each pair of neighbors i and j, process i has higher priority than j if and only if color i > color j. Every process also has two variables describing its current state: a trivalent variable state i and a boolean variable inside i. Variable state i denotes the current dining phase: thinking, hungry, or eating; inside i indicates whether process i is inside the doorway or not. Initially, every process is outside the doorway and thinking. Ping-Ack Variables. Process i has four local boolean variables associated with the ping-ack protocol for each neighbor j: pinged ij, ack ij, deferred ij and replied ij. Initially, all of these variables are false. The local variable pinged ij is true if and only if there is a pending ping request from i to j. A pending ping request initiated by i to j covers the following three situations: a ping request is on its way from i to j, or is being deferred by j, or a replied ack is on its way to i. On the other hand, process i needs to remember received acks until i enters the doorway. The local variable ack ij is true if and only if process i is hungry, outside the doorway, and received an ack from j during the current hungry session of i. Variable deferred ij is true if and only if process i is currently deferring a ping request from j. Also, to achieve eventual 2-bounded waiting, process i needs to record which ack messages have been sent while hungry. The local variable replied ij is true if and only if process i has sent an ack to neighbor j during the current hungry session of i.

6 N(i) denotes the set of neighbors of process i Code for process i 1 : {state i = thinking} Action 1 2 : state i := (thinking or hungry); Become Hungry Ping-Ack Actions 3 : {(state i = hungry) inside i } Action 2 4 : j N(i) where ( pinged ij ack ij ) do Request Acks from Neighbors 5 : send-ping i to j; pinged ij := true; 6 : {receive-ping from j N(i)} Action 3 7 : if (inside i replied ij ) Inside the Doorway or Has Sent the Ack 8 : deferred ij := true; Defer Sending Ack 9 : else Thinking, or Hungry and Has Not Sent an Ack 10 : send-ack i to j; replied ij := (state i = hungry); Send an Ack 11 : {receive-ack from j N(i)} Action 4 12 : ack ij := ((state i = hungry) inside i ) ; Receive an Ack 13 : pinged ij := false; 14 : {(state i = hungry) ( j N(i) :: (ack ij (j 3P 1 )))} Action 5 15 : inside i := true; Enter the Doorway 16 : j N(i) do 17 : ack ij := false; replied ij := false; Fork Collection Actions 18 : {(state i = hungry) inside i } Action 6 19 : j N(i) where (token ij fork ij ) do Request Missing forks 20 : send-request color i to j; token ij := false; 21 : {receive-request color j from j N(i)} Action 7 22 : token ij := true; Receive a Fork Request 23 : if ( inside i ((state i = hungry) (color i < color j ))) 24 : send-fork i to j; fork ij := false; 25 : {receive-fork j from j N(i)} Action 8 26 : fork ij := true; Receive a Fork Other Actions 27 : {((state i = hungry) inside i ( j N(i) :: (fork ij (j 3P 1 ))))} Action 9 28 : state i := eating; Enter Critical Section 29 : {state i = eating} Action : inside i := false; Exit the Doorway 31 : state i := thinking; 32 : j N(i) where (token ij fork ij ) do 33 : send-fork i to j; fork ij := false; Send Deferred Forks 34 : j N(i) where (deferred ij ) do 35 : send-ack i to j; deferred ij := false; Send Deferred Acks Algorithm 1. Wait-Free, Eventual k-bounded Waiting (3k-BW) for Eventual Weak Exclusion (3WX )

7 Each hungry process resets all of its local ack and replied variables to false upon entering the doorway. By constrast, true deferred variables remain true until the process eats and exits. Fork Collection Variables. Process i has two local boolean variables associated with the fork collection scheme for each neighbor j: fork ij and token ij. Symmetrically, j has variables fork ji and token ji for i. The local variable fork ij is true if and only if process i holds the fork shared with j. Because the fork is unique and exclusive, fork ij and fork ji cannot be true simultaneously. However, both fork ij and fork ji could be false simultaneously if and only if the fork is in transit. The local variables, token ij and token ji, are introduced for fork requests between the neighboring processes i and j. In general, if a process i is hungry and holds the token token ij, then i is permitted to request the missing fork by sending the token to the corresponding neighbor j. When both fork ij and token ij are true, process i is deferring the fork request from its neighbor j. Initially, between each pair of neighbors, the fork is at the neighbor with higher color, and the token is at the neighbor with lower color Algorithm Actions A thinking process can become hungry at any time by executing Action 1. Action 1 is not an internal action of Algorithm 1 and is formalized just for completeness of process behaviors. Upon becoming hungry, processes are still outside the doorway. Ping-Ack Actions (Action 2, 3, 4, and 5). While hungry and outside the doorway, Action 2 is always enabled. By Action 2, for each neighbor j, if the ack from j is missing and no pending ping request to j exists, then process i requests an ack from j. As a result, pinged ij becomes true to indicate existence of the pending ping request. When process i receives a ping message, i decides whether to send the ack in Action 3. The ping request can be deferred for two reasons: i is inside the doorway, or i is outside the doorway but has sent an ack during its current hungry session. Otherwise, i sends the ack immediately. As a result, if i is hungry and outside the doorway, the corresponding replied variable is set to true. Processes receive acks by Action 4. As a result, the corresponding pinged variable is set back to false to indicate that no ping request is pending with the corresponding neighbor. Action 5 determines when a hungry process enters the doorway. If for each neighbor j, a hungry process i either received the ack or suspects j continuously by 3P 1, i eventually enters the doorway. After i enters the doorway, i does not need to remember received acks. Also while inside the doorway, i always defers ping requests. Thus, after entering the doorway, i resets its local ack and replied variables to false. Fork Collection Actions (Action 6, 7, and 8). Process i sends fork requests for missing forks in Action 6, which is enabled while i is hungry and inside the doorway. Process i encodes its color in the request messages as a parameter. Any process that receives a fork request in Action 7 decides whether or not to send the shared fork. If the process is outside the doorway, or hungry but with a lower color, then it sends the shared fork immediately. Otherwise, the process defers the fork request until after it eats. Action 8 simply receives a fork. Other Actions. Action 9 determines when a hungry process goes to eat. If a hungry process i is inside the doorway, and for each neighbor j, i either holds the shared fork continuously or suspects j, then i eventually eats. Correct processes can eat only for a finite period of time. By executing Action 10, eating processes exit eating, transit back to thinking, and exit the doorway. Also, all deferred fork requests and deferred ping requests are granted. 4. Safety Proof This section proves that Algorithm 1 satisfies the safety property: eventual weak exclusion. The safety proof relies on two assumptions: 3P 1 can make only finitely many false-positive mistakes, and the fork between any pair of neighbors is unique and exclusive. First, we prove the uniqueness of forks in Lemmas 1.1 and 1.2. Next, we prove safety in Theorem 1. Lemma 1.1: In Algorithm 1, when a process receives a fork request, the process must be holding the requested fork. Suppose that a process j receives a request for a fork that j does not hold. If j is outside the doorway, or is hungry with a lower color, j would duplicate the fork(action 7). As a result, uniqueness of forks and the safety property are violated. Lemma 1.1 shows that the above situation never happens and is proved based on FIFO channels. Proof: Lemma 1.1 is proved by direct construction in two steps. The first step shows that when a process receives a fork, that process must not be holding the corresponding token. By using the result of the first step, we prove Lemma 1.1 in the second step. Suppose that a process j receives a fork from its neighbor i. When i sent the fork, i must be holding the corresponding token (Actions 7 and 10). After i sent the fork, i may send the token to j to re-request the fork (Action 6). Because of FIFO channels, the token must arrive at j after the fork arrives. Thus, when a process j receives a fork, j must not be holding the corresponding token, which is either at the neighbor i or in transit to j. From the above, we also can conclude that while a fork is in transit, the fork recipient can not hold the corresponding

8 token. Hence, when a process i sends a token to its neighbor j, the fork cannot be in transit to i. Otherwise, i should not have the token. Also, j cannot send the fork away without holding the token. Thus, when i sends a fork request to j, the fork is either at j or in transit to j. Because of FIFO channels, when the token arrives at j, j must hold the fork. Lemma 1.1 holds. Lemma 1.2: The fork is unique between each pair of neighbors. If two duplicated forks exist between two neighbors, both neighbors can eat simultaneously infinitely often. Thus, to prove the safety property, we must prove uniqueness of forks. Proof: Only when processes send a fork which they do not hold, could the fork be duplicated. However, a process i sends a fork, only because i received a fork request (Actions 7 and 10). By Lemma 1.1, when i received a fork request, i must be holding the fork. Then processes cannot send a fork which they do not hold. Thus, forks cannot be duplicated. Lemma 1.2 holds. Theorem 1: Algorithm 1 satisfies eventual weak exclusion 3WX : for each execution, there exists a time after which no two live neighbors eat simultaneously. Proof: The safety proof is by direct construction and depends only on the local eventually strong accuracy property of 3P 1. This property guarantees that for each run, there exists a time after which no correct process is suspected by any correct neighbor. Faulty processes cannot prevent safety from being established. Since faulty processes eventually crash, they can eat only finitely many times in any run. Next, we focus on correct processes. Consider any execution α of Algorithm 1. Let t denote the time in α when 3P 1 converges. Let i be any correct process that begins eating after time t. By Action 9, process i can go to eat only if for each neighbor j, either i holds the shared fork continuously or i suspects j. Since 3P 1 never suspects correct neighbors after time t, i must hold every fork shared with its correct neighbors to eat. So long as i remains eating, Action 7 guarantees that i will defer all fork requests. Thus, i will not relinquish any fork while eating. Furthermore, 3P 1 has already converged in α, so no correct neighbor can suspect i either. Also by Lemma 1.2, the fork is unique between each pair of neighbors, thus no correct neighbor can hold the fork shared with i. Consequently, Action 9 remains disabled for every correct hungry neighbor of i until after i transits back to thinking. We conclude that no live neighbor can eat simultaneously after time t. 5. Progress Proof Theorem 2: Algorithm 1 satisfies wait-free progress: every correct hungry process eventually eats. Proof: In order to eat, every hungry process must go through two phases: phase 1, outside the doorway; phase 2, inside the doorway. Correspondingly, our progress proof consists of two phases too. Progress proof of phase 1: every correct process outside the doorway eventually enters the doorway. Progress proof of phase 2: every correct hungry process inside the doorway eventually eats. However, progress in phase 1 relies on progress in phase 2, thus we prove progress in phases 1 and 2 in the reverse order. Next we show the progress proofs of phase 2 and 1 in Lemmas 2.3 and 2.4 respectively. Lemma 2.1: Let processes i and j be correct neighbors, where i is hungry and inside the doorway. If color i > color j and i does not suspect j, i eventually holds the fork shared with j continuously until after i eats. Proof: If process i has not sent a fork request to j and the fork is missing, i will request the fork shared with j (Action 6). When process j receives the fork request, because color i > color j, j defers the fork request only when it is eating (Action 7). Since j is correct, j eats only for a finite period of time. Thus, j eventually exits eating and sends all deferred forks, including the fork shared with i (Action 10). Thus, i eventually holds the shared fork. Next we show that i will hold the fork continuously until after i eats. Because color i > color j, by Action 7, while i is hungry and inside the doorway, i defers any fork request from j. Consequently, i will hold the fork continuously until after i eats. Thus, Lemma 2.1 holds. Lemma 2.2: Between each pair of neighbors i and j, there exists at most one pending ping request initiated by process i at any time. Proof: While there is a pending ping request from i to j, the variable pinged ij remains true until after i receives the ack from j. While pinged ij remains true, process i cannot send another ping message to j (Action 2). Lemma 2.2 holds. 2 Lemma 2.3: Progress in phase 2: every correct hungry process inside the doorway eventually eats. Proof: We prove Lemma 2.3 by complete induction on process colors. The base case shows that every correct hungry process inside the doorway with the highest color hc eventually eats. The inductive step shows that if every correct hungry process inside the doorway with a color higher than d eventually eats, then every correct hungry process inside the doorway with color d eventually eats too. Base Case: Every correct hungry process inside the doorway with the highest color hc eventually eats. Let i be a correct hungry process inside the doorway, where color i = hc. Because no two neighboring processes

9 have the same color, color i is higher than the colors of all neighbors. We start our proof after 3P 1 converges. Process i is guaranteed to eat after 3P 1 converges. For process i, we partition its neighbors into two sets: correct or faulty. All faulty neighbors eventually crash. By strong completeness of 3P 1, i eventually and permanently suspects all faulty neighbors. On the other hand, i cannot suspect correct neighbors after 3P 1 converges. Since color i is higher than the color of any neighbor, by Lemma 2.1, i will hold the forks shared with its correct neighbors continuously until after i eats. Thus, eventually for each neighbor j, i either suspects j permanently or holds the fork shared with j continuously. Consequently, Action 9 is enabled continuously at i, and i eventually eats. Inductive Step: If every correct hungry process inside the doorway with a color higher than d eventually eats, then every correct hungry process inside the doorway with color d eventually eats too. Consider a correct hungry process i inside the doorway with the color d. We prove that i is guaranteed to eat after 3P 1 converges. We partition all neighbors of i into three sets: faulty neighbors, Low i (correct neighbors with a color lower than d) and High i (correct neighbors with a color higher than d). Because no two neighbors have the same color, every correct neighbor belongs to Low i or High i. All faulty neighbors eventually crash. By strong completeness of 3P 1, process i will suspect them eventually and permanently. For each neighbor j in Low i, by Lemma 2.1, process i will eventually hold the fork shared with j continuously until after i eats. For each neighbor j in High i, process i will eventually hold the fork shared with j. If i does not hold the fork shared with j and has not sent the fork request to j, i sends a fork request to j (Action 6). When process j receives the fork request, j defers the fork request only when j is inside the doorway (either eating or hungry). By the inductive hypothesis, if j is hungry and inside the doorway, j eventually eats. Because j is correct, j eventually finishes eating and sends all deferred forks, including the fork shared with i. Process i may lose forks to its neighbors in High i before i eats, but i will eventually hold all forks shared with correct neighbors continuously. Process i could lose the fork to its neighbor j only when j is hungry and inside the doorway too. Next, we show that after 3P 1 converges, i could lose the fork to j at most once before i eats. After 3P 1 converges, j cannot suspect correct processes. Thus, in order to enter the doorway, j needs to collect all acks from all of its correct neighbors, including process i. It is possible that j receives an ack from i, which was sent before i entered the doorway. However, by Lemma 2.2, there exists at most one pending ping request from i to j at any time. Also, while i is inside the doorway, i defers any ping request. Thus, after 3P 1 converges, while i is inside the doorway, j could receive at most one ack from i, which was sent before i entered the doorway. Consequently, j can enter the doorway at most once while i is inside the doorway. By inductive hypothesis, j eventually eats and exits the doorway. After that, j is blocked outside the doorway until after i exits the doorway. While j is blocked outside the doorway, process i cannot lose the fork to j. Thus, i holds the fork shared with j continuously until after i eats. For each neighbor j, i either suspects j permanently or holds the fork continuously. Therefore, Action 9 is enabled continuously at i, and i eventually eats. 2 Lemma 2.4: Progress in phase 1: every correct hungry process outside the doorway eventually enters the doorway. Proof: We say that a process i belongs to set H(t), if and only if, i is correct, hungry, and outside the doorway at time t, and at time t, no correct hungry neighbor of i has been outside the doorway longer than i. We denote t ih as the time when process i started its current hungry session. Process i became hungry at time t ih and remains hungry through time t. Because neighbors can become hungry at the same time, set H(t) may include neighboring processes. We only need to prove that every process in H(t) eventually enters the doorway. If a correct hungry process j is outside the doorway but not in the set H(t), j eventually joins in a set H(t ) at a later time t and enters the doorway. To show that every process i in H(t) eventually enters the doorway, we need to prove the following: for each neighbor j, i either suspects j permanently or eventually receives an ack from j. Every faulty neighbor eventually crashes. By strong completeness of 3P 1, process i will suspect all faulty neighbors eventually and permanently. Next, we show that i eventually receives an ack from each correct neighbor j. After process i becomes hungry, i starts to collect acks from all neighbors (Action 2). By Lemma 2.2, after executing Action 2, for each neighbor j, if the ack from j is missing, then there exists exactly one pending ping request initiated by i. This pending ping request could be sent during the current hungry session, or perhaps a previous hungry session. Suppose that neighbor j receives the ping message at time t jr. Because the ping message could be sent during a previous hungry session of i, t jr may be earlier than t ih. Hence, neighbor j could receive this ping message at any time. Process j will grant the ping request except two reasons as shown next(action 3, Line 7). However, j will send the deferred ack eventually. (1) Process j is inside the doorway (either hungry or eating). If j is hungry, by Lemma 2.3, j eventually eats. Because correct processes can eat only for a finite period of time, j eventually exits the doorway and sends the deferred ack (Action 10). (2) replied ji = true. When replied ji is true, process j is hungry and outside the doorway at time t jr. We will show

10 that j eventually enters the doorway before time t and will send the deferred ack to i after eating. Suppose that replied ji was set to true at time t js and remained true from t js to t jr. To set replied ji as true at time t js, j must send an ack (denoted as ack js ) to i and be hungry and outside the doorway at time t js. Also, j must remain hungry and stay outside the doorway from t js to t jr. By Lemma 2.2, before receiving ack js, process i cannot send another ping request to j. Recall that j receives a ping request from i at time t jr. Thus, i must receive ack js and become hungry again before time t jr. In other words, i must become hungry during time period (t js, t jr ) at least once (Action 2). Note that process i becomes hungry at time t ih and remains hungry through time t. Therefore, time t js < t ih. Process j must enter the doorway at least once during the time period (t js, t). Otherwise, j would stay outside the doorway from t js to time t, then j has stayed outside the doorway longer than i at time t because t js < t ih. Consequently, i cannot be in set H(t), thus j must enter the doorway at least once during the time period (t js, t). Furthermore, since j stayed outside the doorway from t js to t jr, j must enter the doorway after time t jr and before time t. After j entered the doorway, by Lemma 2.3, j eventually eats. After eating, j will send the deferred ack to i. Thus, for each neighbor j, i either suspects j permanently or eventually receives an ack from j. Hence, Action 5 is enabled continuously, process i enters doorway eventually. Lemma 2.4 holds Eventual k-bounded Waiting Proof Theorem 3: Algorithm 1 satisfies eventual 2-bounded waiting: for each execution, there exists a time after which no live process i goes to eat more than twice, while any live neighbor is hungry. Proof: Suppose 3P 1 converges at time t c. At time t c, there may exist a set of live hungry processes, denoted as Hungry(t c ). By Theorem 2, every correct hungry process eventually eats. Therefore, there exists a time t 1 after which all correct processes in Hungry(t c ) eat, where t 1 t c. Thus, after time t 1, no hungry session of correct processes starts before time t c. Faulty processes eventually crash. Thus, there exists a time t 2 after which every live process must be correct. Let time t 3 = max(t 1, t 2 ). Next, we will show that after time t 3, no live process i goes to eat more than twice, while any live neighbor j is hungry. After t 3, since 3P 1 has already converged, no correct process wrongfully suspects any correct neighbor. Thus, to enter the doorway, process i must receive an ack from each correct neighbor. If process i goes to eat more than twice, then its neighbor j must send at least three acks to i. After j sends the first ack message, replied ij is set to true. When j receives the second ping message, j will defer the ping message (Action 3). Thus, while j is hungry, j can send at most one ack message to i. Although j can send at most one ack message to i while j is hungry, i still can receive two acks from j. It is possible that j sent an ack to i just before j became hungry. When j became hungry, the ack message was still in transit to i. Consequently, i could enter the doorway at most twice while j is hungry. Thus, i could go to eat at most twice while j is hungry. Theorem 3 holds Requirements on Communication Channels and Local Memory Bounded Space Complexity. In Algorithm 1, each process i has nine types of local variables. Variables state i and inside i need a fixed size of local memory. The variable color i needs log 2 (δ) bits of local memory, where δ refers to the maximal degree of the conflict graph. For each neighbor j, process i associates the remaining six boolean variables to j: fork ij, token ij, pinged ij, ack ij, replied ij, deferred ij. Putting them together, each process needs log 2 (δ)+6δ +c 1 bits of memory, where c 1 is a constant value. In the worst case (i.e, the conflict graph is clique), δ = n, thus each process needs O(n) bits of local memory. Bounded Capacity of Communication Channels. At any time, the number of messages between each pair of neighbors is also bounded. Specifically, at most four messages are in transit at any time between each pair of neighbors i and j. Algorithm 1 has four types of messages: ping, ack, fork and token. Because the fork and the token are unique between each pair of neighbors, at most one fork and one token are in transit simultaneously. By Lemma 2.2, at any time, at most one ping or ack message initiated by i is in transit. Counting the ping/ack message initiated by j, at most two ping/ack messages are in transit between each pair of neighbors at any time. Thus, at any time, at most four messages are in transit between each pair of neighbors. On the other hand, the size of each message is bounded too. We need to encode the process id into the fork messages, and the local variable color into the fork request messages. Both process id and color need log 2 (n) bits. So each message needs O(log 2 (n)) bits. Quiescent Communication with Crashed Processes. In Algorithm 1, processes eventually stop communicating with crashed processes. For process i, after its neighbor j crashes, by Lemma 2.2, i can send at most one ping request. From the perspective of process i, this ping request is pending forever. Thus, eventually there is no ping/ack message between i and j. Similarly, by Action 6, i can send at most one fork request(token) to j, simply because i cannot

11 get the token back. Thus, eventually there is no fork/token message between i and j. Therefore, eventually no message exists between i and j. 8. Conclusion We have explored wait-free scheduling in environments subject to crash faults. This problem is of practical importance to stabilization, because wait-free scheduling is necessary to establish convergence in the presence of transient faults. Unfortunately, wait-free distributed daemons cannot be implemented in purely asynchronous environments subject to crash faults. As such, we have examined solvability under partial synchrony relative to the eventually perfect failure detector 3P. Our work demonstrates that 3P is sufficient to achieve wait-free dining philosophers under eventual weak exclusion (3WX ), even in the presence of arbitrarily many crash faults. Our algorithm uses a local refinement of the eventually perfect failure detector 3P 1, which can be implemented in sparse networks which are partitionable by crash faults. As such, our solution is also practical in the sense that it can scale to larger networks. We have also shown that eventual k-fairness can also be achieved using 3P 1. Specifically, our algorithm satisfies eventual k-bounded waiting (3k-BW), which guarantees that every execution has an infinite suffix where no process can overtake any live hungry neighbor more than k consecutive times. Our algorithm is efficient insofar as it requires only bounded space and bounded-capacity channels. Additionally, it is quiescent with respect to crashed processes. A natural question is whether or not 3P is the weakest failure detector to implement wait-free, eventually fair daemons. This question goes to the necessity of the oracular assumptions in this paper. Parallel work in [21] has shown that 3P is necessary for wait-free, 3k-BW daemons. Our work shows that 3P is also sufficient. The composition of these two results implies that 3P is actually the weakest failure detector to implement wait-free, 3k-BW daemons. Thus, if wait-free, eventually fair scheduling is necessary for stabilization, then a corollary of our work is that 3P is also necessary. References [1] G. Antonoiu and P. K. Srimani. Mutual exclusion between neighboring nodes in a tree that stabilizes using read/write atomicity. In Proceedings of the 5th International Euro- Par Conference on Parallel Processing, volume 1685, pages , [2] H. Attiya and J. Welch. Distributed Computing : Fundamentals, Simulations, and Advanced Topics. John Wiley and Sons, Inc., [3] J. Beauquier, A. K. Datta, M. Gradinariu, and F. Magniette. Self-stabilizing local mutual exclusion and daemon refinement. Chicago J. Theor. Comput. Sci., 2002(1), July [4] J. Beauquier and S. Kekkonen-Moneta. Fault-tolerance and self-stabilization: impossibility results and solutions using self-stabilizing failure detectors. International Journal of Systems Science, 28(11): , [5] J. E. Burns, P. Jackson, N. A. Lynch, M. J. Fischer, and G. L. Peterson. Data requirements for implementation of n-process mutual exclusion using a single shared variable. J. ACM, 29(1): , [6] S. Cantarell, A. K. Datta, and F. Petit. Self-stabilizing atomicity refinement allowing neighborhood concurrency. In Proceedings of the 6th International Symposium Self- Stabilizing Systems, pages , [7] T. D. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. J. ACM, 43(2): , [8] M. Choy and A. K. Singh. Efficient fault-tolerant algorithms for distributed resource allocation. ACM TOPLAS, 17(3): , [9] F. Cristian. Understanding fault-tolerant distributed systems. Commun. ACM, 34(2):56 78, [10] E. W. Dijkstra. Hierarchical ordering of sequential processes. Acta Informatica, 1(2): , [11] S. Dolev. Self-Stabilization. MIT Press, [12] S. Dolev and Y. A. Haviv. Self-stabilizing microprocessor: Analyzing and overcoming soft errors. IEEE Trans. Comput., 55(4): , [13] C. Dwork, N. A. Lynch, and L. Stockmeyer. Consensus in the presence of partial synchrony. J. ACM, 35(2): , Apr [14] C. Fetzer, U. Schmid, and M. Susskraut. On the possibility of consensus in asynchronous systems with finite average response times. In Proceedings of the 25th IEEE International Conference on Distributed Computing Systems. [15] M. G. Gouda and F. F. Haddix. The alternator. In Proceedings of the 19th IEEE International Conference on Distributed Computing Systems Workshop on Self-stabilizing Systems, pages 48 53, [16] M. Herlihy. Wait-free synchronization. ACM TOPLAS, 13(1): , [17] M. Hutle and J. Widder. Self-stabilizing failure detector algorithms. In Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks, pages , [18] N. Lynch. Fast allocation of nearby resources in a distributed system. In Proceedings of the 12th ACM Symposium on Theory of Computing, pages 70 81, [19] M. Mizuno and M. Nesterenko. A transformation of selfstabilizing serial model programs for asynchronous parallel computing environments. Information Processing Letters, 66(6): , June [20] S. M. Pike and P. A. G. Sivilotti. Dining philosophers with crash locality 1. In Proceedings of the 24th IEEE International Conference on Distributed Computing Systems, pages 22 29, [21] Y. Song, S. M. Pike, and S. Sastry. The weakest failure detector for wait-free, eventually fair mutual exclusion. Technical Report TAMU-CS-TR , Texas A&M University, Feb 2007.

The alternator. Mohamed G. Gouda F. Furman Haddix

The alternator. Mohamed G. Gouda F. Furman Haddix Distrib. Comput. (2007) 20:21 28 DOI 10.1007/s00446-007-0033-1 The alternator Mohamed G. Gouda F. Furman Haddix Received: 28 August 1999 / Accepted: 5 July 2000 / Published online: 12 June 2007 Springer-Verlag