sevier ience Friedemann

Similar documents
Node 1. m1 5 m7. token

Abstract DETECTING TERMINATION OF DISTRIBUTED COMPUTATIONS USING MARKERS

Coordination and Agreement

Distributed Algorithmic

2. Time and Global States Page 1. University of Freiburg, Germany Department of Computer Science. Distributed Systems

our ideas of graph search technique reported

Distributed Snapshots: Determining Global States of Distributed Systems

Improving the Quality of Protocol Standards: Correcting IEEE FireWire Net Update

Redundant States in Sequential Circuits

Concurrent Reading and Writing of Clocks

CL i-1 2rii ki. Encoding of Analog Signals for Binarv Symmetric Channels A. J. BERNSTEIN, MEMBER, IEEE, K. STEIGLITZ, MEMBER,

Parallel algorithms for generating combinatorial objects on linear processor arrays with reconfigurable bus systems*

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Unavoidable Constraints and Collision Avoidance Techniques in Performance Evaluation of Asynchronous Transmission WDMA Protocols

On the interconnection of message passing systems

CSE 486/586 Distributed Systems

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

γ 2 γ 3 γ 1 R 2 (b) a bounded Yin set (a) an unbounded Yin set

MOST attention in the literature of network codes has

A Dag-Based Algorithm for Distributed Mutual Exclusion. Kansas State University. Manhattan, Kansas maintains [18]. algorithms [11].

Proving the Correctness of Distributed Algorithms using TLA

CS 5520/ECE 5590NA: Network Architecture I Spring Lecture 13: UDP and TCP

Finding a -regular Supergraph of Minimum Order

SFWR ENG 3S03: Software Testing

Parallel Programming

An Eternal Domination Problem in Grids

Final Exam Review (extended)

Outline. Definition. 2 Height-Balance. 3 Searches. 4 Rotations. 5 Insertion. 6 Deletions. 7 Reference. 1 Every node is either red or black.

Lecture 1 Contracts. 1 A Mysterious Program : Principles of Imperative Computation (Spring 2018) Frank Pfenning

Telematics. 5th Tutorial - LLC vs. MAC, HDLC, Flow Control, E2E-Arguments

Algorithm 23 works. Instead of a spanning tree, one can use routing.

A Note on Scheduling Parallel Unit Jobs on Hypercubes

Outline. Introduction. 2 Proof of Correctness. 3 Final Notes. Precondition P 1 : Inputs include

To do. Consensus and related problems. q Failure. q Raft

Lecture Notes on Contracts

Flight Systems are Cyber-Physical Systems

Three applications of Euler s formula. Chapter 10

The packing problem: A divide and conquer algorithm on cellular automata

CS 6353 Compiler Construction Project Assignments

An Object-Oriented Approach to Software Development for Parallel Processing Systems

Consensus in the Presence of Partial Synchrony

Two trees which are self-intersecting when drawn simultaneously

The alternator. Mohamed G. Gouda F. Furman Haddix

Distributed Operating Systems. Distributed Synchronization

Efficient Prefix Computation on Faulty Hypercubes

C311 Lab #3 Representation Independence: Representation Independent Interpreters

(IVl -1) messages. In [3], both the time and com-

A Bandwidth Latency Tradeoff for Broadcast and Reduction

CS 151 Complexity Theory Spring Final Solutions. L i NL i NC 2i P.

Distributed Deadlock Detection for. Distributed Process Networks

Self Stabilization. CS553 Distributed Algorithms Prof. Ajay Kshemkalyani. by Islam Ismailov & Mohamed M. Ali

A family of optimal termination detection algorithms

High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore

Communication Networks I December 4, 2001 Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page 1

Hardware versus software

Throughout this course, we use the terms vertex and node interchangeably.

Fault Tolerance. Distributed Software Systems. Definitions

Triangle Graphs and Simple Trapezoid Graphs

16.682: Communication Systems Engineering. Lecture 17. ARQ Protocols

Message-Optimal and Latency-Optimal Termination Detection Algorithms for Arbitrary Topologies

Distributed Systems - I

Analysis of Distributed Snapshot Algorithms

Incompatibility Dimensions and Integration of Atomic Commit Protocols

A Bintree Representation of Generalized Binary. Digital Images

University of Twente. Faculty of Mathematical Sciences. Note on the game chromatic index of trees. Memorandum No. 1652

An Efficient Method for Constructing a Distributed Depth-First Search Tree

V1.0: Seth Gilbert, V1.1: Steven Halim August 30, Abstract. d(e), and we assume that the distance function is non-negative (i.e., d(x, y) 0).

Supplemental Note on Count-to-Infinity Induced Forwarding Loops in Ethernet Networks

A more efficient algorithm for perfect sorting by reversals

A technique for adding range restrictions to. August 30, Abstract. In a generalized searching problem, a set S of n colored geometric objects

A Synchronization Algorithm for Distributed Systems

Verteilte Systeme (Distributed Systems)

IS BINARY ENCODING APPROPRIATE FOR THE PROBLEM-LANGUAGE RELATIONSHIP?

Multiway Blockwise In-place Merging

The Information Structure of Distributed Mutual Exclusion Algorithms

Novel low-overhead roll-forward recovery scheme for distributed systems

Consensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS

System Models 2. Lecture - System Models 2 1. Areas for Discussion. Introduction. Introduction. System Models. The Modelling Process - General

Fault-Tolerant Routing Algorithm in Meshes with Solid Faults

CSE 5306 Distributed Systems

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Optimal Subcube Fault Tolerance in a Circuit-Switched Hypercube

How invariants help writing loops Author: Sander Kooijmans Document version: 1.0

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Spring Quiz III

(a) (4 pts) Prove that if a and b are rational, then ab is rational. Since a and b are rational they can be written as the ratio of integers a 1

CSE 5306 Distributed Systems. Fault Tolerance

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Distributed algorithms

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance

The Encoding Complexity of Network Coding

Consensus and related problems

ACONCURRENT system may be viewed as a collection of

Graph Theory Questions from Past Papers

JOURNAL OF OBJECT TECHNOLOGY

[8] that this cannot happen on the projective plane (cf. also [2]) and the results of Robertson, Seymour, and Thomas [5] on linkless embeddings of gra

OUTLINE. Introduction Clock synchronization Logical clocks Global state Mutual exclusion Election algorithms Deadlocks in distributed systems

CS 123: Lecture 12, LANs, and Ethernet. George Varghese. October 24, 2006

Deadlocks in Distributed Systems: Request Models and Definitions

Consensus on Transaction Commit

Problem with Scanning an Infix Expression

Transcription:

Information Processing Letters 31 (1989) 203-208 North-Holland 22 May 1989 Friedemann N Department of Computer Science, Universrty of Kaiserslautern, P.O. Box 3049, D 6750 Kaiserslautern, Fed Rep. Germany Communicated by T. Lengauer Received 2 November 1987 Revised 4 December 1988 Keywords: Termination detection, distributed algorithm, CSP, distributed computing In recent years, a surprising number of distributed termination detection algorithms with various characteristics have been presented. One of the most elegant solutions is deriv-ed stepwise together with an invariant and is due to Dijkstra, Feijen and van Gasteren [9]. Discussions of the pr&iple and variants can be found rn [1,14,15]. We present a similar algorithm which detects termination faster under the assumption that whenever a (synchronous) message is sent from some process Pi to another process Pi, an acknowledgement carrying a one-bit status mformation is sent in the opposite direction. This can usually be done at negligible cost without an explicit message. We consider n processes PI,..., P, (n >/ 2), each being either active or pas&e. In an underlying computation, the processes cooperate by exchanging synchronous messages (so called basic messages). The computation satisfies the following conditions [IQ]: (1) only active processes may send messages, (2) a process may change from passive to active only on receipt of a (3) a process may chang are not only interested in a method by which a given process P, is enabled to detect termination when it has occurred, but in an efficient control algorithm which enables P, to determine whether the underlying computation (a) has terminated before the control algorithm finished, or (b) had not yet terminated when the algorithm was started. If the underlying computation terminates while the control algorithm is running, either result is accepta call an algorithm with these properties a tion test. Such an algorithm can be regarded as a specialization of a general stable property detection algorithm for distributed systems [8]. Notice that since nontermination is not a stable property, it is impossible to state that a distributed computation is still active now. We assume that the underlying compu&ion is started before the termination test. As opposed to a termination detection al- can be reactivated be a more reasonable in sevier ience 203

Volume 31, Number 4 INFORMATION PROCESSING LETTERS 22 M;ry 1989 never reports a negative result. Those detection algorithms which proceed in rou& (e.g., [4,12]) may ako detect case (b). On the other hand, any tetiation test algorithm can easily be transformed into a detection algorithm (with possibly unbounded message complexity) by simply restarting the algorithm after a negative result. Although the algorithm in [9] is not efficient when used as a termination test, we use it as a starting point by presenting a version with a slightly changed set of rules, to allow for the early announcement of a negative result. We assume that for the purposes of the termination test, a control token circles around a ring P,, P,_,,..., P,, Pn. Notice that being passive does not prevent a process from sending or receiving the token. We further assume that the token and the processes can be either black or white; initially, all processes are supposed to be white. The following rules A process sending a basic message to a recipient with a higher index than its own be- P,, when passive, may initiate a test by active process keeps the token until process Pi (i # n) which has s a black token if Pi or the token is black, otherwise it propagates a white A process transmitting the token be- sent to P, that has no effect, it simplifies Rules 6-8.) To see why a second round is necessary in most cases (Rule 7), a time diagra (Fig. 1) is useful. According to Rule 1, Pz becomes black when sending message (a). If this were the last message of the underlying computation and all processes were passive shortly afterwards, the system would already be terminated at the time instant the current ontrol round is started. The token then becomes ack when passing by P2, resulting in a false alarm. A second round is guaranteed to return a white token, since the first round made all processes white (Rule 5). To avoid false alarms, an algorithm has to distinguish case (a) from case (b) where a message is sent to a node which has already been visited by the token in the currergt control round. According to Rule 3, active processes delay the propagation of the token While this is acceptable for termination detection (in every control round except the last one, at least one basic message is exchanged, thereby bounding the number of token passes), it slows down the termination test. Especially to enable a fast negative response in the case the computation is not terminated, Rule 3 could be replaced by the following rule. 3. An active process Pi (i z n j which has the token propagates a black token. When the token encounters an active process, a second confirmation round (Rule 7) is not necessary. The changes of the rules which take into account this possibility are straightforward. receives a black token for the first f P receives a black token the second. 2 previous control round black / I.-&b current control round 3... n-l Fig. 1. time -

Volume 31, Number 4 INFOR TION PROCESSING LET-E 22 May 1089 Fig. 2. We now show how to modify the algorithm in order to enable a test in one single round (even if the loken did not encounter an active process) such that Rule 7 and Rule 8 can be replaced by the following one. 7. If P, receives a black token it anno s the failure of the test. The general idea of the improvtment is simply that (according to the comments concluding the last section) a process has to register precisely those messages it sends to other processes which have already been visited by the token in the current control round. These are necessarily processes with a higher index. For that purpose a binary state indicator with values 0 (or even ) and 1 ( odd ) is postulated to exist in each process which is initia iized to 0. To achieve that the control token changes the value of the state indicator (see Fig. 2), Rule 5 is replaced by Rule 5. 5. A process sending the token becomes white and changes its state. If care is taken that at any time at most one token exists (i.e., P, does not restart the algorithm before the completion of a previous control round), then exactly those messages which cross the diagonal line representing a control wave in the time diagram (messages (b), (c), and (d) in Fig. 2) are received in a different state than they were sent. Only messages of type (b) are of interest here, le the following replacement cf Rule 1. A process sending a basic r:essage to a recipient in a different state and with a higher index than its own becomes black. Obviously, the sender of the message must be informed about the state of the receiver. We will discuss in the next section how this can be achieved. Surprisingly, in CSP this can be without explicit messages and without changing the underlying communication protocol. Notice that on the one hand % Z I, 5, and 7 other hand are al (i.e., an algorithm based on I, 2, 3 (instead of 3 ), 4, 5, 6, 7 is also a valid termination test). When used as a termination detector an&~gously to [9] (i.e., when the next round is initiated auton atically after an unsuccessful test), the original Rule 3 should be used instead of Rule 3 in order to bound the number of control messages. Rule 7 then becomes as follows. 7. If P, receives a black token it starts h The new detection algorithm (based on Rules 1 \ 2, 3, 4, 5, 6, 7 ) usually generates less messages and detects termination faster than [9] (based on Rules l-6, 7 ), since no extra round after termination is necessary. The original algorithm [9] needs an extra confirmation round whenever some process sent a message to a process with a higher index after the previous round. In particular, a single round is sufficient if the algorithm is initiated after termination. In [9] an invariant is established which (when stated verbally) reads if the token is white then all processes the token has visited in the current round are passive or at least one process the token has not yet visited is black. It is easy to see that our variant of the algorithm satisfies this invariant and is therefore correct. 4. An im entatim in CS Rule 1 requires that the sender is informed about the state of the receiver. We contend that in virtually all cases this can be done at no extra cost once a connection between two processes has been established by an un&+ying protocol and the processes have synchronized for communication. Since usually the transmission of a message is acknowledged by a control message (which unblocks the sender), it shou be easy to let the one-bit state information piggyback on the ment. Abstract im cation recesses co

Volume 31, Number 4 INFORMATION PROCESSING LElTERS 22 May 1989 information about the state of each process must be exchanged by the nontrivial handshaking protocol anyhow in order to reach agreement among the processes in selecting matching communication commands [7,11,13]. We now show how a CSP program P = [PI ii - 11 Pm] without a termination test can be systematically transformed into another program P with an incorporated termination test based on the previously stated rules. As usual we make use of an extended version of the original CSP definition [lo], where not only input commands are allowed in guards, but also output commands [5]. This extension is powerful since it allows signals to propagate backwards as has been notified by Rouge 161. As shown in 12,161 we may assume without loss of generality that each process Pi is already transformed into a semantically equivalent normal form Pi : : INIT; 1 with a top level repetitive command, where each is m optional boolean expression list and none the lists IN~T and S, contains an I/O command or repetitive command. The index sets Fi and r. are assumed to be disjoint (F, n F. = @). We only consider simple variables and expressions in I/O commands; structured variables and expressions [lo] can be handled analogously by tagging the construction identifier with even or odd. For the purpose of termination detection, those alternative parts c B, ; Pj,?Vk with input gu ~3s for which j, < i (a message is received from a process with a smaller index) are split up into two parts: Cl state = 0 ; B, ; qjeven( V,) + S,, state = 1 ; k ; <sodd(&) + S,. The alternative parts IzU#~ ; qk!e, with output i (a message is sent to a The expression constructors even and odd match a corresponding input command only if the receiver is in state 0 or 1, r ectively. The sender sets the boolean flag bl whenever it finds out that it is in a different state, thus complying with Rule 1. For the propagating of the token we add two guarded commands at the top-level loop to all processess PI,..., P, _ 1. They implement Rules 3, 4, and 5 (if i = 1, then Pi_ 1 denotes P, ). I Pi+,?token(color) + have_ token := have, token ; <_,! token(color V black V,PASSIVE) -D s@; state := (state + 1) mod 2. PASSIVE may be seen as a system variable indicating whether the process is active or passive. 4, possible interpretation of PASSIVE, whicn is con sistent with the spec&cations in Section 1, consists of the following definition: PASSIVE = the process is at its top level loop /WkEl-& -,B,. This predicate can easily be implemented. Notice, however, that it does not handle the cases where some output commands are enabled but permanently (or temporarily) blocked because the receiving processes are not ready to accept the messages. For the sake of simplicity we also ignore the so-called Distributed Termination Convention [3] and other problems caused by processes which terminate by leaving the top-level loop of the original program. For the purpose of termination detection the token should only be accepted or propagated if the process is passive (Rule 3 instead of Rule 3 ). The guarded commands may then be changed to CI PASSIVE ; Pi +,?token(color) * I have_ token ; P~srv~ ; Pi_ 1! token (color V black) + It is not possible, however, to combine the commands into one single command I PASUVE ; Pi+,?token(color) + Pi_ 1!token(color V black) ; since the use of an unconditional output com- 9 I. 206

Volume 31, Number 4 P, has a special role since it initiates the control round and has to announce the result. Because of Rule l, P, can never become black. that @ART and TERMINATED are two an obvious way by the environment or parts of the program not shown here according to rules 2, 6, and 7. state := (state -b 1) mod 2 I &?token(color) + TERMINATED := -,color. When detecting termination, Pn should inform all other processes which may then terminate properly in the sense of [lo] b;; leaving the top level loop. Otherwise P, may restart th some later time by setting START t Finally, we note the initializa trol variables added to the INIT parts: 22 May 1989 ge message of the underlying protocol at no cost. revious solutions to the distributed terminaproblem based on synchronous communication which require only one control rcund after termination have been published by Rana [12] and by Apt and Richier [4]. ana ; solution uses synchronized real time clocks. Apt and Richier present a symmetric algorithm with tightly synchronized virtual clocks. In their solution the clocks of the sender and the receiver are synchronized after every basis co unication, requiring two control messages fez each basic message. Our principle can also be applied to control topologies other than ring3 (e.g., trees [14], star networks, and parallel or sequential graph traversal schemes) where a particular process can collect the colors of the other processes. To detect messages which are sent to already visited processes and to differentiate these messages from messages which, during a pri=viaus control round, have been sent by already visited processes to processes which have not yet been visited (messages of type (b) and (9) in Fig. 2), a state module 3 is sufficient. Rule 1 is then replaced by the 1. A process in state s sending a basic message to a recipient in state (s + 1) mod 3 becomes black. The default initial value of START should be The implementation in CSP makes use of a trick due to Boug& [6] which allows to transmit information backwards along u tional channels without explicit messages. this feature is peculiar to CSP (it relies on the semantics of the alternative send ), it is always the case that in a synchronous communication scheme (e.g., Ada, Occam) the sender of a message gets some knowledge about the receiver s state when the message is accepted. The only potential problem is the efficient realization of Rule 1. In practice, howe it should be possible to pi ditional one-bit state informati PI K.R. Ap:; Correctness proofs of distributed termination dgorii~?w+ 4 CM Trans. Programming Languages.%tems (1986) 388-405. 121.R. Apt, L. Bough and Ph. Cbrmont, Two normal fo theorems for CSP programs., Inform. Process. L&t. 131 (1987) 165-171. K.R. Apt and N. Framxz, odeling the distributed termination convention of CSP, ACM Trans. Progrumming Languages Syswns 45 (1984) 370-379. 141 K.R. Apt and J..,L. Richier, Real time clocks versus virtual clocks, In: M. Broy, ed., Control Flow and Data Flow: Concepts of Distributed Programming (Springer, Berlin, lcp5) 475-501. A.J. Bernstein, Output min %xnmunicating sequen Programming Languages and Systems 2 (1 161 e existence cf generic broadcast als 207

Volume 31, Number 4 INFOiUUTION PROCESSING LETTERS 22 May i989 processes, In: J. van Leeuwen, ed., Prxz 2nd Internat. Workshop o,t Distributed Algorit!uns, Lecture Notes in Computer Science, Vol. 312 (Springer, Berlin, 198&) 388-407. [7] G.N. Buckley and A. Silberschatz, An effective impi+ mentation for the generalized input-output construct of CSP, ACM Trans. Programming Languages Systems 5 (1983) 223-235. [8] K.M. Chandy and L. Lamport, Distributed snapsh.3ts: Determining giobai states of distributed systems, ACA4 Trans. Comput. Systems 3 (1985) 63-75. [9] E.W. Dijkstra, W.H.J. Feijen and A.J.M. van Gasttren, Derivation of a termination detection algorithm for distributed computations, Inform. Process. L,ett. 16 (1383) 217-219. [lo] C.A.R. Hoare, Communicating sequential processes, Comm. ACM 21 (1978) 666-677. [ll] R.B. Kieburtz and A. Silberschatz, Comments on Commuticating sequential processes, ACM Trans. Programming Languages Systems I (1979) 218-225. [12] S.P. Rana, A distributed solution of the distributed termination problem, Inform. Process. Lett. I (1983) 43-46. [13] A. Stiberschatz, Communication and synchronization in distributed systems, IEEE Trans. Software Eng. SE5 <I9799 542-546. [14] R.W. Topor, Termination detec for distributed computations, Inform. Process. Lett. 1984) 33-36. [15] J.P. Verjus, On the proof of a distributed algorithm, Inform. Process. L.ett. 25 (1987) 145-147. [16] D. Zabel, Normalform-Transformationen fir CSP-Programme, Informatik - Forschung und Entwickiung 3 (1988) 64-76.