Event Ordering. Greg Bilodeau CS 5204 November 3, 2009

Similar documents
OUTLINE. Introduction Clock synchronization Logical clocks Global state Mutual exclusion Election algorithms Deadlocks in distributed systems

Clock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers

Advanced Topics in Distributed Systems. Dr. Ayman A. Abdel-Hamid. Computer Science Department Virginia Tech

Process groups and message ordering

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?

Last Class: Naming. Today: Classical Problems in Distributed Systems. Naming. Time ordering and clock synchronization (today)

Multi-threaded programming in Java

Coordination and Agreement

Arranging lunch value of preserving the causal order. a: how about lunch? meet at 12? a: <receives b then c>: which is ok?

Lecture 7: Logical Time

Chapter 6 Synchronization

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Coordination and Agreement

2017 Paul Krzyzanowski 1

DISTRIBUTED MUTEX. EE324 Lecture 11

CSE 5306 Distributed Systems

CSE 486/586 Distributed Systems

Frequently asked questions from the previous class survey

Synchronization. Clock Synchronization

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Coordination and Agreement

Distributed Systems. coordination Johan Montelius ID2201. Distributed Systems ID2201

Clock and Time. THOAI NAM Faculty of Information Technology HCMC University of Technology

殷亚凤. Synchronization. Distributed Systems [6]

Three Models. 1. Time Order 2. Distributed Algorithms 3. Nature of Distributed Systems1. DEPT. OF Comp Sc. and Engg., IIT Delhi

wait with priority An enhanced version of the wait operation accepts an optional priority argument:

Causal Order Protocols for Group Communication

SYNCHRONIZATION. DISTRIBUTED SYSTEMS Principles and Paradigms. Second Edition. Chapter 6 ANDREW S. TANENBAUM MAARTEN VAN STEEN

Chapter 6 Synchronization (1)

Synchronization. Chapter 5

Distributed KIDS Labs 1

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast

CSE 486/586 Distributed Systems

CSE 5306 Distributed Systems. Consistency and Replication

Time in Distributed Systems

Distributed Systems Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Coordination 1/39

CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [DISTRIBUTED MUTUAL EXCLUSION] Frequently asked questions from the previous class survey

Virtual Synchrony. Ki Suh Lee Some slides are borrowed from Ken, Jared (cs ) and JusBn (cs )

Homework #2 Nathan Balon CIS 578 October 31, 2004

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

Lecture 6: Logical Time

TDDD82 Secure Mobile Systems Lecture 1: Introduction and Distributed Systems Models

Distributed Systems - I

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Group ID: 3 Page 1 of 15. Authors: Anne-Marie Bosneag Ioan Raicu Davis Ford Liqiang Zhang

Reliable Distributed System Approaches

Distributed Systems Question Bank UNIT 1 Chapter 1 1. Define distributed systems. What are the significant issues of the distributed systems?

The UNIVERSITY of EDINBURGH. SCHOOL of INFORMATICS. CS4/MSc. Distributed Systems. Björn Franke. Room 2414

QUESTIONS Distributed Computing Systems. Prof. Ananthanarayana V.S. Dept. Of Information Technology N.I.T.K., Surathkal

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1

(Pessimistic) Timestamp Ordering. Rules for read and write Operations. Read Operations and Timestamps. Write Operations and Timestamps

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

CSE 124: LAMPORT AND VECTOR CLOCKS. George Porter October 30, 2017

Programming with Process Groups: Group and Multicast Semantics

CMSC 714 Lecture 14 Lamport Clocks and Eraser

Distributed Mutual Exclusion

Coupling Thursday, October 21, :23 PM

Several of these problems are motivated by trying to use solutiions used in `centralized computing to distributed computing

CSE 5306 Distributed Systems

Global shared variables. Message passing paradigm. Communication Ports. Port characteristics. Sending a message 07/11/2018

Data-Centric Consistency Models. The general organization of a logical data store, physically distributed and replicated across multiple processes.

Synchronization. Distributed Systems IT332

Distributed Systems. Pre-Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2015

A multi-tasking wordset for Standard Forth. Andrew Haley Consulting Engineer 8 September 2017

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17

DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK. UNIT I PART A (2 marks)

Causal Order Multicast Protocol Using Different Information from Brokers to Subscribers

On the interconnection of message passing systems

Data Replication CS 188 Distributed Systems February 3, 2015

VALLIAMMAI ENGINEERING COLLEGE

Process Synchroniztion Mutual Exclusion & Election Algorithms

CSE 5306 Distributed Systems. Synchronization

CSE 380 Computer Operating Systems

Concurrency and OS recap. Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin e Greg Gagne

Non-blocking Array-based Algorithms for Stacks and Queues. Niloufar Shafiei

Distributed Systems Multicast & Group Communication Services

Indirect Communication

Failure Tolerance. Distributed Systems Santa Clara University

Virtual Synchrony. Jared Cantwell

(Pessimistic) Timestamp Ordering

殷亚凤. Consistency and Replication. Distributed Systems [7]

Synchronisation in. Distributed Systems. Co-operation and Co-ordination in. Distributed Systems. Kinds of Synchronisation. Clock Synchronization

Verteilte Systeme (Distributed Systems)

Distributed Systems COMP 212. Lecture 17 Othon Michail

An application-level implementation of causal timestamps and causal ordering

Basic vs. Reliable Multicast

Chapter 5 Asynchronous Concurrent Execution

Linearizability CMPT 401. Sequential Consistency. Passive Replication

Rollback-Recovery Protocols for Send-Deterministic Applications. Amina Guermouche, Thomas Ropars, Elisabeth Brunet, Marc Snir and Franck Cappello

Time Synchronization and Logical Clocks

Concurrent & Distributed Systems Supervision Exercises

Lecture 10: Clocks and Time

Distributed Systems Coordination and Agreement

Checkpointing and Rollback Recovery in Distributed Systems: Existing Solutions, Open Issues and Proposed Solutions

Lecture 12: Time Distributed Systems

Distributed Systems (5DV020) Group communication. Fall Group communication. Fall Indirect communication

Synchronisation and Coordination (Part 2)

Chapter 8 Fault Tolerance

Transcription:

Greg Bilodeau CS 5204 November 3, 2009

Fault Tolerance How do we prepare for rollback and recovery in a distributed system? How do we ensure the proper processing order of communications between distributed processes?

Time No shared clock All specifications of a system must be given in terms of events observable within that system Can we construct a concept of time that would be useful from events of a distributed system?

Events An event is just an event of interest example: a communication between processes Single process defined as totally ordered sequence of events

Events Happened before relation : If a and b from same process and a comes before b a is a send and b a receive from different processes If a b and b c, then a c Events a and b concurrent if!a b and!b a

Events Another definition: events causally affect each other a b means it is possible for a to causally affect b a and b are concurrent if they cannot causally affect each other

Assigns a number to an event Simple counter Clock Condition: For a, b: if a b then C(a) < C(b) C(p 1 ) < C(p 2 ) C(p 1 ) < C(q 2 ) C1: Line between local events C2: Line between send and receive Logical Clocks

Logical Clocks How we meet these conditions: C1: Each process increments its clock between successive events C2: Requires each message to include a timestamp equal to time the message was sent Receiver sets its own clock to a value greater than or equal to its own value and greater than the timestamp from the message cannot move its clock backward

Example of Lamport s Algorithm Event Ordering 1 2 3 10 P 1 1 4 6 P 2 5 P 3 1 2 3 4 5 6 7 8 9 CS 5204 Operating Systems 9

Lamport s Approach Just order events according to times at which they occur If times are equal, choose one to proceed Example: mutual exclusion problem Assume all messages received in order Assume all messages eventually received Each process has own request queue Conditions we must achieve: Process with resource must release before used by others Requests must be granted in order made Every request must eventually be granted

Lamport s Mutual Exclusion Example Event Ordering Process P i sends T m :P i message to all others, adds message to own request queue Process P j adds resource request to its queue, sends a time stamped acknowledgement When finished, P i removes the message from its queue, sends a time stamped removal to all others Process P j removes the resource request from the queue P i can use the resource when: It s own request is ordered before any others in its queue It has received a message from all others stamped later than T m

Limits of Lamport Clock times cannot guarantee causal relationship We can say if a b then C(a) < C(b) CANNOT say if C(a) < C(b) then a b Concept of time is exclusive to each process, i.e. causality only in same process We can provides this through: Using physical clocks Using vector clocks

Vector Time The vector time for p i, VT(p i ): Length n, where n is number of processes in group Initialized to all zeros p i increments VT(p i )[i] when sending m Each message sent in time-stamped with VT(p i ) Receiving processes in the group modify their vector clock: Vector time-stamp of m counts the number of messages that causally precede m on a per-sender basis

Vector Clocks P 1 (1,0,0) (2,0,0) (3,4,1) (2,3,1) P 2 (0,1,0) (2,2,0) (2,4,1) P 3 (0,0,2) (0,0,1) CS 5204 Operating Systems 14

Birman-Schiper-Stephenson ISIS toolkit tools for building software in loosely coupled distributed environments CBCAST multicast primitive Fault-tolerant, causally ordered message delivery Asynchronous ABCAST Extension allowing total ordering Synchronous Group communication Imposes overhead proportional to group size

Birman-Schiper-Stephenson Cooperative processes form groups Processes multicast to all members of their group(s) Delivery times are uncertain possible to receive messages out of causal sequence Message processing mechanism must provide lossless, uncorrupted and sequenced delivery Distinction between receiving and delivering Allows delay of delivery until some condition satisfied i.e. causal order maintained

P 1 Send(M 1 ) Causal Ordering of Messages Space P 2 Send(M 2 ) P 3 Time CS 5204 Operating Systems 17

Vector Clocks in BSS Values in vector clock indicate how many multicasts preceded message by each process; must process same number from each before same state is reached Recipient will delay delivery of the message using a delay queue until corresponding number of messages have been received

Conclusions Causal relationships between events of processes in a distributed environment are critical when discussing fault-tolerance and rollback/recovery Achieving total ordering of events is difficult in the absence of a shared clock Mechanisms to provide shared logical clocks use simple counters but can enforce causal orderings

Questions?