Chapter 6 Synchronization (1)

Similar documents
Time in Distributed Systems

CSE 5306 Distributed Systems. Synchronization

CSE 5306 Distributed Systems

Clock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers

Distributed Coordination 1/39

Chapter 6 Synchronization

TIME ATTRIBUTION 11/4/2018. George Porter Nov 6 and 8, 2018

Advanced Topics in Distributed Systems. Dr. Ayman A. Abdel-Hamid. Computer Science Department Virginia Tech

SYNCHRONIZATION. DISTRIBUTED SYSTEMS Principles and Paradigms. Second Edition. Chapter 6 ANDREW S. TANENBAUM MAARTEN VAN STEEN

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Time Synchronization and Logical Clocks

殷亚凤. Synchronization. Distributed Systems [6]

Synchronization Part 1. REK s adaptation of Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 5

Time. COS 418: Distributed Systems Lecture 3. Wyatt Lloyd

Time Synchronization and Logical Clocks

Last Class: Naming. Today: Classical Problems in Distributed Systems. Naming. Time ordering and clock synchronization (today)

Verteilte Systeme (Distributed Systems)

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

Lecture 10: Clocks and Time

Lecture 12: Time Distributed Systems

Distributed Systems. Clock Synchronization: Physical Clocks. Paul Krzyzanowski

Synchronisation in. Distributed Systems. Co-operation and Co-ordination in. Distributed Systems. Kinds of Synchronisation. Clock Synchronization

Distributed Systems (5DV147)

ICT 6544 Distributed Systems Lecture 7

CSE 124: LAMPORT AND VECTOR CLOCKS. George Porter October 30, 2017

Distributed Systems COMP 212. Lecture 17 Othon Michail

Clock-Synchronisation

Chapter 8 Fault Tolerance

Arranging lunch value of preserving the causal order. a: how about lunch? meet at 12? a: <receives b then c>: which is ok?

Time in Distributed Systems

Synchronization. Distributed Systems IT332

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast

Clock and Time. THOAI NAM Faculty of Information Technology HCMC University of Technology

Lecture 7: Logical Time

Event Ordering. Greg Bilodeau CS 5204 November 3, 2009

Chapter 6 Synchronization (2)

Distributed Systems. Pre-Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2015

CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [DISTRIBUTED MUTUAL EXCLUSION] Frequently asked questions from the previous class survey

Distributed Systems. 05. Clock Synchronization. Paul Krzyzanowski. Rutgers University. Fall 2017

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

OUTLINE. Introduction Clock synchronization Logical clocks Global state Mutual exclusion Election algorithms Deadlocks in distributed systems

Synchronization. Clock Synchronization


Process groups and message ordering

Distributed Operating Systems. Distributed Synchronization

Distributed Systems Principles and Paradigms

2. Time and Global States Page 1. University of Freiburg, Germany Department of Computer Science. Distributed Systems

TIME AND SYNCHRONIZATION. I. Physical Clock Synchronization: Motivation and Challenges

Large Systems: Design + Implementation: Communication Coordination Replication. Image (c) Facebook

Implementing a NTP-Based Time Service within a Distributed Middleware System

Distributed Systems Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2016

Data Replication CS 188 Distributed Systems February 3, 2015

CSE 124: TIME SYNCHRONIZATION, CRISTIAN S ALGORITHM, BERKELEY ALGORITHM, NTP. George Porter October 27, 2017

Distributed Systems Principles and Paradigms

MYE017 Distributed Systems. Kostas Magoutis

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Chapter 8 Fault Tolerance

2017 Paul Krzyzanowski 1

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?

Fault Tolerance. Distributed Software Systems. Definitions

Module 7 - Replication

Last Class: Naming. DNS Implementation

Lecture 6: Logical Time

DISTRIBUTED REAL-TIME SYSTEMS

Multi-threaded programming in Java

Distributed Systems. Day 7: Time and Snapshots [Part 1]

CSE 380 Computer Operating Systems

Ad hoc and Sensor Networks Time Synchronization

Synchronization. Chapter 5

Failure Tolerance. Distributed Systems Santa Clara University

DISTRIBUTED MUTEX. EE324 Lecture 11

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance

DISTRIBUTED SYSTEMS. Second Edition. Andrew S. Tanenbaum Maarten Van Steen. Vrije Universiteit Amsterdam, 7'he Netherlands PEARSON.

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

INTRODUCTION TO WIRELESS SENSOR NETWORKS. CHAPTER 7: TIME SYNCHRONIZATION Anna Förster

Distributed Systems Multicast & Group Communication Services

Distributed Synchronization. EECS 591 Farnam Jahanian University of Michigan

Distributed Clock Synchronization Algorithms: A Survey

Distributed Systems COMP 212. Lecture 1 Othon Michail

wait with priority An enhanced version of the wait operation accepts an optional priority argument:

Basic vs. Reliable Multicast

CSE 5306 Distributed Systems. Consistency and Replication

Distributed Systems. coordination Johan Montelius ID2201. Distributed Systems ID2201

Distributed Systems Synchronization

Coordination and Agreement

Midterm Examination ECE 419S 2015: Distributed Systems Date: March 13th, 2015, 6-8 p.m.

Specifying and Proving Broadcast Properties with TLA

The Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer

Chapter 7 Consistency And Replication

Reliable Distributed System Approaches

Chapter 4 Communication

Three Models. 1. Time Order 2. Distributed Algorithms 3. Nature of Distributed Systems1. DEPT. OF Comp Sc. and Engg., IIT Delhi

Coordination and Agreement

Distributed Systems Fault Tolerance

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Process Synchroniztion Mutual Exclusion & Election Algorithms

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 7 Consistency And Replication

殷亚凤. Consistency and Replication. Distributed Systems [7]

Concurrency and OS recap. Based on Operating System Concepts with Java, Sixth Edition, 2003, Avi Silberschatz, Peter Galvin e Greg Gagne

On the interconnection of message passing systems

Transcription:

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 6 Synchronization (1) With material from Ken Birman Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Plan Clock synchronization in distributed systems Physical clocks Logical clocks Ordered multicasting JGroups Mutual exclusion Election Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Time We have been somewhat careful to avoid talking about time until now But, in distributed system we need practical ways to deal with time E.g., we may need to agree that update A occurred before update B Or offer a lease on a resource that expires at time 1:1:1.5 Or guarantee that a time critical event will reach all interested parties within 1ms Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

But what does Time mean? Time on a machine s local clock But was it set accurately? And could it drift, e.g. run fast or slow? What about faults, like stuck bits? Time on a global clock? E.g., with GPS receiver Still not accurate enough to determine which events happens before other events Or could try to agree on time Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Basic approaches Physical time Synchronize local clocks (internal synchronization) Synchronize with external source (external synchronization) Logical time Avoid absolute statements about time Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Computer Clocks Typical computer has timer circuit for keeping tracks of time Quartz crystal that oscillate at a well-defined frequence Counter decremented each oscillation Holding register incremented when counter reaches zero Time always goes forwards on a single computer But not when compared across distributed nodes Crystals oscillate at slightly different frequency -> clock skew Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Computer Clocks Difference between time on M and M1 goes from > to < Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Physical Time How to get absolute physical time on computer? Atomic clocks are expensive Universal Coordinated Time (UTC) sources often broadcast UTC second starts E.g., WWV, Fort Collins, Colorado Accuracy of ~ 1 msec (1 x 1-3 sec) Global Positioning System Claimed error < 6 nsec for (6 x 1-9 sec) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Clock Synchronization Algorithms Generally have time server Needs external time source WWV, GPS, ninja, administrator, System model Timers work within a maximum drift rate, clock time offset To guarantee that clocks do not drift more than seconds apart, synchronize at least every seconds: 1 skew Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Server Cristian s Algorithm Client A s offset relative to B,, is given by dt req dt res θ T 3 + (T T 1 ) + (T 4 T 3 ) T 4 = (T T 1 ) + (T 3 T 4 ) : speed up A s (virtual) clock towards B s : slow down A s (virtual) clock towards B s Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Server Network Time Protocol Client The delay estimation,, is given by Calculate and multiple (8) times Choose for which is minimal as offset estimate Divide time servers into strata External clock is stratum Stratum A server adjust their time according to stratum B server if B < A, then becomes stratum B+1 server Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

The Berkeley Algorithm Time daemon/server is actively requesting current time difference from machines Calculates average and requests machines to adjust clock Works if no interface with external machine with physical time synchronization Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Clock Synchronization in Wireless Networks Challenges Most machines cannot contact each other directly Large overhead in multihop routing Cannot deploy fixed time servers Need to optimize for energy consumption Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Reference Broadcast Synchronization Variable: Context switches, system call overhead (RBS) Variable: Media Access Control protocol Constant: Sufficiently constant: Can timestamp early Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Reference Broadcast Synchronization (RBS) Assume we know clock skew and have corrected for it I.e., clock offset can be regarded as constant Algorithm Transmitter broadcasts m reference packets Each of n receivers note when reference was observed according to local clock Receivers exchange observations Calculate pairwise offset as Offset[i, j] = 1 m m (T i,k T j,k ) Where k=1 is time of receipt of message k at node i Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Logical Time The Berkeley and RBS algorithms do not necessarily give any correspondence to UTC But they do give correspondence to computer clock A lot of times this is not necessary If we just want to know if event a happened before or after event b Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Lamport s Approach Leslie Lamport suggested that we should reduce time to its basics Cannot order events according to a global clock None available Can use logical clock Time basically becomes a way of labeling events so that we may ask if event A happened before event B Answer should be consistent with what could have happened with respect to a global clock Often this is what matters Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Drawing time-line pictures: p snd p (m) m q rcv q (m) deliv q (m) D Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Drawing time-line pictures: p A snd p (m) B m q C rcv q (m) deliv q (m) D A, B, C and D are events. Could be anything meaningful to the application microcode, program code, file write, message handling, So are snd(m) and rcv(m) and deliv(m) What ordering claims are meaningful? Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Drawing time-line pictures: p A snd p (m) B m q C rcv q (m) deliv q (m) D A happens before B, and C before D Local ordering at a single process Write and Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Drawing time-line pictures: p A snd p (m) B m q C rcv q (m) deliv q (m) D snd p (m) also happens before rcv q (m) Distributed ordering introduced by a message Write Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Drawing time-line pictures: p A snd p (m) B m q C rcv q (m) deliv q (m) D A happens before D Transitivity: A happens before snd p (m), which happens before rcv q (m), which happens before D Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Drawing time-line pictures: p A snd p (m) B m q C rcv q (m) deliv q (m) D B and D are concurrent Looks like B happens first, but D has no way to know. No information flowed Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

The Happens-Before Relation We ll say that A happens-before B, written A B, if 1) A P B according to the local ordering, or ) A is a snd and B is a rcv and A M B, or A and B are related under the transitive closure of rules 1. and. Thus, A D So far, this is just a mathematical notation, not a systems tool A new event seen by a process happens logically after other events seen by that process A message receive happens logically after a message has been sent Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Logical clocks A simple tool that can capture parts of the happens before relation First version: uses just a single integer Designed for big (64-bit or more) counters Each process p maintains C p, a local counter A message m will carry C m Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Rules for managing logical clocks When an event happens at a process p it increments C p Any event that matters to p Normally, also snd and rcv events (since we want receive to occur after the matching send) When p sends m, set C m = C p When q receives m, set C q = max(c q, C m )+1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Time-line with LT annotations p A snd p (m) B C p 1 1 3 3 3 3 m q C rcv q (m) D deliv q (m) C q 1 1 1 1 3 3 3 4 5 5 C(A) = 1, C(snd p (m)) =, C(m) = C(rcv q (m)) = max(1,)+1 = 3, etc Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Logical clocks If A happens before B, A B, then C(A)<C(B) A B : A = E En = B, where each pair is ordered either by p or m LT associated with these only increase But converse might not be true: If C(A)<C(B) can t be sure that A B This is because processes that don t communicate still assign timestamps and hence events will seem to have an order Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Can we do better? One option is to use vector clocks Here we treat timestamps as a vector One counter for each process Rules for managing vector times differ from what we did with logical clocks Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Vector clocks Clock is a vector: e.g. VC(A)=[1, ] We ll just assign p index and q index 1 Vector clocks require either agreement on the numbering/static membership, or that the actual process id s be included with the vector Rules for managing vector clock When event happens at p, increment VC p [index p ] Normally, also increment for snd and rcv events When sending a message, set VC(m)=VC p When receiving, set VC q =max(vc q, VC(m)) Where max is max on components of vector VC(m)[index p ]-1 events causally preceded m at p Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5 Time-line with VT annotations p q m D A C B rcv q (m) deliv q (m) snd p (m) VC q 1 1 1 1 3 3 4 VC p 1 1 3 3 3 3 VT(m)=[,] Could also be [1,] if we decide not to increment the clock on a snd event. Decision depends on how the timestamps will be used.

Rules for comparison of VCs We ll say that VC A VC B if i, VC A [i] VC B [i] And we ll say that VT A < VT B if VC A VC B but VC A VC B That is, for some i, VC A [i] < VC B [i] Examples? [,4] [,4] [1,3] < [7,3] [1,3] is incomparable to [3,1] Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5 Time-line with VC annotations VC(A)=[1,]. VC(D)=[,4]. So VC(A)<VC(D) VC(B)=[3,]. So VC(B) and VC(D) are incomparable p q m D A C B rcv q (m) deliv q (m) snd p (m) VC q 1 1 1 1 3 3 4 VC p 1 1 3 3 3 3 VC(m)=[,]

Vector time and happens before If A B, then VC(A)<VC(B) Write a chain of events from A to B Step by step the vector clocks get larger But also VC(A)<VC(B) then A B Two cases (formally by induction) If A and B both happen at same process p all events seen by p increments vector clocks If A happens at p and B at q, can trace the path back by which q learned VT(A)[p] since q only updates VT(A)[p] based on message receipt from, say, q If q <> p trace further back (Otherwise A and B happened concurrently) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

What can we use this for? May want different kinds of guarantees of when multicast messages are delivered at receivers None Delivery in arbitrary order (in practice FIFO) E.g., many reads information FIFO Delivery in order of sending (with respect to one process) E.g., only one process updates data but many reads Causal Delivery with respect to happens-before E.g., efficient distributed locking Total Delivery with respect to a total/global order E.g., for replicated updates I.e., ordered multi-casting Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

What can we use this for? May want different kinds of guarantees of when multicast messages are delivered at receivers None Delivery in arbitrary order (in practice FIFO) E.g., many reads information FIFO Delivery in order of sending (with respect to one process) E.g., only one process updates data but many reads We will, for now, assume that There are no failures We have stable process groups Causal Delivery with respect to happens-before E.g., efficient distributed locking Total Delivery with respect to a total/global order E.g., for replicated updates I.e., ordered multi-casting Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Example: Totally Ordered Multicasting Update 1 Add 1$ to account Update Add 1% interest to account Update 1; Update (1$ + 1$)*1,1 = 1111$ Update ; Update 1 1$*1,1 + 1$ = 111$ Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Totally Ordered Multicasting Implementation Use a sequencer Get a number and use that as sequence number for multicast Receiving processes deliver in that order Distributed Each process attaches a Lamport clock to message When received, acknowledge to everybody Order messages in queue according to sequence number (and process number) Deliver message at front when it has been acknowledged by all processes Queues become identical Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Enforcing Causal Communication Want to ensure that is snd(m) snd(m*) then del(m) del(m*) Use vector timestamps Only increment when sending and receiving Adjust when delivering Deliver a message m from i when m is the next message expected from i i.e, ts(m)[i] = VC j [i] + 1 All messages that i has seen from other processes have been seen by us i.e., ts(m)[k] <= VC j [k], k!= j Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Cost Orderings give increased guarantees and implementation cost None < FIFO < causal < total None/FIFO ordering is easy to implement Except for failures, transport layers take care of this May be used efficiently Causal ordering harder to implement efficiently But can be implemented very efficiently May be used efficiently Do not need to wait for own massages Can deliver concurrent messages immediately Total ordering is hard to implement efficiently Sometimes needed Also hard to use efficiently Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5

Summary We cannot achieve global, synchronized time in a distributed system Physical time Can synchronize (with high probability) within a constant Logical time Advance time only as consequence of logical middleware events Can be used to implement, e.g., ordered multicast Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, e, (c) 7 Prentice-Hall, Inc. All rights reserved. -13-397-5