Time Synchronization and Logical Clocks

Similar documents
Time Synchronization and Logical Clocks

Time. COS 418: Distributed Systems Lecture 3. Wyatt Lloyd

TIME ATTRIBUTION 11/4/2018. George Porter Nov 6 and 8, 2018

CSE 124: LAMPORT AND VECTOR CLOCKS. George Porter October 30, 2017

CSE 124: TIME SYNCHRONIZATION, CRISTIAN S ALGORITHM, BERKELEY ALGORITHM, NTP. George Porter October 27, 2017

Synchronization Part 1. REK s adaptation of Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 5

Last Class: Naming. Today: Classical Problems in Distributed Systems. Naming. Time ordering and clock synchronization (today)

Clock-Synchronisation

Distributed Coordination 1/39

Distributed Systems COMP 212. Lecture 17 Othon Michail

Distributed Systems. 05. Clock Synchronization. Paul Krzyzanowski. Rutgers University. Fall 2017

Time in Distributed Systems

Distributed Systems. Clock Synchronization: Physical Clocks. Paul Krzyzanowski

CSE 5306 Distributed Systems

Clock and Time. THOAI NAM Faculty of Information Technology HCMC University of Technology

Lecture 10: Clocks and Time

殷亚凤. Synchronization. Distributed Systems [6]

Advanced Topics in Distributed Systems. Dr. Ayman A. Abdel-Hamid. Computer Science Department Virginia Tech

Lecture 12: Time Distributed Systems

Verteilte Systeme (Distributed Systems)

Synchronization. Clock Synchronization

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Chapter 6 Synchronization (1)

CSE 5306 Distributed Systems. Synchronization

Time in Distributed Systems

EECS 498 Introduction to Distributed Systems

Synchronisation in. Distributed Systems. Co-operation and Co-ordination in. Distributed Systems. Kinds of Synchronisation. Clock Synchronization

Last Class: Naming. DNS Implementation

SYNCHRONIZATION. DISTRIBUTED SYSTEMS Principles and Paradigms. Second Edition. Chapter 6 ANDREW S. TANENBAUM MAARTEN VAN STEEN

Lecture 7: Logical Time

Synchronization. Distributed Systems IT332

Chapter 6 Synchronization

2. Time and Global States Page 1. University of Freiburg, Germany Department of Computer Science. Distributed Systems


Arranging lunch value of preserving the causal order. a: how about lunch? meet at 12? a: <receives b then c>: which is ok?

Synchronization. Chapter 5

ICT 6544 Distributed Systems Lecture 7

Middleware and Distributed Systems. Time. Martin v. Löwis. Mittwoch, 22. Februar 12

DISTRIBUTED REAL-TIME SYSTEMS

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast

Clock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers

2017 Paul Krzyzanowski 1

Implementing a NTP-Based Time Service within a Distributed Middleware System

Distributed Systems. Time, clocks, and Ordering of events. Rik Sarkar. University of Edinburgh Fall 2014

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

Causal Consistency and Two-Phase Commit

Multi-threaded programming in Java

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

Frequently asked questions from the previous class survey

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Proseminar Distributed Systems Summer Semester Paxos algorithm. Stefan Resmerita

Distributed Operating Systems. Distributed Synchronization

Distributed Systems Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2016

Primary-Backup Replication

Lecture 6: Logical Time

Introduction to Distributed Systems

DISTRIBUTED MUTEX. EE324 Lecture 11

Wireless Communication Bluetooth, Timing

Availability versus consistency. Eventual Consistency: Bayou. Eventual consistency. Bayou: A Weakly Connected Replicated Storage System

TIME AND SYNCHRONIZATION. I. Physical Clock Synchronization: Motivation and Challenges

Distributed Systems. Pre-Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2015

Distributed Systems Fault Tolerance

SETTING UP AN NTP SERVER AT THE ROYAL OBSERVATORY OF BELGIUM

Synchronization of Network Devices Time by GPS Based Network Time Protocol Output

Data Replication CS 188 Distributed Systems February 3, 2015

Distributed Systems Principles and Paradigms

Implementing NTP. Release 3.8.0

Implementing NTP. Support was added for IPv6 addresses, VRFs, multicast-based associations, and burst and iburst modes for poll-based associations.

wait with priority An enhanced version of the wait operation accepts an optional priority argument:

TDDD82 Secure Mobile Systems Lecture 1: Introduction and Distributed Systems Models

Table of Contents 1 NTP Configuration 1-1

The ISP Column A monthly column on things Internet. Protocol Basics: The Network Time Protocol. NTP, Time, and Timekeeping. March 2014 Geoff Huston

Operation Manual NTP. Table of Contents

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance

Implementing NTPv4 in IPv6

Process groups and message ordering

Distributed Information Processing

CS6450: Distributed Systems Lecture 13. Ryan Stutsman

Last Class: Clock Synchronization. Today: More Canonical Problems

Distributed Systems Synchronization

Chapter 8 Fault Tolerance

Lecture 21 Disk Devices and Timers

Dynamo: Key-Value Cloud Storage

Three Models. 1. Time Order 2. Distributed Algorithms 3. Nature of Distributed Systems1. DEPT. OF Comp Sc. and Engg., IIT Delhi

OUTLINE. Introduction Clock synchronization Logical clocks Global state Mutual exclusion Election algorithms Deadlocks in distributed systems

Time. Computer Science and Engineering College of Engineering The Ohio State University. Lecture 37

Network Time Protocol

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Consistency in Distributed Systems

1/10/16. RPC and Clocks. Tom Anderson. Last Time. Synchroniza>on RPC. Lab 1 RPC

Distributed Systems Principles and Paradigms

SETTING UP AN NTP SERVER AT THE ROYAL OBSERVATORY OF BELGIUM

Midterm Examination ECE 419S 2015: Distributed Systems Date: March 13th, 2015, 6-8 p.m.

Experiment on Cristian s and Berkeley Time Synchronization Algorithms

Primary/Backup. CS6450: Distributed Systems Lecture 3/4. Ryan Stutsman

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

What you need to know about GNSS leap seconds

System Models 2. Lecture - System Models 2 1. Areas for Discussion. Introduction. Introduction. System Models. The Modelling Process - General

Event Ordering. Greg Bilodeau CS 5204 November 3, 2009

SEGR 550 Distributed Computing. Final Exam, Fall 2011

Transcription:

Time Synchronization and Logical Clocks CS 240: Computing Systems and Concurrency Lecture 5 Mootaz Elnozahy

Today 1. The need for time synchronization 2. Wall clock time synchronization 3. Logical Time 2

A distributed edit-compile workflow Physical time à 2143 < 2144 è make doesn t call compiler Lack of time synchronization result a possible object file mismatch 3

What makes time synchronization hard? 1. Quartz oscillator sensitive to temperature, age, vibration, radiation Accuracy ca. one part per million (one second of clock drift over 12 days) 2. The internet is: Asynchronous: arbitrary message delays Best-effort: messages don t always arrive 4

Today 1. The need for time synchronization 2. Wall clock time synchronization Cristian s algorithm, Berkeley algorithm, NTP 3. Logical Time Lamport clocks Vector clocks 5

Just use Coordinated Universal Time? UTC is broadcast from radio stations on land and satellite (e.g., the Global Positioning System) Computers with receivers can synchronize their clocks with these timing signals Signals from land-based stations are accurate to about 0.1 10 milliseconds Signals from GPS are accurate to about one microsecond Why can t we put GPS receivers on all our computers? 6

Synchronization to a time server Suppose a server with an accurate clock (e.g., GPSdisciplined crystal oscillator) Could simply issue an RPC to obtain the time: Client Server Time But this doesn t account for network latency Message delays will have outdated server s answer 7

Cristian s algorithm: Outline 1. Client sends a request packet, timestamped with its local clock T 1 2. Server timestamps its receipt of the request T 2 with its local clock 3. Server sends a response packet with its local clock T 3 and T 2 4. Client locally timestamps its receipt of the server s response T 4 Client T 1 T 4 Server T 2 T 3 How the client can use these timestamps to synchronize its local clock to the server s local clock? Time 8

Cristian s algorithm: Offset sample calculation Client Server Goal: Client sets clock ß T 3 + δ resp Client samples round trip time δ = δ req + δ resp = (T 4 T 1 ) (T 3 T 2 ) T 1 δ req T 2 But client knows δ, not δ resp Assume: δ req δ resp δ resp T 4 T 3 Client sets clock ß T 3 + ½δ Time 9

Today 1. The need for time synchronization 2. Wall clock time synchronization Cristian s algorithm, Berkeley algorithm, NTP 3. Logical Time Lamport clocks Vector clocks 10

Berkeley algorithm A single time server can fail, blocking timekeeping The Berkeley algorithm is a distributed algorithm for timekeeping Assumes all machines have equally-accurate local clocks Obtains average from participating computers and synchronizes clocks to that average 11

Berkeley algorithm Master machine: polls L other machines using Cristian s algorithm à { θ i } (i = 1 L) Master 12

Today 1. The need for time synchronization 2. Wall clock time synchronization Cristian s algorithm, Berkeley algorithm, NTP 3. Logical Time Lamport clocks Vector clocks 13

The Network Time Protocol (NTP) Enables clients to be accurately synchronized to UTC despite message delays Provides reliable service Survives lengthy losses of connectivity Communicates over redundant network paths Provides an accurate service Unlike the Berkeley algorithm, leverages heterogeneous accuracy in clocks 14

NTP: System structure Servers and time sources are arranged in layers (strata) Stratum 0: High-precision time sources themselves e.g., atomic clocks, shortwave radio time receivers Stratum 1: NTP servers directly connected to Stratum 0 Stratum 2: NTP servers that synchronize with Stratum 1 Stratum 2 servers are clients of Stratum 1 servers Stratum 3: NTP servers that synchronize with Stratum 2 Stratum 3 servers are clients of Stratum 2 servers Users computers synchronize with Stratum 3 servers 15

NTP operation: Server selection Messages between an NTP client and server are exchanged in pairs: request and response Use Cristian s algorithm For i th message exchange with a particular server, calculate: 1. Clock offset θ i from client to server 2. Round trip time δ i between client and server Over last eight exchanges with server k, the client computes its dispersion σ k = max i δ i min i δ i Client uses the server with minimum dispersion 16

NTP operation : Clock offset calculation Client tracks minimum round trip time and associated offset over the last eight message exchanges (δ 0, θ 0 ) θ 0 is the best estimate of offset: client adjusts its clock by θ 0 to synchronize to server Offset θ θ 0 Each point represents one sample δ 0 Round trip time δ 17

NTP operation: How to change time Can t just change time: Don t want time to run backwards Recall the make example Instead, change the update rate for the clock Changes time in a more gradual fashion Prevents inconsistent local timestamps 18

Clock synchronization: Take-away points Clocks on different systems will always behave differently Disagreement between machines can result in undesirable behavior NTP, Berkeley clock synchronization Rely on timestamps to estimate network delays 100s μs ms accuracy Clocks never exactly synchronized Often inadequate for distributed systems Often need to reason about the order of events Might need precision on the order of ns 19

Today 1. The need for time synchronization 2. Wall clock time synchronization Cristian s algorithm, Berkeley algorithm, NTP 3. Logical Time Lamport clocks Vector clocks 20

Motivation: Multi-site database replication A New York-based bank wants to make its transaction ledger database resilient to whole-site failures Replicate the database, keep one copy in sf, one in nyc San Francisco New York 21

The consequences of concurrent updates Replicate the database, keep one copy in sf, one in nyc Client sends query to the nearest copy Client sends update to both copies Deposit $100 $1,000 $1,100 $1,111 Inconsistent replicas! Updates should have been performed in the same order at each copy Pay 1% interest $1,000 $1,010 $1,110 22

Idea: Logical clocks Landmark 1978 paper by Leslie Lamport Insight: only the events themselves matter Idea: Disregard the precise clock time Instead, capture just a happens before relationship between a pair of events 23

Defining happens-before Consider three processes: P1, P2, and P3 Notation: Event a happens before event b (a à b) P1 P2 P3 Physical time 24

Defining happens-before 1. Can observe event order at a single process a P1 P2 P3 b Physical time 25

Defining happens-before 1. If same process and a occurs before b, then a à b a P1 P2 P3 b Physical time 26

Defining happens-before 1. If same process and a occurs before b, then a à b 2. Can observe ordering when processes communicate a P1 P2 P3 b c Physical time 27

Defining happens-before 1. If same process and a occurs before b, then a à b 2. If c is a message receipt of b, then b à c a P1 P2 P3 b c Physical time 28

Defining happens-before 1. If same process and a occurs before b, then a à b 2. If c is a message receipt of b, then b à c 3. Can observe ordering transitively a P1 P2 P3 b c Physical time 29

Defining happens-before 1. If same process and a occurs before b, then a à b 2. If c is a message receipt of b, then b à c 3. If a à b and b à c, then a à c a P1 P2 P3 b c Physical time 30

Concurrent events Not all events are related by à a, d not related by à so concurrent, written as a d a b P1 P2 c d P3 Physical time 31

Lamport clocks: Objective We seek a clock time C(a) for every event a Plan: Tag events with clock times; use clock times to make distributed system correct Clock condition: If a à b, then C(a) < C(b) 32

The Lamport Clock algorithm Each process P i maintains a local clock C i 1. Before executing an event, C i ß C i + 1 a P1 C 1 =0 P2 C 2 =0 P3 C 3 =0 b c Physical time 33

The Lamport Clock algorithm 1. Before executing an event a, C i ß C i + 1: Set event time C(a) ß C i a P1 C 1 =1 C(a) = 1 P2 C 2 =1 P3 C 3 =1 b c Physical time 34

The Lamport Clock algorithm 1. Before executing an event b, C i ß C i + 1: Set event time C(b) ß C i a b P1 C 1 =2 C(a) = 1 C(b) = 2 P2 C 2 =1 c P3 C 3 =1 Physical time 35

The Lamport Clock algorithm 1. Before executing an event b, C i ß C i + 1 2. Send the local clock in the message m a b P1 C 1 =2 C(a) = 1 C(b) = 2 C(m) = 2 P2 C 2 =1 c P3 C 3 =1 Physical time 36

The Lamport Clock algorithm 3. On process P j receiving a message m: Set C j and receive event time C(c) ß1 + max{ C j, C(m) } a b P1 C 1 =2 C(a) = 1 C(b) = 2 C(m) = 2 P2 C 2 =3 C(c) = 3 c P3 C 3 =1 Physical time 37

Ordering all events Break ties by appending the process number to each event: 1. Process P i timestamps event e with C i (e).i 2. C(a).i < C(b).j when: C(a) < C(b), or C(a) = C(b) and i < j Now, for any two events a and b, C(a) < C(b) or C(b) < C(a) This is called a total ordering of events 38

Making concurrent updates consistent P1 P2 Recall multi-site database replication: San Francisco (P1) deposited $100: New York (P2) paid 1% interest: % $ We reached an inconsistent state Could we design a system that uses Lamport Clock total order to make multi-site updates consistent? 39

Totally-Ordered Multicast Client sends update to one replica à Lamport timestamp C(x) Key idea: Place events into a local queue Sorted by increasing C(x) P1 s local queue: $ 1.1 % 1.2 P2 s local queue: % 1.2 P1 P2 Goal: All sites apply the updates in (the same) Lamport clock order 40

Totally-Ordered Multicast (Almost correct) 1. On receiving an event from client, broadcast to others (including yourself) 2. On receiving an event from replica: a) Add it to your local queue b) Broadcast an acknowledgement message to every process (including yourself) 3. On receiving an acknowledgement: Mark corresponding event acknowledged in your queue 4. Remove and process events everyone has ack ed from head of queue 41

Totally-Ordered Multicast (Almost correct) P1 queues $, P2 queues % P1 queues and ack s % P1 marks % fully ack ed P2 marks % fully ack ed $ P1 1.1 % 1.2 $ 1.1 1.1 % 1.2 $ % 1.2 P2 P2 processes % ack % % (Ack s to self not shown here) 42

Totally-Ordered Multicast (Correct version) 1. On receiving an event from client, broadcast to others (including yourself) 2. On receiving or processing an event: a) Add it to your local queue b) Broadcast an acknowledgement message to every process (including yourself) only from head of queue 3. When you receive an acknowledgement: Mark corresponding event acknowledged in your queue 4. Remove and process events everyone has ack ed from head of queue 43

Totally-Ordered Multicast (Correct version) $ 1.1 % 1.2 1.1 % $ 1.2 P1 $ 1.1 % 1.2 P2 ack $ $ % ack % (Ack s to self not shown here) $ % 44

So, are we done? Does totally-ordered multicast solve the problem of multi-site replication in general? Not by a long shot! 1. Our protocol assumed: No node failures No message loss No message corruption 2. All to all communication does not scale 3. Waits forever for message delays (performance?) 45

Take-away points: Lamport clocks Can totally-order events in a distributed system: that s useful! But: while by construction, a à b implies C(a) < C(b), The converse is not necessarily true: C(a) < C(b) does not imply a à b (possibly, a b) Can t use Lamport clock timestamps to infer causal relationships between events 46

Today 1. The need for time synchronization 2. Wall clock time synchronization Cristian s algorithm, Berkeley algorithm, NTP 3. Logical Time Lamport clocks Vector clocks 47

Vector clock (VC) Label each event e with a vector V(e) = [c 1, c 2, c n ] c i is a count of events in process i that causally precede e Initially, all vectors are [0, 0,, 0] Two update rules: 1. For each local event on process i, increment local entry c i 2. If process j receives message with vector [d 1, d 2,, d n ]: Set each local entry c k = max{c k, d k } Increment local entry c j 48

Vector clock: Example All counters start at [0, 0, 0] P1 P2 P3 Applying local update rule Applying message rule Local vector clock piggybacks on interprocess messages a b [1,0,0] [2,0,0] c d e [2,1,0] [2,2,0] f [0,0,1] [2,2,2] Physical time 49

Vector clocks can establish causality Rule for comparing vector clocks: V(a) = V(b) when a k = b k for all k V(a) < V(b) when a k b k for all k and V(a) V(b) Concurrency: a b if a i < b i and a j > b j, some i, j V(a) < V(z) when there is a chain of events linked by à between a and z a b [1,0,0] [2,0,0] c [2,1,0] z [2,2,0] 50

Two events a, z Lamport clocks: C(a) < C(z) Conclusion: None Vector clocks: V(a) < V(z) Conclusion: a à à z Vector clock timestamps tell us about causal event relationships 51

VC application: Causally-ordered bulletin board system Distributed bulletin board application Each post à multicast of the post to all other users Want: No user to see a reply before the corresponding original message post Deliver message only after all messages that causally precede it have been delivered Otherwise, the user would see a reply to a message they could not find 52

VC application: Causally-ordered bulletin board system P 0 Original post VC 0 = (1,0,0) VC 0 = (1,1,0) m P 1 1 s reply VC 1 = (1,1,0) P 2 VC 2 = (0,0,0) VC 2 = (1,0,0) m* VC = (1,1,0) 2 Physical time à User 0 posts, user 1 replies to 0 s post; user 2 observes 53

Wednesday Topic: Lab 1 Virtualization, sockets, RPCs 54

Why global timing? Suppose there were an infinitely-precise and globally consistent time standard That would be very handy. For example: 1. Who got last seat on airplane? 2. Mobile cloud gaming: Which was first, A shoots B or vice-versa? 3. Does this file need to be recompiled? 55

Totally-Ordered Multicast (Attempt #1) P1 queues $, P2 queues % P1 queues and ack s % P1 marks % fully ack ed P2 marks % fully ack ed P2 processes % $ P1 1.1 % 1.2 $ 1.1 1.1 % 1.2 $ % 1.2 ack % P2 % P2 queues and ack s $ P2 processes $ P1 marks $ fully ack ed P1 processes $, then % $ % ack $ Note: ack s to self not shown here $ 56

Totally-Ordered Multicast P1 queues $, P2 queues % P1 queues % P2 queues and ack s $ P2 marks $ fully ack ed P2 processes $ $ P1 (Correct version) 1.1 % 1.2 $ 1.1 % 1.2 1.1 % $ 1.2 P2 P1 marks $ fully ack ed P1 processes $ P1 ack s % ack $ P1 marks % fully ack ed P1 processes % P2 marks % fully ack ed P2 processes % $ % ack % (Ack s to self not shown here) $ % 57

Time standards Universal Time (UT1) In concept, based on astronomical observation of the sun at 0º longitude Known as Greenwich Mean Time International Atomic Time (TAI) Beginning of TAI is midnight on January 1, 1958 Each second is 9,192,631,770 cycles of radiation emitted by a Cesium atom Has diverged from UT1 due to slowing of earth s rotation Coordinated Universal Time (UTC) TAI + leap seconds, to be within 0.9 seconds of UT1 Currently TAI UTC = 36 58

VC application: Order processing Suppose we are running a distributed order processing system Each process = a different user Each event = an order A user has seen all orders with V(order) < the user s current vector 59