CSE 5306 Distributed Systems. Synchronization

Similar documents
CSE 5306 Distributed Systems

Chapter 6 Synchronization

Advanced Topics in Distributed Systems. Dr. Ayman A. Abdel-Hamid. Computer Science Department Virginia Tech

Clock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers

SYNCHRONIZATION. DISTRIBUTED SYSTEMS Principles and Paradigms. Second Edition. Chapter 6 ANDREW S. TANENBAUM MAARTEN VAN STEEN

Distributed Synchronization. EECS 591 Farnam Jahanian University of Michigan

殷亚凤. Synchronization. Distributed Systems [6]

Distributed Coordination 1/39

Synchronization. Clock Synchronization

Synchronization Part 1. REK s adaptation of Claypool s adaptation of Tanenbaum s Distributed Systems Chapter 5

ICT 6544 Distributed Systems Lecture 7

Synchronization. Chapter 5

Verteilte Systeme (Distributed Systems)

Last Class: Clock Synchronization. Today: More Canonical Problems

Synchronization. Distributed Systems IT332

Last Class: Naming. Today: Classical Problems in Distributed Systems. Naming. Time ordering and clock synchronization (today)

Time in Distributed Systems

Distributed Systems. coordination Johan Montelius ID2201. Distributed Systems ID2201

Mutual Exclusion. A Centralized Algorithm

PROCESS SYNCHRONIZATION

Distributed Systems COMP 212. Lecture 17 Othon Michail

Chapter 6 Synchronization (1)

Last Class: Clock Synchronization. Today: More Canonical Problems

Clock-Synchronisation

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17

CMPSCI 677 Operating Systems Spring Lecture 14: March 9

Clock and Time. THOAI NAM Faculty of Information Technology HCMC University of Technology

Distributed Operating Systems. Distributed Synchronization

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Synchronisation in. Distributed Systems. Co-operation and Co-ordination in. Distributed Systems. Kinds of Synchronisation. Clock Synchronization

Process Synchroniztion Mutual Exclusion & Election Algorithms

Last Class: Naming. DNS Implementation

Chapter 6 Synchronization (2)

DISTRIBUTED MUTEX. EE324 Lecture 11

Synchronization Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [DISTRIBUTED MUTUAL EXCLUSION] Frequently asked questions from the previous class survey

Lecture 10: Clocks and Time

Frequently asked questions from the previous class survey

Distributed Systems Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2016

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Distributed Systems. Clock Synchronization: Physical Clocks. Paul Krzyzanowski

Time Synchronization and Logical Clocks

CSE 380 Computer Operating Systems

Distributed Systems Synchronization

CSE 486/586 Distributed Systems

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Lecture 12: Time Distributed Systems

OUTLINE. Introduction Clock synchronization Logical clocks Global state Mutual exclusion Election algorithms Deadlocks in distributed systems

Distributed Systems. 05. Clock Synchronization. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems Principles and Paradigms

Distributed Systems COMP 212. Revision 2 Othon Michail


TIME ATTRIBUTION 11/4/2018. George Porter Nov 6 and 8, 2018

Exam 2 Review. Fall 2011

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast

Coordination and Agreement

Several of these problems are motivated by trying to use solutiions used in `centralized computing to distributed computing

Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems (5DV147)

2. Time and Global States Page 1. University of Freiburg, Germany Department of Computer Science. Distributed Systems

Selected Questions. Exam 2 Fall 2006

An Efficient Approach of Election Algorithm in Distributed Systems

DISTRIBUTED REAL-TIME SYSTEMS

Silberschatz and Galvin Chapter 18

Event Ordering Silberschatz, Galvin and Gagne. Operating System Concepts

Homework #2 Nathan Balon CIS 578 October 31, 2004

Lecture 2: Leader election algorithms.

Chapter 16: Distributed Synchronization

Distributed Systems COMP 212. Lecture 19 Othon Michail

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Failure Tolerance. Distributed Systems Santa Clara University

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

Mutual Exclusion in DS

Chapter 18: Distributed

Algorithms and protocols for distributed systems

Distributed Systems. Pre-Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2015

Arranging lunch value of preserving the causal order. a: how about lunch? meet at 12? a: <receives b then c>: which is ok?

CSE 124: LAMPORT AND VECTOR CLOCKS. George Porter October 30, 2017

Time in Distributed Systems

Lecture 7: Logical Time

CS 417: the one-hour study guide for exam 2

Synchronization (contd.)

Multi-threaded programming in Java

wait with priority An enhanced version of the wait operation accepts an optional priority argument:

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

Distributed Systems Coordination and Agreement

CS Final Review

Time Synchronization and Logical Clocks

Distributed Systems 8L for Part IB

Distributed Mutual Exclusion

Distributed Coordination! Distr. Systems: Fundamental Characteristics!

Basic Protocols and Error Control Mechanisms

11/7/2018. Event Ordering. Module 18: Distributed Coordination. Distributed Mutual Exclusion (DME) Implementation of. DME: Centralized Approach

Last Time. 19: Distributed Coordination. Distributed Coordination. Recall. Event Ordering. Happens-before

Introduction to Distributed Systems Seif Haridi

Dep. Systems Requirements

Leader Election Algorithms in Distributed Systems

Proseminar Distributed Systems Summer Semester Paxos algorithm. Stefan Resmerita

Transcription:

CSE 5306 Distributed Systems Synchronization 1

Synchronization An important issue in distributed system is how processes cooperate and synchronize with one another Cooperation is partially supported by naming, which allow them to share resources Example of synchronization Access to shared resources Agreement on the ordering of events Will discuss Synchronization based on actual time Synchronization based on relative orders Algorithms for election of synchronization coordinators 2

Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned an earlier time 3

Physical Clock All computers have a circuit to keep track of time using a quartz crystal However, quartz crystals at different computers often run at a slightly different speed This leads to the so called clock skew between different machines Some systems (e.g., real-time systems) need external physical clock Solar day: interval between two consecutive noons Solar day varies due to many reasons International atomic time (TAI): transitions of cesium 133 atom Cannot be directly used as everyday clock. TAI second < Solar second Solution: leap second whenever the difference is 800msec -> UTC 4

Global Positioning System Used to locate a physical point on earth Need at least 3 satellites to measure: Longitude, Latitude, and Altitude (height) Example: computing a position in a 2D space 5

GPS Challenges Clock skew complicates the GPS localization The receiver s clock is generally not well synchronized with that of a satellite E.g. 1 sec of clock offset could lead to 300,000 kilometers error in distance estimation Other source of errors The position of satellite is not known precisely The receivers clock has a finite accuracy The signal propagation speed is not constant Earth is not a perfect sphere - need further correction 6

Clock Synchronization Algorithms The goal of synchronization is to keep all machines synchronized to an external reference clock, or just keep all machines together as well as possible The relation between clock time and UTC when clocks tick at different rates 7

Network Time Protocol (NTP) Pairwise clock synchronization e.g., a client synchronize its clock with a server What is the clock offset? 8

The Berkeley Algorithm Goal: just keep all machine together Steps The time daemon tell all machines its time Other machines answer how far ahead or behind The time daemon computes the average and tell other how to adjust 9

Clock Sync. in Wireless Networks In traditional distributed systems, we can deploy many time servers that can easily contact each other for efficient information dissemination However, in wireless networks, communication becomes expensive and unreliable RBS (Reference Broadcast Synchronization) is a clock synchronization protocol where a sender broadcasts a reference message that will allow its receivers to adjust their clocks 10

Reference Broadcast Synchronization To estimate the mutual, relative clock offset, two nodes exchange the time when they receive the same broadcast the difference is the offset in one broadcast The average of M offsets is then used as the result However, offset increases over time due to clock skew (how to overcome?) 11

Logical Clocks In many applications, what matters is not the real time It is the order of the events For the algorithms that synchronize the order of events, the clocks are often refereed to as logical clocks Example: Lamport s logical clock, which defines the happen-before relation If a and b are events in the same process, and a occurs before b, then a b is true. If a is the event of a message being sent by one process, and b is the event of the message being received by another process, then a b 12

Lamport s Logical Clocks Three processes, each with its own clock. The clocks run at different rates. Lamport s algorithm corrects the clock 13

Lamport s Algorithm Each process P i maintains a counter C i 1. Before executing an event P i executes C i C i + 1. 2. When process P i sends a message m to P j, it sets m s timestamp ts (m) equal to C i after having executed the previous step. 3. Upon the receipt of a message m, process P j adjusts its own local counter as C j max{c j, ts (m)}, after which it then executes the first step and delivers the message to the application. 14

Application of Lamport s Algorithm Updating a replicated database and leaving it in an inconsistent state. 15

Totally Ordered Multicasting Apply Lamport s algorithm Every message is timestamped and the local counter is adjusted according to every message Each update triggers a multicast to all servers Each server multicasts an acknowledgement for every received update request Pass the message to the application only when the message is at the head of the queue all acknowledgements of this message has been received The above steps guarantees that the messages are in the same order at every server, assuming message transmission is reliable 16

Problem with Lamport s Algorithm Lamport s algorithm guarantees that if event a happened before event b, then we have C(a)<C(b) However, this does not mean that C(a)<C(b) implies event a happened before event b 17

Vector Clocks (1) Vector clocks are constructed by letting each process P i maintain a vector VC i with the following two properties: 1. VC i [ i ] is the number of events that have occurred so far at P i. VC i [ i ] is the local logical clock at process P i. 2. If VC i [ j ] = k then P i knows that k events have occurred at P j. It is thus P i s knowledge of the local time at P j. 18

Vector Clocks (2) Steps carried out to accomplish the property 2 of previous slide Before executing an event P i executes VC i [i] VC i [i]+1 When process P i sends a message m to P j, it sets m s (vector) timestamp ts(m) equal to VC i after having executed the previous step. Upon the receipt of a message m, process P j adjusts its own vector by setting VC j [k] max{vc j [k], ts(m[k]} for each k, after which it executes the first step and delivers the message to the application. 19

Mutual Exclusion Concurrent access may corrupt the resource or make it inconsistent Token-based approach for mutual exclusion only 1 token is passed around in the system Process can only access when it has the token Easy to avoid starvation and deadlock However, situation becomes complicated if token is lost Permission-based approach for mutual exclusion A process has to get permission before accessing a resource Grant permission to only one process at any time 20

A Centralized Algorithm Three steps: (a) Process 1 asks the coordinator for resource. Permission is granted (b) Process 2 asks the coordinator for resource. The coordinator does not reply (c) When process 1 releases the resource, it notifies the coordinator. The coordinator then grants permission to process 2 Easy to implement, but has the single point of failure problem 21

A Distributed Algorithm When a process wants to access a shared resource, it builds a messages containing: Name of the resource, Its process number, and the current time Then, sends the message to all other processes, even to itself Three different cases If the receiver (of a message) is not accessing the resource and does not want to access it, it sends back an OK message to the sender. If the receiver already has access to the resource, it simply does not reply. Instead, it queues the request. If the receiver wants to access the resource as well but has not yet done so, it compares the timestamp of the incoming message with the one contained in the message that it has sent everyone. The lowest one wins. When a process receives OK from all other processes, it starts access 22

An Example (a) Two processes want to access a shared resource at the same moment (b) Process 0 has the lowest timestamp, so it wins. (c) When process 0 is done, it sends an OK also, so 2 can now go ahead. 23

Problems in the Distributed Algorithm Any process fails, the algorithm fail worse than the centralized algorithm Each process has to maintain a list of all other processes process addition, leaving, and crashing Every process needs to do the same amount of work as the coordinator in the centralized algorithm Improvements A majority voting, e.g., as long as you get more than half votes, you can access the resource The algorithm is still slow, expensive, and not robust Distributed algorithms are not always the best option 24

A Token Ring Algorithm When a ring is initialized, process 0 is given a token. Token is passed from k to k+1 (modulo the ring size) in a pointto-point message. Ordering is logical, usually based on the process number or other means When a process acquires the token from its neighbor, it checks to see if needs to access the shared resource. If yes, go ahead with the resource, and then release the resource and pass the token when it finishes If not, pass the token immediately to the next one Each process only needs to know who is next inline Problem: if the token is lost, it is hard to detect 25

Election Algorithms The main objective select one process as the coordinator to perform some special roles such as synchronization Examples of election algorithms The bully algorithm the ring algorithm Election algorithms for wireless networks 26

The Bully Algorithm Assume every process knows the process numbers of all other processes But do not know which ones are dead or alive When any process, P, notices the coordinator is no longer responding to requests, it initiates an election. P sends an ELECTION message to all processes with higher numbers. If no one responds, process P wins the election and becomes the coordinator. If one of the processes with higher numbers answers, it takes over. Process P s job is done. 27

An Example (1) The bully election algorithm. (a) Process 4 holds an election. (b) Processes 5 and 6 respond, telling 4 to stop. (c) Now 5 and 6 each hold an election. 28

An Example (2) The bully election algorithm. (d) Process 6 tells 5 to stop. (e) Process 6 wins and tells everyone. 29

The Ring Algorithm No token is used in the algorithm Every process knows the ring structure i.e., how processes are arranged Whenever a process notices that the coordinator is not functioning It builds an ELECTION message, add its process number in it, and pass it to the next alive process one in the ring Whoever receives the election message will add its process number in it and pass it to the next alive process Eventually, the message will circuit back to the original process that initiated the ELECTION message This process will converts it to a COORDINATOR message and notify other processes that the one with the highest number in the list wins 30

An Example Election algorithm using a ring when two nodes initiated the ELECTION at the same time 31

Election in Wireless Networks Bully or Ring algorithms are not realistic in wireless environments since they assume message passing is reliable which is not true in wireless environments Our goal is to select the best node as the coordinator (or leader) not just any node e.g. best capacity based on remaining battery To elect a leader, any node, called source, can initiate an election by sending an ELECTION message to its neighbors A tree is build during the propagation of this election message The best eligible node is passed from leafs to root 32

An Example (1) Election algorithm in a wireless network, with node a as the source. (a) Initial network. (b) (e) The build-tree phase 33

An Example (2) Election algorithm in a wireless network, with node a as the source. (a) Initial network. (b) (e) The build-tree phase 34

An Example (3) Election algorithm in a wireless network, with node a as the source. (a) Initial network. (b) (e) The build-tree phase 35