PROCESS SYNCHRONIZATION

Similar documents
Distributed Synchronization. EECS 591 Farnam Jahanian University of Michigan

Last Class: Clock Synchronization. Today: More Canonical Problems

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17

Mutual Exclusion. A Centralized Algorithm

Chapter 6 Synchronization

CMPSCI 677 Operating Systems Spring Lecture 14: March 9

Distributed Systems. coordination Johan Montelius ID2201. Distributed Systems ID2201

Clock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers

Process Synchroniztion Mutual Exclusion & Election Algorithms

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Chapter 6 Synchronization (2)

Synchronization (contd.)

CSE 5306 Distributed Systems. Synchronization

Last Class: Clock Synchronization. Today: More Canonical Problems

Synchronization. Distributed Systems IT332

SYNCHRONIZATION. DISTRIBUTED SYSTEMS Principles and Paradigms. Second Edition. Chapter 6 ANDREW S. TANENBAUM MAARTEN VAN STEEN

CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [DISTRIBUTED MUTUAL EXCLUSION] Frequently asked questions from the previous class survey

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Synchronization. Chapter 5

Algorithms and protocols for distributed systems

Coordination and Agreement

Advanced Topics in Distributed Systems. Dr. Ayman A. Abdel-Hamid. Computer Science Department Virginia Tech

Chapter 16: Distributed Synchronization

Chapter 18: Distributed

Synchronization. Clock Synchronization

Exam 2 Review. Fall 2011

Verteilte Systeme (Distributed Systems)

殷亚凤. Synchronization. Distributed Systems [6]

Synchronization Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Distributed Systems (5DV147)

11/7/2018. Event Ordering. Module 18: Distributed Coordination. Distributed Mutual Exclusion (DME) Implementation of. DME: Centralized Approach

CSE 486/586 Distributed Systems

Distributed Operating Systems. Distributed Synchronization

Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms

Distributed Systems - II

CSE 5306 Distributed Systems

Distributed Systems Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Last Time. 19: Distributed Coordination. Distributed Coordination. Recall. Event Ordering. Happens-before

Event Ordering Silberschatz, Galvin and Gagne. Operating System Concepts

Silberschatz and Galvin Chapter 18

Homework #2 Nathan Balon CIS 578 October 31, 2004

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

SEGR 550 Distributed Computing. Final Exam, Fall 2011

OUTLINE. Introduction Clock synchronization Logical clocks Global state Mutual exclusion Election algorithms Deadlocks in distributed systems

An Efficient Approach of Election Algorithm in Distributed Systems

Distributed Systems Coordination and Agreement

Distributed Systems 8L for Part IB

Mutual Exclusion in DS

Frequently asked questions from the previous class survey

416 practice questions (PQs)

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Distributed Coordination! Distr. Systems: Fundamental Characteristics!

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

CS Final Review

DISTRIBUTED MUTEX. EE324 Lecture 11

Distributed Coordination 1/39

Distributed Systems. Lec 9: Distributed File Systems NFS, AFS. Slide acks: Dave Andersen

Distributed Mutual Exclusion

Study of various Election algorithms on the basis of messagepassing

CSE 380 Computer Operating Systems

Several of these problems are motivated by trying to use solutiions used in `centralized computing to distributed computing

Failure Tolerance. Distributed Systems Santa Clara University

Fault-Tolerance & Paxos

Selected Questions. Exam 2 Fall 2006

Lecture 2: Leader election algorithms.

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

Operating Systems. Mutual Exclusion in Distributed Systems. Lecturer: William Fornaciari

June 22/24/ Gerd Liefländer System Architecture Group Universität Karlsruhe (TH), System Architecture Group

Distributed Systems 11. Consensus. Paul Krzyzanowski

A Two-Layer Hybrid Algorithm for Achieving Mutual Exclusion in Distributed Systems

Exam 2 Review. October 29, Paul Krzyzanowski 1

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

Consistency in Distributed Systems

Distributed Systems Principles and Paradigms

Election Administration Algorithm for Distributed Computing

Distributed Mutual Exclusion Algorithms

Failures, Elections, and Raft

What is the Race Condition? And what is its solution? What is a critical section? And what is the critical section problem?

Process groups and message ordering

Tasks. Task Implementation and management

Leader Election Algorithms in Distributed Systems

Deadlocks. Today. Next Time. ! Resources & deadlocks! Dealing with deadlocks! Other issues. ! Memory management

Last Class: Naming. Today: Classical Problems in Distributed Systems. Naming. Time ordering and clock synchronization (today)

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

Consensus and related problems

Definition of a DS A collection of independent computers that appear to its user as a single coherent system.

Slides for Chapter 15: Coordination and Agreement

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Distributed Transaction Management 2003

Today: Fault Tolerance

Operating Systems: Quiz2 December 15, Class: No. Name:

Distributed Systems. Pre-Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2015

CS 425 / ECE 428 Distributed Systems Fall 2017

Coordination and Agreement

How Bitcoin achieves Decentralization. How Bitcoin achieves Decentralization

Intuitive distributed algorithms. with F#

Remaining Contemplation Questions

Transcription:

DISTRIBUTED COMPUTER SYSTEMS PROCESS SYNCHRONIZATION Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Process Synchronization Mutual Exclusion Algorithms Permission Based Centralized DeCentralized Distributed Token Based Token Ring Election Algorithms Bully Algorithm Ring Algorithm Election in Wireless Networks Election in Large Scale Systems 1!

Process Synchronization Concurrency and Collaboration are fundamental to process synchronization in distributed systems Processes may required coordination to carry out a global, collective task in a consistent manner Processes may seek to access simultaneously the same resource Access to a shared resource must Critical Section Exclusive access to the resource is required Algorithms are required to coordinate execution among processes Grant Mutual Exclusive Access by processes Leader Election, etc CONCURRENCY AND COLLABORATION MUTUAL EXCLUSION 2!

Mutual Exclusion in Distributed Systems The lack of a common shared memory makes the problem of mutual exclusion more challenging distributed systems as compared centralized systems Desirable properties for possible DME solutions Ensure Mutual Exclusion, No deadlocks Deadlock is a state of the system when no node is in its CS and yet no requesting node can proceed to its CS. No starvation Starvation occurs when some node waits indefinitely to enter its CS, while other nodes can enter and exit their CSs Mutual Exclusion Classification Mutual Exclusion Permission-Based Token-Based Centralized Token Ring Decentralized Distributed 3!

PERMISSION BASED DME CENTRALIZED ALGORITHM Centralized Algorithm Free Resource Centralized Algorithm achieves DME by closely mimicking ME in single processor systems One process, C, is the Coordinator Coordinates access to resources Other processes issue requests to access resource 1. Request Resource 2. Wait for Response 3. Receive Grant 4. Access Resource 5. Release Resource P Request(R) Grant(R) C Release(R) 4!

Centralized Algorithm Allocated Resource If resource is currently by another process, C does not reply until resource is released Maintain queue of pending requests, serviced in FIFO order Q { }! Q {P 2 }! Q {P 2 }! {P 3 }! Q {P 3 }! P 1 Request(R) Grant(R) C Release(R) Request(R) P 2 P 3 Centralized Algorithm Advantages Easy to understand, verify and implement Access Fairness FIFO order of service is simple to implement Note FIFO policy does not guarantee Fair Share access Limitations Not scalable to large scale distributed systems Coordinator is likely to become a bottleneck A process requesting a resource cannot distinguish between being in a blocked state, waiting for the resource to be released, from waiting for a failed coordinator Timeout mechanisms or alive messages may be required to be designed carefully to avoid early and late timeouts or unnecessary alive messages 5!

PERMISSION BASED DME DECENTRALIZED ALGORITHM Decentralized Permission Based DME Fully decentralized algorithms attempt to address the bottleneck problem of centralized algorithms Basic tenet of the algorithm is coordination replication and majority voting Each resource is assumed to be replicated n times Each replica has its own coordinator Therefore, n coordinators are needed Voting algorithm can be executed using DHT-based approach 6!

DME Decentralized Voting Algorithm Unlike in the centralized scheme, if the resource cannot be granted, a decentralized DME coordinator informs the requesting process that the resource is unavailable To gain access to a resource, a process my secure a majority vote m > n/2, where n is the number of coordinators A failed coordinator resets at arbitrary moments, not having to remember any vote it gave before the crash Ignoring previously granted permissions, may lead a recovered coordinator to grant permission again to another process Decentralized Permissions Analysis Benefits No single point of failure Implementable using DHT based structure If permission to access a resource is denied, because the process cannot obtain a majority vote, the process backs off a random amount of time and tries again later Limitations Overhead in terms of the number of messages generated can be prohibitively high Messages to secure majority, which may require multiple attempts Failure may be difficult to handle 7!

PERMISSION BASED DME DISTRIBUTED ALGORITHM A Distributed ME Algorithm Ricart and Agrawala Algorithm assumes there is a mechanism for totally ordering of all events in the system and a reliable message system Lamport s algorithm can be used for total ordering A process wanting to enter it CS sends a message with (CS name, process id, current time) to all processes, including itself When a process receives a CS request from another process, it reacts based on its current state with respect to the CS requested. Three possible cases must be considered 8!

RicartANDAgrawala Algorithm Basic Cases RA Algorithm distinguishes between 3 cases a. If the receiver is not in the CS and it does not want to enter the CS, it sends an OK message to the sender. b. If the receiver is in the CS, it does not reply and queues the request. c. If the receiver wants to enter the CS but has not yet, it compares the timestamp of the incoming message with the timestamp of its message sent to everyone The lowest timestamp wins. If the incoming timestamp is lower, the receiver sends an OK message to the sender. If its own timestamp is lower, the receiver queues the request and sends nothing. RA DME Algorithm CS Entry and Exit After a process sends out a request to enter a CS, it waits for an OK from all the other processes. Upon receiving all OK message, process enters the CS Upon exiting CS, it sends OK messages to all processes on its queue, which are waiting for that CS Upon sending OK messages, processes are deleted from the queue. 9!

RA Algorithm Example 8! P 0 P 0 P 0 8! 12! 8! OK! OK! OK! P 1 12! P 2 P 1 OK! P 2 P 1 P 2 12! P 0 and P 2 request access to the same resource, nearly at the same time Both send requests with timestamps, 8 and 12, respectively P 0 has the lowest timestamp, 8 P 0 wins access to the resource 12! Upon completion of its CS, P 0 sends OK message to P 2 P 2 can enter its own CS Distributed Permissions Analysis Benefits No central bottleneck This results in improved performance Fewer messages than the decentralized algorithm Limitations The system is exposed to n points of failure If a node fails to respond, the entire system locks up 10!

TOKEN BASED DME TOKEN RING ALGORITHM A Token Ring Algorithm P 2 P 1 P 3 P 0 P 2 P 4 P 6 P 0 P 4 P 1 P 3 P 5 P 7 P 7 P 6 P 5 An unordered group of processes on a network. A logical ring constructed in software Token circulates at high speed on the network A process must have a token to enter its CS 11!

DME Algorithm Comparison Algorithm Messages per Entry/Exit Delay Before Entry (in Message Times) Centralized 3 2 [1] Decentralized 2mk+m, k=1,2, 2mk [2] Problems Coordinator Crash Starvation, Low Efficiency Distributed 2 ( n 1 ) 2 ( n 1 ) Process Crash Token ring 1 to 0 to n 1 Lost Token, Process Crash Centralized is the most efficient. Token ring efficient when a large of processes seek to use critical region. [2] K is the number of attempts, before success [1] Assumes that messages are transmitted sequentially over the network ELECTION BACKGROUND 12!

Election Algorithms Many distributed algorithms such as mutual exclusion and deadlock detection require a coordinator process. When the coordinator process fails, the distributed group of processes must execute an election algorithm to determine a new coordinator process. Different criteria can be used to select the new coordinator Assumptions and Requirements Any process can call for an election. A process can call for at most one election at a time. Multiple processes can call an election simultaneously. Election results should not depend on which process calls for it. Each process maintains the following data structure Variable called elected An attribute value called attr, representing ID, MAC address, fastest CPU, etc The non-faulty process with the best (highest) election attribute value (e.g., highest ID or MAC address, or fastest CPU, etc.) is elected 13!

ELECTION ALGORITHMS BULLY ALGORITHM Bully Algorithm Main Characteristics Operating Assumptions Synchronous system All messages arrive within T M units transmission of time. A reply is dispatched within P P units of processing time after the receipt of a message. If no response is received in 2 T M + P P, the node is assumed to be faulty Node crashed Attribute = Process ID Each process knows all the other processes in the system Therefore, processes know each others IDs 14!

The Bully Algorithm When any process, P, notices that the coordinator is no longer responding it initiates an election: 1. P sends an election message to all processes with higher id numbers. 2. If no one responds, P wins the election and becomes coordinator. 3. If a higher process responds, it takes over. Process P s job is done. The Bully Algorithm At any moment, a process can receive an election message from one of its lower-numbered colleagues. The receiver sends an OK back to the sender and conducts its own election. Eventually only the bully process remains. Bully process becomes the new cooridinator The bully announces victory to all processes in the distributed group. 15!

Bully Algorithm Example Process 4 notices 7 down. Process 4 holds an election. Process 5 and 6 respond, telling 4 to stop. Now 5 and 6 each hold an election. Bully Algorithm Example! Process 6 tells process 5 to stop Process 6 (the bully) wins and tells everyone If processes 7 recovers, it restarts election process 16!

Performance of Bully Algorithm Best case scenario: The process with the second highest id notices the failure of the coordinator and elects itself. N-2 coordinator messages are sent. Turnaround time is one message transmission time. Worst case scenario: When the process with the least id detects the failure. N-1 processes altogether begin elections, each sending messages to processes with higher ids. The message overhead is O(N 2 ). Turnaround time is approximately 5 message transmission times if there are no failures during the run: election, answer, election, answer, coordinator ELECTION ALGORITHMS RING ALGORITHM 17!

Ring Algorithm Basic Operation RA assumes that the processes are logically ordered in a ring {implies a successor pointer and an active process list} that is unidirectional. When any process, P, notices that the coordinator is no longer responding it initiates an election: 1. P sends message containing P s process ID to the next available successor. Ring Algorithm Basic Operation 2. At each active process, the receiving process adds its process number to the list of processes in the message and forwards it to its successor. Eventually, the message gets back to the sender. 3. The initial sender sends out a second message letting everyone know who the coordinator is {the process with the highest number} and indicating the current members of the active list of processes. 18!

Ring Algorithm Example Even if two ELECTIONS start at once, every node will pick the same leader. ELECTION LARGE SCALE SYSTEMS 19!

Very Large Scale Networks To deal with scale, more than one node should be selected Nodes organized as peers and super-peers Normal nodes should have low-latency access to superpeers Super-peers should be evenly distributed across the network There should be a predefined portion of super-peers relative to the total number of nodes in the overlay network Each super-peer should not need to serve more than a fixed number of normal nodes Elections held within each peer group Super-peers coordinate among themselves Conclusion Mutual Exclusion Algorithms Permission Based Centralized DeCentralized Distributed Token Based Token Ring Election Algorithms Bully Algorithm Ring Algorithm Election in Large Scale Systems 20!