Co-Leader Based Leader Election Algorithm in Distributed Environment

Size: px
Start display at page:

Download "Co-Leader Based Leader Election Algorithm in Distributed Environment"

Transcription

1 Co-Leader Based Leader Election Algorithm in Distributed Environment Gajendra Tyagi, Ankit Mundra, Jitendra Kr. Tyagi and Nitin Rakesh Department of Computer Science and Engineering, Jaypee University of Information Technology, Waknaghat, P.O. Waknaghat, Tehkandaghat, Distt. Solan. Abstract. In a distributed environment, several independent nodes work together by exchanging messages to each other. To achieve synchronization and co-ordination among the nodes, there is a need to elect one leader node. Previously several algorithms have been proposed,one of the classical and well-known algorithms to elect the leader node is bully algorithm. Ring algorithm is another modified version of bully algorithm. But these algorithms are having a large number of messages during election process. Till now several amendments are done over the bully algorithm to reduce the number of messages in the leader election process. This paper presents the concept of co-leader based leader election mechanism by which the number of messages can be greatly reduced and hence can reduce network traffic. Further, we have evaluated the performance of our proposed approach by considering the testbed of 10, 100 and 1000 nodes and shown the comparative analysis. Keywords: Bully algorithm, Leader-election, Co-leader, Message passing. 1. Introduction In a distributed environment, several nodes work together as a single unit therefore it is much needed to have the coordination among all these nodes. It is the leader node which is assigned with this responsibility and its role is very crucial in the distributed environment and also in communication network [1 5]. One of the main responsibilities of the leader node is to make sure about the synchronization between different devices. There are various parameters which are generally taken into consideration during the election of leader node like maximum node degree means maximum number of connections to other nodes is considered as having the highest priority [6]. One more parameter is their lifetime means a node having the maximum lifetime is preferred. There are a number of algorithms proposed till now. Each algorithm has its own strategy and parameters to select the leader. However, all these algorithms have successfully done their task of electing the leader, but these are having some disadvantages of large number of redundant election message. In this paper, we have tried to reduce the number of election messages and simultaneous election initiation by introducing the concept of co-leader. This paper is organized in five sections. In section 2, we have examined the previous related work of leader selection algorithms. Section 3 would present the proposed approach and its procedure for leader selection. In section 4, comparison with the previous algorithm is done. Finally, conclusion is given in section 5. ICC-2014 Editors: K. R. Venugopal and S. C. Lingareddy pp

2 Gajendra Tyagi, et al. Figure Related Work Till now, various algorithms have been proposed for the selection of leader. One of the most famous and well known algorithms for this purpose is bully algorithm which was proposed by Garcia-Molina. Bully algorithm and some of its enhanced versions are discussed below. 2.1 Bully Algorithm When a node p knows about the failure of coordinator by the timeout mechanism, it executes the bully algorithm [7]. An election message is sent by node p to all the other nodes having higher id with respect to it. Figure 1(a) Node 3 is detecting the failure of leader node. (b) Node 3 is initiating an election to select the leader node by sending message to higher id nodes. (c) Higher id nodes 4, 5 and 6 are replying with OK message to initiate the election. (d) All the higher id nodes initiated the election. (e) Node 6 is sending an OK message to node 5 to stop the election. (f) Finally Node 6 broadcasting the coordinate message to all nodes. Bully algorithm got very much popularity because of its successful operation of electing the leader. But it was using a large number of message passing during the process. 2.2 Consensus Based Leader Election Algorithm In this algorithm, procedure of electing new leader is stats when a node in the network detects that the current leader node has failed. Afterwards, it starts sending the election message to all the higher priority nodes. These nodes after receiving the election message confirm the failure of leader node by 262

3 Co-Leader Based Leader Election Algorithm in Distributed Environment Figure 2. sending the CHECK EXIST message to the failed leader. In Figure 2(a). Node 4 is detecting the failure of leader and sending the election message to node 5 and 6. In 2(b) node 5 and 6 are confirming the failure of node 7. In 2(c) node 5 and 6 are sending OK message to node 4(d) node 4 is grating the node 6. In 2(d) node 6 is sending the proposal. In 2(e) ACCEPT message is received by node 6. In 2(f) node 6 is broadcasting the COORDINATOR message. The number of messages got reduced in this algorithm in comparison to the bully algorithm [6,8]. But a further reduction is possible, which is discussed below in section 3 rd and comparison with the previous algorithm in shown below in section Proposed Approach This algorithm is established on some basic assumptions which are: It is a synchronous system and each node is assigned a unique identification number in an increasing order. A timeout mechanism is used to identify the failure of nodes. Each node has the knowledge of identification number of all other nodes. In the election, a node with the lowest identification number is elected as the leader node. The node with the second lowest identification number is elected as the co-leader. A failed node can recover and rejoin the system at any time. A number of messages are used in this algorithm namely ELECTION, PROPOSAL, VERIFY, LEADER,CO LEADER, SENDER ALIVE and OK. 263

4 Gajendra Tyagi, et al. Figure 3. (a) Leader failure is detected by node 4. (b) Node 4 is sending the proposal to node 2 which is a co-leader. (c) Co-leader is verifying the failure of leader node (d) After verification, co-leader is announcing the leadership to all nodes. (e) All the nodes are replying to new leader. (f) Leader is broadcasting the new co-leader which is node Execution of Proposed Algorithm As it is already assumed that the failure of nodes are identified by a timeout mechanism, and all the nodes are assigned the id in an increasing order. Suppose there is a system, in which there are seven live nodes. Each node is having a unique id and the node with id 1 is the leader of the system. The node with id 2 is the co-leader of the system. Suppose at any instant of time, leader got crashed and this failure is detected by node 4. The node 4 will immediately send the PROPOSAL message to the co-leader of the system to become the new leader. Node 2 i.e. co-leader will verify the failure of leader by sending a VERIFY Message. If no response is received from Figure 4. Comparative analysis. the leader within a predefined time, the co-leader will be sure of the leader failure. The co-leader will then broadcast the LEADER message to all the nodes and waits for the OK message. All the live nodes will acknowledge the new leader by sending the OK message along with their id. In this way, this new leader i.e. node 2 will get to know about the next minimum id. Then the new leader will broadcast a CO LEADER message to all the nodes telling them about the new co-leader. The whole procedure is illustrated in the Figure

5 Co-Leader Based Leader Election Algorithm in Distributed Environment Table 1. Algorithm: Co-leader based leader selection algorithm. Table 2. Number of messages comparison with previous algorithms. 4. Algorithm and Performance Evaluation In this section we have evaluate the performance of our proposed approach against the previous approaches in terms of number of message passing. We have implemented the testbed for 10, 100 and 1000 nodes using MPI framework. After the implementation we have computed the number of messages required by bully algorithm, consensus algorithm and our proposed algorithm (shown in Table 1 and 2). 5. Conclusion In this paper we have proposed the co-leader based leader selection algorithm for distributed communication environment. Our proposed approach provides the robust mechanism for electing the leader node 265

6 Gajendra Tyagi, et al. among the several candidates with less number of election messages. We have also shown the performance of our approach with the previous approaches and prove that our approach is more robust and required less number of messages. Because of less number of messages in the network the overall performance is increased. References [1] H. Abu-Amar and J. Lokre, Election In Asyncronous Complete Network With Intermitted Link Failures, IEEE Transaction on Computers, vol. 43, no. 7, pp , (1994). [2] J. Brunkreef, J. P. Katoen and S. Mauw, Design and Analysis of Dynamic Leader Election Protocols in Broadcast Network, Distributed Computing, vol. 9, no. 4, pp , (1996). [3] H. M. Sayeed, M. Abu-Amara and H. Abu-Avara, OptimalAsynchronous Agreement and Leader Election Algorithm for Complete Networks with Byzantine Faulty Links, Distributed Computing, vol. 9, no. 3, pp , (1995). [4] Sung-Hoon-Park, Yoon Kim and Jeoung Sun Hwang An Efficient Algorithm for Leader-Election in Synchrous Distributed Systems, IEEE Transaction on Computers, vol. 43, no. 7, pp , (1999). [5] G. Singh, Leader Election in the Presence of Link Failures IEEE Transaction on Parallel and Distributed Systems, vol. 7, no. 3, pp , March (1996). [6] Hsu-Chia Cahng and Chi-Chun Lo, A Consensus-Based Leader Election Algorithm for Wireless Ad Hoc Networks. [7] H. Garcia-Molina, Elections in Distributed Computing System, IEEE Transaction Computer, vol. C-31, pp , January (1982). [8] S. H. Park, Y. Kim and J. S. Hwang, An efficient algorithm for leader-election in synchronous distributed systems, IEEE TENCON, pp , (1999). 266

Leader Election Algorithms in Distributed Systems

Leader Election Algorithms in Distributed Systems Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 6, June 2014, pg.374

More information

Election Administration Algorithm for Distributed Computing

Election Administration Algorithm for Distributed Computing I J E E E C International Journal of Electrical, Electronics and Computer Engineering 1(2): 1-6(2012) Election Administration Algorithm for Distributed Computing SK Gandhi* and Pawan Kumar Thakur* **Department

More information

An Efficient Approach of Election Algorithm in Distributed Systems

An Efficient Approach of Election Algorithm in Distributed Systems An Efficient Approach of Election Algorithm in Distributed Systems SANDIPAN BASU Post graduate Department of Computer Science, St. Xavier s College, 30 Park Street (30 Mother Teresa Sarani), Kolkata 700016,

More information

Study of various Election algorithms on the basis of messagepassing

Study of various Election algorithms on the basis of messagepassing IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727Volume 8, Issue 1 (Nov. - Dec. 2012), PP 23-27 Study of various Election algorithms on the basis of messagepassing approach

More information

Distributed Synchronization. EECS 591 Farnam Jahanian University of Michigan

Distributed Synchronization. EECS 591 Farnam Jahanian University of Michigan Distributed Synchronization EECS 591 Farnam Jahanian University of Michigan Reading List Tanenbaum Chapter 5.1, 5.4 and 5.5 Clock Synchronization Distributed Election Mutual Exclusion Clock Synchronization

More information

CMPSCI 677 Operating Systems Spring Lecture 14: March 9

CMPSCI 677 Operating Systems Spring Lecture 14: March 9 CMPSCI 677 Operating Systems Spring 2014 Lecture 14: March 9 Lecturer: Prashant Shenoy Scribe: Nikita Mehra 14.1 Distributed Snapshot Algorithm A distributed snapshot algorithm captures a consistent global

More information

Buffered Based Routing and Resiliency Approach for WMN

Buffered Based Routing and Resiliency Approach for WMN Buffered Based Routing and Resiliency Approach for WMN 1 eetanjali Rathee, Ankit Mundra (MIEEE) 1, Department of Computer cience and Engineering Jaypee University of Information Technology Waknaghat, India

More information

PROCESS SYNCHRONIZATION

PROCESS SYNCHRONIZATION DISTRIBUTED COMPUTER SYSTEMS PROCESS SYNCHRONIZATION Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Process Synchronization Mutual Exclusion Algorithms Permission Based Centralized

More information

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q Coordination 1 To do q q q Mutual exclusion Election algorithms Next time: Global state Coordination and agreement in US Congress 1798-2015 Process coordination How can processes coordinate their action?

More information

Enhanced Bully Algorithm for Leader Node Election in Synchronous Distributed Systems

Enhanced Bully Algorithm for Leader Node Election in Synchronous Distributed Systems Computers 2012, 1, 3-23; doi:10.3390/computers1010003 Article OPEN ACCESS computers ISSN 2073-431X www.mdpi.com/journal/computers Enhanced Bully Algorithm for Leader Node Election in Synchronous Distributed

More information

FAULT TOLERANT LEADER ELECTION IN DISTRIBUTED SYSTEMS

FAULT TOLERANT LEADER ELECTION IN DISTRIBUTED SYSTEMS FAULT TOLERANT LEADER ELECTION IN DISTRIBUTED SYSTEMS Marius Rafailescu The Faculty of Automatic Control and Computers, POLITEHNICA University, Bucharest ABSTRACT There are many distributed systems which

More information

Last Class: Clock Synchronization. Today: More Canonical Problems

Last Class: Clock Synchronization. Today: More Canonical Problems Last Class: Clock Synchronization Logical clocks Vector clocks Global state Lecture 11, page 1 Today: More Canonical Problems Distributed snapshot and termination detection Election algorithms Bully algorithm

More information

Last Class: Clock Synchronization. Today: More Canonical Problems

Last Class: Clock Synchronization. Today: More Canonical Problems Last Class: Clock Synchronization Logical clocks Vector clocks Global state Lecture 12, page 1 Today: More Canonical Problems Distributed snapshot and termination detection Election algorithms Bully algorithm

More information

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need

More information

Distributed Systems. coordination Johan Montelius ID2201. Distributed Systems ID2201

Distributed Systems. coordination Johan Montelius ID2201. Distributed Systems ID2201 Distributed Systems ID2201 coordination Johan Montelius 1 Coordination Coordinating several threads in one node is a problem, coordination in a network is of course worse: failure of nodes and networks

More information

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Page 1 Introduction We frequently want to get a set of nodes in a distributed system to agree Commitment protocols and mutual

More information

Clock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers

Clock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers Clock Synchronization Synchronization Tanenbaum Chapter 6 plus additional papers Fig 6-1. In a distributed system, each machine has its own clock. When this is the case, an event that occurred after another

More information

CSE 5306 Distributed Systems. Synchronization

CSE 5306 Distributed Systems. Synchronization CSE 5306 Distributed Systems Synchronization 1 Synchronization An important issue in distributed system is how processes cooperate and synchronize with one another Cooperation is partially supported by

More information

Today: Fault Tolerance

Today: Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

Recovering from a Crash. Three-Phase Commit

Recovering from a Crash. Three-Phase Commit Recovering from a Crash If INIT : abort locally and inform coordinator If Ready, contact another process Q and examine Q s state Lecture 18, page 23 Three-Phase Commit Two phase commit: problem if coordinator

More information

Process Synchroniztion Mutual Exclusion & Election Algorithms

Process Synchroniztion Mutual Exclusion & Election Algorithms Process Synchroniztion Mutual Exclusion & Election Algorithms Paul Krzyzanowski Rutgers University November 2, 2017 1 Introduction Process synchronization is the set of techniques that are used to coordinate

More information

A Hybrid of Improved Bulls and Weighted Round Robin to optimize the Leader and Load Balancing in Cloud and Distributed Computing Environment

A Hybrid of Improved Bulls and Weighted Round Robin to optimize the Leader and Load Balancing in Cloud and Distributed Computing Environment A Hybrid of Improved Bulls and Weighted Round Robin to optimize the Leader and Load Balancing in Cloud and Distributed Computing Environment Suvarna Lakshmi C, Assistant Professor, Dept of Computer Science,

More information

Lecture 2: Leader election algorithms.

Lecture 2: Leader election algorithms. Distributed Algorithms M.Tech., CSE, 0 Lecture : Leader election algorithms. Faculty: K.R. Chowdhary : Professor of CS Disclaimer: These notes have not been subjected to the usual scrutiny reserved for

More information

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input

Initial Assumptions. Modern Distributed Computing. Network Topology. Initial Input Initial Assumptions Modern Distributed Computing Theory and Applications Ioannis Chatzigiannakis Sapienza University of Rome Lecture 4 Tuesday, March 6, 03 Exercises correspond to problems studied during

More information

Mutual Exclusion in DS

Mutual Exclusion in DS Mutual Exclusion in DS Event Ordering Mutual Exclusion Election Algorithms Reaching Agreement Event Ordering Happened-before relation (denoted by ). If A and B are events in the same process, and A was

More information

Byzantine Fault Tolerant Raft

Byzantine Fault Tolerant Raft Abstract Byzantine Fault Tolerant Raft Dennis Wang, Nina Tai, Yicheng An {dwang22, ninatai, yicheng} @stanford.edu https://github.com/g60726/zatt For this project, we modified the original Raft design

More information

Today: Fault Tolerance. Fault Tolerance

Today: Fault Tolerance. Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

Distributed Systems. Multicast and Agreement

Distributed Systems. Multicast and Agreement Distributed Systems Multicast and Agreement Björn Franke University of Edinburgh 2015/2016 Multicast Send message to multiple nodes A node can join a multicast group, and receives all messages sent to

More information

Consensus and related problems

Consensus and related problems Consensus and related problems Today l Consensus l Google s Chubby l Paxos for Chubby Consensus and failures How to make process agree on a value after one or more have proposed what the value should be?

More information

Distributed Systems. Fault Tolerance. Paul Krzyzanowski

Distributed Systems. Fault Tolerance. Paul Krzyzanowski Distributed Systems Fault Tolerance Paul Krzyzanowski Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License. Faults Deviation from expected

More information

QARS for Self Reconfiguration Mechanism in Wireless Mesh Networks

QARS for Self Reconfiguration Mechanism in Wireless Mesh Networks QARS for Self Reconfiguration Mechanism in Wireless Mesh Networks A.Melveena, D.Ramya Dorai Abstract Wireless mesh networks (WMNs) are being developed actively and deployed widely for a variety of applications.

More information

Chapter 6 Synchronization

Chapter 6 Synchronization DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 6 Synchronization Clock Synchronization Figure 6-1. When each machine has its own clock, an event

More information

Homework #2 Nathan Balon CIS 578 October 31, 2004

Homework #2 Nathan Balon CIS 578 October 31, 2004 Homework #2 Nathan Balon CIS 578 October 31, 2004 1 Answer the following questions about the snapshot algorithm: A) What is it used for? It used for capturing the global state of a distributed system.

More information

Replication in Distributed Systems

Replication in Distributed Systems Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over

More information

Distributed Computing. CS439: Principles of Computer Systems November 20, 2017

Distributed Computing. CS439: Principles of Computer Systems November 20, 2017 Distributed Computing CS439: Principles of Computer Systems November 20, 2017 Last Time Network Programming: Sockets End point of communication Identified by (IP address : port number) pair Client-Side

More information

Event Ordering Silberschatz, Galvin and Gagne. Operating System Concepts

Event Ordering Silberschatz, Galvin and Gagne. Operating System Concepts Event Ordering Happened-before relation (denoted by ) If A and B are events in the same process, and A was executed before B, then A B If A is the event of sending a message by one process and B is the

More information

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson Distributed systems Lecture 6: Elections, distributed transactions, and replication DrRobert N. M. Watson 1 Last time Saw how we can build ordered multicast Messages between processes in a group Need to

More information

Fault Tolerance. Basic Concepts

Fault Tolerance. Basic Concepts COP 6611 Advanced Operating System Fault Tolerance Chi Zhang czhang@cs.fiu.edu Dependability Includes Availability Run time / total time Basic Concepts Reliability The length of uninterrupted run time

More information

Distributed Systems COMP 212. Lecture 19 Othon Michail

Distributed Systems COMP 212. Lecture 19 Othon Michail Distributed Systems COMP 212 Lecture 19 Othon Michail Fault Tolerance 2/31 What is a Distributed System? 3/31 Distributed vs Single-machine Systems A key difference: partial failures One component fails

More information

EECS 591 DISTRIBUTED SYSTEMS

EECS 591 DISTRIBUTED SYSTEMS EECS 591 DISTRIBUTED SYSTEMS Manos Kapritsos Fall 2018 Slides by: Lorenzo Alvisi 3-PHASE COMMIT Coordinator I. sends VOTE-REQ to all participants 3. if (all votes are Yes) then send Precommit to all else

More information

Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms

Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms Verteilte Systeme/Distributed Systems Ch. 5: Various distributed algorithms Holger Karl Computer Networks Group Universität Paderborn Goal of this chapter Apart from issues in distributed time and resulting

More information

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [ELECTION ALGORITHMS] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Does a process

More information

Performance Evaluation of Mesh - Based Multicast Routing Protocols in MANET s

Performance Evaluation of Mesh - Based Multicast Routing Protocols in MANET s Performance Evaluation of Mesh - Based Multicast Routing Protocols in MANET s M. Nagaratna Assistant Professor Dept. of CSE JNTUH, Hyderabad, India V. Kamakshi Prasad Prof & Additional Cont. of. Examinations

More information

LOGO: A New Distributed Leader Election Algorithm in WSNs with Low Energy Consumption

LOGO: A New Distributed Leader Election Algorithm in WSNs with Low Energy Consumption LOGO: A New Distributed Leader Election Algorithm in WSNs with Low Energy Consumption Ahcène Bounceur 1, Madani Bezoui 2, Umber Noreen 1, Reinhardt Euler 1 Farid Lalem 1, Mohammad Hammoudeh 3, and Sohail

More information

Database management system Prof. D. Janakiram Department of Computer Science and Engineering Indian Institute of Technology, Madras

Database management system Prof. D. Janakiram Department of Computer Science and Engineering Indian Institute of Technology, Madras Database management system Prof. D. Janakiram Department of Computer Science and Engineering Indian Institute of Technology, Madras Lecture 25 Basic 2-phase & 3-phase Commit protocol In the last lecture,

More information

Analysis and Enhancements of Leader Elections algorithms in Mobile Ad Hoc Networks

Analysis and Enhancements of Leader Elections algorithms in Mobile Ad Hoc Networks Analysis and Enhancements of Leader Elections algorithms in Mobile Ad Hoc Networks Mohammad H. Al Shayeji 1, AbdulRahman R. Al-Azmi 2, AbdulAziz R. Al-Azmi 3 and M.D. Samrajesh 4 Computer Engineering Department,

More information

Failures, Elections, and Raft

Failures, Elections, and Raft Failures, Elections, and Raft CS 8 XI Copyright 06 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved. Distributed Banking SFO add interest based on current balance PVD deposit $000 CS 8 XI Copyright

More information

Synchronization. Distributed Systems IT332

Synchronization. Distributed Systems IT332 Synchronization Distributed Systems IT332 2 Outline Clock synchronization Logical clocks Election algorithms Mutual exclusion Transactions 3 Hardware/Software Clocks Physical clocks in computers are realized

More information

Fault Tolerance Part I. CS403/534 Distributed Systems Erkay Savas Sabanci University

Fault Tolerance Part I. CS403/534 Distributed Systems Erkay Savas Sabanci University Fault Tolerance Part I CS403/534 Distributed Systems Erkay Savas Sabanci University 1 Overview Basic concepts Process resilience Reliable client-server communication Reliable group communication Distributed

More information

Distributed Computing. CS439: Principles of Computer Systems November 19, 2018

Distributed Computing. CS439: Principles of Computer Systems November 19, 2018 Distributed Computing CS439: Principles of Computer Systems November 19, 2018 Bringing It All Together We ve been studying how an OS manages a single CPU system As part of that, it will communicate with

More information

Synchronization Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Synchronization Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University Synchronization Part II CS403/534 Distributed Systems Erkay Savas Sabanci University 1 Election Algorithms Issue: Many distributed algorithms require that one process act as a coordinator (initiator, etc).

More information

Failure Tolerance. Distributed Systems Santa Clara University

Failure Tolerance. Distributed Systems Santa Clara University Failure Tolerance Distributed Systems Santa Clara University Distributed Checkpointing Distributed Checkpointing Capture the global state of a distributed system Chandy and Lamport: Distributed snapshot

More information

Distributed Systems Fault Tolerance

Distributed Systems Fault Tolerance Distributed Systems Fault Tolerance [] Fault Tolerance. Basic concepts - terminology. Process resilience groups and failure masking 3. Reliable communication reliable client-server communication reliable

More information

To do. Consensus and related problems. q Failure. q Raft

To do. Consensus and related problems. q Failure. q Raft Consensus and related problems To do q Failure q Consensus and related problems q Raft Consensus We have seen protocols tailored for individual types of consensus/agreements Which process can enter the

More information

SYNCHRONIZATION. DISTRIBUTED SYSTEMS Principles and Paradigms. Second Edition. Chapter 6 ANDREW S. TANENBAUM MAARTEN VAN STEEN

SYNCHRONIZATION. DISTRIBUTED SYSTEMS Principles and Paradigms. Second Edition. Chapter 6 ANDREW S. TANENBAUM MAARTEN VAN STEEN DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN واحد نجف آباد Chapter 6 SYNCHRONIZATION Dr. Rastegari - Email: rastegari@iaun.ac.ir - Tel: +98331-2291111-2488

More information

Byzantine Failures. Nikola Knezevic. knl

Byzantine Failures. Nikola Knezevic. knl Byzantine Failures Nikola Knezevic knl Different Types of Failures Crash / Fail-stop Send Omissions Receive Omissions General Omission Arbitrary failures, authenticated messages Arbitrary failures Arbitrary

More information

Consensus, impossibility results and Paxos. Ken Birman

Consensus, impossibility results and Paxos. Ken Birman Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

Time-related replication for p2p storage system

Time-related replication for p2p storage system Seventh International Conference on Networking Time-related replication for p2p storage system Kyungbaek Kim E-mail: University of California, Irvine Computer Science-Systems 3204 Donald Bren Hall, Irvine,

More information

Synchronisation and Coordination (Part 2)

Synchronisation and Coordination (Part 2) The University of New South Wales School of Computer Science & Engineering COMP9243 Week 5 (18s1) Ihor Kuz, Manuel M. T. Chakravarty & Gernot Heiser Synchronisation and Coordination (Part 2) Transactions

More information

The Impact of Clustering on the Average Path Length in Wireless Sensor Networks

The Impact of Clustering on the Average Path Length in Wireless Sensor Networks The Impact of Clustering on the Average Path Length in Wireless Sensor Networks Azrina Abd Aziz Y. Ahmet Şekercioğlu Department of Electrical and Computer Systems Engineering, Monash University, Australia

More information

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure

Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Two-Phase Atomic Commitment Protocol in Asynchronous Distributed Systems with Crash Failure Yong-Hwan Cho, Sung-Hoon Park and Seon-Hyong Lee School of Electrical and Computer Engineering, Chungbuk National

More information

Distributed Algorithms. The Leader Election Problem. 1.2 The Network Model. Applications. 1 The Problem and the Model. Lesson two Leader Election

Distributed Algorithms. The Leader Election Problem. 1.2 The Network Model. Applications. 1 The Problem and the Model. Lesson two Leader Election The Problem and the Model Distributed Algorithms Lesson two Leader Election. The problem What is a leader A leader is a member that all other nodes acknowledge as being distinguished to perform some special

More information

Distributed Algorithms (PhD course) Consensus SARDAR MUHAMMAD SULAMAN

Distributed Algorithms (PhD course) Consensus SARDAR MUHAMMAD SULAMAN Distributed Algorithms (PhD course) Consensus SARDAR MUHAMMAD SULAMAN Consensus The processes use consensus to agree on a common value out of values they initially propose Reaching consensus is one of

More information

Introduction to Distributed Systems Seif Haridi

Introduction to Distributed Systems Seif Haridi Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send

More information

Mutual Exclusion. A Centralized Algorithm

Mutual Exclusion. A Centralized Algorithm Mutual Exclusion Processes in a distributed system may need to simultaneously access the same resource Mutual exclusion is required to prevent interference and ensure consistency We will study three algorithms

More information

Distributed Consensus Protocols

Distributed Consensus Protocols Distributed Consensus Protocols ABSTRACT In this paper, I compare Paxos, the most popular and influential of distributed consensus protocols, and Raft, a fairly new protocol that is considered to be a

More information

Recall our 2PC commit problem. Recall our 2PC commit problem. Doing failover correctly isn t easy. Consensus I. FLP Impossibility, Paxos

Recall our 2PC commit problem. Recall our 2PC commit problem. Doing failover correctly isn t easy. Consensus I. FLP Impossibility, Paxos Consensus I Recall our 2PC commit problem FLP Impossibility, Paxos Client C 1 C à TC: go! COS 418: Distributed Systems Lecture 7 Michael Freedman Bank A B 2 TC à A, B: prepare! 3 A, B à P: yes or no 4

More information

Dep. Systems Requirements

Dep. Systems Requirements Dependable Systems Dep. Systems Requirements Availability the system is ready to be used immediately. A(t) = probability system is available for use at time t MTTF/(MTTF+MTTR) If MTTR can be kept small

More information

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17 Synchronization Part 2 REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17 1 Outline Part 2! Clock Synchronization! Clock Synchronization Algorithms!

More information

Today: Fault Tolerance. Failure Masking by Redundancy

Today: Fault Tolerance. Failure Masking by Redundancy Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Failure recovery Checkpointing

More information

Study and Comparison of Mesh and Tree- Based Multicast Routing Protocols for MANETs

Study and Comparison of Mesh and Tree- Based Multicast Routing Protocols for MANETs Study and Comparison of Mesh and Tree- Based Multicast Routing Protocols for MANETs Rajneesh Gujral Associate Proffesor (CSE Deptt.) Maharishi Markandeshwar University, Mullana, Ambala Sanjeev Rana Associate

More information

Time Synchronization in Wireless Sensor Networks: CCTS

Time Synchronization in Wireless Sensor Networks: CCTS Time Synchronization in Wireless Sensor Networks: CCTS 1 Nerin Thomas, 2 Smita C Thomas 1, 2 M.G University, Mount Zion College of Engineering, Pathanamthitta, India Abstract: A time synchronization algorithm

More information

MENCIUS: BUILDING EFFICIENT

MENCIUS: BUILDING EFFICIENT MENCIUS: BUILDING EFFICIENT STATE MACHINE FOR WANS By: Yanhua Mao Flavio P. Junqueira Keith Marzullo Fabian Fuxa, Chun-Yu Hsiung November 14, 2018 AGENDA 1. Motivation 2. Breakthrough 3. Rules of Mencius

More information

殷亚凤. Synchronization. Distributed Systems [6]

殷亚凤. Synchronization. Distributed Systems [6] Synchronization Distributed Systems [6] 殷亚凤 Email: yafeng@nju.edu.cn Homepage: http://cs.nju.edu.cn/yafeng/ Room 301, Building of Computer Science and Technology Review Protocols Remote Procedure Call

More information

Paxos Made Simple. Leslie Lamport, 2001

Paxos Made Simple. Leslie Lamport, 2001 Paxos Made Simple Leslie Lamport, 2001 The Problem Reaching consensus on a proposed value, among a collection of processes Safety requirements: Only a value that has been proposed may be chosen Only a

More information

A Multipath AODV Reliable Data Transmission Routing Algorithm Based on LQI

A Multipath AODV Reliable Data Transmission Routing Algorithm Based on LQI Sensors & Transducers 2014 by IFSA Publishing, S. L. http://www.sensorsportal.com A Multipath AODV Reliable Data Transmission Routing Algorithm Based on LQI 1 Yongxian SONG, 2 Rongbiao ZHANG and Fuhuan

More information

Distributed Systems 8L for Part IB

Distributed Systems 8L for Part IB Distributed Systems 8L for Part IB Handout 3 Dr. Steven Hand 1 Distributed Mutual Exclusion In first part of course, saw need to coordinate concurrent processes / threads In particular considered how to

More information

21. Distributed Algorithms

21. Distributed Algorithms 21. Distributed Algorithms We dene a distributed system as a collection of individual computing devices that can communicate with each other [2]. This denition is very broad, it includes anything, from

More information

UNIT IV 1. What is meant by hardware and software clock? Clock devices can be programmed to generate interrupts at regular intervals in orders that, for example, time slicing can be implemented.the operating

More information

Data Consistency and Blockchain. Bei Chun Zhou (BlockChainZ)

Data Consistency and Blockchain. Bei Chun Zhou (BlockChainZ) Data Consistency and Blockchain Bei Chun Zhou (BlockChainZ) beichunz@cn.ibm.com 1 Data Consistency Point-in-time consistency Transaction consistency Application consistency 2 Strong Consistency ACID Atomicity.

More information

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski Distributed Systems 09. State Machine Replication & Virtual Synchrony Paul Krzyzanowski Rutgers University Fall 2016 1 State machine replication 2 State machine replication We want high scalability and

More information

Efficient Hybrid Multicast Routing Protocol for Ad-Hoc Wireless Networks

Efficient Hybrid Multicast Routing Protocol for Ad-Hoc Wireless Networks Efficient Hybrid Multicast Routing Protocol for Ad-Hoc Wireless Networks Jayanta Biswas and Mukti Barai and S. K. Nandy CAD Lab, Indian Institute of Science Bangalore, 56, India {jayanta@cadl, mbarai@cadl,

More information

Implementation of an Adaptive MAC Protocol in WSN using Network Simulator-2

Implementation of an Adaptive MAC Protocol in WSN using Network Simulator-2 Implementation of an Adaptive MAC Protocol in WSN using Network Simulator-2 1 Suresh, 2 C.B.Vinutha, 3 Dr.M.Z Kurian 1 4 th Sem, M.Tech (Digital Electronics), SSIT, Tumkur 2 Lecturer, Dept.of E&C, SSIT,

More information

Distributed Coordination 1/39

Distributed Coordination 1/39 Distributed Coordination 1/39 Overview Synchronization of distributed processes requires new concepts in addition to synchronization of processes in single multi-core systems. Topics: Notion of global

More information

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 08: Fault Tolerance Version: December 2, 2010 2 / 65 Contents Chapter

More information

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks.

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks. Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

Distributed Systems 11. Consensus. Paul Krzyzanowski

Distributed Systems 11. Consensus. Paul Krzyzanowski Distributed Systems 11. Consensus Paul Krzyzanowski pxk@cs.rutgers.edu 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value must be one

More information

Coordination and Agreement

Coordination and Agreement Coordination and Agreement Nicola Dragoni Embedded Systems Engineering DTU Informatics 1. Introduction 2. Distributed Mutual Exclusion 3. Elections 4. Multicast Communication 5. Consensus and related problems

More information

OUTLINE. Introduction Clock synchronization Logical clocks Global state Mutual exclusion Election algorithms Deadlocks in distributed systems

OUTLINE. Introduction Clock synchronization Logical clocks Global state Mutual exclusion Election algorithms Deadlocks in distributed systems Chapter 5 Synchronization OUTLINE Introduction Clock synchronization Logical clocks Global state Mutual exclusion Election algorithms Deadlocks in distributed systems Concurrent Processes Cooperating processes

More information

A Modified Leader Election Algorithm for MANET

A Modified Leader Election Algorithm for MANET A Modified Leader Election Algorithm for MANET Smita Bhoir Computer Engineering Department Ramrao Adik Institute of Technology Mumbai, India smitapatilbe@gmail.com Amarsinh Vidhate Computer Engineering

More information

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013 Distributed Systems 19. Fault Tolerance Paul Krzyzanowski Rutgers University Fall 2013 November 27, 2013 2013 Paul Krzyzanowski 1 Faults Deviation from expected behavior Due to a variety of factors: Hardware

More information

Modified Low Energy Adaptive Clustering Hierarchy for Heterogeneous Wireless Sensor Network

Modified Low Energy Adaptive Clustering Hierarchy for Heterogeneous Wireless Sensor Network Modified Low Energy Adaptive Clustering Hierarchy for Heterogeneous Wireless Sensor Network C.Divya1, N.Krishnan2, A.Petchiammal3 Center for Information Technology and Engineering Manonmaniam Sundaranar

More information

Key Agreement in Ad-hoc Networks

Key Agreement in Ad-hoc Networks Key Agreement in Ad-hoc Networks N. Asokan and P. Ginzboorg Presented by Chuk Yang Seng 1 Introduction Ad-hoc Key Agreement Scenario: Small group of people at a conference in a room Wireless network session

More information

Analysis of Cluster based Routing Algorithms in Wireless Sensor Networks using NS2 simulator

Analysis of Cluster based Routing Algorithms in Wireless Sensor Networks using NS2 simulator Analysis of Cluster based Routing Algorithms in Wireless Sensor Networks using NS2 simulator Ashika R. Naik Department of Electronics & Tele-communication, Goa College of Engineering (India) ABSTRACT Wireless

More information

Public Key Management Scheme with Certificate Management Node for Wireless Ad Hoc Networks

Public Key Management Scheme with Certificate Management Node for Wireless Ad Hoc Networks Proceedings of the International Multiconference on Computer Science and Information Technology pp. 445 451 ISSN 1896-7094 c 2006 PIPS Public Key Management Scheme with Certificate Management Node for

More information

An Enhanced Super-Peer System Considering Mobility and Energy in Mobile Environments

An Enhanced Super-Peer System Considering Mobility and Energy in Mobile Environments An Enhanced Super-Peer System Considering Mobility and Energy in Mobile Environments Sun-Kyum Kim, Kwang-Jo Lee, Sung-Bong Yang Departement of Computer Science Yonsei University Repubilc of Korea {skyum,

More information

Assignment 12: Commit Protocols and Replication Solution

Assignment 12: Commit Protocols and Replication Solution Data Modelling and Databases Exercise dates: May 24 / May 25, 2018 Ce Zhang, Gustavo Alonso Last update: June 04, 2018 Spring Semester 2018 Head TA: Ingo Müller Assignment 12: Commit Protocols and Replication

More information

Fault Tolerance. Distributed Systems. September 2002

Fault Tolerance. Distributed Systems. September 2002 Fault Tolerance Distributed Systems September 2002 Basics A component provides services to clients. To provide services, the component may require the services from other components a component may depend

More information

P2 Recitation. Raft: A Consensus Algorithm for Replicated Logs

P2 Recitation. Raft: A Consensus Algorithm for Replicated Logs P2 Recitation Raft: A Consensus Algorithm for Replicated Logs Presented by Zeleena Kearney and Tushar Agarwal Diego Ongaro and John Ousterhout Stanford University Presentation adapted from the original

More information