Process groups and message ordering

Similar documents
Fault Tolerance. Distributed Software Systems. Definitions

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast

Clock Synchronization. Synchronization. Clock Synchronization Algorithms. Physical Clock Synchronization. Tanenbaum Chapter 6 plus additional papers

Advanced Topics in Distributed Systems. Dr. Ayman A. Abdel-Hamid. Computer Science Department Virginia Tech

Distributed Systems (ICE 601) Fault Tolerance

Distributed Systems Multicast & Group Communication Services

Fault Tolerance Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Distributed Systems Fault Tolerance

Lecture 7: Logical Time

MODELS OF DISTRIBUTED SYSTEMS

Network Protocols. Sarah Diesburg Operating Systems CS 3430

MODELS OF DISTRIBUTED SYSTEMS

Distributed Synchronization. EECS 591 Farnam Jahanian University of Michigan

Coordination and Agreement

Failure Tolerance. Distributed Systems Santa Clara University

Event Ordering. Greg Bilodeau CS 5204 November 3, 2009

Coordination and Agreement

2017 Paul Krzyzanowski 1

Chapter 6 Synchronization

Failure Models. Fault Tolerance. Failure Masking by Redundancy. Agreement in Faulty Systems

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

Distributed Systems (5DV020) Group communication. Fall Group communication. Fall Indirect communication

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance

Chapter 8 Fault Tolerance

殷亚凤. Synchronization. Distributed Systems [6]

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Distributed Systems. Multicast and Agreement

Basic vs. Reliable Multicast

Introduction to Distributed Systems Seif Haridi

Distributed Systems. Pre-Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2015

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

THE TRANSPORT LAYER UNIT IV

Today: Fault Tolerance. Failure Masking by Redundancy

Distributed Systems Reliable Group Communication

Chapter 8 Fault Tolerance

Consistency in Distributed Systems

Fault Tolerance. Distributed Systems. September 2002

Group ID: 3 Page 1 of 15. Authors: Anne-Marie Bosneag Ioan Raicu Davis Ford Liqiang Zhang

Synchronization. Clock Synchronization

Protocol Specification. Using Finite State Machines

Midterm Examination ECE 419S 2015: Distributed Systems Date: March 13th, 2015, 6-8 p.m.

Improving Reliable Transport and Handoff Performance in Cellular Wireless Networks

Time in Distributed Systems

Information Network 1 TCP 1/2

Fault Tolerance. Basic Concepts

Distributed Synchronization: outline

PROCESS SYNCHRONIZATION

Basic Reliable Transport Protocols

OSI Transport Layer. Network Fundamentals Chapter 4. Version Cisco Systems, Inc. All rights reserved. Cisco Public 1

Distributed Systems Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2016

How Bitcoin achieves Decentralization. How Bitcoin achieves Decentralization

COMMUNICATION IN DISTRIBUTED SYSTEMS

Lecture 6: Logical Time

The UNIVERSITY of EDINBURGH. SCHOOL of INFORMATICS. CS4/MSc. Distributed Systems. Björn Franke. Room 2414

Primary-Backup Replication

Homework 2 COP The total number of paths required to reach the global state is 20 edges.

Time in Distributed Systems

System Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics

Distributed Systems. Day 9: Replication [Part 1]

Today: Fault Tolerance. Replica Management

CSE 5306 Distributed Systems

Coordination and Agreement

Exam 2 Review. October 29, Paul Krzyzanowski 1

Arranging lunch value of preserving the causal order. a: how about lunch? meet at 12? a: <receives b then c>: which is ok?

CS 268: IP Multicast Routing

Problem 7. Problem 8. Problem 9

CSE 124: LAMPORT AND VECTOR CLOCKS. George Porter October 30, 2017

Peer-to-peer Sender Authentication for . Vivek Pathak and Liviu Iftode Rutgers University

MYE017 Distributed Systems. Kostas Magoutis

Synchronisation and Coordination (Part 2)

The flow of data must not be allowed to overwhelm the receiver

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015

Internetworking Terms. Internet Structure. Internet Structure. Chapter 15&16 Internetworking. Internetwork Structure & Terms

Distributed Systems Principles and Paradigms

CSE 486/586 Distributed Systems Reliable Multicast --- 1

CSE 486/586 Distributed Systems

Lecture 12: Time Distributed Systems

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Last Class: Naming. Today: Classical Problems in Distributed Systems. Naming. Time ordering and clock synchronization (today)

CSE 486/586 Distributed Systems

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?

Computer Networking. Reliable Transport. Reliable Transport. Principles of reliable data transfer. Reliable data transfer. Elements of Procedure

Distributed Systems. coordination Johan Montelius ID2201. Distributed Systems ID2201

Distributed Systems Inter-Process Communication (IPC) in distributed systems

416 Distributed Systems. Networks review; Day 1 of 2 Jan 5 + 8, 2018

Failures, Elections, and Raft

Chapter 6 Synchronization (1)

Today: Fault Tolerance

Frequently asked questions from the previous class survey

Chapter 5: Distributed Systems: Fault Tolerance. Fall 2013 Jussi Kangasharju

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

SYNCHRONIZATION. DISTRIBUTED SYSTEMS Principles and Paradigms. Second Edition. Chapter 6 ANDREW S. TANENBAUM MAARTEN VAN STEEN

System Models for Distributed Systems

Proseminar Distributed Systems Summer Semester Paxos algorithm. Stefan Resmerita

Distributed Systems COMP 212. Lecture 19 Othon Michail

CSE 486/586 Distributed Systems

Distributed Systems COMP 212. Lecture 17 Othon Michail


CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 21: Network Protocols (and 2 Phase Commit)

Transcription:

Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create ( name ), kill ( name ) join ( name, process ), leave ( name, process ) internal structure? NO ( peer structure ) failure tolerant, complex protocols YES ( a single coordinator and point of failure ) simpler protocols e.g. all join requests must go to the coordinator concurrent joins avoided closed or open? OPEN a non-member can send a message to the group CLOSED only members can send to the group failures? a failed process leaves the group without executing leave robustness leave, join and failures happen during normal operation algorithms must be robust 1

Message delivery for a process group - assumptions ASSUMPTIONS messages are multicast to named process groups reliable channels: a given message is delivered reliably to all members of the group FIFO from a given source to a given destination processes don t crash (failure and restart not considered) processes behave as specified e.g. send the same values to all processes - we are not considering so-called Byzantine behaviour (when malicious or erroneous processes do not behave according to their specifications see Lamport s Byzantine Generals problem). 2

Ordering message delivery application process message service OS comms. interface may specify delivery order to message service e.g. arrival (no) order, total order, causal order may reorder delivery to application (on request for some order) by buffering messages assume FIFO from each source at this level (done by lower levels) total order = every process receives all messages in the same order (including its own). We first consider causal order 3

Message delivery causal order First, define causal order in terms of one-to-one messages; later, multicast to a process group application processes time P3 m m sent message m provably before sent message m' The above diagram shows a violation of causal delivery order Causal delivery order requires that, at P3, m is delivered before m' The definition relates to POTENTIAL causality, not application semantics DEFINITION of causal delivery order (where < means happened before ) send i ( m ) < send j ( m' ) => deliver k ( m ) < deliver k ( m') 4

Message delivery causal order for a process group If we know that all processes in a group receive all messages, the message delivery service can implement causal delivery order (for total order, see later) application processes, time P3 m m application process the message service can postpone the delivery of messages to the application process message service OS comms. interface 5

Message delivery causal order using Vector Clocks A vector clock is maintained by the message service at each node for each process: application processes 1,0,0 2,1,0 0,1,0 1,2,0 P3 1,0,1 1,1,2 vector notation: - fixed number of processes N - each process s message service keeps a vector of dimension N - for each process, each entry records the most up-to-date value of the state counter delivered to the application process, for the process at that position 6

Vector Clocks message service operation application processes 1,0,0 2,1,0 3,3,0 4,3,4 P3 0,1,0 1,2,0 1,3,0 1,4,4 1,0,1 1,1,2 1,3,3 1,3,4 Message service operation: before send increment local process state value in local vector on send, timestamp message with sending process s local vector on receive by message service see below on deliver to receiving application process, increment receiving process s state value in its local vector and update the other fields of the vector by comparing its values with the incoming vector (timestamp) and recording the higher value in each field, thus updating this process s knowledge of system state 7

Implementing causal order using Vector Clocks application processes 1,0,0 2,2,0 110 1,1,0 120 1,2,0 P3 000 0,0,0?? = message service P3 s vector is at (0,0,0) and a message with timestamp (1,2,0) arrives from i.e. has received a message from that P3 hasn t seen. More detail of P3 s message service: receiver vector sender sender vector decision new receiver vector 0,0,0 00 1,2,0 20 buffer 0,0,0 000 P3 is missing a message from that sender has already received 0,0,0 1,0,0 deliver 1,0,1 1,0,1 1,2,0 deliver 1,2,2 In each case: do the sender and receiver agree on the state of all other processes? If the sender has a higher state value for any of these others, the receiver is missing a message, so buffer the message 8

application processes Vector Clocks - example 1,0,0,0 2,0,2,0 P3 1,0,1,0 1,0,2,0? P4 1,0,0,1 1,0,2,2 sender sender vector receiver receiver vector decision new receiver vector P3 1,0,2,0 1,0,0,0 deliver -> 2,0,2,0 same states for and P4 P3 1020 1,0,2,0 P4 1001 1,0,0,1 deliver P4 -> 1022 1,0,2,2 same states for and P3 1,0,2,0 0,0,0,0 buffer -> 0,0,0,0 same state for P4, different for - missing message 1,0,0,0 0,0,0,0 deliver -> 1,1,0,0 reconsider buffered message: P3 1,0,2,0 1,1,0,0 deliver -> 1,2,2,0 same states t for and P4 9

Total order is not enforced by the vector clocks algorithm application processes 1,0,0,0 2,2,0,0 3,2,2,0 m1 1100 1,1,0,0 1200 1,2,0,0 1320 1,3,2,0 m2 P3 1,0,1,0 1,0,2,0 m3 1,2,3,0 P4 1,0,0,1 1,0,2,2 1,2,2,3 m2 and m3 are not causally related receives m1, m2, m3 receives m1, m2, m3 P3 receives m1, m3, m2 P4 receives m1, m3, m2 If the application requires total order this could be enforced by modifying the vector clock algorithm to include ACKs and delivery to self. 10

Totally ordered multicast The vectors can be a large overhead on message transmission and a simpler algorithm can be used if only total order is required. Recall the ASSUMPTIONS messages are multicast to named process groups reliable channels: a given message is delivered reliably to all members of the group FIFO from a given source to a given destination processes don t crash (failure and restart not considered) no Byzantine behaviour total order algorithm - sender multicasts to all including itself - all acknowledge receipt as a multicast message - message is delivered in timestamp order after all ACKs have been received If the delivery system must support both, so that applications i can choose, vector clocks can achieve both causal and total ordering. 11

Total ordered multicast outline of approach application processes 1 2 3 P3 ACKs 3 increments its clock to 1 and multicasts a message with timestamp 1 All delivery systems collect the message, multicast ACK and collect all ACKs - no contention deliver message to application processes and increment local clocks to 2. and P3 both multicast messages with timestamp 3 All delivery systems collect messages, multicast ACKs and collect ACKs. - contention so use a tie-breaker (e.g. lowest process ID wins) and deliver s message before P3s This is just a sketch of an approach. In practice, timeouts would have to be used to take account of long delays due to congestion and/or failure of components and/or communication links 12