Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Similar documents
Fault Tolerance. Distributed Software Systems. Definitions

CSE 5306 Distributed Systems

CSE 5306 Distributed Systems. Fault Tolerance

Distributed Systems 11. Consensus. Paul Krzyzanowski

Fault Tolerance. Basic Concepts

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

Distributed Systems Reliable Group Communication

Basic vs. Reliable Multicast

Replication in Distributed Systems

Chapter 8 Fault Tolerance

Chapter 5: Distributed Systems: Fault Tolerance. Fall 2013 Jussi Kangasharju

Today: Fault Tolerance. Failure Masking by Redundancy

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

Failure Tolerance. Distributed Systems Santa Clara University

Fault Tolerance. Distributed Systems. September 2002

Exam 2 Review. October 29, Paul Krzyzanowski 1

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013

Chapter 8 Fault Tolerance

Distributed Systems Fault Tolerance

Today: Fault Tolerance. Replica Management

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Fault Tolerance Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Dep. Systems Requirements

Distributed Systems 24. Fault Tolerance

Fault Tolerance Dealing with an imperfect world

2017 Paul Krzyzanowski 1

Today: Fault Tolerance

Today: Fault Tolerance. Fault Tolerance

Failure Models. Fault Tolerance. Failure Masking by Redundancy. Agreement in Faulty Systems

Distributed Systems COMP 212. Lecture 19 Othon Michail

Reliable Distributed System Approaches

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance

Distributed Systems. replication Johan Montelius ID2201. Distributed Systems ID2201

Distributed Systems. Fault Tolerance. Paul Krzyzanowski

Distributed Systems 23. Fault Tolerance

Today: Fault Tolerance. Reliable One-One Communication

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

CS 425 / ECE 428 Distributed Systems Fall 2017

Distributed Systems (5DV147)

Fault Tolerance 1/64

Intuitive distributed algorithms. with F#

Distributed Systems Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2016

CS October 2017

MYE017 Distributed Systems. Kostas Magoutis

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Fault Tolerance Part I. CS403/534 Distributed Systems Erkay Savas Sabanci University

Replication. Feb 10, 2016 CPSC 416

Fault Tolerance. Distributed Systems IT332

COMMUNICATION IN DISTRIBUTED SYSTEMS

Network Protocols. Sarah Diesburg Operating Systems CS 3430

Byzantine Fault Tolerance and Consensus. Adi Seredinschi Distributed Programming Laboratory

Failures, Elections, and Raft

Distributed KIDS Labs 1

Distributed Systems. 10. Consensus: Paxos. Paul Krzyzanowski. Rutgers University. Fall 2017

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov

Distributed Systems. Fall 2017 Exam 3 Review. Paul Krzyzanowski. Rutgers University. Fall 2017

Applications of Paxos Algorithm

Process groups and message ordering

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?

Virtual Synchrony. Ki Suh Lee Some slides are borrowed from Ken, Jared (cs ) and JusBn (cs )

CS 138: Practical Byzantine Consensus. CS 138 XX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

Distributed Systems Principles and Paradigms

Fault Tolerance for Highly Available Internet Services: Concept, Approaches, and Issues

Practical Byzantine Fault

Global atomicity. Such distributed atomicity is called global atomicity A protocol designed to enforce global atomicity is called commit protocol

Fault Tolerance. Fall 2008 Jussi Kangasharju

CSE 486/586 Distributed Systems

Atomicity. Bailu Ding. Oct 18, Bailu Ding Atomicity Oct 18, / 38

Providing Real-Time and Fault Tolerance for CORBA Applications

Replication architecture

Lecture XII: Replication

Viewstamped Replication to Practical Byzantine Fault Tolerance. Pradipta De

Consensus and related problems

Arvind Krishnamurthy Fall Collection of individual computing devices/processes that can communicate with each other

Basic concepts in fault tolerance Masking failure by redundancy Process resilience Reliable communication. Distributed commit.

MYE017 Distributed Systems. Kostas Magoutis

MODELS OF DISTRIBUTED SYSTEMS

Distributed Systems (ICE 601) Fault Tolerance

Priya Narasimhan. Assistant Professor of ECE and CS Carnegie Mellon University Pittsburgh, PA

Fault Tolerance. Goals: transparent: mask (i.e., completely recover from) all failures, or predictable: exhibit a well defined failure behavior

Chapter 19: Distributed Databases

Module 8 - Fault Tolerance

C 1. Recap. CSE 486/586 Distributed Systems Failure Detectors. Today s Question. Two Different System Models. Why, What, and How.

The Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer

Recall: Primary-Backup. State machine replication. Extend PB for high availability. Consensus 2. Mechanism: Replicate and separate servers

Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer

Exam 2 Review. Fall 2011

Replication and Consistency. Fall 2010 Jussi Kangasharju

A Framework for Highly Available Services Based on Group Communication

Distributed Algorithms Models

MODELS OF DISTRIBUTED SYSTEMS

Fault Tolerance Causes of failure: process failure machine failure network failure Goals: transparent: mask (i.e., completely recover from) all

ZooKeeper & Curator. CS 475, Spring 2018 Concurrent & Distributed Systems

X X C 1. Recap. CSE 486/586 Distributed Systems Gossiping. Eager vs. Lazy Replication. Recall: Passive Replication. Fault-Tolerance and Scalability

Time and Space. Indirect communication. Time and space uncoupling. indirect communication

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions

A definition. Byzantine Generals Problem. Synchronous, Byzantine world

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast

Distributed Algorithms Reliable Broadcast

Transcription:

Distributed Systems 09. State Machine Replication & Virtual Synchrony Paul Krzyzanowski Rutgers University Fall 2016 1

State machine replication 2

State machine replication We want high scalability and high availability Achieve this via redundancy Replicated components will take place of ones that stop working Active-passive: replicated components are standing by Active-active: replicated components are working Replicated state machine State machine = program that takes inputs & produces outputs & holds internal state (data) Replicated = run concurrently on several machines If all replicas get the same set of inputs in the same order, they will perform the same computation and produce the same results To ensure correct execution & high availability Each process must see & process the same inputs in the same sequence Obtain consensus at each state transition 3

State machine replication Replicas = group of machines = process group Load balancing (queries can go to any replica) Fault tolerance (OK if some die; they all do the same thing) Important for replicas to remain consistent Need to receive the same messages [usually] in the same order (causally related messages) What if one of the replicas dies? Then it does not get updates When it comes up, it will be in a state prior to the updates Not good getting new updates will put it in an inconsistent state 4

Faults Faults may be fail-silent: the system does not communicate fail-stop: a fail-silent system that remains silent fail-recover: a fail-silent system that comes back online Byzantine: the system communicates with bad data synchronous system vs. asynchronous system Synchronous = system responds to a message in a bounded time Asynchronous = no assurance of when a message arrives E.g., IP packet versus serial port transmission IP network = asynchronous In a distributed system, we assume processes are: Concurrent, asynchronous, failure-prone 5

Agreement in faulty systems Two army problem Good processors - faulty communication lines Coordinated attack Infinite acknowledgement problem 6

Agreement in faulty systems It is impossible to achieve consensus with asynchronous faulty processes There is no foolproof way to check whether a process failed or is alive but not communicating (or communicating quickly enough) We have to live with this: We cannot reliably detect a failed process Moreover, the the system might recover But we can propagate knowledge that we think it failed Take it out of the group (even if it is alive) If it recovers, it will have to re-join 8

Virtual Synchrony 9

Virtual Synchrony is a software model Model for group management and group communication A process can join or leave a group A process can send a message to a group Message ordering requirements defined by programmer Atomic multicast A message is either delivered to all processes in the group or to none 10

Group View Group View = Set of processes currently in the group A multicast message is associated with a group view Every process in the group should have the same group view When a process joins or leaves the group, the group view changes View change View change = Multicast message announcing the joining or leaving of a process Timeouts lead to failure detection Group membership change the dead member is removed from the group 11

Events Group members receive three types of events 1. New message received 2. View change: group membership change 3. Checkpoint request Dump the state of your system so a new process can read it 12

view change view change view change View Changes & Virtual Synchrony G={p} G={p, q} G={p, q, r, s, t} G={r, s, t} p q r s t Time 0 10 20 30 40 50 60 70 13

A view change is a barrier What if a message is being multicast during a view change? Two multicast messages in transit at the same time: view change (vc) message (m) Need to guarantee all or nothing semantics m is delivered to all processes in G before any process is delivered a vc OR m is not delivered to any process in G Reliable multicasts with this property are virtually synchronous All multicasts must take place between view changes A view change is a barrier recall the distinction between receiving a message and delivering it to the application 14

Virtual Synchrony: implementation example ISIS toolkit: fault-tolerant distributed system offering virtual synchrony Achieves high update & membership event rates Hundreds of thousands of events/second on commodity hardware as of 2009 Provides distributed consistency Applications can create & join groups & send multicasts Applications will see the same events in an equivalent order Group members can update group state in a consistent, fault-tolerant manner Who uses it? New York Stock Exchange, Swiss Exchange, US NAVY AEGIS, etc. Similar models: Microsoft s scalable cluster service, IBM s DCS system, CORBA Apache Zookeeper (configuration, synchronization, and naming service) 15

Implementation: Goals Message transmission is asynchronous (e.g., IP) Machines may receive messages in different order Virtual synchrony Preserve the illusion that events happen in the same order Uses TCP reliable point-to-point message delivery Multicasting is implemented by sending a message to each group member No guarantee that ALL group members receive the message The sender may fail before transmission ends 16

Implementation: Group Management Group Membership Service (GMS) Failure detection service If a process p reports a process q as faulty GMS reports this to every process with a connection to q q is taken out of the process group and will need to re-join Imposes a consistent view of membership to all members 17

Implementation: State Transfer When a new member joins a group It will need to import the current state of the group State transfer: Contact an existing member to request a state transfer checkpoint request Initialize the new member (replica) to that checkpoint state Important enforce the group view barrier A state transfer is treated as an instantaneous event Guarantee that all messages sent to view G i are delivered to all non-faulty processes in G i before the next view change (G i+1 ) 18

Ensuring all messages are received All messages sent to G i must be delivered to all non-faulty processes before a view change to G i+1 But what if the sender failed? Each process stores a message until it know all members received it At that time, the message is stable 19

View Change Stable message = received (acknowledged) by all group members Every process holds a message until it knows that it has been received by the group unstable messages view change P unstable messages unstable messages to group members unstable messages flush view change P flush flush to group members flush View change complete when each process receives a flush message from every other process in the group 23

View change summary Every process will Send any unstable messages to all group members Wait for acknowledgements Deliver any received messages that are not duplicates Send a flush message to the group Receive a flush message from every member of the group Benefits No need for a single master that propagates its updates to replicas Not transactional not limited to one-at-a-time processing Message ordering is generally causal within a view more efficient than imposing total ordering 24

The End 25