Replicated State Machine in Wide-area Networks

Size: px
Start display at page:

Download "Replicated State Machine in Wide-area Networks"

Transcription

1 Replicated State Machine in Wide-area Networks Yanhua Mao CSE223A WI09 1

2 Building replicated state machine with consensus General approach to replicate stateful deterministic services Provide strong consistency E.g., FT file system for LANs Servers agree on the same sequence of command to execute Network x x y z Consensus to reach agreement E.g. Paxos, Fast Paxos. y x y z x y z z 2

3 Why wide-area? Reliability and availability To cope with colocation failure: fire etc. Consistency To coordinate wide-area applications Yahoo s Zookeeper, Google s Chubby Are existing protocols efficient in wide-area? If not, can we improve them? 3

4 More on wide-area Wide-area is more complex than simply not local area Distribution of servers LAN WAN Distribution of clients LAN {LxL}:Localarea system {L,W} WAN {WxL}:Clusterbased system {WxW}: Grid, SOA, mobile 4

5 Paxos vs. Fast Paxos in (W, L) Topology All servers in the same LAN Clients talk to the servers via WAN Sources of asynchrony Wide area network Client Operating system Observation Client High variance on delays WAN delay dominates WAN LAN Servers 5

6 Consensus speed Theoretically Counting the number of message delays to reach consensus Assuming each message take the same delay. Practically Message delay varies greatly WAN message delay maybe hundreds time larger than that of a LAN message Client Replica cmd cmd cmd result result result Replica Replica 6

7 Paxos Analysis Client cmd result Leader 1a 1b 2a Client WAN LAN Replica 1a 1b 2a Client Replica Paxos Servers Wide area delay for Paxos The message from the client to the leader 7

8 Fast Paxos Analysis Client Leader 1a 1b 2a ANY 2a ANY cmd cmd cmd result Replica 1a 1a 1b 1b 2a ANY result result cmd result Client WAN LAN Replica Client Replica Fast Paxos Quorum Size: 2f+1 Need 2f+1 replicas to receive the cmd from the client to make progress Wide area delay (f=1) The third fastest message from the client to the replicas Servers 8

9 Classic Paxos vs. Fast Paxos In (W, L) Paxos Fast Paxos Learning delay 3 2 Number of replicas 2f+1 3f+1 Quorum Size f+1 2f+1 Collision No Yes Question: Is Fast Paxos always faster in practice in collision-free runs? 9

10 Analysis Result Assumption WAN message delay dominates WAN message delay is IID and follows a continuous distribution Result Classic Paxos has a fixed probability (60%) of being faster (f=1) Classic Paxos does even better with larger f Intuition WAN dominates and has high variance 10

11 Quick Proof Given a wide area latency distribution Draw 5 variables (A-E) from the distribution A < B < C < D for Fast Paxos E for Paxos E s relative order to A-D ( 5 cases ) Each case has the same probability A < B < C < D E E E Paxos Faster E E Fast Paxos Faster 11

12 Summary Protocol performance is network settings sensitive Fast Paxos better in (L,L) Paxos beter in (W,L) and (W,W) How to design a protocol to have the best of both in a wide-range of network settings Run both Paxos and Fast Paxos concurrently Switch between Paxos and Fast Paxos when network condition changes 12

13 High performance replicated state machine for (W, W) systems Model A small number of n sites inter-connected by asynchronous wide-area network Up to f out of n servers can fail by crashing System load Possible variable and changing load from each site Network-bound (large request size) CPU-bound (small request size) Goals Low latency under low client load High throughput with high client load WAN 13

14 Paxos in steady state P 0 /Leader P 1 /Replica P 2 /Replica P 0 /Leader Clinet 0 sends x x P 0 proposes x x x proposeaccept learn Clinet 0 recv x x P 0 learns x fwd propose accept z z z P 2 fwd z z Client 2 sends z Client 2 recv z Instance 0 Instance 1 learn P 2 learns z Pros Simple and low message complexity (3n-3) Low latency at the leader (2 steps) Cons High latency at the non-leader replicas (4 steps) Not load balanced: bottleneck at the leader or the leader s links 14

15 Fast Paxos in steady state x P 0 /Replica P 1 /Replica P 2 /Replica x P 0 proposes x P 0 learns x x x propose propose accept x propose accept P 3 /Replica z z P 3 proposes z z P 3 learns z z Pros Load balanced Instance 0 Instance 1 Low latency at all servers under no contention(2 steps) Cons Collision results in additional latency under contention High message complexity (n 2-1) More replicas needed (3f+1) 15

16 Paxos vs. Fast Paxos in (W, W) system Paxos Fast Paxos New Alg? WAN learning latency Leader: 2 Non-leader: 4 2 (when no collision) 2 Collision No Yes No Number of replicas 2f+1 3f+1 2f+1 WAN msg Leader: 3n-3 n 2-1 3n-3 complexity Non-Leader: 3n-2 Load balancing No Yes Yes Neither algorithm is perfect, but Paxos is better. Question: can we achieve best of both with a new algorithm? 16

17 Deriving Mencius from Paxos Approach Rotating leader A variant of consensus Optimizations Advantages Ensure safety Safety is easy when derived from existing protocol Flexible design Others may derive their own protocol 17

18 First step: Rotating the leader Each instance of consensus is assigned to a coordinator server The coordinator is the default leader of that instance Simple well-known assignment: e.g. round-robin A server proposes client requests immediately to the next available instance it coordinates Don t have to wait for other servers: reduce latency A server only proposes client requests to instances it coordinates There will be no contention 18

19 Mencius Rule 1 A server p maintains its index I p, i.e., the next consensus instance it coordinates. Rule 1: Upon receiving a client request v, it immediately proposes v to instance I p and updates its index accordingly. 19

20 Benefits of rotating the leader All servers can now propose requests directly Potentially low latency at all sites Load balancing at the servers Higher throughput under CPU-bound client load Balanced communication pattern Higher throughput under network-bound load Leader fwd, accept propose, learn propose, learn fwd, accept prop, acc, learn prop, acc, learn prop, acc, learn prop, acc, learn prop, acc, learn prop, acc, learn Paxos Mencius 20

21 Ensure liveness when servers have different load Rule 1 only works well when all the servers have the same load Servers may observe different client loads Servers can skip their turns (propose no-ops) w/o skipping x 1 y 1 z 1 y 2 z 2 z w/ skipping x y z z y z no-op no-op no-op

22 Mencius Rule 2 Rule 2: If server p receives a propose message with some value v other than no-op for instance i and i>i p, before accepting the v and sending back an accept message, p updates its index I p to be greater than i and proposes no-ops for each instance in range [I p, I p ) that p coordinates, where I p is p s new index. 22

23 Proposing no-ops is costly Consider the case where only one server proposes requests It takes 3n(n-1) messages to get one value v chosen 3(n-1) for choosing v 3(n-1) 2 for n-1 no-ops Solution Use a simpler abstraction than consensus 23

24 Simple consensus for efficiency Simple consensus Coordinator: can propose either a client request or a no-op Non-coordinator: only no-op Benefits no-op can be learned in one message delay if the coordinator skips (proposes a no-op) Easy to piggyback no-ops to improve efficiency: essentially no extra cost 24

25 Coordinated Paxos Starting state The coordinator is the default leader Start from state as if phase one is done by the coordinator Suggest The coordinator propose a request by running phase 2 Skip The coordinator proposes a no-op Fast learning for skipping Revoke A replica start the full phase 1 and 2 and propose a no-op Only needed when the coordinator is detected as having failed 25

26 Skips with simple consensus Replica 0 x y 2a 3a 2a 3a Replica 1 2a 3a 2a 3a Replica 2 Replica 0 x y 2a 3a 2a skip 3a Replica 1 2a 3a 2a skip 3a Replica 2 skip x no-op no-op y Pros: Simple and correct protocol Cons: High Message complexity: (n-1)(n+2) 26

27 Reduce message complexity Replica 0 x y 2a 3a 2a skip 3a Replica 1 2a 3a 2a skip skip 3a Replica 2 Replica 0 Replica 1 x 2a 2a 3a 3a y 2a 2a +skip +skip 3a 3a Replica 0 s view: x no-op no-op y Replica Idea: piggyback skip to messages Can piggyback to future 2a messages too No longer need to broadcast 27

28 Mencius Optimization 1 & 2 When a server p receives a suggest message from server q. Let r be a server other than p or q. Optimization 1: p does not send a separate skip message to q. Instead, p uses the accept message that replies the suggest to promise not to suggest any client request to instance smaller than i in the future. Optimization 2: p does not send a skip message to r immediately. Instead, p waits for a future suggest message from p to r to indicate that p has promised not to suggest any client requests to instance smaller than i. 28

29 Gap in idle replicas Potentially unbounded number of requests wait to be committed Only happens to idle replicas. Once the replica sends requests, the pending skips are resolved. Idle replicas can send skips periodically to accelerate resolution Replica 0 Replica 1 Replica 2 x no-op no-op y no-op no-op z x no-op y no-op z x no-op y no-op z

30 Mencius Accelerator 1 When a server p receives a suggest message from server q, let r be a server other than p or q. Accelerator 1: A server p propagates skip messages to r if the total number of outstanding skip messages to r is more than some constant α, or the message has been deferred for more than some time τ. 30

31 Failures Faulty processes cannot skip Leave gap in the requests sequence Replica 0 x y 2a 3a 2a skip 3a Replica 1 2a 3a 2a skip 3a Replica 2 Replica 0 x no-op y Replica 1 x no-op y

32 Mencius: Rule 3 Rule 3: Let q be a server that another server p suspects has failed, and let C q be the smallest instance that is coordinated by q and not learned by p, p revokes q for all instances in the range [C q, I p ] that q coordinates. 32

33 Imperfect failure detector A false suspicion may result no-op chosen, even if the coordinator is not crashed and has proposed some value v Solution: re-suggest v when the coordinator learns no-op What if false suspicion happens again? Keep trying? W is not enough to provide liveness One server can be permanently falsely suspected So, we require P instead Eventually there is no false suspicion, hence P guarantees eventual liveness 33

34 Mencius: Rule 4 Rule 4: If server p suggests a value v other than no-op to instance i, and p learns no-op is chosen, then p suggests v again. 34

35 Deal with failure Revoke: propose no-op on behalf of the faulty processes Problem: Full 3 phases of Paxos are costly Solution: revoke for large block

36 Deal with failure Revoke: propose no-op on behalf of the faulty processes Problem: Full 3 phases of Paxos are costly Solution: revoke for large block

37 Deal with failure Revoke: propose no-op on behalf of the faulty processes Problem: Full 3 phases of Paxos are costly Solution: revoke for large block x y x y x y x y x y x y x y x y x y x y

38 Deal with failure Revoke: propose no-op on behalf of the faulty processes Problem: Full 3 phases of Paxos are costly Solution: revoke for large block x y x y x y x y x y x y x y x y x y x y

39 Deal with failure Revoke: propose no-op on behalf of the faulty processes Problem: Full 3 phases of Paxos are costly Solution: revoke for large block x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y

40 Mencius failure recovery P 2 may come back because of failure recovery or false suspicion Find out the next available slots it coordinates Start proposing request to that slot x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y z P 2 recovers and propose request z to its next available slot 40

41 Mencius failure recovery P 2 may come back because of failure recovery or false suspicion Find out the next available slots it coordinates Start proposing request to that slot Other servers catch up with P 2 by skipping x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y x y z P 0 and P 1 skip their turns upon receiving the request from P 2 41

42 Mencius: Optimization 3 Optimization 3: Let q be a server that another server p suspects to have failed, and let C q be the smallest instance that is coordinated by q and not learned by p. For some constant β, p revokes q for all instances in the range [C q, I p +2β] that q coordinates if C q <I p +β. 42

43 Mencius summary Rule 1 Suggest client request immediately to the next consensus instance a server coordinates Rule 2 Update index and skip unused instances when accepting a value Rule 3 Revoke suspected servers Rule 4 Re-suggest a client request after a false suspicion 43

44 Mencius summary continued Optimization 1 Piggyback skips to accept () messages Optimization 2 Piggyback skips to future suggest (2a) messages Accelerator 1 Put a bound before starting flush skip messages Optimization 3 Issue revocation in large blocks to amortize the cost 44

45 Mencius vs. Paxos and Fast Paxos in (W, W) system Paxos Fast Paxos New Mencius Alg? WAN learning latency Leader: 2 Non-leader: 4 2 (when no collision) 2* Collision No Yes No Number of replicas 2f+1 3f+1 2f+1 WAN msg Leader: 3n-3 n 2-1 3n-3 complexity Non-Leader: 3n-2 Load balancing No Yes Yes Mencius achieves the best of both Paxos and Fast Paxos 45

46 Delayed commit P 0 P 0 suggests x to instance 0 x P 0 learns and commits x x y P 0 learns and commits y P 2 y 2a P 2 suggests y to instance 2 2a x? no-op y 3a 3a y P 2 learns y x P 2 learns y and commits x, y x no-op y Delayed Commit Up to one round-trip extra delay Out-of-order commit reduces the extra delay when requests are commutable A benefit of using simple consensus 46

47 Experimental setup Mencius vs. Paxos A clique topology for three sites A simple replicated register service k total registers: Mencius-k ρ bytes dummy payload ρ = 4000 : Network-bound ρ = 0 : CPU-bound Out-of-order commit option Clients send requests at a fixed rate 50ms, 5~20Mbps 50ms, 5~20Mbps 50ms, 5~20Mbps 47

48 Latency vs. system load Network-bound load, 20Mbps for all links Latency (ms) Mencius Mencius-1024 Mencius-128 Mencius-16 Paxos ρ = Throughput (ops) Take away: Mencius has good latency Latency goes to infinitive when reaching maximum throughput Mencius s latency is as good as Paxos s under high contention Enable out-of-order commit reduces latency up to 30% 48

49 Throughput % improvement when network-bound Throughput (ops) Mencius Mencius-1024 Mencius-128 Mencius-16 Paxos x ρ = 4000 (Network-bound) y ρ = 0 (CPU-bound) Up-to 50% improvement when CPU-bounded Enabling out-of-order hurts throughput Take away: Mencius has good throughput 49

50 Throughput under failure Caused by delayed commits Throughput (ops) Paxos: crash follower 800 Paxos: crash leader Mencius 600 Temporary stalls when any server fails Still higher than Paxos Temporary stalls Caused by when leader fails duplicates Failure suspected Time (sec) Failure reported 50

51 Fault scalability Star topology: 10 Mbps link from each site to the central node Throughput (ops) sites Mencius Paxos 5 sites 7 sites ρ = 4000 (Network-bound) sites 5 sites 7 sites ρ = 0 (CPU-bound) Take away: Mencius scales better than Paxos 51

52 More evaluation results Lower latency at all servers under low contention True, even without out-of-order commit Adaptable to available bandwidth Adaptation happens automatically Easy to take advantage of all available bandwidth 52

53 Summary Mencius Rotating leader Paxos Easy to ensure safety and a flexible design Simple consensus The foundation of efficiency design High performance High throughput under high load Low latency under low load Low latency under medium to high load when enabling out-oforder commit Better load balancing Better scalability 53

54 References Flavio Junqueira, Yanhua Mao, Keith Marzullo "Classic Paxos vs. Fast Paxos: Caveat Emptor, HotDep 2007, Edinburgh, UK Yanhua Mao, Flavio Junqueira, Keith Marzullo "Mencius: Building Efficient Replicated State Machines for WANs, OSDI 2008, San Diego, CA, USA 54

Mencius: Another Paxos Variant???

Mencius: Another Paxos Variant??? Mencius: Another Paxos Variant??? Authors: Yanhua Mao, Flavio P. Junqueira, Keith Marzullo Presented by Isaiah Mayerchak & Yijia Liu State Machine Replication in WANs WAN = Wide Area Network Goals: Web

More information

MENCIUS: BUILDING EFFICIENT

MENCIUS: BUILDING EFFICIENT MENCIUS: BUILDING EFFICIENT STATE MACHINE FOR WANS By: Yanhua Mao Flavio P. Junqueira Keith Marzullo Fabian Fuxa, Chun-Yu Hsiung November 14, 2018 AGENDA 1. Motivation 2. Breakthrough 3. Rules of Mencius

More information

HP: Hybrid Paxos for WANs

HP: Hybrid Paxos for WANs HP: Hybrid Paxos for WANs Dan Dobre, Matthias Majuntke, Marco Serafini and Neeraj Suri {dan,majuntke,marco,suri}@cs.tu-darmstadt.de TU Darmstadt, Germany Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded

More information

There Is More Consensus in Egalitarian Parliaments

There Is More Consensus in Egalitarian Parliaments There Is More Consensus in Egalitarian Parliaments Iulian Moraru, David Andersen, Michael Kaminsky Carnegie Mellon University Intel Labs Fault tolerance Redundancy State Machine Replication 3 State Machine

More information

AGREEMENT PROTOCOLS. Paxos -a family of protocols for solving consensus

AGREEMENT PROTOCOLS. Paxos -a family of protocols for solving consensus AGREEMENT PROTOCOLS Paxos -a family of protocols for solving consensus OUTLINE History of the Paxos algorithm Paxos Algorithm Family Implementation in existing systems References HISTORY OF THE PAXOS ALGORITHM

More information

HT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers

HT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers 1 HT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers Vinit Kumar 1 and Ajay Agarwal 2 1 Associate Professor with the Krishna Engineering College, Ghaziabad, India.

More information

SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines

SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines Hanyu Zhao *, Quanlu Zhang, Zhi Yang *, Ming Wu, Yafei Dai * * Peking University Microsoft Research Replication for Fault Tolerance

More information

Intuitive distributed algorithms. with F#

Intuitive distributed algorithms. with F# Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype

More information

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need

More information

TAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton

TAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton TAPIR By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton Outline Problem Space Inconsistent Replication TAPIR Evaluation Conclusion Problem

More information

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov

Practical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change

More information

Distributed Systems 11. Consensus. Paul Krzyzanowski

Distributed Systems 11. Consensus. Paul Krzyzanowski Distributed Systems 11. Consensus Paul Krzyzanowski pxk@cs.rutgers.edu 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value must be one

More information

Distributed Systems Consensus

Distributed Systems Consensus Distributed Systems Consensus Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Consensus 1393/6/31 1 / 56 What is the Problem?

More information

Consensus and related problems

Consensus and related problems Consensus and related problems Today l Consensus l Google s Chubby l Paxos for Chubby Consensus and failures How to make process agree on a value after one or more have proposed what the value should be?

More information

Transactions. CS 475, Spring 2018 Concurrent & Distributed Systems

Transactions. CS 475, Spring 2018 Concurrent & Distributed Systems Transactions CS 475, Spring 2018 Concurrent & Distributed Systems Review: Transactions boolean transfermoney(person from, Person to, float amount){ if(from.balance >= amount) { from.balance = from.balance

More information

PARALLEL CONSENSUS PROTOCOL

PARALLEL CONSENSUS PROTOCOL CANOPUS: A SCALABLE AND MASSIVELY PARALLEL CONSENSUS PROTOCOL Bernard Wong CoNEXT 2017 Joint work with Sajjad Rizvi and Srinivasan Keshav CONSENSUS PROBLEM Agreement between a set of nodes in the presence

More information

Group Replication: A Journey to the Group Communication Core. Alfranio Correia Principal Software Engineer

Group Replication: A Journey to the Group Communication Core. Alfranio Correia Principal Software Engineer Group Replication: A Journey to the Group Communication Core Alfranio Correia (alfranio.correia@oracle.com) Principal Software Engineer 4th of February Copyright 7, Oracle and/or its affiliates. All rights

More information

Replication in Distributed Systems

Replication in Distributed Systems Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over

More information

Paxos and Replication. Dan Ports, CSEP 552

Paxos and Replication. Dan Ports, CSEP 552 Paxos and Replication Dan Ports, CSEP 552 Today: achieving consensus with Paxos and how to use this to build a replicated system Last week Scaling a web service using front-end caching but what about the

More information

MDCC MULTI DATA CENTER CONSISTENCY. amplab. Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete

MDCC MULTI DATA CENTER CONSISTENCY. amplab. Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete MDCC MULTI DATA CENTER CONSISTENCY Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete gpang@cs.berkeley.edu amplab MOTIVATION 2 3 June 2, 200: Rackspace power outage of approximately 0

More information

Fast Paxos Made Easy: Theory and Implementation

Fast Paxos Made Easy: Theory and Implementation International Journal of Distributed Systems and Technologies, 6(1), 15-33, January-March 2015 15 Fast Paxos Made Easy: Theory and Implementation Wenbing Zhao, Department of Electrical and Computer Engineering,

More information

SpecPaxos. James Connolly && Harrison Davis

SpecPaxos. James Connolly && Harrison Davis SpecPaxos James Connolly && Harrison Davis Overview Background Fast Paxos Traditional Paxos Implementations Data Centers Mostly-Ordered-Multicast Network layer Speculative Paxos Protocol Application layer

More information

Beyond FLP. Acknowledgement for presentation material. Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen

Beyond FLP. Acknowledgement for presentation material. Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen Beyond FLP Acknowledgement for presentation material Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen Paper trail blog: http://the-paper-trail.org/blog/consensus-protocols-paxos/

More information

Recovering from a Crash. Three-Phase Commit

Recovering from a Crash. Three-Phase Commit Recovering from a Crash If INIT : abort locally and inform coordinator If Ready, contact another process Q and examine Q s state Lecture 18, page 23 Three-Phase Commit Two phase commit: problem if coordinator

More information

Designing Distributed Systems using Approximate Synchrony in Data Center Networks

Designing Distributed Systems using Approximate Synchrony in Data Center Networks Designing Distributed Systems using Approximate Synchrony in Data Center Networks Dan R. K. Ports Jialin Li Naveen Kr. Sharma Vincent Liu Arvind Krishnamurthy University of Washington CSE Today s most

More information

Distributed Consensus: Making Impossible Possible

Distributed Consensus: Making Impossible Possible Distributed Consensus: Making Impossible Possible Heidi Howard PhD Student @ University of Cambridge heidi.howard@cl.cam.ac.uk @heidiann360 hh360.user.srcf.net Sometimes inconsistency is not an option

More information

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks.

Consensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks. Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson Distributed systems Lecture 6: Elections, distributed transactions, and replication DrRobert N. M. Watson 1 Last time Saw how we can build ordered multicast Messages between processes in a group Need to

More information

Introduction to Distributed Systems Seif Haridi

Introduction to Distributed Systems Seif Haridi Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send

More information

Exploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout

Exploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout Exploiting Commutativity For Practical Fast Replication Seo Jin Park and John Ousterhout Overview Problem: consistent replication adds latency and throughput overheads Why? Replication happens after ordering

More information

S-Paxos: Offloading the Leader for High Throughput State Machine Replication

S-Paxos: Offloading the Leader for High Throughput State Machine Replication 212 31st International Symposium on Reliable Distributed Systems S-: Offloading the Leader for High Throughput State Machine Replication Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole

More information

Failure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18

Failure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18 Failure models Byzantine Fault Tolerance Fail-stop: nodes either execute the protocol correctly or just stop Byzantine failures: nodes can behave in any arbitrary way Send illegal messages, try to trick

More information

CPS 512 midterm exam #1, 10/7/2016

CPS 512 midterm exam #1, 10/7/2016 CPS 512 midterm exam #1, 10/7/2016 Your name please: NetID: Answer all questions. Please attempt to confine your answers to the boxes provided. If you don t know the answer to a question, then just say

More information

Distributed Systems 8L for Part IB

Distributed Systems 8L for Part IB Distributed Systems 8L for Part IB Handout 3 Dr. Steven Hand 1 Distributed Mutual Exclusion In first part of course, saw need to coordinate concurrent processes / threads In particular considered how to

More information

Distributed Systems. Before We Begin. Advantages. What is a Distributed System? CSE 120: Principles of Operating Systems. Lecture 13.

Distributed Systems. Before We Begin. Advantages. What is a Distributed System? CSE 120: Principles of Operating Systems. Lecture 13. CSE 120: Principles of Operating Systems Lecture 13 Distributed Systems December 2, 2003 Before We Begin Read Chapters 15, 17 (on Distributed Systems topics) Prof. Joe Pasquale Department of Computer Science

More information

Efficient and Scalable Replication of Services over Wide-Area Networks

Efficient and Scalable Replication of Services over Wide-Area Networks Efficient and Scalable Replication of Services over Wide-Area Networks Thesis by Abdallah Abouzamazem In Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy University of Newcastle

More information

Distributed systems. Consensus

Distributed systems. Consensus Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory Consensus B A C 2 Consensus In the consensus problem, the processes propose values and have to agree on one among these

More information

Distributed Consensus Protocols

Distributed Consensus Protocols Distributed Consensus Protocols ABSTRACT In this paper, I compare Paxos, the most popular and influential of distributed consensus protocols, and Raft, a fairly new protocol that is considered to be a

More information

Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks (Technical Report)

Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks (Technical Report) : Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks (Technical Report) Jiaqing Du, Daniele Sciascia, Sameh Elnikety, Willy Zwaenepoel, and Fernando Pedone

More information

ZooKeeper Atomic Broadcast

ZooKeeper Atomic Broadcast ZooKeeper Atomic Broadcast The heart of the ZooKeeper coordination service Benjamin Reed, Flavio Junqueira Yahoo! Research ZooKeeper Service Transforms a request into an idempotent transaction Request

More information

Basic vs. Reliable Multicast

Basic vs. Reliable Multicast Basic vs. Reliable Multicast Basic multicast does not consider process crashes. Reliable multicast does. So far, we considered the basic versions of ordered multicasts. What about the reliable versions?

More information

Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering

Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering Jialin Li, Ellis Michael, Naveen Kr. Sharma, Adriana Szekeres, Dan R. K. Ports Server failures are the common case in data centers

More information

Distributed Algorithms. Partha Sarathi Mandal Department of Mathematics IIT Guwahati

Distributed Algorithms. Partha Sarathi Mandal Department of Mathematics IIT Guwahati Distributed Algorithms Partha Sarathi Mandal Department of Mathematics IIT Guwahati Thanks to Dr. Sukumar Ghosh for the slides Distributed Algorithms Distributed algorithms for various graph theoretic

More information

Data Consistency and Blockchain. Bei Chun Zhou (BlockChainZ)

Data Consistency and Blockchain. Bei Chun Zhou (BlockChainZ) Data Consistency and Blockchain Bei Chun Zhou (BlockChainZ) beichunz@cn.ibm.com 1 Data Consistency Point-in-time consistency Transaction consistency Application consistency 2 Strong Consistency ACID Atomicity.

More information

Today: Fault Tolerance

Today: Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

EMPIRICAL STUDY OF UNSTABLE LEADERS IN PAXOS LONG KAI THESIS

EMPIRICAL STUDY OF UNSTABLE LEADERS IN PAXOS LONG KAI THESIS 2013 Long Kai EMPIRICAL STUDY OF UNSTABLE LEADERS IN PAXOS BY LONG KAI THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science in the Graduate

More information

Vive La Différence: Paxos vs. Viewstamped Replication vs. Zab

Vive La Différence: Paxos vs. Viewstamped Replication vs. Zab IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, MANUSCRIPT ID 1 Vive La Différence: Paxos vs. Viewstamped Replication vs. Zab Robbert van Renesse, Nicolas Schiper, Fred B. Schneider, Fellow, IEEE

More information

Replication. Feb 10, 2016 CPSC 416

Replication. Feb 10, 2016 CPSC 416 Replication Feb 10, 2016 CPSC 416 How d we get here? Failures & single systems; fault tolerance techniques added redundancy (ECC memory, RAID, etc.) Conceptually, ECC & RAID both put a master in front

More information

Systems. Roland Kammerer. 10. November Institute of Computer Engineering Vienna University of Technology. Communication Protocols for Embedded

Systems. Roland Kammerer. 10. November Institute of Computer Engineering Vienna University of Technology. Communication Protocols for Embedded Communication Roland Institute of Computer Engineering Vienna University of Technology 10. November 2010 Overview 1. Definition of a protocol 2. Protocol properties 3. Basic Principles 4. system communication

More information

Enhancing Throughput of

Enhancing Throughput of Enhancing Throughput of NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing Throughput of Partially Replicated State Machines via NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing

More information

Designing for Understandability: the Raft Consensus Algorithm. Diego Ongaro John Ousterhout Stanford University

Designing for Understandability: the Raft Consensus Algorithm. Diego Ongaro John Ousterhout Stanford University Designing for Understandability: the Raft Consensus Algorithm Diego Ongaro John Ousterhout Stanford University Algorithms Should Be Designed For... Correctness? Efficiency? Conciseness? Understandability!

More information

Google Spanner - A Globally Distributed,

Google Spanner - A Globally Distributed, Google Spanner - A Globally Distributed, Synchronously-Replicated Database System James C. Corbett, et. al. Feb 14, 2013. Presented By Alexander Chow For CS 742 Motivation Eventually-consistent sometimes

More information

Practical Byzantine Fault Tolerance

Practical Byzantine Fault Tolerance Practical Byzantine Fault Tolerance Robert Grimm New York University (Partially based on notes by Eric Brewer and David Mazières) The Three Questions What is the problem? What is new or different? What

More information

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013 Distributed Systems 19. Fault Tolerance Paul Krzyzanowski Rutgers University Fall 2013 November 27, 2013 2013 Paul Krzyzanowski 1 Faults Deviation from expected behavior Due to a variety of factors: Hardware

More information

Coordinating distributed systems part II. Marko Vukolić Distributed Systems and Cloud Computing

Coordinating distributed systems part II. Marko Vukolić Distributed Systems and Cloud Computing Coordinating distributed systems part II Marko Vukolić Distributed Systems and Cloud Computing Last Time Coordinating distributed systems part I Zookeeper At the heart of Zookeeper is the ZAB atomic broadcast

More information

Distributed algorithms

Distributed algorithms Distributed algorithms Prof R. Guerraoui lpdwww.epfl.ch Exam: Written Reference: Book - Springer Verlag http://lpd.epfl.ch/site/education/da - Introduction to Reliable (and Secure) Distributed Programming

More information

CS 138: Practical Byzantine Consensus. CS 138 XX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

CS 138: Practical Byzantine Consensus. CS 138 XX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. CS 138: Practical Byzantine Consensus CS 138 XX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Scenario Asynchronous system Signed messages s are state machines It has to be practical CS 138

More information

Distributed Systems. replication Johan Montelius ID2201. Distributed Systems ID2201

Distributed Systems. replication Johan Montelius ID2201. Distributed Systems ID2201 Distributed Systems ID2201 replication Johan Montelius 1 The problem The problem we have: servers might be unavailable The solution: keep duplicates at different servers 2 Building a fault-tolerant service

More information

Local Recovery for High Availability in Strongly Consistent Cloud Services

Local Recovery for High Availability in Strongly Consistent Cloud Services IEEE TRANSACTION ON DEPENDABLE AND SECURE COMPUTING, VOL. X, NO. Y, JANUARY 2013 1 Local Recovery for High Availability in Strongly Consistent Cloud Services James W. Anderson, Hein Meling, Alexander Rasmussen,

More information

Consensus, impossibility results and Paxos. Ken Birman

Consensus, impossibility results and Paxos. Ken Birman Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}

More information

Providing Real-Time and Fault Tolerance for CORBA Applications

Providing Real-Time and Fault Tolerance for CORBA Applications Providing Real-Time and Tolerance for CORBA Applications Priya Narasimhan Assistant Professor of ECE and CS University Pittsburgh, PA 15213-3890 Sponsored in part by the CMU-NASA High Dependability Computing

More information

Practice: Large Systems Part 2, Chapter 2

Practice: Large Systems Part 2, Chapter 2 Practice: Large Systems Part 2, Chapter 2 Overvie Introduction Strong Consistency Crash Failures: Primary Copy, Commit Protocols Crash-Recovery Failures: Paxos, Chubby Byzantine Failures: PBFT, Zyzzyva

More information

Theoretical Computer Science

Theoretical Computer Science Theoretical Computer Science 496 (2013) 170 183 Contents lists available at SciVerse ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Optimizing Paxos with batching

More information

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q

Coordination 1. To do. Mutual exclusion Election algorithms Next time: Global state. q q q Coordination 1 To do q q q Mutual exclusion Election algorithms Next time: Global state Coordination and agreement in US Congress 1798-2015 Process coordination How can processes coordinate their action?

More information

Be General and Don t Give Up Consistency in Geo- Replicated Transactional Systems

Be General and Don t Give Up Consistency in Geo- Replicated Transactional Systems Be General and Don t Give Up Consistency in Geo- Replicated Transactional Systems Alexandru Turcu, Sebastiano Peluso, Roberto Palmieri and Binoy Ravindran Replicated Transactional Systems DATA CONSISTENCY

More information

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast

Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast HariGovind V. Ramasamy Christian Cachin August 19, 2005 Abstract Atomic broadcast is a communication primitive that allows a group of

More information

Today: Fault Tolerance. Fault Tolerance

Today: Fault Tolerance. Fault Tolerance Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing

More information

Asynchronous Reconfiguration for Paxos State Machines

Asynchronous Reconfiguration for Paxos State Machines Asynchronous Reconfiguration for Paxos State Machines Leander Jehl and Hein Meling Department of Electrical Engineering and Computer Science University of Stavanger, Norway Abstract. This paper addresses

More information

ATOMIC COMMITMENT Or: How to Implement Distributed Transactions in Sharded Databases

ATOMIC COMMITMENT Or: How to Implement Distributed Transactions in Sharded Databases ATOMIC COMMITMENT Or: How to Implement Distributed Transactions in Sharded Databases We talked about transactions and how to implement them in a single-node database. We ll now start looking into how to

More information

Agreement and Consensus. SWE 622, Spring 2017 Distributed Software Engineering

Agreement and Consensus. SWE 622, Spring 2017 Distributed Software Engineering Agreement and Consensus SWE 622, Spring 2017 Distributed Software Engineering Today General agreement problems Fault tolerance limitations of 2PC 3PC Paxos + ZooKeeper 2 Midterm Recap 200 GMU SWE 622 Midterm

More information

CORFU: A Shared Log Design for Flash Clusters

CORFU: A Shared Log Design for Flash Clusters CORFU: A Shared Log Design for Flash Clusters Authors: Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, Ted Wobber, Michael Wei, John D. Davis EECS 591 11/7/18 Presented by Evan Agattas and Fanzhong

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

Paxos Made Live. An Engineering Perspective. Authors: Tushar Chandra, Robert Griesemer, Joshua Redstone. Presented By: Dipendra Kumar Jha

Paxos Made Live. An Engineering Perspective. Authors: Tushar Chandra, Robert Griesemer, Joshua Redstone. Presented By: Dipendra Kumar Jha Paxos Made Live An Engineering Perspective Authors: Tushar Chandra, Robert Griesemer, Joshua Redstone Presented By: Dipendra Kumar Jha Consensus Algorithms Consensus: process of agreeing on one result

More information

Priya Narasimhan. Assistant Professor of ECE and CS Carnegie Mellon University Pittsburgh, PA

Priya Narasimhan. Assistant Professor of ECE and CS Carnegie Mellon University Pittsburgh, PA OMG Real-Time and Distributed Object Computing Workshop, July 2002, Arlington, VA Providing Real-Time and Fault Tolerance for CORBA Applications Priya Narasimhan Assistant Professor of ECE and CS Carnegie

More information

INF-5360 Presentation

INF-5360 Presentation INF-5360 Presentation Optimistic Replication Ali Ahmad April 29, 2013 Structure of presentation Pessimistic and optimistic replication Elements of Optimistic replication Eventual consistency Scheduling

More information

JPaxos: State machine replication based on the Paxos protocol

JPaxos: State machine replication based on the Paxos protocol JPaxos: State machine replication based on the Paxos protocol Jan Kończak 2, Nuno Santos 1, Tomasz Żurkowski 2, Paweł T. Wojciechowski 2, and André Schiper 1 1 EPFL, Switzerland 2 Poznań University of

More information

Fault-Tolerant Distributed Services and Paxos"

Fault-Tolerant Distributed Services and Paxos Fault-Tolerant Distributed Services and Paxos" INF346, 2015 2015 P. Kuznetsov and M. Vukolic So far " " Shared memory synchronization:" Wait-freedom and linearizability" Consensus and universality " Fine-grained

More information

Optimizing GIS Services: Scalability & Performance. Firdaus Asri

Optimizing GIS Services: Scalability & Performance. Firdaus Asri Optimizing GIS Services: Scalability & Performance Firdaus Asri Define Performance Performance The speed at which a given operation occurs E.g. Request response time measured in seconds Scalability The

More information

Building Consistent Transactions with Inconsistent Replication

Building Consistent Transactions with Inconsistent Replication Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports University of Washington Distributed storage systems

More information

Distributed Systems. Day 9: Replication [Part 1]

Distributed Systems. Day 9: Replication [Part 1] Distributed Systems Day 9: Replication [Part 1] Hash table k 0 v 0 k 1 v 1 k 2 v 2 k 3 v 3 ll Facebook Data Does your client know about all of F s servers? Security issues? Performance issues? How do clients

More information

Low-Latency Multi-Datacenter Databases using Replicated Commit

Low-Latency Multi-Datacenter Databases using Replicated Commit Low-Latency Multi-Datacenter Databases using Replicated Commit Hatem Mahmoud, Faisal Nawab, Alexander Pucher, Divyakant Agrawal, Amr El Abbadi UCSB Presented by Ashutosh Dhekne Main Contributions Reduce

More information

Canopus: A Scalable and Massively Parallel Consensus Protocol

Canopus: A Scalable and Massively Parallel Consensus Protocol Canopus: A Scalable and Massively Parallel Consensus Protocol (Extended report) Sajjad Rizvi, Bernard Wong, Srinivasan Keshav Cheriton School of Computer Science University of Waterloo, 200 University

More information

Lecture XII: Replication

Lecture XII: Replication Lecture XII: Replication CMPT 401 Summer 2007 Dr. Alexandra Fedorova Replication 2 Why Replicate? (I) Fault-tolerance / High availability As long as one replica is up, the service is available Assume each

More information

Zyzzyva. Speculative Byzantine Fault Tolerance. Ramakrishna Kotla. L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin

Zyzzyva. Speculative Byzantine Fault Tolerance. Ramakrishna Kotla. L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin Zyzzyva Speculative Byzantine Fault Tolerance Ramakrishna Kotla L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin The Goal Transform high-performance service into high-performance

More information

Distributed Consensus: Making Impossible Possible

Distributed Consensus: Making Impossible Possible Distributed Consensus: Making Impossible Possible QCon London Tuesday 29/3/2016 Heidi Howard PhD Student @ University of Cambridge heidi.howard@cl.cam.ac.uk @heidiann360 What is Consensus? The process

More information

State Machine Replication is More Expensive than Consensus

State Machine Replication is More Expensive than Consensus State Machine Replication is More Expensive than Consensus Karolos Antoniadis EPFL Rachid Guerraoui EPFL Dahlia Malkhi VMware Research Group Dragos-Adrian Seredinschi EPFL Abstract Consensus and State

More information

Applications of Paxos Algorithm

Applications of Paxos Algorithm Applications of Paxos Algorithm Gurkan Solmaz COP 6938 - Cloud Computing - Fall 2012 Department of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL Oct 15, 2012 1

More information

Distributed DNS Name server Backed by Raft

Distributed DNS Name server Backed by Raft Distributed DNS Name server Backed by Raft Emre Orbay Gabbi Fisher December 13, 2017 Abstract We implement a asynchronous distributed DNS server backed by Raft. Our DNS server system supports large cluster

More information

Distributed Systems. Lec 9: Distributed File Systems NFS, AFS. Slide acks: Dave Andersen

Distributed Systems. Lec 9: Distributed File Systems NFS, AFS. Slide acks: Dave Andersen Distributed Systems Lec 9: Distributed File Systems NFS, AFS Slide acks: Dave Andersen (http://www.cs.cmu.edu/~dga/15-440/f10/lectures/08-distfs1.pdf) 1 VFS and FUSE Primer Some have asked for some background

More information

Computer Network Fundamentals Spring Week 3 MAC Layer Andreas Terzis

Computer Network Fundamentals Spring Week 3 MAC Layer Andreas Terzis Computer Network Fundamentals Spring 2008 Week 3 MAC Layer Andreas Terzis Outline MAC Protocols MAC Protocol Examples Channel Partitioning TDMA/FDMA Token Ring Random Access Protocols Aloha and Slotted

More information

Paxos Replicated State Machines as the Basis of a High- Performance Data Store

Paxos Replicated State Machines as the Basis of a High- Performance Data Store Paxos Replicated State Machines as the Basis of a High- Performance Data Store William J. Bolosky, Dexter Bradshaw, Randolph B. Haagens, Norbert P. Kusters and Peng Li March 30, 2011 Q: How to build a

More information

arxiv: v2 [cs.dc] 12 Sep 2017

arxiv: v2 [cs.dc] 12 Sep 2017 Efficient Synchronous Byzantine Consensus Ittai Abraham 1, Srinivas Devadas 2, Danny Dolev 3, Kartik Nayak 4, and Ling Ren 2 arxiv:1704.02397v2 [cs.dc] 12 Sep 2017 1 VMware Research iabraham@vmware.com

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

Erasure Coding in Object Stores: Challenges and Opportunities

Erasure Coding in Object Stores: Challenges and Opportunities Erasure Coding in Object Stores: Challenges and Opportunities Lewis Tseng Boston College July 2018, PODC Acknowledgements Nancy Lynch Muriel Medard Kishori Konwar Prakash Narayana Moorthy Viveck R. Cadambe

More information

Distributed Systems. Fault Tolerance. Paul Krzyzanowski

Distributed Systems. Fault Tolerance. Paul Krzyzanowski Distributed Systems Fault Tolerance Paul Krzyzanowski Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License. Faults Deviation from expected

More information

PRIMARY-BACKUP REPLICATION

PRIMARY-BACKUP REPLICATION PRIMARY-BACKUP REPLICATION Primary Backup George Porter Nov 14, 2018 ATTRIBUTION These slides are released under an Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) Creative Commons

More information

Comparison of Different Implementations of Paxos for Distributed Database Usage. Proposal By. Hanzi Li Shuang Su Xiaoyu Li

Comparison of Different Implementations of Paxos for Distributed Database Usage. Proposal By. Hanzi Li Shuang Su Xiaoyu Li Comparison of Different Implementations of Paxos for Distributed Database Usage Proposal By Hanzi Li Shuang Su Xiaoyu Li 12/08/2015 COEN 283 Santa Clara University 1 Abstract This study aims to simulate

More information

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Modified by: Dr. Ramzi Saifan Definition of a Distributed System (1) A distributed

More information

Tolerating Latency in Replicated State Machines through Client Speculation

Tolerating Latency in Replicated State Machines through Client Speculation Tolerating Latency in Replicated State Machines through Client Speculation April 22, 2009 1, James Cowling 2, Edmund B. Nightingale 3, Peter M. Chen 1, Jason Flinn 1, Barbara Liskov 2 University of Michigan

More information