HP: Hybrid Paxos for WANs
|
|
- Mercy Marshall
- 5 years ago
- Views:
Transcription
1 HP: Hybrid Paxos for WANs Dan Dobre, Matthias Majuntke, Marco Serafini and Neeraj Suri TU Darmstadt, Germany Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group
2 Safety Critical Systems Resilience against catastrophic failures State Machine Replication Resilience of Critical Services Illusion of a single server that never fails Wide Area Replication Large and unpredictable delays in WANs latency-optimal protocol clients request server SMR clients request no reply reply n 2t+1 replicas EDCC, Valencia, May 18, 2010 Matthias Majuntke 2
3 Which Consensus Protocol State Machine Replication (SMR) Clients propose commands to replicas Agreement on sequence of commands replicas are in consistent state when executing command sequence Consensus protocol needed Latency-optimal protocols Latency: #message delays between when client proposes command and when command is learned by learner (to be executed). Two Protocols by Lamport Classic Paxos (CP) 3 message delays (during normal operation) Majority quorum for recovery Fast Paxos (FP) 2 message delays (during normal operation) message delays in presence of collisions Larger quorum for recovery Client Leader Acceptors Client Client Acceptors Client EDCC, Valencia, May 18, 2010 Matthias Majuntke 3
4 Paxos vs. Fast Paxos Compared Latency Planetlab Experiments Simulation of the CP and FP msg. patterns (different topologies) FP not always faster than CP Some clients prefer CP, some FP Single crash can turn setting EDCC, Valencia, May 18, 2010 Matthias Majuntke 4
5 Motivation for a Hybrid Protocol No clear winner between CP and FP With respect to latency Hybrid Protocol: Hybrid Paxos (HP) Runs CP and FP in parallel Chooses quickest outcome of two protocols Implements Generalized Consensus Commuting commands may be chosen in any order Does not negatively affect throughput FP mode switched off when not beneficial EDCC, Valencia, May 18, 2010 Matthias Majuntke 5
6 Outline of the Talk Contribution System Model Background on Paxos and Generalized Consensus Hybrid Paxos protocol Evaluation Discussion Conclusion EDCC, Valencia, May 18, 2010 Matthias Majuntke 6
7 Contribution Hybrid Paxos (HP) CP with additional fast mode Fast learning in absence of collisions 3 msg delays as CP in presence of collisions Latency optimal 2f+1 servers, f may crash (optimal) Linear number of messages (optimal) First efficient implementation of Generalized Consensus Experiments using Emulab HP reaches theoretical minimum of latency HP does not negatively affect throughput EDCC, Valencia, May 18, 2010 Matthias Majuntke 7
8 System Model Distributed System n servers Any number of clients (may crash) Communication via reliable FIFO channels Crash-stop model clients At most minority of servers fails (n 2f+1), f = #crashes Asynchrony ΩΩ Failure detector (eventually outputs same correct leader) servers Generalized Consensus Command History Equivalence class of command sequences Sequences c 1 and c 2 are equivalent iff executing them produces same outputs and state commuting commands EDCC, Valencia, May 18, 2010 Matthias Majuntke 8
9 Background on Generalized Consensus Protocol operates on command history = equivalence class of command sequences Terms on histories Prefix relation on histories glb of histories (largest common prefix, intersection) lub of histories (smallest common extension, union) h and h compatible iff exists g: h g, h g Definition of Generalized Consensus Consistency: every two learned histories are compatible. Nontriviality: if history is chosen than all contained commands have been proposed. Conservatism: if history h is learned, then h was chosen. Progress: if command c is proposed, eventually a history containing c is learned. EDCC, Valencia, May 18, 2010 Matthias Majuntke 9
10 Background on Paxos Family Following holds for CP, FP, and HP Clients are proposers and learners Servers are acceptors Cooperate to choose single comand history Acceptors query ΩΩ and elect leader among them Unique Leader needed for progress only Paxos * protocols operate in rounds Each leader is preassigned a set of round numbers Operation modes Recovery, to change rounds (must ensure consistency) Normal operation Quorums of acceptors CP: any two quorums intersect FP: requires larger fast quorums FQ n- FQ +1 intersection of quorum and fast quorum FQ is larger than n- FQ n- FQ EDCC, Valencia, May 18, 2010 Matthias Majuntke 10
11 CP and FP Message Patterns cl Recovery (all protocols) Normal Operation of CP ld propose 2b 1a 1b 2a 2b 2a 2b acc Phase 1 Phase 2 Normal Operation of FP cl ld acc 2bfast chosen propose 2bfast Fast mode 2a 1a 1b 2b Recovery from collision EDCC, Valencia, May 18, 2010 Matthias Majuntke 11
12 Ideas behind Message Patterns Normal Operation CP Client sends proposal (command) to leader Leader appends command to history and sends history to acceptors (2a) Acceptors accept history as local history Acceptors send history back to client (2b) Normal Operation FP Client sends proposal to acceptors Acceptors append commands to local fast history (optimistic) Acceptors send history back to client (and leader) (2bfast) Collision Recovery triggered by Leader Recovery (to start a new round) Phase 1: initialized by new leader (1a) Acceptors send local histories to leader (1b) Core of Leader determines chosen history protocol Phase 2: Leader synchronizes acceptors to chosen history (2a) Reply to clients (2b) EDCC, Valencia, May 18, 2010 Matthias Majuntke 12
13 Combining the two protocols CP HP FP cl ld propose 2b 2bfast 2bfast chosen acc 2a 2b propose 2bfast Execute CP and FP pattern in parallel CP with additional FP mode Acceptors locally maintain fast and classic history History from ld as classic history Commands from cl appended to fast history No naïve combination Clients learn either by receiving Quorum of equal 2b messages (learn( learn) Fast Quorum of equal 2bfast messages and one 2b message (hybrid learn) Needed also in FP for speculative execution EDCC, Valencia, May 18, 2010 Matthias Majuntke 13
14 Same message pattern Hybrid Recovery Acceptors maintain separate histories Classic history Fast history Leader perform CP and FP like recoveries in parallel Determines history fh from FP recovery Determines history h from CP recovery Problem: h and fh might be incompatible (no common extension) Determine largest prefix pfh of fh which is compatible with h Pick lub of pfh and h (smallest common extension) Why is this correct (sufficient for Consistency)? To show: any history lh learned by hybrid learn is prefix of pfh. lh fh, and all prefixes of fh compatible with h are prefixes of pfh Sufficient to show: lh compatible with h By hybrid learning: some acceptor holds lh as classic history lh and h have been sent by leader lh and h are compatible Neither h nor fh sufficient Goal: lub of h and fh EDCC, Valencia, May 18, 2010 Matthias Majuntke 14
15 Optimization 1 (msg complexity) Implementation Optimization Leader does not send entire history to acceptors (2a) FIFO channels Optimization 2 (execution) Implementing state machine at servers Only leader executes commands (speculatively) Prevents rollbacks at acceptors Clients receive history digests + result Optimization 3 (latency) Diverging fast and classic histories during normal mode prevents hybrid learning Periodically acceptors locally align fh to h (as in hybrid recovery) Optimization 4 (throughput) FP mode switched off during high load Leader monitors load Also true for FP EDCC, Valencia, May 18, 2010 Matthias Majuntke 15
16 Evaluation Experimental setting Banking system, two operations deposit and withdraw deposit operations are commutable (Generalized Consensus) Emulab test bed 20ms link delay between client and servers, 100Mbps Topology similar to Europe topology from beginning of presentation Servers 600Mhz PC, Fedora 6 EDCC, Valencia, May 18, 2010 Matthias Majuntke 16
17 Latency Latency of HP with varying withdraw rate = probability of collisions EDCC, Valencia, May 18, 2010 Matthias Majuntke Latency vs throughput (with and w/o batching) 17
18 Throughput Throughput with increasing clients EDCC, Valencia, May 18, 2010 Matthias Majuntke 18 Throughput with increasing number of f
19 Related Work [Lamport: ACM Computer 1998] The Part-Time parliament [Lamport: Dist. Comp. 2006] Fast Paxos [Lamport: TR2005] Generalized Consensus and Paxos [Dobre, Suri DSN2006] One-step Consensus with Zero-degradation [Charron-Bost, Schiper: PRDC2006] Improving Fast Paxos: Being Optimal with no Overhead Minimum latency of FP and CP only in failure-free runs [Camargos, Schmidt, Pedone: NCA2008] Mulitcoordinated Agreement Protocols for Higher Availability Improved availability of CP by multiple leaders; collision resolution req. [Zielinski: DISC2005] Optimistic Generic Broadcast Parallel execution of CP and FP; not resilience optimal; quadratic msg complexity [Mao, Junqueira, Marzullo: OSDI2008] Mencius: Building Efficient Replicated State Machine for WANs Based on CP; partition consensus instances among several leaders (throughput) Each client has LAN connection to one leader (latency) Perfect failure detector needed EDCC, Valencia, May 18, 2010 Matthias Majuntke 19
20 Comparison to CP Implements CP Never worse than CP Discussion FP mode switched off when leader is highly loaded Comparison to FP HP and FP need 2 msg delays in absence of collisions HP needs 3, FP needs 6 msg delays in presence of collisions Experiments: Collision rate grows faster than server utilization rate Servers underutilized when hybrid learning rate below 50% FP would spend >50% of the time recovering from collisions Optimizations Batching possible Increasing throughput by a magnitude EDCC, Valencia, May 18, 2010 Matthias Majuntke 20
21 HP: Hybrid Paxos Idea: add fast learning to Paxos Generalized Consensus protocol Summary First protocol with 2 msg delays in absence of collisions and 3 msg delays otherwise Optimal latency, resilience and number of messages Generalized Consensus is practical approach for WAN replication HP can outperform state of the art protocols HP is a Generalized Consensus protocol that features minimal latency and maximum throughput in most situations! EDCC, Valencia, May 18, 2010 Matthias Majuntke 21
22 Thank you for your attention! Questions? EDCC, Valencia, May 18, 2010 Matthias Majuntke 22
Replicated State Machine in Wide-area Networks
Replicated State Machine in Wide-area Networks Yanhua Mao CSE223A WI09 1 Building replicated state machine with consensus General approach to replicate stateful deterministic services Provide strong consistency
More informationReducing the Costs of Large-Scale BFT Replication
Reducing the Costs of Large-Scale BFT Replication Marco Serafini & Neeraj Suri TU Darmstadt, Germany Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de
More informationMencius: Another Paxos Variant???
Mencius: Another Paxos Variant??? Authors: Yanhua Mao, Flavio P. Junqueira, Keith Marzullo Presented by Isaiah Mayerchak & Yijia Liu State Machine Replication in WANs WAN = Wide Area Network Goals: Web
More informationFast Paxos Made Easy: Theory and Implementation
International Journal of Distributed Systems and Technologies, 6(1), 15-33, January-March 2015 15 Fast Paxos Made Easy: Theory and Implementation Wenbing Zhao, Department of Electrical and Computer Engineering,
More informationMENCIUS: BUILDING EFFICIENT
MENCIUS: BUILDING EFFICIENT STATE MACHINE FOR WANS By: Yanhua Mao Flavio P. Junqueira Keith Marzullo Fabian Fuxa, Chun-Yu Hsiung November 14, 2018 AGENDA 1. Motivation 2. Breakthrough 3. Rules of Mencius
More informationWhat is Distributed Storage Good For?
Efficient Robust Storage using Secret Tokens Dan Dobre, Matthias Majuntke, Marco Serafini and Neeraj Suri Dependable Embedded Systems & SW Group Neeraj Suri www.deeds.informatik.tu-darmstadt.de EU-NSF
More informationAGREEMENT PROTOCOLS. Paxos -a family of protocols for solving consensus
AGREEMENT PROTOCOLS Paxos -a family of protocols for solving consensus OUTLINE History of the Paxos algorithm Paxos Algorithm Family Implementation in existing systems References HISTORY OF THE PAXOS ALGORITHM
More informationSpecPaxos. James Connolly && Harrison Davis
SpecPaxos James Connolly && Harrison Davis Overview Background Fast Paxos Traditional Paxos Implementations Data Centers Mostly-Ordered-Multicast Network layer Speculative Paxos Protocol Application layer
More informationJust Say NO to Paxos Overhead: Replacing Consensus with Network Ordering
Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering Jialin Li, Ellis Michael, Naveen Kr. Sharma, Adriana Szekeres, Dan R. K. Ports Server failures are the common case in data centers
More informationRecovering from a Crash. Three-Phase Commit
Recovering from a Crash If INIT : abort locally and inform coordinator If Ready, contact another process Q and examine Q s state Lecture 18, page 23 Three-Phase Commit Two phase commit: problem if coordinator
More informationZyzzyva. Speculative Byzantine Fault Tolerance. Ramakrishna Kotla. L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin
Zyzzyva Speculative Byzantine Fault Tolerance Ramakrishna Kotla L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin The Goal Transform high-performance service into high-performance
More informationHT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers
1 HT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers Vinit Kumar 1 and Ajay Agarwal 2 1 Associate Professor with the Krishna Engineering College, Ghaziabad, India.
More informationPaxos and Replication. Dan Ports, CSEP 552
Paxos and Replication Dan Ports, CSEP 552 Today: achieving consensus with Paxos and how to use this to build a replicated system Last week Scaling a web service using front-end caching but what about the
More informationToday: Fault Tolerance
Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing
More informationToday: Fault Tolerance. Fault Tolerance
Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing
More informationExploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout
Exploiting Commutativity For Practical Fast Replication Seo Jin Park and John Ousterhout Overview Problem: consistent replication adds latency and throughput overheads Why? Replication happens after ordering
More informationPaxos. Sistemi Distribuiti Laurea magistrale in ingegneria informatica A.A Leonardo Querzoni. giovedì 19 aprile 12
Sistemi Distribuiti Laurea magistrale in ingegneria informatica A.A. 2011-2012 Leonardo Querzoni The Paxos family of algorithms was introduced in 1999 to provide a viable solution to consensus in asynchronous
More informationEnhancing Throughput of
Enhancing Throughput of NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing Throughput of Partially Replicated State Machines via NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing
More informationFast Paxos. Leslie Lamport
Distrib. Comput. (2006) 19:79 103 DOI 10.1007/s00446-006-0005-x ORIGINAL ARTICLE Fast Paxos Leslie Lamport Received: 20 August 2005 / Accepted: 5 April 2006 / Published online: 8 July 2006 Springer-Verlag
More informationProseminar Distributed Systems Summer Semester Paxos algorithm. Stefan Resmerita
Proseminar Distributed Systems Summer Semester 2016 Paxos algorithm stefan.resmerita@cs.uni-salzburg.at The Paxos algorithm Family of protocols for reaching consensus among distributed agents Agents may
More informationWeak Consistency as a Last Resort
Weak Consistency as a Last Resort Marco Serafini and Flavio Junqueira Yahoo! Research Barcelona, Spain { serafini, fpj }@yahoo-inc.com ABSTRACT It is well-known that using a replicated service requires
More informationBuilding Consistent Transactions with Inconsistent Replication
Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports University of Washington Distributed storage systems
More informationThere Is More Consensus in Egalitarian Parliaments
There Is More Consensus in Egalitarian Parliaments Iulian Moraru, David Andersen, Michael Kaminsky Carnegie Mellon University Intel Labs Fault tolerance Redundancy State Machine Replication 3 State Machine
More informationCoordinating distributed systems part II. Marko Vukolić Distributed Systems and Cloud Computing
Coordinating distributed systems part II Marko Vukolić Distributed Systems and Cloud Computing Last Time Coordinating distributed systems part I Zookeeper At the heart of Zookeeper is the ZAB atomic broadcast
More informationSynchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors. Michel Raynal, Julien Stainer
Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors Michel Raynal, Julien Stainer Synchrony Weakened by Message Adversaries vs Asynchrony Enriched with Failure Detectors
More informationTheoretical Computer Science
Theoretical Computer Science 496 (2013) 170 183 Contents lists available at SciVerse ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Optimizing Paxos with batching
More informationBeyond FLP. Acknowledgement for presentation material. Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen
Beyond FLP Acknowledgement for presentation material Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen Paper trail blog: http://the-paper-trail.org/blog/consensus-protocols-paxos/
More informationA Formal Model of Crash Recovery in Distributed Software Transactional Memory (Extended Abstract)
A Formal Model of Crash Recovery in Distributed Software Transactional Memory (Extended Abstract) Paweł T. Wojciechowski, Jan Kończak Poznań University of Technology 60-965 Poznań, Poland {Pawel.T.Wojciechowski,Jan.Konczak}@cs.put.edu.pl
More information10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein
10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright 2012 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4.
More informationIntuitive distributed algorithms. with F#
Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype
More informationSDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines
SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines Hanyu Zhao *, Quanlu Zhang, Zhi Yang *, Ming Wu, Yafei Dai * * Peking University Microsoft Research Replication for Fault Tolerance
More informationS-Paxos: Offloading the Leader for High Throughput State Machine Replication
212 31st International Symposium on Reliable Distributed Systems S-: Offloading the Leader for High Throughput State Machine Replication Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole
More informationGenerating Fast Indulgent Algorithms
Generating Fast Indulgent Algorithms Dan Alistarh 1, Seth Gilbert 2, Rachid Guerraoui 1, and Corentin Travers 3 1 EPFL, Switzerland 2 National University of Singapore 3 Université de Bordeaux 1, France
More informationData Consistency and Blockchain. Bei Chun Zhou (BlockChainZ)
Data Consistency and Blockchain Bei Chun Zhou (BlockChainZ) beichunz@cn.ibm.com 1 Data Consistency Point-in-time consistency Transaction consistency Application consistency 2 Strong Consistency ACID Atomicity.
More informationDistributed Systems Consensus
Distributed Systems Consensus Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Consensus 1393/6/31 1 / 56 What is the Problem?
More informationAsynchronous Reconfiguration for Paxos State Machines
Asynchronous Reconfiguration for Paxos State Machines Leander Jehl and Hein Meling Department of Electrical Engineering and Computer Science University of Stavanger, Norway Abstract. This paper addresses
More informationCheap Paxos. Leslie Lamport and Mike Massa. Appeared in The International Conference on Dependable Systems and Networks (DSN 2004)
Cheap Paxos Leslie Lamport and Mike Massa Appeared in The International Conference on Dependable Systems and Networks (DSN 2004) Cheap Paxos Leslie Lamport and Mike Massa Microsoft Abstract Asynchronous
More informationConsensus and related problems
Consensus and related problems Today l Consensus l Google s Chubby l Paxos for Chubby Consensus and failures How to make process agree on a value after one or more have proposed what the value should be?
More informationDesigning Distributed Systems using Approximate Synchrony in Data Center Networks
Designing Distributed Systems using Approximate Synchrony in Data Center Networks Dan R. K. Ports Jialin Li Naveen Kr. Sharma Vincent Liu Arvind Krishnamurthy University of Washington CSE Today s most
More informationTAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton
TAPIR By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton Outline Problem Space Inconsistent Replication TAPIR Evaluation Conclusion Problem
More informationByzantine fault tolerance. Jinyang Li With PBFT slides from Liskov
Byzantine fault tolerance Jinyang Li With PBFT slides from Liskov What we ve learnt so far: tolerate fail-stop failures Traditional RSM tolerates benign failures Node crashes Network partitions A RSM w/
More information10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein. Copyright 2003 Philip A. Bernstein. Outline
10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Copyright 2003 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4. Other Approaches
More informationDistributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf
Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need
More informationPaxos Made Simple. Leslie Lamport, 2001
Paxos Made Simple Leslie Lamport, 2001 The Problem Reaching consensus on a proposed value, among a collection of processes Safety requirements: Only a value that has been proposed may be chosen Only a
More informationDistributed Systems 11. Consensus. Paul Krzyzanowski
Distributed Systems 11. Consensus Paul Krzyzanowski pxk@cs.rutgers.edu 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value must be one
More informationThe Performance of Paxos and Fast Paxos
27º Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos 291 The Performance of and Fast Gustavo M. D. Vieira 1, Luiz E. Buzato 1 1 Instituto de Computação, Unicamp Caixa Postal 6176 1383-97
More informationCS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.
Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message
More informationGroup Replication: A Journey to the Group Communication Core. Alfranio Correia Principal Software Engineer
Group Replication: A Journey to the Group Communication Core Alfranio Correia (alfranio.correia@oracle.com) Principal Software Engineer 4th of February Copyright 7, Oracle and/or its affiliates. All rights
More informationBe General and Don t Give Up Consistency in Geo- Replicated Transactional Systems
Be General and Don t Give Up Consistency in Geo- Replicated Transactional Systems Alexandru Turcu, Sebastiano Peluso, Roberto Palmieri and Binoy Ravindran Replicated Transactional Systems DATA CONSISTENCY
More informationMDCC MULTI DATA CENTER CONSISTENCY. amplab. Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete
MDCC MULTI DATA CENTER CONSISTENCY Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete gpang@cs.berkeley.edu amplab MOTIVATION 2 3 June 2, 200: Rackspace power outage of approximately 0
More informationThe Distributed Coordination Engine (DConE) TECHNICAL WHITE PAPER
The Distributed Coordination Engine (DConE) TECHNICAL WHITE PAPER Table of Contents Introduction... 1 Distributed Transaction Processing with DConE...2 The Paxos Algorithm... 2 Achieving Consensus with
More informationBIG DATA AND CONSISTENCY. Amy Babay
BIG DATA AND CONSISTENCY Amy Babay Outline Big Data What is it? How is it used? What problems need to be solved? Replication What are the options? Can we use this to solve Big Data s problems? Putting
More informationCS October 2017
Atomic Transactions Transaction An operation composed of a number of discrete steps. Distributed Systems 11. Distributed Commit Protocols All the steps must be completed for the transaction to be committed.
More informationDistributed Consensus: Making Impossible Possible
Distributed Consensus: Making Impossible Possible QCon London Tuesday 29/3/2016 Heidi Howard PhD Student @ University of Cambridge heidi.howard@cl.cam.ac.uk @heidiann360 What is Consensus? The process
More informationAS distributed systems develop and grow in size,
1 hbft: Speculative Byzantine Fault Tolerance With Minimum Cost Sisi Duan, Sean Peisert, Senior Member, IEEE, and Karl N. Levitt Abstract We present hbft, a hybrid, Byzantine fault-tolerant, ted state
More informationErasure Coding in Object Stores: Challenges and Opportunities
Erasure Coding in Object Stores: Challenges and Opportunities Lewis Tseng Boston College July 2018, PODC Acknowledgements Nancy Lynch Muriel Medard Kishori Konwar Prakash Narayana Moorthy Viveck R. Cadambe
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions Transactions Main issues: Concurrency control Recovery from failures 2 Distributed Transactions
More informationCoordination and Agreement
Coordination and Agreement Nicola Dragoni Embedded Systems Engineering DTU Informatics 1. Introduction 2. Distributed Mutual Exclusion 3. Elections 4. Multicast Communication 5. Consensus and related problems
More informationMaking Fast Consensus Generally Faster
Making Fast Consensus Generally Faster [Technical Report] Sebastiano Peluso Virginia Tech peluso@vt.edu Alexandru Turcu Virginia Tech talex@vt.edu Roberto Palmieri Virginia Tech robertop@vt.edu Giuliano
More informationDistributed Consensus: Making Impossible Possible
Distributed Consensus: Making Impossible Possible Heidi Howard PhD Student @ University of Cambridge heidi.howard@cl.cam.ac.uk @heidiann360 hh360.user.srcf.net Sometimes inconsistency is not an option
More informationCSE 486/586 Distributed Systems
CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo CSE 486/586 Recap: Consensus On a synchronous system There s an algorithm that works. On
More informationPaxos and Raft (Lecture 21, cs262a) Ion Stoica, UC Berkeley November 7, 2016
Paxos and Raft (Lecture 21, cs262a) Ion Stoica, UC Berkeley November 7, 2016 Bezos mandate for service-oriented-architecture (~2002) 1. All teams will henceforth expose their data and functionality through
More informationFailures, Elections, and Raft
Failures, Elections, and Raft CS 8 XI Copyright 06 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved. Distributed Banking SFO add interest based on current balance PVD deposit $000 CS 8 XI Copyright
More informationDistributed Systems. 10. Consensus: Paxos. Paul Krzyzanowski. Rutgers University. Fall 2017
Distributed Systems 10. Consensus: Paxos Paul Krzyzanowski Rutgers University Fall 2017 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value
More informationEfficient and Scalable Replication of Services over Wide-Area Networks
Efficient and Scalable Replication of Services over Wide-Area Networks Thesis by Abdallah Abouzamazem In Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy University of Newcastle
More informationDistributed Systems. coordination Johan Montelius ID2201. Distributed Systems ID2201
Distributed Systems ID2201 coordination Johan Montelius 1 Coordination Coordinating several threads in one node is a problem, coordination in a network is of course worse: failure of nodes and networks
More informationDistributed Commit in Asynchronous Systems
Distributed Commit in Asynchronous Systems Minsoo Ryu Department of Computer Science and Engineering 2 Distributed Commit Problem - Either everybody commits a transaction, or nobody - This means consensus!
More informationApplications of Paxos Algorithm
Applications of Paxos Algorithm Gurkan Solmaz COP 6938 - Cloud Computing - Fall 2012 Department of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL Oct 15, 2012 1
More informationJPaxos: State machine replication based on the Paxos protocol
JPaxos: State machine replication based on the Paxos protocol Jan Kończak 2, Nuno Santos 1, Tomasz Żurkowski 2, Paweł T. Wojciechowski 2, and André Schiper 1 1 EPFL, Switzerland 2 Poznań University of
More informationPractical Byzantine Fault Tolerance
Practical Byzantine Fault Tolerance Robert Grimm New York University (Partially based on notes by Eric Brewer and David Mazières) The Three Questions What is the problem? What is new or different? What
More informationSelf-healing Data Step by Step
Self-healing Data Step by Step Uwe Friedrichsen (codecentric AG) NoSQL matters Cologne, 29. April 2014 @ufried Uwe Friedrichsen uwe.friedrichsen@codecentric.de http://slideshare.net/ufried http://ufried.tumblr.com
More informationDistributed Consensus Protocols
Distributed Consensus Protocols ABSTRACT In this paper, I compare Paxos, the most popular and influential of distributed consensus protocols, and Raft, a fairly new protocol that is considered to be a
More informationVive La Différence: Paxos vs. Viewstamped Replication vs. Zab
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, MANUSCRIPT ID 1 Vive La Différence: Paxos vs. Viewstamped Replication vs. Zab Robbert van Renesse, Nicolas Schiper, Fred B. Schneider, Fellow, IEEE
More informationAssignment 12: Commit Protocols and Replication Solution
Data Modelling and Databases Exercise dates: May 24 / May 25, 2018 Ce Zhang, Gustavo Alonso Last update: June 04, 2018 Spring Semester 2018 Head TA: Ingo Müller Assignment 12: Commit Protocols and Replication
More informationRecall our 2PC commit problem. Recall our 2PC commit problem. Doing failover correctly isn t easy. Consensus I. FLP Impossibility, Paxos
Consensus I Recall our 2PC commit problem FLP Impossibility, Paxos Client C 1 C à TC: go! COS 418: Distributed Systems Lecture 7 Michael Freedman Bank A B 2 TC à A, B: prepare! 3 A, B à P: yes or no 4
More informationMulti-Ring Paxos /12/$ IEEE. Marco Primi University of Lugano (USI) Switzerland
Multi-Ring Paxos Parisa Jalili Marandi University of Lugano (USI) Switzerland Marco Primi University of Lugano (USI) Switzerland Fernando Pedone University of Lugano (USI) Switzerland Abstract This paper
More informationFast Atomic Multicast
Università della Svizzera italiana USI Technical Report Series in Informatics Fast Atomic Multicast Paulo R. Coelho 1, Nicolas Schiper 2, Fernando Pedone 1 1 Faculty of Informatics, Università della Svizzera
More informationHigh performance recovery for parallel state machine replication
High performance recovery for parallel state machine replication Odorico M. Mendizabal and Fernando Luís Dotti and Fernando Pedone Universidade Federal do Rio Grande (FURG), Rio Grande, Brazil Pontifícia
More informationRobust BFT Protocols
Robust BFT Protocols Sonia Ben Mokhtar, LIRIS, CNRS, Lyon Joint work with Pierre Louis Aublin, Grenoble university Vivien Quéma, Grenoble INP 18/10/2013 Who am I? CNRS reseacher, LIRIS lab, DRIM research
More informationStrong Consistency at Scale
Strong Consistency at Scale Carlos Eduardo Bezerra University of Lugano (USI) Switzerland Le Long Hoang University of Lugano (USI) Switzerland Fernando Pedone University of Lugano (USI) Switzerland Abstract
More informationPractical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov
Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change
More informationScaling Byzantine Fault-tolerant Replication to Wide Area Networks
Scaling Byzantine Fault-tolerant Replication to Wide Area Networks Cristina Nita-Rotaru Dependable and Secure Distributed Systems Lab Department of Computer Science and CERIAS Purdue University Department
More informationConsensus Problem. Pradipta De
Consensus Problem Slides are based on the book chapter from Distributed Computing: Principles, Paradigms and Algorithms (Chapter 14) by Kshemkalyani and Singhal Pradipta De pradipta.de@sunykorea.ac.kr
More informationWhen You Don t Trust Clients: Byzantine Proposer Fast Paxos
2012 32nd IEEE International Conference on Distributed Computing Systems When You Don t Trust Clients: Byzantine Proposer Fast Paxos Hein Meling, Keith Marzullo, and Alessandro Mei Department of Electrical
More informationRevisiting Fast Practical Byzantine Fault Tolerance
Revisiting Fast Practical Byzantine Fault Tolerance Ittai Abraham, Guy Gueta, Dahlia Malkhi VMware Research with: Lorenzo Alvisi (Cornell), Rama Kotla (Amazon), Jean-Philippe Martin (Verily) December 4,
More informationSpecula(ng Seriously. Rachid Guerraoui, EPFL
Specula(ng Seriously Rachid Guerraoui, EPFL The World is turning IT IT is turning distributed Everybody should come to disc/podc But some don t Indeed theory scares pracbboners But wait, there is more
More informationDistributed Systems 8L for Part IB
Distributed Systems 8L for Part IB Handout 3 Dr. Steven Hand 1 Distributed Mutual Exclusion In first part of course, saw need to coordinate concurrent processes / threads In particular considered how to
More informationA Low-latency Consensus Algorithm for Geographically Distributed Systems
A Low-latency Consensus Algorithm for Geographically Distributed Systems Balaji Arun Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of
More informationHigh Performance State-Machine Replication
High Performance State-Machine Replication Parisa Jalili Marandi University of Lugano (USI) Switzerland Marco Primi University of Lugano (USI) Switzerland Fernando Pedone University of Lugano (USI) Switzerland
More informationLocal Recovery for High Availability in Strongly Consistent Cloud Services
IEEE TRANSACTION ON DEPENDABLE AND SECURE COMPUTING, VOL. X, NO. Y, JANUARY 2013 1 Local Recovery for High Availability in Strongly Consistent Cloud Services James W. Anderson, Hein Meling, Alexander Rasmussen,
More informationIntroduction to Distributed Systems Seif Haridi
Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send
More informationEMPIRICAL STUDY OF UNSTABLE LEADERS IN PAXOS LONG KAI THESIS
2013 Long Kai EMPIRICAL STUDY OF UNSTABLE LEADERS IN PAXOS BY LONG KAI THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science in the Graduate
More informationDfinity Consensus, Explored
Dfinity Consensus, Explored Ittai Abraham, Dahlia Malkhi, Kartik Nayak, and Ling Ren VMware Research {iabraham,dmalkhi,nkartik,lingren}@vmware.com Abstract. We explore a Byzantine Consensus protocol called
More informationEvaluating BFT Protocols for Spire
Evaluating BFT Protocols for Spire Henry Schuh & Sam Beckley 600.667 Advanced Distributed Systems & Networks SCADA & Spire Overview High-Performance, Scalable Spire Trusted Platform Module Known Network
More informationReplication in Distributed Systems
Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over
More informationFast Byzantine Consensus
Fast Byzantine Consensus Jean-Philippe Martin, Lorenzo Alvisi Department of Computer Sciences The University of Texas at Austin Email: {jpmartin, lorenzo}@cs.utexas.edu Abstract We present the first consensus
More informationAgreement in Distributed Systems CS 188 Distributed Systems February 19, 2015
Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Page 1 Introduction We frequently want to get a set of nodes in a distributed system to agree Commitment protocols and mutual
More informationLarge-Scale Key-Value Stores Eventual Consistency Marco Serafini
Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,
More informationFast Follower Recovery for State Machine Replication
Fast Follower Recovery for State Machine Replication Jinwei Guo 1, Jiahao Wang 1, Peng Cai 1, Weining Qian 1, Aoying Zhou 1, and Xiaohang Zhu 2 1 Institute for Data Science and Engineering, East China
More information