MENCIUS: BUILDING EFFICIENT
|
|
- Jonas Atkins
- 5 years ago
- Views:
Transcription
1 MENCIUS: BUILDING EFFICIENT STATE MACHINE FOR WANS By: Yanhua Mao Flavio P. Junqueira Keith Marzullo Fabian Fuxa, Chun-Yu Hsiung November 14, 2018
2 AGENDA 1. Motivation 2. Breakthrough 3. Rules of Mencius 4. Optimization of Mencius 5. Evaluation 6. Conclusion
3 WIDE AREA NETWORK (WAN) MODEL Model the system with n site Each site contains a server and some clients. server client
4 MOTIVATION: INSUFFICIENCY OF PAXOS Rely on single leader Leader server processes more messages.(cpu) Unbalanced communication pattern limits throughput. Higher latency for clients in non-leader site.
5 COORDINATE PAXOS To save bandwidth From n * (n - 1) to 2 * n for ACK Leader Server 1 prep prep Ack Ack prop prop acc acc Learn Learn Server 2 Leader election Propose value
6 PAXOS: LIMITED THROUGHPUT All server are mutually connected Only links with leader server used.
7 PAXOS: HIGHER LATENCY FOR OTHERS When client in leader site, it took 2 messages transmissions to learn value While client in non-leader site, it took 4 messages transmissions to learn value
8 PAXOS: HIGHER LATENCY FOR NON-LEADER SITE
9 MENCIUS IMPROVEMENT Rotate the leader: Assign each slot to a server blue 0, 3, 6 green 1, 4, 7 yellow 2, 5, 8 For each slot, only assigned server could propose non-no_op value. All servers could propose no_op.
10 ASSUMPTION Crash server recover Unreliable failure detector: Detect failed server Asynchronous FIFO channel: TCP
11 THREE ACTIONS 1. Suggest: Ordinary propose value 2. Skip: Leader itself skip this term 3. Revoke: Other take over the run and propose no-op P2 is leader initially but considered failed
12 FOUR ACTIONS TO HANDLE 1. Propose 2. Accept 3. Fill bubbles 4. Crash server recover
13 PROPOSE Need to know which slot to propose Maintain the next propose slot blue blue 0, 3, 6
14 PROPOSE Need to know which slot to propose Maintain the next propose slot blue 0:v blue 0, 3, 6
15 ACCEPT According to the receive message to adjust the slot. Obey the serializability. blue Index = 0 Receive suggestion v1 for slot 1
16 ACCEPT: CASE 1 Next propose slot above the coming message slot. blue 0:v Index = 0 Index = 0 Receive suggestion v1 for slot 1
17 ACCEPT: CASE 1 Next propose slot above the coming message slot. blue 0:v0 1:v Index = 0 Index = 0 Receive suggestion v1 for slot 1 Accept (1, v)
18 ACCEPT: CASE 2 Next propose slot below the coming message slot. blue Index = 0 Index = 3 Receive suggestion v1 for slot 1 Accept (1, v)
19 ACCEPT: CASE 2 Next propose slot below the coming message slot. blue 0:no_op 1:v Index = 0 Index = 3 Index = 3 SKIP Receive suggestion v1 for slot 1 Accept (1, v) Propose no_op for slot 0
20 FILL BUBBLES Crash server does not propose value. Commit only when no previous bubbles blue 0: v0 1 2: v2 3: v3 4 5:v Gap for slot 1, 4 Cannot not commit Revoke!
21 REVOKE Other server holds an election and takes the leadership. Propose NO_OP for the slots P0 P1 P2
22 FILL BUBBLES Crash server does not broadcast SKIP. Commit only when no previous bubbles Revoke the slots assigned to suspected crash server. Fill the bubbles. blue 0: v0 1: no_op 2: v2 3: v3 4: no_op 5:v Gap for slot 1, 4 Cannot not commit Revoke!
23 blue / yellow 0: v0 1: no_op 2: v2 3: v3 4: no_op 5:v5 6: v6 7 8 SERVER RECOVER Next propose slot is assigned NO_OP by others. Green server 0: v Index = 1 Propose v1 for slot 1
24 blue / yellow 0: v0 1: no_op 2: v2 3: v3 4: no_op 5:v5 6: v6 7 8 SERVER RECOVER Next propose slot is assigned NO_OP by others. Proposed again. Green server 0: v Index = 1 Propose v1 for slot 1 Learn slot 1, 4 Are no_op
25 blue / yellow 0: v0 1: no_op 2: v2 3: v3 4: no_op 5:v5 6: v6 7 8 SERVER RECOVER Next propose slot is assigned NO_OP by others. Proposed again. Green server 0: v0 1: no_op 2 3 4: no_op 5 6 7: v1 8 Index = 1 Index = 7 Index = 7 Propose v1 for slot 1 Learn slot 1, 4 Are no_op Propose v1 for slot 7
26 OPTIMIZATION Worst case: Only one server keep proposing value Other n 1 servers are idle. v0 v3 Index = 0 Index = 3 Index = 3 NO_OP NO_OP NO_OP NO_OP Receive suggestion v1 for slot 1 Accept (1, v) Propose no_op for slot 0
27 OPTIMIZATION Worst case: Only one server keep proposing value Other n 1 servers are idle. Fact: We use the FIFO channel v0 v3 NO_OP NO_OP NO_OP NO_OP
28 ACCEPT INCLUDE SKIP Due to FIFO Leader know no server 1 not proposed value for slot 1, 4 before ACK. blue 0: v0 1 2: v2 3: v3 4 5:v Propose value for slot 6 acc
29 ACCEPT INCLUDE SKIP Due to FIFO Leader know no server 1 not proposed value for slot 1, 4 before ACK. blue 0: v0 1 2: v2 3: v3 4 5:v Propose value for slot 6 acc After ACCEPT, green server update its next propose slot above 6 0: v0 1:no_op 2: v2 3: v3 4:no_op 5:v5 6:v6 7 8
30 PROPOSE INCLUDE SKIP Due to FIFO Server know leader not proposed value for slot 0, 3 before propose for slot 6. green 0 1:v1 2: v2 3 4:v4 5:v Propose value for slot 6 After propose, leader update its next propose slot above 6 green 0: no_op 1:v1 2: v2 3: no_op 4:v4 5:v5 6:v6 7 8 Learned data for green server
31 REVOKE FAULT SERVER MORE Don t revoke slot every time. Server could revoke more slots. How many more slots is a tuned parameters 0: v0 1 2: v : : v0 1: no_op 2: v : v0 1: no_op 2: v2 3 4: no_op 5 6 7:no_op 8
32 STILL NEED SKIP MESSAGE When there are more than two idle servers prop acc acc... prop
33 STILL NEED SKIP MESSAGE blue 0: v0 1:no_op 2:no_op 3: v3 4:no_op 5:no_op 6:v6 7 8 green 0: v0 1:no_op 2 3: v3 4:no_op 5 6:v6 7 8 yellow 0: v0 1 2:no_op 3: v3 4 5:no_op 6:v6 7 8 Idle servers cannot commit slot 3 and slot 6 mutually. Limit the number of SKIP slot by sending SKIP. (Tuned parameter α) Send SKIP periodically. (Tuned parameter τ)
34 OUT OF ORDER COMMIT DELAY Could commit only when previous slot all commit. Delay when concurrent suggest. (For the example, learn y first then x)
35 CONDITION FOR COMMIT DELAY Server 1 PROPOSE y before ACCEPT y Server 0 sent LEARN before ACCEPT y S0 propose x At slot 0 S0 learn x No commit delay Server 0 x x acc acc prop learn P1 learn y Server 1 x y x y S1 propose y At slot 1 S1 learn x x y
36 CONDITION FOR x COMMIT BEFORE y Server 0 PROPOSE x after ACCEPT y x cannot be in the order before y Server 0 P0 propose x x prop acc P0 learn x x Server 1 y P1 propose y At slot 1 y
37 OUT OF ORDER COMMIT DELAY Commit delay happen only when server sent ACCEPT message to others Between sending PROPOSE and LEARN P0 propose x At slot 0 y P0 propose x At slot 1 x x y y P0 learn x x commit delay at most one communication cycle
38 CHOOSING α, τ, AND β Recall α: send if α SKIP messages outstanding (Accelerator 1) τ: send if τ time passed since outstanding SKIP message created (Accelerator 1) β: p revokes q s proposals in range [C q, I p + 2β] if C q < I p + 2β (Optimization 3)
39 CHOOSING τ Should be large enough to amortize SKIP messages But too large == extra commit delay Mencius: τ = 50ms Accelerator 1 generates at most 20 SKIP msg/s Extra delay is at most 50 ms Can occur naturally anyway (packet loss, delay, etc.)
40 CHOOSING α Limits the # of outstanding SKIP messages before servers p and q catch up If τ large enough, SKIP messages can be combined into one Reduces overhead by factor of α Mencius: α = 20 95% cost reduction
41 CHOOSING β Large β: slow recoveries during false suspicion or failure But, overhead of having large β is negligible Update index to next available slot On SUGGEST, other replicas skip turns and catch up (Rule 2) Mencius: β = 100,000 See paper for calculation details
42 EVALUATION Mencius vs. traditional Paxos DETER testbed, TCP, C++ API: PROPOSE(v) ONCOMMIT(v) ISCOMMUTE(u, v) Mencius only, out of order enabled Nagle s algorithm α = 20,τ = 50ms, β = 100,000
43 THROUGHPUT ρ = 4,000 network-bound Mencius: 1,550 ops (82.7% utilization) Paxos: 540 ops ρ = 0 CPU-bound Paxos: 6,000 ops Leader: 100% utilization Other: 50% Mencius: 9,000 ops All 100% utilization! Less registers == lower throughput
44 THROUGHPUT Figure 5: Mencius uses available bandwidth even when channels are asymmetric A *: 20 Mbps B *: 15 Mbps C *: 10 Mbps Figure 6: Mencius is able to adapt to changing bandwidth
45 THROUGHPUT UNDER FAILURE 3 servers, network-bound Failure after 30 seconds
46 SCALABILITY
47 LATENCY 3 site clique topology Low to medium latency
48 LATENCY
49 OTHER OPTIMIZATIONS Batch requests Higher throughput, but higher latency Eliminate Phase 3, broadcast ACCEPT Paxos: cuts learning delay by 1 Mencius: cuts upper bound on delayed commit by 1 Increases message complexity (decreasing throughput if CPU-bound) Broadcast body of requests Reach consensus on unique request ID Not effective if CPU-bound
50 RELATED WORK Consensus Fast Paxos CoReFP Moving sequencer/leader Totem S protocol Atomic broadcast Zieliński M-Consensus High throughput consensus/fault scalability FSR PBFT Zyzzyva Steward
51 FUTURE WORK AND OPEN ISSUES Byzantine failures Coordinator allocation Sites with faulty servers
52 CONCLUSION High performance Higher throughput than Paxos (CPU- or network-bound) Better scalability Suitable for wide-area applications At least Paxos-like commit latency
53 THANK YOU Questions?
Mencius: Another Paxos Variant???
Mencius: Another Paxos Variant??? Authors: Yanhua Mao, Flavio P. Junqueira, Keith Marzullo Presented by Isaiah Mayerchak & Yijia Liu State Machine Replication in WANs WAN = Wide Area Network Goals: Web
More informationReplicated State Machine in Wide-area Networks
Replicated State Machine in Wide-area Networks Yanhua Mao CSE223A WI09 1 Building replicated state machine with consensus General approach to replicate stateful deterministic services Provide strong consistency
More informationHP: Hybrid Paxos for WANs
HP: Hybrid Paxos for WANs Dan Dobre, Matthias Majuntke, Marco Serafini and Neeraj Suri {dan,majuntke,marco,suri}@cs.tu-darmstadt.de TU Darmstadt, Germany Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded
More informationJust Say NO to Paxos Overhead: Replacing Consensus with Network Ordering
Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering Jialin Li, Ellis Michael, Naveen Kr. Sharma, Adriana Szekeres, Dan R. K. Ports Server failures are the common case in data centers
More informationThere Is More Consensus in Egalitarian Parliaments
There Is More Consensus in Egalitarian Parliaments Iulian Moraru, David Andersen, Michael Kaminsky Carnegie Mellon University Intel Labs Fault tolerance Redundancy State Machine Replication 3 State Machine
More informationIntroduction to Distributed Systems Seif Haridi
Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send
More informationViewstamped Replication to Practical Byzantine Fault Tolerance. Pradipta De
Viewstamped Replication to Practical Byzantine Fault Tolerance Pradipta De pradipta.de@sunykorea.ac.kr ViewStamped Replication: Basics What does VR solve? VR supports replicated service Abstraction is
More informationPaxos Replicated State Machines as the Basis of a High- Performance Data Store
Paxos Replicated State Machines as the Basis of a High- Performance Data Store William J. Bolosky, Dexter Bradshaw, Randolph B. Haagens, Norbert P. Kusters and Peng Li March 30, 2011 Q: How to build a
More informationGroup Replication: A Journey to the Group Communication Core. Alfranio Correia Principal Software Engineer
Group Replication: A Journey to the Group Communication Core Alfranio Correia (alfranio.correia@oracle.com) Principal Software Engineer 4th of February Copyright 7, Oracle and/or its affiliates. All rights
More informationEfficient and Scalable Replication of Services over Wide-Area Networks
Efficient and Scalable Replication of Services over Wide-Area Networks Thesis by Abdallah Abouzamazem In Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy University of Newcastle
More informationZooKeeper Atomic Broadcast
ZooKeeper Atomic Broadcast The heart of the ZooKeeper coordination service Benjamin Reed, Flavio Junqueira Yahoo! Research ZooKeeper Service Transforms a request into an idempotent transaction Request
More informationSpecPaxos. James Connolly && Harrison Davis
SpecPaxos James Connolly && Harrison Davis Overview Background Fast Paxos Traditional Paxos Implementations Data Centers Mostly-Ordered-Multicast Network layer Speculative Paxos Protocol Application layer
More informationReplication in Distributed Systems
Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over
More informationSDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines
SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines Hanyu Zhao *, Quanlu Zhang, Zhi Yang *, Ming Wu, Yafei Dai * * Peking University Microsoft Research Replication for Fault Tolerance
More informationIntuitive distributed algorithms. with F#
Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype
More informationExam 2 Review. Fall 2011
Exam 2 Review Fall 2011 Question 1 What is a drawback of the token ring election algorithm? Bad question! Token ring mutex vs. Ring election! Ring election: multiple concurrent elections message size grows
More informationPractical Byzantine Fault Tolerance. Miguel Castro and Barbara Liskov
Practical Byzantine Fault Tolerance Miguel Castro and Barbara Liskov Outline 1. Introduction to Byzantine Fault Tolerance Problem 2. PBFT Algorithm a. Models and overview b. Three-phase protocol c. View-change
More informationDistributed algorithms
Distributed algorithms Prof R. Guerraoui lpdwww.epfl.ch Exam: Written Reference: Book - Springer Verlag http://lpd.epfl.ch/site/education/da - Introduction to Reliable (and Secure) Distributed Programming
More informationData Consistency and Blockchain. Bei Chun Zhou (BlockChainZ)
Data Consistency and Blockchain Bei Chun Zhou (BlockChainZ) beichunz@cn.ibm.com 1 Data Consistency Point-in-time consistency Transaction consistency Application consistency 2 Strong Consistency ACID Atomicity.
More informationTheoretical Computer Science
Theoretical Computer Science 496 (2013) 170 183 Contents lists available at SciVerse ScienceDirect Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs Optimizing Paxos with batching
More informationDesigning Distributed Systems using Approximate Synchrony in Data Center Networks
Designing Distributed Systems using Approximate Synchrony in Data Center Networks Dan R. K. Ports Jialin Li Naveen Kr. Sharma Vincent Liu Arvind Krishnamurthy University of Washington CSE Today s most
More informationDistributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013
Distributed Systems 19. Fault Tolerance Paul Krzyzanowski Rutgers University Fall 2013 November 27, 2013 2013 Paul Krzyzanowski 1 Faults Deviation from expected behavior Due to a variety of factors: Hardware
More informationFailure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18
Failure models Byzantine Fault Tolerance Fail-stop: nodes either execute the protocol correctly or just stop Byzantine failures: nodes can behave in any arbitrary way Send illegal messages, try to trick
More informationReducing the Costs of Large-Scale BFT Replication
Reducing the Costs of Large-Scale BFT Replication Marco Serafini & Neeraj Suri TU Darmstadt, Germany Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de
More informationDistributed Systems. Before We Begin. Advantages. What is a Distributed System? CSE 120: Principles of Operating Systems. Lecture 13.
CSE 120: Principles of Operating Systems Lecture 13 Distributed Systems December 2, 2003 Before We Begin Read Chapters 15, 17 (on Distributed Systems topics) Prof. Joe Pasquale Department of Computer Science
More informationPaxos and Replication. Dan Ports, CSEP 552
Paxos and Replication Dan Ports, CSEP 552 Today: achieving consensus with Paxos and how to use this to build a replicated system Last week Scaling a web service using front-end caching but what about the
More informationS-Paxos: Offloading the Leader for High Throughput State Machine Replication
212 31st International Symposium on Reliable Distributed Systems S-: Offloading the Leader for High Throughput State Machine Replication Martin Biely, Zarko Milosevic, Nuno Santos, André Schiper Ecole
More informationHomework 1. Question 1 - Layering. CSCI 1680 Computer Networks Fonseca
CSCI 1680 Computer Networks Fonseca Homework 1 Due: 27 September 2012, 4pm Question 1 - Layering a. Why are networked systems layered? What are the advantages of layering? Are there any disadvantages?
More informationAGREEMENT PROTOCOLS. Paxos -a family of protocols for solving consensus
AGREEMENT PROTOCOLS Paxos -a family of protocols for solving consensus OUTLINE History of the Paxos algorithm Paxos Algorithm Family Implementation in existing systems References HISTORY OF THE PAXOS ALGORITHM
More informationDistributed Algorithms Benoît Garbinato
Distributed Algorithms Benoît Garbinato 1 Distributed systems networks distributed As long as there were no machines, programming was no problem networks distributed at all; when we had a few weak computers,
More informationBe General and Don t Give Up Consistency in Geo- Replicated Transactional Systems
Be General and Don t Give Up Consistency in Geo- Replicated Transactional Systems Alexandru Turcu, Sebastiano Peluso, Roberto Palmieri and Binoy Ravindran Replicated Transactional Systems DATA CONSISTENCY
More informationDistributed Systems 24. Fault Tolerance
Distributed Systems 24. Fault Tolerance Paul Krzyzanowski pxk@cs.rutgers.edu 1 Faults Deviation from expected behavior Due to a variety of factors: Hardware failure Software bugs Operator errors Network
More informationPARALLEL CONSENSUS PROTOCOL
CANOPUS: A SCALABLE AND MASSIVELY PARALLEL CONSENSUS PROTOCOL Bernard Wong CoNEXT 2017 Joint work with Sajjad Rizvi and Srinivasan Keshav CONSENSUS PROBLEM Agreement between a set of nodes in the presence
More informationDistributed Algorithms. Partha Sarathi Mandal Department of Mathematics IIT Guwahati
Distributed Algorithms Partha Sarathi Mandal Department of Mathematics IIT Guwahati Thanks to Dr. Sukumar Ghosh for the slides Distributed Algorithms Distributed algorithms for various graph theoretic
More informationDistributed Systems Multicast & Group Communication Services
Distributed Systems 600.437 Multicast & Group Communication Services Department of Computer Science The Johns Hopkins University 1 Multicast & Group Communication Services Lecture 3 Guide to Reliable Distributed
More informationToday: Fault Tolerance
Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing
More informationDistributed Systems 11. Consensus. Paul Krzyzanowski
Distributed Systems 11. Consensus Paul Krzyzanowski pxk@cs.rutgers.edu 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value must be one
More informationToday: Fault Tolerance. Fault Tolerance
Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing
More informationDistributed Coordination with ZooKeeper - Theory and Practice. Simon Tao EMC Labs of China Oct. 24th, 2015
Distributed Coordination with ZooKeeper - Theory and Practice Simon Tao EMC Labs of China {simon.tao@emc.com} Oct. 24th, 2015 Agenda 1. ZooKeeper Overview 2. Coordination in Spring XD 3. ZooKeeper Under
More informationTolerating Latency in Replicated State Machines through Client Speculation
Tolerating Latency in Replicated State Machines through Client Speculation April 22, 2009 1, James Cowling 2, Edmund B. Nightingale 3, Peter M. Chen 1, Jason Flinn 1, Barbara Liskov 2 University of Michigan
More informationProviding Real-Time and Fault Tolerance for CORBA Applications
Providing Real-Time and Tolerance for CORBA Applications Priya Narasimhan Assistant Professor of ECE and CS University Pittsburgh, PA 15213-3890 Sponsored in part by the CMU-NASA High Dependability Computing
More informationCLOUD-SCALE FILE SYSTEMS
Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients
More informationPaxos Made Live. An Engineering Perspective. Authors: Tushar Chandra, Robert Griesemer, Joshua Redstone. Presented By: Dipendra Kumar Jha
Paxos Made Live An Engineering Perspective Authors: Tushar Chandra, Robert Griesemer, Joshua Redstone Presented By: Dipendra Kumar Jha Consensus Algorithms Consensus: process of agreeing on one result
More informationPBFT: A Byzantine Renaissance. The Setup. What could possibly go wrong? The General Idea. Practical Byzantine Fault-Tolerance (CL99, CL00)
PBFT: A Byzantine Renaissance Practical Byzantine Fault-Tolerance (CL99, CL00) first to be safe in asynchronous systems live under weak synchrony assumptions -Byzantine Paxos! The Setup Crypto System Model
More informationLecture 10: Link layer multicast. Mythili Vutukuru CS 653 Spring 2014 Feb 6, Thursday
Lecture 10: Link layer multicast Mythili Vutukuru CS 653 Spring 2014 Feb 6, Thursday Unicast and broadcast Usually, link layer is used to send data over a single hop between source and destination. This
More informationRecovering from a Crash. Three-Phase Commit
Recovering from a Crash If INIT : abort locally and inform coordinator If Ready, contact another process Q and examine Q s state Lecture 18, page 23 Three-Phase Commit Two phase commit: problem if coordinator
More informationA Formal Model of Crash Recovery in Distributed Software Transactional Memory (Extended Abstract)
A Formal Model of Crash Recovery in Distributed Software Transactional Memory (Extended Abstract) Paweł T. Wojciechowski, Jan Kończak Poznań University of Technology 60-965 Poznań, Poland {Pawel.T.Wojciechowski,Jan.Konczak}@cs.put.edu.pl
More informationDistributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf
Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need
More informationPriya Narasimhan. Assistant Professor of ECE and CS Carnegie Mellon University Pittsburgh, PA
OMG Real-Time and Distributed Object Computing Workshop, July 2002, Arlington, VA Providing Real-Time and Fault Tolerance for CORBA Applications Priya Narasimhan Assistant Professor of ECE and CS Carnegie
More informationFailures, Elections, and Raft
Failures, Elections, and Raft CS 8 XI Copyright 06 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved. Distributed Banking SFO add interest based on current balance PVD deposit $000 CS 8 XI Copyright
More informationByzantine fault tolerance. Jinyang Li With PBFT slides from Liskov
Byzantine fault tolerance Jinyang Li With PBFT slides from Liskov What we ve learnt so far: tolerate fail-stop failures Traditional RSM tolerates benign failures Node crashes Network partitions A RSM w/
More informationA simple totally ordered broadcast protocol
A simple totally ordered broadcast protocol Benjamin Reed Yahoo! Research Santa Clara, CA - USA breed@yahoo-inc.com Flavio P. Junqueira Yahoo! Research Barcelona, Catalunya - Spain fpj@yahoo-inc.com ABSTRACT
More informationCSE 5306 Distributed Systems
CSE 5306 Distributed Systems Fault Tolerance Jia Rao http://ranger.uta.edu/~jrao/ 1 Failure in Distributed Systems Partial failure Happens when one component of a distributed system fails Often leaves
More informationCS 425 / ECE 428 Distributed Systems Fall 2017
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy) Nov 7, 2017 Lecture 21: Replication Control All slides IG Server-side Focus Concurrency Control = how to coordinate multiple concurrent
More informationZyzzyva. Speculative Byzantine Fault Tolerance. Ramakrishna Kotla. L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin
Zyzzyva Speculative Byzantine Fault Tolerance Ramakrishna Kotla L. Alvisi, M. Dahlin, A. Clement, E. Wong University of Texas at Austin The Goal Transform high-performance service into high-performance
More informationDistributed System. Gang Wu. Spring,2018
Distributed System Gang Wu Spring,2018 Lecture4:Failure& Fault-tolerant Failure is the defining difference between distributed and local programming, so you have to design distributed systems with the
More informationCSE 5306 Distributed Systems. Fault Tolerance
CSE 5306 Distributed Systems Fault Tolerance 1 Failure in Distributed Systems Partial failure happens when one component of a distributed system fails often leaves other components unaffected A failure
More informationFailure Tolerance. Distributed Systems Santa Clara University
Failure Tolerance Distributed Systems Santa Clara University Distributed Checkpointing Distributed Checkpointing Capture the global state of a distributed system Chandy and Lamport: Distributed snapshot
More informationExploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout
Exploiting Commutativity For Practical Fast Replication Seo Jin Park and John Ousterhout Overview Problem: consistent replication adds latency and throughput overheads Why? Replication happens after ordering
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions Transactions Main issues: Concurrency control Recovery from failures 2 Distributed Transactions
More informationConcurrency Control II and Distributed Transactions
Concurrency Control II and Distributed Transactions CS 240: Computing Systems and Concurrency Lecture 18 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material.
More informationJPaxos: State machine replication based on the Paxos protocol
JPaxos: State machine replication based on the Paxos protocol Jan Kończak 2, Nuno Santos 1, Tomasz Żurkowski 2, Paweł T. Wojciechowski 2, and André Schiper 1 1 EPFL, Switzerland 2 Poznań University of
More informationExam Distributed Systems
Exam Distributed Systems 5 February 2010, 9:00am 12:00pm Part 2 Prof. R. Wattenhofer Family Name, First Name:..................................................... ETH Student ID Number:.....................................................
More informationEnhancing Throughput of
Enhancing Throughput of NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing Throughput of Partially Replicated State Machines via NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing
More informationParsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast
Parsimonious Asynchronous Byzantine-Fault-Tolerant Atomic Broadcast HariGovind V. Ramasamy Christian Cachin August 19, 2005 Abstract Atomic broadcast is a communication primitive that allows a group of
More informationIX: A Protected Dataplane Operating System for High Throughput and Low Latency
IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this
More informationFault Tolerance. Distributed Systems. September 2002
Fault Tolerance Distributed Systems September 2002 Basics A component provides services to clients. To provide services, the component may require the services from other components a component may depend
More informationDistributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski
Distributed Systems 09. State Machine Replication & Virtual Synchrony Paul Krzyzanowski Rutgers University Fall 2016 1 State machine replication 2 State machine replication We want high scalability and
More informationEngineering Fault-Tolerant TCP/IP servers using FT-TCP. Dmitrii Zagorodnov University of California San Diego
Engineering Fault-Tolerant TCP/IP servers using FT-TCP Dmitrii Zagorodnov University of California San Diego Motivation Reliable network services are desirable but costly! Extra and/or specialized hardware
More informationParallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?
Parallel and Distributed Systems Instructor: Sandhya Dwarkadas Department of Computer Science University of Rochester What is a parallel computer? A collection of processing elements that communicate and
More information1999, Scott F. Midkiff
Lecture Topics Direct Link Networks: Multiaccess Protocols (.7) Multiaccess control IEEE 80.5 Token Ring and FDDI CS/ECpE 556: Computer Networks Originally by Scott F. Midkiff (ECpE) Modified by Marc Abrams
More informationMulti-version concurrency control
MVCC and Distributed Txns (Spanner) 2P & CC = strict serialization Provides semantics as if only one transaction was running on DB at time, in serial order + Real-time guarantees CS 518: Advanced Computer
More informationPaxos. Sistemi Distribuiti Laurea magistrale in ingegneria informatica A.A Leonardo Querzoni. giovedì 19 aprile 12
Sistemi Distribuiti Laurea magistrale in ingegneria informatica A.A. 2011-2012 Leonardo Querzoni The Paxos family of algorithms was introduced in 1999 to provide a viable solution to consensus in asynchronous
More informationReplications and Consensus
CPSC 426/526 Replications and Consensus Ennan Zhai Computer Science Department Yale University Recall: Lec-8 and 9 In the lec-8 and 9, we learned: - Cloud storage and data processing - File system: Google
More informationTAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton
TAPIR By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton Outline Problem Space Inconsistent Replication TAPIR Evaluation Conclusion Problem
More informationDistributed Transactions
Distributed Transactions CS6450: Distributed Systems Lecture 17 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University.
More informationCSE 486/586 Distributed Systems
CSE 486/586 Distributed Systems Failure Detectors Slides by: Steve Ko Computer Sciences and Engineering University at Buffalo Administrivia Programming Assignment 2 is out Please continue to monitor Piazza
More informationLANs. Local Area Networks. via the Media Access Control (MAC) SubLayer. Networks: Local Area Networks
LANs Local Area Networks via the Media Access Control (MAC) SubLayer 1 Local Area Networks Aloha Slotted Aloha CSMA (non-persistent, 1-persistent, p-persistent) CSMA/CD Ethernet Token Ring 2 Network Layer
More informationApplications of Paxos Algorithm
Applications of Paxos Algorithm Gurkan Solmaz COP 6938 - Cloud Computing - Fall 2012 Department of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL Oct 15, 2012 1
More informationCommunication Networks
Communication Networks Prof. Laurent Vanbever Exercises week 4 Reliable Transport Reliable versus Unreliable Transport In the lecture, you have learned how a reliable transport protocol can be built on
More informationDistributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi
1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.
More informationDistributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs
1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds
More informationData Storage Revolution
Data Storage Revolution Relational Databases Object Storage (put/get) Dynamo PNUTS CouchDB MemcacheDB Cassandra Speed Scalability Availability Throughput No Complexity Eventual Consistency Write Request
More informationCoordinating distributed systems part II. Marko Vukolić Distributed Systems and Cloud Computing
Coordinating distributed systems part II Marko Vukolić Distributed Systems and Cloud Computing Last Time Coordinating distributed systems part I Zookeeper At the heart of Zookeeper is the ZAB atomic broadcast
More informationCS Transport. Outline. Window Flow Control. Window Flow Control
CS 54 Outline indow Flow Control (Very brief) Review of TCP TCP throughput modeling TCP variants/enhancements Transport Dr. Chan Mun Choon School of Computing, National University of Singapore Oct 6, 005
More informationAS distributed systems develop and grow in size,
1 hbft: Speculative Byzantine Fault Tolerance With Minimum Cost Sisi Duan, Sean Peisert, Senior Member, IEEE, and Karl N. Levitt Abstract We present hbft, a hybrid, Byzantine fault-tolerant, ted state
More informationDistributed Systems. Fault Tolerance. Paul Krzyzanowski
Distributed Systems Fault Tolerance Paul Krzyzanowski Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License. Faults Deviation from expected
More informationExploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout
Exploiting Commutativity For Practical Fast Replication Seo Jin Park and John Ousterhout Overview Problem: replication adds latency and throughput overheads CURP: Consistent Unordered Replication Protocol
More informationConcepts. Techniques for masking faults. Failure Masking by Redundancy. CIS 505: Software Systems Lecture Note on Consensus
CIS 505: Software Systems Lecture Note on Consensus Insup Lee Department of Computer and Information Science University of Pennsylvania CIS 505, Spring 2007 Concepts Dependability o Availability ready
More informationRavana: Controller Fault-Tolerance in SDN
Ravana: Controller Fault-Tolerance in SDN Software Defined Networking: The Data Centre Perspective Seminar Michel Kaporin (Mišels Kaporins) Michel Kaporin 13.05.2016 1 Agenda Introduction Controller Failures
More informationLecture 14: Congestion Control"
Lecture 14: Congestion Control" CSE 222A: Computer Communication Networks Alex C. Snoeren Thanks: Amin Vahdat, Dina Katabi Lecture 14 Overview" TCP congestion control review XCP Overview 2 Congestion Control
More informationErasure Coding in Object Stores: Challenges and Opportunities
Erasure Coding in Object Stores: Challenges and Opportunities Lewis Tseng Boston College July 2018, PODC Acknowledgements Nancy Lynch Muriel Medard Kishori Konwar Prakash Narayana Moorthy Viveck R. Cadambe
More informationDistributed Systems COMP 212. Lecture 19 Othon Michail
Distributed Systems COMP 212 Lecture 19 Othon Michail Fault Tolerance 2/31 What is a Distributed System? 3/31 Distributed vs Single-machine Systems A key difference: partial failures One component fails
More informationDistributed Systems Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2016
Distributed Systems 2016 Exam 1 Review Paul Krzyzanowski Rutgers University Fall 2016 Question 1 Why does it not make sense to use TCP (Transmission Control Protocol) for the Network Time Protocol (NTP)?
More informationCS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.
Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message
More informationApplication of SDN: Load Balancing & Traffic Engineering
Application of SDN: Load Balancing & Traffic Engineering Outline 1 OpenFlow-Based Server Load Balancing Gone Wild Introduction OpenFlow Solution Partitioning the Client Traffic Transitioning With Connection
More informationThe Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer
The Timed Asynchronous Distributed System Model By Flaviu Cristian and Christof Fetzer - proposes a formal definition for the timed asynchronous distributed system model - presents measurements of process
More informationConsensus. Chapter Two Friends. 2.3 Impossibility of Consensus. 2.2 Consensus 16 CHAPTER 2. CONSENSUS
16 CHAPTER 2. CONSENSUS Agreement All correct nodes decide for the same value. Termination All correct nodes terminate in finite time. Validity The decision value must be the input value of a node. Chapter
More informationConsensus for Non-Volatile Main Memory
1 Consensus for Non-Volatile Main Memory Huynh Tu Dang, Jaco Hofmann, Yang Liu, Marjan Radi, Dejan Vucinic, Fernando Pedone, and Robert Soulé University of Lugano, TU Darmstadt, and Western Digital 2 Traditional
More information