MDCC MULTI DATA CENTER CONSISTENCY. amplab. Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete
|
|
- Cuthbert McCormick
- 5 years ago
- Views:
Transcription
1 MDCC MULTI DATA CENTER CONSISTENCY Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, Alan Fekete amplab
2 MOTIVATION 2
3 3
4 June 2, 200: Rackspace power outage of approximately 0 minutes {FAILURE}
5 June 2, 200: Rackspace power outage of approximately 0 minutes {FAILURE} April 21, 2011: AWS East outage of over 2 hours, with data loss {FAILURE} 5
6 June 2, 200: Rackspace power outage of approximately 0 minutes UNRELIABLE DATA CENTERS {FAILURE} April 21, 2011: AWS East outage of over 2 hours, with data loss {FAILURE} 6
7 SOLUTION: GEO-REPLICATION 7
8 HIGH NETWORK LATENCY 8
9 THE ENVIRONMENT UNRELIABLE DATA CENTERS + HIGH NETWORK LATENCY
10 OUR CHALLENGE FAST AND RELIABLE DATABASE UNRELIABLE DATA CENTERS + HIGH NETWORK LATENCY 10
11 MDCC MULTI DATA CENTER CONSISTENCY new distributed commit protocol STRONG CONSISTENCY ACROSS DATA CENTERS LOW LATENCY NO DATA LOSS 11
12 EXISTING SOLUTIONS Consistency Transactional Unit Commit Latency Data Loss Possible? Amazon Dynamo Eventual None 1 round trip Not possible Yahoo PNUTS Timeline per key Single key 1 round trip Possible COPS Causality Multi-record 1 round trip Possible MySQL (async) Serializable Static partition 1 round trip Possible Google Megastore Serializable Static partition 2 round trips Not possible Google Spanner Snapshot Isolation Partition 2 round trips Not possible Walter Parallel Snapshot Isolation Multi-record 2 round trips Not possible 12
13 EXISTING SOLUTIONS + MDCC Consistency Transactional Unit Commit Latency Data Loss Possible? Amazon Dynamo Eventual None 1 round trip Not possible Yahoo PNUTS Timeline per key Single key 1 round trip Possible COPS Causality Multi-record 1 round trip Possible MySQL (async) Serializable Static partition 1 round trip Possible Google Megastore Serializable Static partition 2 round trips Not possible Google Spanner Snapshot Isolation Partition 2 round trips Not possible Walter MDCC Parallel Snapshot Isolation Read-committed without lost updates Multi-record 2 round trips Not possible Multi-record 1* round trip Not possible 13
14 WHY MDCC WORKS OPTIMIZES FOR TWO KEY OBSERVATIONS ON WORKLOADS 1
15 WHY MDCC WORKS CONFLICTS ARE RARE everyone updates their own data 15
16 WHY MDCC WORKS CONFLICTS ARE RARE everyone updates their own data CONFLICTS COMMUTE order of concurrent updates isn t important 16
17 MDCC OUTLINE TWO-PHASE COMMIT: BACKGROUND MDCC DISTRIBUTED TRANSACTIONS OPTIMIZATION: CONFLICTS ARE RARE OPTIMIZATION: CONFLICTS COMMUTE EVALUATION 17
18 TWO-PHASE COMMIT: BACKGROUND 18
19 TWO-PHASE COMMIT (2PC) STANDARD METHOD FOR DISTRIBUTED TRANSACTIONS PREPARE PHASE + COMMIT PHASE SYSTEMS TYPICALLY USE 2PC BETWEEN PARTITIONS 1
20 AN EXAMPLE TRANSACTION buy 1 book-a ( in stock) buy 3 book-b ( in stock) 20
21 TWO-PHASE COMMIT book-a coordinator book-b 21
22 TWO-PHASE COMMIT book-a coordinator book-b coordinator initiates transaction commit by sending PREPARE to all s 22
23 TWO-PHASE COMMIT book-a coordinator book-b s vote yes or no for the commit 23
24 TWO-PHASE COMMIT book-a coordinator book-b coordinator makes final decision (commit/abort) can commit only when ALL votes are yes 2
25 TWO-PHASE COMMIT book-a coordinator all s must respond for commit book-b coordinator makes final decision (commit/abort) can commit only when ALL votes are yes 25
26 TWO-PHASE COMMIT book-a coordinator final decision by coordinator book-b coordinator makes final decision (commit/abort) can commit only when ALL votes are yes 26
27 TWO-PHASE COMMIT transaction committed book-a coordinator book-b coordinator stores final decision 27
28 TWO-PHASE COMMIT transaction committed book-a coordinator decision stored only coordinator book-b coordinator stores final decision 28
29 TWO-PHASE COMMIT transaction committed book-a coordinator coordinator failure could block protocol book-b coordinator stores final decision 2
30 TWO-PHASE COMMIT book-a coordinator 6 6 book-b coordinator informs s with COMMIT phase 6 30
31 MDCC DISTRIBUTED TRANSACTIONS 31
32 MDCC TRANSACTIONS book-a coordinator book-b 32
33 MDCC TRANSACTIONS leader-a book-a coordinator leader-b book-b separate Paxos instance per record 33
34 MDCC TRANSACTIONS leader-a book-a coordinator leader-b book-b each executes Paxos replicated state machine 3
35 MDCC TRANSACTIONS 3 leader-a book-a coordinator 6 leader-b book-b coordinator proposes options (potential updates) 35
36 MDCC TRANSACTIONS coordinator leader-a leader-b book-a book-b s validate the options 36
37 coordinator leader-a leader-b MDCC TRANSACTIONS book-a book-b check with all permutations of pending options
38 coordinator leader-a leader-b MDCC TRANSACTIONS book-a book-b check: write-write conflicts, integrity constraints
39 coordinator leader-a leader-b MDCC TRANSACTIONS book-a book-b s tag options as accepted or rejected
40 MDCC TRANSACTIONS 0 coordinator book-a book-b leader-a leader-b Paxos learners learn the result
41 MDCC TRANSACTIONS transaction committed coordinator leader-a leader-b book-a book-b commits when all options are learned
42 MDCC TRANSACTIONS 2 coordinator book-a book-b leader-a leader-b coordinator cannot change transaction outcome transaction committed
43 transaction committed coordinator leader-a leader-b MDCC TRANSACTIONS book-a book-b once learned, options are never unlearned
44 transaction committed coordinator leader-a leader-b MDCC TRANSACTIONS book-a book-b notify s to execute option, to make visible
45 coordinator leader-a leader-b MDCC TRANSACTIONS book-a book-b option visibility can bypass the coordinator
46 MDCC TRANSACTIONS 3 3 leader-a 3 3 book-a 3 3 coordinator leader-b book-b options enable read-committed isolation
47 MDCC VS. TWO-PHASE COMMIT MDCC makes distributed commit decision stores distributed commit decision can tolerate failures/delays TWO-PHASE COMMIT makes commit decision on coordinator stores commit decision on coordinator failures/delays can block protocol 7
48 OPTIMIZATION: CONFLICTS ARE RARE 8
49 MDCC: FAST PAXOS FIRST TO USE FAST PAXOS FOR TRANSACTIONS POSSIBLE TO BYPASS THE LEADER CAN REDUCE NUMBER OF MESSAGE ROUND-TRIPS, WHEN CONFLICTS ARE RARE
50 FAST PAXOS CLASSIC PAXOS clients propose through leader FAST PAXOS clients propose directly to s leader leader 50
51 THE FAST IN FAST PAXOS BYPASS THE LEADER CONSENSUS IN 1 ROUND-TRIP 51
52 THE SLOW IN FAST PAXOS CONCURRENT UPDATES CAN PREVENT CONSENSUS LEADER MUST STEP IN TO RESOLVE CONFLICTS 2 ADDITIONAL MESSAGE ROUNDS TO RESOLVE CONSENSUS MAY TAKE 3 ROUND-TRIPS 52
53 LARGER QUORUMS IN FAST PAXOS CLASSIC PAXOS N/2 responses needed FAST PAXOS 2N/3 + 1 responses needed leader leader 53
54 MDCC: FAST PAXOS IMPLICATIONS TRANSACTIONS CAN BYPASS THE LEADERS TRANSACTIONS CAN COMMIT IN 1 ROUND-TRIP, WHEN CONFLICTS ARE RARE 5
55 OPTIMIZATION: CONFLICTS COMMUTE 55
56 COMMUTATIVE UPDATES commutative updates do not need to be totally ordered 56
57 COMMUTATIVE UPDATES commutative updates do not need to be totally ordered
58 COMMUTATIVE UPDATES commutative updates do not need to be totally ordered =
59 COMMUTATIVE UPDATES commutative updates do not need to be totally ordered = =
60 MDCC: GENERALIZED PAXOS GENERALIZATION OF FAST PAXOS INSTANCES AGREE ON COMMAND STRUCTURES (C-STRUCTS) C-STRUCTS ARE SEQUENCES OF COMMANDS, NOT JUST A SINGLE COMMAND 60
61 MDCC: COMMUTATIVE UPDATES perfect for c-structs of Generalized Paxos can use Fast Paxos (fast commits) without worrying about the order! however, order matters with integrity constraints 61
62 EXAMPLE OF COMMUTATIVE UPDATES online bookstore: books in stock integrity constraint: stock 0 5 replicas Fast Paxos quorum size of 62
63 COMMUTATIVE UPDATES 63
64 COMMUTATIVE UPDATES 6
65 COMMUTATIVE UPDATES 1 book sold! 65
66 COMMUTATIVE UPDATES 1 book sold! 66
67 COMMUTATIVE UPDATES 2 books sold! 67
68 COMMUTATIVE UPDATES 2 books sold! 68
69 COMMUTATIVE UPDATES 3 books sold! 6
70 COMMUTATIVE UPDATES 3 books sold! 70
71 COMMUTATIVE UPDATES books sold! 71
72 COMMUTATIVE UPDATES books sold! 72
73 COMMUTATIVE UPDATES 5 books sold! 73
74 COMMUTATIVE UPDATES 5 books sold! VIOLATED INTEGRITY CONSTRAINTS! 7
75 SOLUTION: QUORUM DEMARCATION adjust the constraint (stock 0) to keep replicas consistent without coordination commits require quorums, so coordination can piggyback on quorum messages quorum demarcation: stock N Q N value 75
76 COMMUTATIVE UPDATES stock N Q N value =
77 COMMUTATIVE UPDATES 3 books sold! stock N Q N value =
78 COMMUTATIVE UPDATES 3 books sold! N Q stock value = 0.8 N 78
79 MDCC: COMMUTATIVE UPDATES FAST COMMITS EVEN WHEN CONFLICTS OCCUR, WHEN ORDER IS NOT IMPORTANT MAINTAIN CONSISTENCY AND FAST COMMITS, WITH QUORUM DEMARCATION 7
80 EVALUATION 80
81 MDCC EXPERIMENTAL SETUP implemented on SCADS, a distributed key-value store fully replicated to 5 Amazon EC2 regions (California, Virginia, Ireland, Singapore, Tokyo) TPC-W benchmark for general performance micro benchmark for protocol investigation 81
82 TPC-W BENCHMARK 82
83 WRITE RESPONSE TIME CDF Percentage of Transactions QW-3 QW- MDCC 2PC Megastore* Write Transaction Response Times, log-scale (ms) 83
84 MICRO BENCHMARK 8
85 VARIOUS MDCC CONFIGURATIONS MDCC Fast Multi Supports Multi-Paxos Supports Fast Paxos Optimizes Commutative Updates 85
86 WRITE RESPONSE TIME CDF Percentage of Transactions Write Transaction Response Times (ms) MDCC Fast Multi 2PC 86
87 VARYING CONFLICT RATE Commits/Aborts (in thousands) PC Multi Fast MDCC 2PC Multi Fast MDCC 2PC Multi Fast MDCC 2PC Multi Fast MDCC Commits Aborts 2PC Multi Fast MDCC 2PC Multi Fast MDCC 0% 50% 20% 10% Hotspot Size 5% 2% 87
88 CONCLUSION 88
89 MDCC - NEW COMMIT PROTOCOL CONFLICTS ARE RARE -OR- CONFLICTS COMMUTE STRONG CONSISTENCY ACROSS DATA CENTERS 1* ROUND-TRIP DURABLE COMMITS COMPARABLE PERFORMANCE TO EVENTUALLY CONSISTENT PROTOCOLS amplab THANK YOU! GENE PANG - GPANG@CS.BERKELEY.EDU 8
90 0
Janus. Consolidating Concurrency Control and Consensus for Commits under Conflicts. Shuai Mu, Lamont Nelson, Wyatt Lloyd, Jinyang Li
Janus Consolidating Concurrency Control and Consensus for Commits under Conflicts Shuai Mu, Lamont Nelson, Wyatt Lloyd, Jinyang Li New York University, University of Southern California State of the Art
More informationBuilding Consistent Transactions with Inconsistent Replication
Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports University of Washington Distributed storage systems
More informationScalable Transaction Management with Snapshot Isolation for Distributed Systems J.Suba Lakshmi#1, K.Arulanandam*2,T.Varadarajan*3
Scalable Transaction Management with Snapshot Isolation for Distributed Systems J.Suba Lakshmi#1, K.Arulanandam*2,T.Varadarajan*3 #1Research Scholar, *2Assistant Professor,*3Assistant Professor Department
More informationTAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton
TAPIR By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton Outline Problem Space Inconsistent Replication TAPIR Evaluation Conclusion Problem
More informationScalable Transactions for Scalable Distributed Database Systems. Gene Pang. A dissertation submitted in partial satisfaction of the
Scalable Transactions for Scalable Distributed Database Systems by Gene Pang A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science
More informationSurviving congestion in geo-distributed storage systems
Surviving congestion in geo-distributed storage systems Brian Cho Marcos K. Aguilera University of Illinois at Urbana-Champaign Microsoft Research Silicon Valley Geo-distributed data centers Web applications
More informationPerformance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences
Performance and Forgiveness June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Margo Seltzer Architect Outline A consistency primer Techniques and costs of consistency
More informationCS6450: Distributed Systems Lecture 11. Ryan Stutsman
Strong Consistency CS6450: Distributed Systems Lecture 11 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed
More informationCS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.
Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message
More informationDistributed Systems. 10. Consensus: Paxos. Paul Krzyzanowski. Rutgers University. Fall 2017
Distributed Systems 10. Consensus: Paxos Paul Krzyzanowski Rutgers University Fall 2017 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value
More informationBuilding Consistent Transactions with Inconsistent Replication
DB Reading Group Fall 2015 slides by Dana Van Aken Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports
More informationTransactions. CS 475, Spring 2018 Concurrent & Distributed Systems
Transactions CS 475, Spring 2018 Concurrent & Distributed Systems Review: Transactions boolean transfermoney(person from, Person to, float amount){ if(from.balance >= amount) { from.balance = from.balance
More informationCS October 2017
Atomic Transactions Transaction An operation composed of a number of discrete steps. Distributed Systems 11. Distributed Commit Protocols All the steps must be completed for the transaction to be committed.
More informationSHAFT: Supporting Transactions with Serializability and Fault-Tolerance in Highly-Available Datastores
: Supporting Transactions with Serializability and Fault-Tolerance in Highly-Available Datastores Yuqing Zhu, Yilei Wang Institute for Computing Technology, Chinese Academy of Sciences, Beijing 90, China
More informationTechnical Report TR Design and Evaluation of a Transaction Model with Multiple Consistency Levels for Replicated Data
Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 Keller Hall 200 Union Street SE Minneapolis, MN 55455-0159 USA TR 15-009 Design and Evaluation of a Transaction
More informationTransaction Management using Causal Snapshot Isolation in Partially Replicated Databases
Transaction Management using Causal Snapshot Isolation in Partially Replicated Databases ABSTRACT We present here a transaction management protocol using snapshot isolation in partially replicated multi-version
More informationData Store Consistency. Alan Fekete
Data Store Consistency Alan Fekete Outline Overview Consistency Properties - Research paper: Wada et al, CIDR 11 Transaction Techniques - Research paper: Bailis et al, HotOS 13, VLDB 14 Each data item
More informationStrong Consistency & CAP Theorem
Strong Consistency & CAP Theorem CS 240: Computing Systems and Concurrency Lecture 15 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Consistency models
More informationDistributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf
Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need
More informationCS Amazon Dynamo
CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed
More informationModeling, Analyzing, and Extending Megastore using Real-Time Maude
Modeling, Analyzing, and Extending Megastore using Real-Time Maude Jon Grov 1 and Peter Ölveczky 1,2 1 University of Oslo 2 University of Illinois at Urbana-Champaign Thanks to Indranil Gupta (UIUC) and
More informationThere Is More Consensus in Egalitarian Parliaments
There Is More Consensus in Egalitarian Parliaments Iulian Moraru, David Andersen, Michael Kaminsky Carnegie Mellon University Intel Labs Fault tolerance Redundancy State Machine Replication 3 State Machine
More informationConsensus a classic problem. Consensus, impossibility results and Paxos. Distributed Consensus. Asynchronous networks.
Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}
More informationSecurity (and finale) Dan Ports, CSEP 552
Security (and finale) Dan Ports, CSEP 552 Today Security: what if parts of your distributed system are malicious? BFT: state machine replication Bitcoin: peer-to-peer currency Course wrap-up Security Too
More informationSCALABLE CONSISTENCY AND TRANSACTION MODELS
Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:
More informationSeminar on. Fault Tolerance in Cloud Computing
T. Cucinotta Real-Time Systems Laboratory (ReTiS) Scuola Superiore Sant Anna 1 Course: Concurrent & Distributed Systems December 12th, 2016 University of Pisa Seminar on Fault Tolerance in Cloud Computing
More informationDistributed Consensus: Making Impossible Possible
Distributed Consensus: Making Impossible Possible QCon London Tuesday 29/3/2016 Heidi Howard PhD Student @ University of Cambridge heidi.howard@cl.cam.ac.uk @heidiann360 What is Consensus? The process
More informationCS6450: Distributed Systems Lecture 15. Ryan Stutsman
Strong Consistency CS6450: Distributed Systems Lecture 15 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed
More informationDistributed System. Gang Wu. Spring,2018
Distributed System Gang Wu Spring,2018 Lecture4:Failure& Fault-tolerant Failure is the defining difference between distributed and local programming, so you have to design distributed systems with the
More informationBuilding Consistent Transactions with Inconsistent Replication
Building Consistent Transactions with Inconsistent Replication Irene Zhang Naveen Kr. Sharma Adriana Szekeres Arvind Krishnamurthy Dan R. K. Ports University of Washington {iyzhang, naveenks, aaasz, arvind,
More informationMegastore: Providing Scalable, Highly Available Storage for Interactive Services & Spanner: Google s Globally- Distributed Database.
Megastore: Providing Scalable, Highly Available Storage for Interactive Services & Spanner: Google s Globally- Distributed Database. Presented by Kewei Li The Problem db nosql complex legacy tuning expensive
More informationSDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines
SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines Hanyu Zhao *, Quanlu Zhang, Zhi Yang *, Ming Wu, Yafei Dai * * Peking University Microsoft Research Replication for Fault Tolerance
More informationAgreement and Consensus. SWE 622, Spring 2017 Distributed Software Engineering
Agreement and Consensus SWE 622, Spring 2017 Distributed Software Engineering Today General agreement problems Fault tolerance limitations of 2PC 3PC Paxos + ZooKeeper 2 Midterm Recap 200 GMU SWE 622 Midterm
More informationReplicated State Machine in Wide-area Networks
Replicated State Machine in Wide-area Networks Yanhua Mao CSE223A WI09 1 Building replicated state machine with consensus General approach to replicate stateful deterministic services Provide strong consistency
More informationCS 425 / ECE 428 Distributed Systems Fall 2017
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy) Nov 7, 2017 Lecture 21: Replication Control All slides IG Server-side Focus Concurrency Control = how to coordinate multiple concurrent
More informationMINIMIZING TRANSACTION LATENCY IN GEO-REPLICATED DATA STORES
MINIMIZING TRANSACTION LATENCY IN GEO-REPLICATED DATA STORES Divy Agrawal Department of Computer Science University of California at Santa Barbara Joint work with: Amr El Abbadi, Hatem Mahmoud, Faisal
More informationLast time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson
Distributed systems Lecture 6: Elections, distributed transactions, and replication DrRobert N. M. Watson 1 Last time Saw how we can build ordered multicast Messages between processes in a group Need to
More informationCrossStitch: An Efficient Transaction Processing Framework for Geo-Distributed Systems
CrossStitch: An Efficient Transaction Processing Framework for Geo-Distributed Systems Sharon Choy, Bernard Wong, Xu Cui, Xiaoyi Liu Cheriton School of Computer Science, University of Waterloo s2choy,
More informationTail Latency in ZooKeeper and a Simple Reimplementation
Tail Latency in ZooKeeper and a Simple Reimplementation Michael Graczyk Abstract ZooKeeper [1] is a commonly used service for coordinating distributed applications. ZooKeeper uses leader-based atomic broadcast
More informationIntegrity in Distributed Databases
Integrity in Distributed Databases Andreas Farella Free University of Bozen-Bolzano Table of Contents 1 Introduction................................................... 3 2 Different aspects of integrity.....................................
More informationLow Overhead Concurrency Control for Partitioned Main Memory Databases
Low Overhead Concurrency Control for Partitioned Main Memory Databases Evan Jones, Daniel Abadi, Samuel Madden, June 2010, SIGMOD CS 848 May, 2016 Michael Abebe Background Motivations Database partitioning
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions Transactions Main issues: Concurrency control Recovery from failures 2 Distributed Transactions
More informationECE 6102 Exam 2. Multiple Choice. Problem 6. Problem 7. Problem 8. Total
ECE 6102 Exam 2 Instructions: You have 1 hour, 20 minutes to complete this test. The test is closed book and closed notes. No calculators, phones, or other electronic devices are allowed. Make sure to
More informationFailure models. Byzantine Fault Tolerance. What can go wrong? Paxos is fail-stop tolerant. BFT model. BFT replication 5/25/18
Failure models Byzantine Fault Tolerance Fail-stop: nodes either execute the protocol correctly or just stop Byzantine failures: nodes can behave in any arbitrary way Send illegal messages, try to trick
More informationConsensus, impossibility results and Paxos. Ken Birman
Consensus, impossibility results and Paxos Ken Birman Consensus a classic problem Consensus abstraction underlies many distributed systems and protocols N processes They start execution with inputs {0,1}
More informationBeyond FLP. Acknowledgement for presentation material. Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen
Beyond FLP Acknowledgement for presentation material Chapter 8: Distributed Systems Principles and Paradigms: Tanenbaum and Van Steen Paper trail blog: http://the-paper-trail.org/blog/consensus-protocols-paxos/
More informationEnhancing Throughput of
Enhancing Throughput of NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing Throughput of Partially Replicated State Machines via NCA 2017 Zhongmiao Li, Peter Van Roy and Paolo Romano Enhancing
More informationConsistency in Distributed Systems
Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission
More informationDistributed Systems. Day 13: Distributed Transaction. To Be or Not to Be Distributed.. Transactions
Distributed Systems Day 13: Distributed Transaction To Be or Not to Be Distributed.. Transactions Summary Background on Transactions ACID Semantics Distribute Transactions Terminology: Transaction manager,,
More informationCPS 512 midterm exam #1, 10/7/2016
CPS 512 midterm exam #1, 10/7/2016 Your name please: NetID: Answer all questions. Please attempt to confine your answers to the boxes provided. If you don t know the answer to a question, then just say
More informationExam 2 Review. Fall 2011
Exam 2 Review Fall 2011 Question 1 What is a drawback of the token ring election algorithm? Bad question! Token ring mutex vs. Ring election! Ring election: multiple concurrent elections message size grows
More informationCockroachDB on DC/OS. Ben Darnell, CTO, Cockroach Labs
CockroachDB on DC/OS Ben Darnell, CTO, Cockroach Labs Agenda A cloud-native database CockroachDB on DC/OS Why CockroachDB Demo! Cloud-Native Database What is Cloud-Native? Horizontally scalable Individual
More informationPNUTS: Yahoo! s Hosted Data Serving Platform. Reading Review by: Alex Degtiar (adegtiar) /30/2013
PNUTS: Yahoo! s Hosted Data Serving Platform Reading Review by: Alex Degtiar (adegtiar) 15-799 9/30/2013 What is PNUTS? Yahoo s NoSQL database Motivated by web applications Massively parallel Geographically
More informationReplications and Consensus
CPSC 426/526 Replications and Consensus Ennan Zhai Computer Science Department Yale University Recall: Lec-8 and 9 In the lec-8 and 9, we learned: - Cloud storage and data processing - File system: Google
More informationApplications of Paxos Algorithm
Applications of Paxos Algorithm Gurkan Solmaz COP 6938 - Cloud Computing - Fall 2012 Department of Electrical Engineering and Computer Science University of Central Florida - Orlando, FL Oct 15, 2012 1
More informationEECS 498 Introduction to Distributed Systems
EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Replicated State Machines Logical clocks Primary/ Backup Paxos? 0 1 (N-1)/2 No. of tolerable failures October 11, 2017 EECS 498
More informationCausal Consistency and Two-Phase Commit
Causal Consistency and Two-Phase Commit CS 240: Computing Systems and Concurrency Lecture 16 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Consistency
More informationCSE 444: Database Internals. Section 9: 2-Phase Commit and Replication
CSE 444: Database Internals Section 9: 2-Phase Commit and Replication 1 Today 2-Phase Commit Replication 2 Two-Phase Commit Protocol (2PC) One coordinator and many subordinates Phase 1: Prepare Phase 2:
More informationAGREEMENT PROTOCOLS. Paxos -a family of protocols for solving consensus
AGREEMENT PROTOCOLS Paxos -a family of protocols for solving consensus OUTLINE History of the Paxos algorithm Paxos Algorithm Family Implementation in existing systems References HISTORY OF THE PAXOS ALGORITHM
More informationBig Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)
Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 10: Mutable State (2/2) March 16, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These
More informationEECS 498 Introduction to Distributed Systems
EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Dynamo Recap Consistent hashing 1-hop DHT enabled by gossip Execution of reads and writes Coordinated by first available successor
More informationScalable Causal Consistency for Wide-Area Storage with COPS
Don t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU The Key-value
More informationReplication in Distributed Systems
Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over
More informationRecovering from a Crash. Three-Phase Commit
Recovering from a Crash If INIT : abort locally and inform coordinator If Ready, contact another process Q and examine Q s state Lecture 18, page 23 Three-Phase Commit Two phase commit: problem if coordinator
More informationLow-Latency Multi-Datacenter Databases using Replicated Commit
Low-Latency Multi-Datacenter Databases using Replicated Commit Hatem Mahmoud, Faisal Nawab, Alexander Pucher, Divyakant Agrawal, Amr El Abbadi UCSB Presented by Ashutosh Dhekne Main Contributions Reduce
More informationDistributed Systems 11. Consensus. Paul Krzyzanowski
Distributed Systems 11. Consensus Paul Krzyzanowski pxk@cs.rutgers.edu 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value must be one
More informationTransaction Management using Causal Snapshot Isolation in Partially Replicated Databases. Technical Report
Transaction Management using Causal Snapshot Isolation in Partially Replicated Databases Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 Keller Hall 200 Union
More informationLarge-Scale Key-Value Stores Eventual Consistency Marco Serafini
Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,
More informationDistributed Systems Consensus
Distributed Systems Consensus Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Consensus 1393/6/31 1 / 56 What is the Problem?
More informationParallel Data Types of Parallelism Replication (Multiple copies of the same data) Better throughput for read-only computations Data safety Partitionin
Parallel Data Types of Parallelism Replication (Multiple copies of the same data) Better throughput for read-only computations Data safety Partitioning (Different data at different sites More space Better
More informationLow-Latency Multi-Datacenter Databases using Replicated Commits
Low-Latency Multi-Datacenter Databases using Replicated Commits Hatem A. Mahmoud, Alexander Pucher, Faisal Nawab, Divyakant Agrawal, Amr El Abbadi Universoty of California Santa Barbara, CA, USA {hatem,pucher,nawab,agrawal,amr}@cs.ucsb.edu
More informationExploiting Commutativity For Practical Fast Replication. Seo Jin Park and John Ousterhout
Exploiting Commutativity For Practical Fast Replication Seo Jin Park and John Ousterhout Overview Problem: consistent replication adds latency and throughput overheads Why? Replication happens after ordering
More informationDesign and Validation of Distributed Data Stores using Formal Methods
Design and Validation of Distributed Data Stores using Formal Methods Peter Ölveczky University of Oslo University of Illinois at Urbana-Champaign Based on joint work with Jon Grov, Indranil Gupta, Si
More informationGoogle Spanner - A Globally Distributed,
Google Spanner - A Globally Distributed, Synchronously-Replicated Database System James C. Corbett, et. al. Feb 14, 2013. Presented By Alexander Chow For CS 742 Motivation Eventually-consistent sometimes
More informationExam 2 Review. October 29, Paul Krzyzanowski 1
Exam 2 Review October 29, 2015 2013 Paul Krzyzanowski 1 Question 1 Why did Dropbox add notification servers to their architecture? To avoid the overhead of clients polling the servers periodically to check
More informationPNUTS and Weighted Voting. Vijay Chidambaram CS 380 D (Feb 8)
PNUTS and Weighted Voting Vijay Chidambaram CS 380 D (Feb 8) PNUTS Distributed database built by Yahoo Paper describes a production system Goals: Scalability Low latency, predictable latency Must handle
More informationDistributed Data Management Transactions
Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut must ensure that interactions succeed consistently An OLTP Topic Motivation Most database interactions consist of multiple, coherent operations
More informationSpanner : Google's Globally-Distributed Database. James Sedgwick and Kayhan Dursun
Spanner : Google's Globally-Distributed Database James Sedgwick and Kayhan Dursun Spanner - A multi-version, globally-distributed, synchronously-replicated database - First system to - Distribute data
More informationFault-Tolerant Distributed Services and Paxos"
Fault-Tolerant Distributed Services and Paxos" INF346, 2015 2015 P. Kuznetsov and M. Vukolic So far " " Shared memory synchronization:" Wait-freedom and linearizability" Consensus and universality " Fine-grained
More informationDistributed Consensus: Making Impossible Possible
Distributed Consensus: Making Impossible Possible Heidi Howard PhD Student @ University of Cambridge heidi.howard@cl.cam.ac.uk @heidiann360 hh360.user.srcf.net Sometimes inconsistency is not an option
More informationStronger Semantics for Low-Latency Geo-Replicated Storage
To appear in Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), Lombard, IL, April 2013 Stronger Semantics for Low-Latency Geo-Replicated Storage Wyatt Lloyd,
More informationProseminar Distributed Systems Summer Semester Paxos algorithm. Stefan Resmerita
Proseminar Distributed Systems Summer Semester 2016 Paxos algorithm stefan.resmerita@cs.uni-salzburg.at The Paxos algorithm Family of protocols for reaching consensus among distributed agents Agents may
More informationSpanner: Google's Globally-Distributed Database* Huu-Phuc Vo August 03, 2013
Spanner: Google's Globally-Distributed Database* Huu-Phuc Vo August 03, 2013 *OSDI '12, James C. Corbett et al. (26 authors), Jay Lepreau Best Paper Award Outline What is Spanner? Features & Example Structure
More informationDistributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi
1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.
More informationDistributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs
1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds
More informationLeader or Majority: Why have one when you can have both? Improving Read Scalability in Raft-like consensus protocols
Leader or Majority: Why have one when you can have both? Improving Read Scalability in Raft-like consensus protocols Vaibhav Arora, Tanuj Mittal, Divyakant Agrawal, Amr El Abbadi * and Xun Xue, Zhiyanan,
More informationLow Overhead Concurrency Control for Partitioned Main Memory Databases. Evan P. C. Jones Daniel J. Abadi Samuel Madden"
Low Overhead Concurrency Control for Partitioned Main Memory Databases Evan P. C. Jones Daniel J. Abadi Samuel Madden" Banks" Payment Processing" Airline Reservations" E-Commerce" Web 2.0" Problem:" Millions
More informationToday: Fault Tolerance
Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing
More informationToday: Fault Tolerance. Fault Tolerance
Today: Fault Tolerance Agreement in presence of faults Two army problem Byzantine generals problem Reliable communication Distributed commit Two phase commit Three phase commit Paxos Failure recovery Checkpointing
More informationHT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers
1 HT-Paxos: High Throughput State-Machine Replication Protocol for Large Clustered Data Centers Vinit Kumar 1 and Ajay Agarwal 2 1 Associate Professor with the Krishna Engineering College, Ghaziabad, India.
More informationConcurrency Control II and Distributed Transactions
Concurrency Control II and Distributed Transactions CS 240: Computing Systems and Concurrency Lecture 18 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material.
More informationConsolidating Concurrency Control and Consensus for Commits under Conflicts
Consolidating Concurrency Control and Consensus for Commits under Conflicts Shuai Mu, Lamont Nelson, Wyatt Lloyd, and Jinyang Li New York University, University of Southern California Abstract Conventional
More informationAutomated Data Partitioning for Highly Scalable and Strongly Consistent Transactions
Automated Data Partitioning for Highly Scalable and Strongly Consistent Transactions Alexandru Turcu, Roberto Palmieri, Binoy Ravindran Virginia Tech SYSTOR 2014 Desirable properties in distribute transactional
More informationPaxos and Distributed Transactions
Paxos and Distributed Transactions INF 5040 autumn 2016 lecturer: Roman Vitenberg Paxos what is it? The most commonly used consensus algorithm A fundamental building block for data centers Distributed
More informationHow Eventual is Eventual Consistency?
Probabilistically Bounded Staleness How Eventual is Eventual Consistency? Peter Bailis, Shivaram Venkataraman, Michael J. Franklin, Joseph M. Hellerstein, Ion Stoica (UC Berkeley) BashoChats 002, 28 February
More informationCS 655 Advanced Topics in Distributed Systems
Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 (Fall 2018) Part 7: Mutable State (2/2) November 13, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These slides are
More informationWhy distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data?"
Why distributed databases suck, and what to do about it - Regaining consistency Do you want a database that goes down or one that serves wrong data?" 1 About the speaker NoSQL team lead at Trifork, Aarhus,
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL
CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.
More information