Strong Eventual Consistency and CRDTs
|
|
- Pauline Weaver
- 5 years ago
- Views:
Transcription
1 Strong Eventual Consistency and CRDTs Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC
2 Large-scale replicated data structures Large, dynamic graph Incremental, parallel, asynchronous: - updates - processing Replication + eventual consistency + fast + simple Eventual Consistency Conflict-free objects = no synchronisation whatsoever Is this practical? 2
3 Consistency
4 Strong consistency Preclude conflict: Replicas update in same total order Any deterministic object Consensus Serialisation bottleneck way agreement Tolerates < n/2 faults Sequential, linearisable Simultaneous N- Very general Correct Doesn't scale 4
5 Eventual Consistency Conflict! Reconcile Update local + propagate No foreground synch Expose tentative state Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus (in background) Designed for human collaboration Consensus moved to background 5 No (foreground) synchronisation Local updates; Speculate: replicas diverge Reliable broadcast, scalable If no conflicts, merge Otherwise, arbitrate; roll back inconsistent replicas Requires replicas to agree on arbitrations Consensus in background OK for human collaboration; not for high-performance computing
6 Strong Eventual Consistency Available, responsive More parallelism No conflicts Update local No + rollback propagate No synch Expose intermediate state Eventual, reliable delivery No conflict Unique outcome of concurrent updates No consensus: n-1 faults Not universal Fast, responsive Solves the CAP problem 6 No synchronisation Local updates; Speculate: replicas diverge Reliable broadcast, scalable Merge: unique, predictable outcome may not depend on: concurrent updates, order received Solves the CAP problem
7 Definition of EC delivered formally defined No (foreground) synchronisation Local speculative updates: replicas diverge Reconcile conflicts: arbitrate, merge, roll Eventual delivery: An update executed at a correct replica eventually executes at all correct replicas Termination: All update executions terminate Convergence: Correct replicas that have executed the same updates Availability eventually ++ reach Latency -- equivalent state (and stay) Complexity ++ Consensus (in background) 7 No (foreground) synchronisation Local updates Speculate: replicas diverge Reliable broadcast, scalable If no conflicts, merge Otherwise, reconcile; arbitrate, merge roll back inconsistent replicas Requires replicas to agree on arbitrations Consensus in background
8 Definition of SEC Eventual delivery: An update executed at some correct replica eventually executes at all correct replicas Termination: All update executions terminate Strong Convergence: Correct replicas that have executed the same updates have equivalent state No conflicts No rollback No consensus Limited 8 No (foreground) synchronisation Local updates Speculate: replicas diverge Reliable broadcast, scalable If no conflicts, merge Otherwise, reconcile; arbitrate, merge roll back inconsistent replicas Requires replicas to agree on arbitrations Consensus in background
9 SEC Sequential Consistency Consider Set-like object S such that: {true} add(e) {e S} {true} remove(e) {e S} {true} add(e) remove(e) {e S} Satisfies SEC conditions add(e); remove(e') {true} {e, e' S} add(e'); remove(e) Not sequentially consistent OR-Set add wins Conversely, SC is not SEC requires consensus 9 SEC is weaker than linearisability: it allows state changes after the return sequential consistency: see above Conversely, SC is stronger than SEC requires consensus It's a new kind of consistency
10 Conflict-free Replicated Data Types (CRDTs) Intuition: Conflicts are the problem Design data types with no conflicts CRDTs Available, fast Reconcile scalability + consistency Simple sufficient conditions Principled, correct 10
11 Object model & Sufficient conditions
12 Query client s s1 s2.q(b) S s1.q(a) s2 S s3 Local at source replica Client's choice Example: Amazon shopping cart is replicated unspecified client, e.g., Web frontend One or more load-balancer, failures may direct client to different replicas 12
13 State-based replication client s s1 s2 s2.u(b) S S s1.u(a) s1 M s2.m(s1) s2 s1.m(s2) M s2 s3 M s3.m(s2) Local at source s1.u(a), s2.u(b), Compute Update local payload Convergence: Episodically: send si payload On delivery: merge payloads m merge two valid states produce valid state no historical info available Inefficient if payload is large 13 State-based: Local queries, local updates Send full state; on receive, merge Conceptually simple File systems (NFS, Unison, Dynamo)
14 State-based: monotonic semilattice CRDT s s1 s2 s3 s1.u(a) S s2.u(b) S s1 M s2.m(s1) s2 M s3.m(s2) s1.m(s2) M s2 = Least Upper Bound LUB = merge If payload type forms a semi-lattice updates are increasing merge computes Least Upper Bound then replicas converge to LUB of last values Example: Payload = int, merge = max no reference to history 14
15 Operation-based replication s s1 s2 s2.u(b) s1.u(a) S S b b a s1.u(b) D a s2.u(a) D push to all replicas eventually push small updates - more efficient than state-based s3 D s3.u(b) D s3.u(a) At source: Eventually, at all replicas: prepare broadcast to all replicas update local replica 15 Operation-based: Local queries Replicated updates Log, send operations Reconcile non-commutative operations Vector clock: static network Collaborative editing, Bayou, PNUTS
16 Operation-based replication s s1 s2 s1.t(a'); s1.u(a) s2.t(b'); s2.u(b) S S b b a s1.u(b) D a s2.u(a) D push to all replicas eventually push small updates - more efficient than state-based s3 D s3.u(b) D s3.u(a) At source: Eventually, at all replicas: prepare broadcast to all replicas update local replica 16
17 Op-based: commute CRDT s s1 s1.u(a) S b s1.u(b) D a s2.u(a) s2 s2.u(b) S b a D s3 D s3.u(b) D s3.u(a) Delivery order ensures downstream precondition happened-before or weaker If: (Liveness) all replicas execute all operations in delivery order (Safety) concurrent operations all commute Then: replicas converge 17 Not same order at 1 and 2? OK if concurrent f and g commute Commutativity: sufficient necessary? (conjecture)
18 Monotonic semi-lattice commutative Systematic transformation Inefficient Hand-crafted op-based implementation 1. A state-based object can emulate an operation-based object, and vice-versa 2. State-based emulation of an op-based CRDT is a CRDT 3. Operation-based emulation of a state-based CRDT is a CRDT 18
19 Emulating operation-based with state-based RBi reliable broadcast emulated as CvRDT Replica payload at i: Grow-Set Msgs Grow-Set Dlvrd Operations: send(m): Msgs Msgs { m } deliver (Pre): choose m Msgs \ Dlvrd: Pre(m); Dlvrd Dlvrd { m }; m s s' s. Msgs s'. Msgs merge(s,s'): s. Msgs s'. Msgs RB = send, deliver Deliver(Pre) Pre = predicate (pre-cond) merge computes LUB it's a CvRDT 19
20 Comparison State-based: Update merge operation Simple data types State includes preceding updates; no separate historical information Operation-based: Update operation Higher level, more complex More powerful, more constraining Small messages State-based or op-based, as convenient 20 It will often be convenient to design using state-based Convert to deltas or op-based for efficiency at large scale
21 Composition and sharding A combination of independent CRDTs remains a CRDT Very large objects Independent shards Static: hash Statically-Sharded CRDT Each shard is a CRDT Update: single shard No cross-object invariants (Dynamic: requires consensus to rebalance) 21
22 The challenge: What interesting objects can we design with no synchronisation whatsoever?
23 Portfolio of CRDTs Register Last-Writer Wins Multi-Value Set Grow-Only 2P Observed-Remove Map Set of Registers Counter Unlimited Non-negative Graphs Directed Monotonic DAG Edit graph Sequence Edit sequence 23
24 Multi-master counter Increment / decrement Payload: P = [int, int, ], N = [int, int, ] value() = i P[i] i N[i] increment () = P[MyID]++ decrement () = N[MyID]++ merge(s,s') = like vector clock s s' = ([,max(s.p[i],s'.p[i]), ]i, [,max(s.n[i],s'.n[i]), ]i) Positive or negative 24
25 Multi-master counter Increment / decrement Payload: P = [int, int, ], N = [int, int, ] value() = i P[i] i N[i] increment () = P[MyID]++ decrement () = N[MyID]++ merge(s,s') = like vector clock s s' = ([,max(s.p[i],s'.p[i]), ]i, [,max(s.n[i],s'.n[i]), ]i) Positive or negative can't maintain global invariant such as s>0 25
26 Set design alternatives Sequential specification: {true} add(e) {e S} {true} remove(e) {e S} {true} add(e) remove(e) {????} linearisable? error state? last writer wins? add wins? remove wins? linearisable: sequential order equivalent to real-time order Requires consensus 26 Linearisability is too strong for SEC
27 Observed-Remove Set s s1 s2 s3 {} {} {} add(a) S {aα} add(a) S {aβ} {aβ} rmv (a) S {aβ, aα} M {aα} {aβ} Can never remove more tokens than {aα} {aα} exist Op order removed M M M {aβ} {aβ, aα} {aβ, aα} tokens have been previously added Payload: added, removed (element, unique-token) add(e) = A A {(e, α)} Remove: all unique elements observed remove(e) = R R { (e, ) A} lookup(e) = (e, ) A \ R merge (S, S') = (A A', R R') {true} add(e) remove(e) {e S} 27 In case of add+remove, OR-Set gives precedence to add
28 OR-Set Set: solves Dynamo Shopping Cart anomaly Optimisations: No tombstones Operation-based approach Snapshots Sharded 28
29 OR-Set + Snapshot Read consistent snapshot Despite concurrent, incremental updates Unique token = time (vector clock) α = Lamport (process i, counter t) UIDs identify snapshot version Snapshot: vector clock value Retain tombstones until not needed lookup(e, t) = (e, i, t') A : t'>t (e, i, t') R: t'>t 29
30 Graph design alternatives Graph = (V, E) where E V V Sequential specification: {v,v' V} addedge(v,v') { } { (v,v') E} removevertex(v) { } Concurrent: removevertex(v') addedge(v,v') linearisable? last writer wins? addedge wins? removevertex wins? etc. for our Web Search Engine application, removevertex wins Do not check precondition at add/remove 30
31 Directed Graph Payload = OR-Set V, OR-Set E Updates add/remove to V, E addvertex(v), removevertex(v) addedge(v,v'), removeedge(v,v') Do not enforce invariant a priori lookupedge(v,v') = (v,v') E v V v' V removevertex(v') addedge(v,v') removevertex wins 31
32 Graph + shards + snapshots Snapshot see OR-Set Sharding See OR-Set Do not enforce invariant a priori lookupedge(v,v') = (v,v') E v V v' V 32
33 Ongoing work
34 CRDTs for P2P & Cloud Computing ConcoRDanT: ANR Systematic study of conflict-free design space Theory and practice Characterise invariants Library of data types Not universal Conflict-free vs. conflict semantics Move consensus off critical path, non-critical ops 34 Other data types: Trivial: set with add, remove
35 CRDT + dataflow Web site 1 Web site 2 Set Whitelist crawl extract links Graph Graph spam detector Web site 3 Incremental, asynchronous processing Replicate, shard CRDTs near the edge Propagate updates dataflow Throttle according to QoS metrics (freshness, availability, cost, etc.) Scale: sharded Synchronous processing: snapshot, at centre Content DB Map URL to last 2 versions extract words Words Map word to Set of URLs 35
36 More scalability Incremental, asynchronous processing Operation-based: send small updates Dataflow-style notification network Push/throttle according to required QoS Sharding Composability Large-scale transactional processing Snapshot Isolation CRDTs appear ideal 36
37 Thought experiment (1) Web site 1 Map URL to text Content DB extract links Graph Links page rank Query 2 Query 1 Query 3 Web site 2 Web site 3 Key: Input crawl Canonical URL extract words computation Data (replicated) Object type (replicated) Map word to Set of URLs Words ranking indexer query resolver Ranked Map of word to list of URLs 37
38 Thought experiment (2) Links rank content DB, index: decentralised, replicated, close to the network edge Words ranking indexer crawl, extract Links Words Links crawl, extract Words crawl, extract Links Words crawl, extract Links Words 38 Crawl and extract content close to the source of data = partition geographically, replicate Compute page rank on consistent data: snapshot Fast results: replicate index widely near the edge
39 Contributions Strong Eventual Consistency (SEC) A solution to the CAP problem Formal definitions Two sufficient conditions Strong equivalence between the two SEC shown incomparable to sequential consistency CRDTs integer vectors, counters sets graphs 39
40 40
Strong Eventual Consistency and CRDTs
Strong Eventual Consistency and CRDTs Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC Large-scale replicated data structures Large, dynamic
More informationConflict-free Replicated Data Types (CRDTs)
Conflict-free Replicated Data Types (CRDTs) for collaborative environments Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC Conflict-free
More informationSelf-healing Data Step by Step
Self-healing Data Step by Step Uwe Friedrichsen (codecentric AG) NoSQL matters Cologne, 29. April 2014 @ufried Uwe Friedrichsen uwe.friedrichsen@codecentric.de http://slideshare.net/ufried http://ufried.tumblr.com
More informationConflict-free Replicated Data Types
Conflict-free Replicated Data Types Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski To cite this version: Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski. Conflict-free Replicated
More informationINF-5360 Presentation
INF-5360 Presentation Optimistic Replication Ali Ahmad April 29, 2013 Structure of presentation Pessimistic and optimistic replication Elements of Optimistic replication Eventual consistency Scheduling
More information(It s the invariants, stupid)
Consistency in three dimensions (It s the invariants, stupid) Marc Shapiro Cloud to the edge Social, web, e-commerce: shared mutable data Scalability replication consistency issues Inria & Sorbonne-Universités
More informationA comprehensive study of Convergent and Commutative Replicated Data Types
INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE A comprehensive study of Convergent and Commutative Replicated Data Types Marc Shapiro, INRIA & LIP6, Paris, France Nuno Preguiça, CITI,
More informationEventual Consistency Today: Limitations, Extensions and Beyond
Eventual Consistency Today: Limitations, Extensions and Beyond Peter Bailis and Ali Ghodsi, UC Berkeley - Nomchin Banga Outline Eventual Consistency: History and Concepts How eventual is eventual consistency?
More informationCloud-backed applications. Write Fast, Read in the Past: Causal Consistency for Client- Side Applications. Requirements & challenges
Write Fast, Read in the Past: ausal onsistenc for lient- Side Applications loud-backed applications 1. Request 3. Repl 2. Process request & store update 4. Transmit update 50~200ms databas e app server
More informationSWIFTCLOUD: GEO-REPLICATION RIGHT TO THE EDGE
SWIFTCLOUD: GEO-REPLICATION RIGHT TO THE EDGE Annette Bieniusa T.U. Kaiserslautern Carlos Baquero HSALab, U. Minho Marc Shapiro, Marek Zawirski INRIA, LIP6 Nuno Preguiça, Sérgio Duarte, Valter Balegas
More informationJust-Right Consistency. Centralised data store. trois bases. As available as possible As consistent as necessary Correct by design
Just-Right Consistency As available as possible As consistent as necessary Correct by design Marc Shapiro, UMC-LI6 & Inria Annette Bieniusa, U. Kaiserslautern Nuno reguiça, U. Nova Lisboa Christopher Meiklejohn,
More informationLarge-Scale Geo-Replicated Conflict-free Replicated Data Types
Large-Scale Geo-Replicated Conflict-free Replicated Data Types Carlos Bartolomeu carlos.bartolomeu@tecnico.ulisboa.pt Instituto Superior Técnico (Advisor: Professor Luís Rodrigues) Abstract. Conflict-free
More informationExtending Eventually Consistent Cloud Databases for Enforcing Numeric Invariants
Extending Eventually Consistent Cloud Databases for Enforcing Numeric Invariants Valter Balegas, Diogo Serra, Sérgio Duarte Carla Ferreira, Rodrigo Rodrigues, Nuno Preguiça NOVA LINCS/FCT/Universidade
More informationReplication in Distributed Systems
Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over
More informationLarge-Scale Key-Value Stores Eventual Consistency Marco Serafini
Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,
More informationGenie. Distributed Systems Synthesis and Verification. Marc Rosen. EN : Advanced Distributed Systems and Networks May 1, 2017
Genie Distributed Systems Synthesis and Verification Marc Rosen EN.600.667: Advanced Distributed Systems and Networks May 1, 2017 1 / 35 Outline Introduction Problem Statement Prior Art Demo How does it
More informationDistributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency
Distributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency CPSC 416 Project Proposal n6n8: Trevor Jackson, u2c9: Hayden Nhan, v0r5: Yongnan (Devin) Li, x5m8: Li Jye Tong Introduction
More informationEventual Consistency Today: Limitations, Extensions and Beyond
Eventual Consistency Today: Limitations, Extensions and Beyond Peter Bailis and Ali Ghodsi, UC Berkeley Presenter: Yifei Teng Part of slides are cited from Nomchin Banga Road Map Eventual Consistency:
More informationConflict-Free Replicated Data Types (basic entry)
Conflict-Free Replicated Data Types (basic entry) Marc Shapiro Sorbonne-Universités-UPMC-LIP6 & Inria Paris http://lip6.fr/marc.shapiro/ 16 May 2016 1 Synonyms Conflict-Free Replicated Data Types (CRDTs).
More informationHorizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage
Horizontal or vertical scalability? Scaling Out Key-Value Storage COS 418: Distributed Systems Lecture 8 Kyle Jamieson Vertical Scaling Horizontal Scaling [Selected content adapted from M. Freedman, B.
More informationScaling Out Key-Value Storage
Scaling Out Key-Value Storage COS 418: Distributed Systems Logan Stafman [Adapted from K. Jamieson, M. Freedman, B. Karp] Horizontal or vertical scalability? Vertical Scaling Horizontal Scaling 2 Horizontal
More informationConflict-free Replicated Data Types
Conflict-free Replicated Data Types Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski To cite this version: Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski. Conflict-free Replicated
More informationDynamo: Key-Value Cloud Storage
Dynamo: Key-Value Cloud Storage Brad Karp UCL Computer Science CS M038 / GZ06 22 nd February 2016 Context: P2P vs. Data Center (key, value) Storage Chord and DHash intended for wide-area peer-to-peer systems
More informationConflict-free Replicated Data Types in Practice
Conflict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11 th January, 2017 HASLab/INESC TEC & University of Minho InfoBlender Motivation Background Background: CAP Theorem
More informationFrom strong to eventual consistency: getting it right. Nuno Preguiça, U. Nova de Lisboa Marc Shapiro, Inria & UPMC-LIP6
From strong to eventual consistency: getting it right Nuno Preguiça, U. Nova de Lisboa Marc Shapiro, Inria & UPMC-LIP6 Conflict-free Replicated Data Types ı Marc Shapiro 1,5, Nuno Preguiça 2,1, Carlos
More informationThis talk is about. Data consistency in 3D. Geo-replicated database q: Queue c: Counter { q c } Shared database q: Queue c: Counter { q c }
Data consistency in 3D (It s the invariants, stupid) Marc Shapiro Masoud Saieda Ardekani Gustavo Petri This talk is about Understanding consistency Primitive consistency mechanisms How primitives compose
More informationDynamo: Amazon s Highly Available Key-Value Store
Dynamo: Amazon s Highly Available Key-Value Store DeCandia et al. Amazon.com Presented by Sushil CS 5204 1 Motivation A storage system that attains high availability, performance and durability Decentralized
More informationCS Amazon Dynamo
CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed
More informationCSE-E5430 Scalable Cloud Computing Lecture 10
CSE-E5430 Scalable Cloud Computing Lecture 10 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 23.11-2015 1/29 Exam Registering for the exam is obligatory,
More informationDynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat
Dynamo: Amazon s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat Dynamo An infrastructure to host services Reliability and fault-tolerance at massive scale Availability providing
More informationDynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation
Dynamo Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/20 Outline Motivation 1 Motivation 2 3 Smruti R. Sarangi Leader
More informationLinearizability CMPT 401. Sequential Consistency. Passive Replication
Linearizability CMPT 401 Thursday, March 31, 2005 The execution of a replicated service (potentially with multiple requests interleaved over multiple servers) is said to be linearizable if: The interleaved
More informationSelective Hearing: An Approach to Distributed, Eventually Consistent Edge Computation
Selective Hearing: An Approach to Distributed, Eventually Consistent Edge Computation Christopher Meiklejohn Basho Technologies, Inc. Bellevue, WA Email: cmeiklejohn@basho.com Peter Van Roy Université
More informationCoordination-Free Computations. Christopher Meiklejohn
Coordination-Free Computations Christopher Meiklejohn LASP DISTRIBUTED, EVENTUALLY CONSISTENT COMPUTATIONS CHRISTOPHER MEIKLEJOHN (BASHO TECHNOLOGIES, INC.) PETER VAN ROY (UNIVERSITÉ CATHOLIQUE DE LOUVAIN)
More informationUsing Erlang, Riak and the ORSWOT CRDT at bet365 for Scalability and Performance. Michael Owen Research and Development Engineer
1 Using Erlang, Riak and the ORSWOT CRDT at bet365 for Scalability and Performance Michael Owen Research and Development Engineer 2 Background 3 bet365 in stats Founded in 2000 Located in Stoke-on-Trent
More informationDISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS
LASP 2 DISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS 3 EN TAL AV CHRISTOPHER MEIKLEJOHN 4 RESEARCH WITH: PETER VAN ROY (UCL) 5 MOTIVATION 6 SYNCHRONIZATION IS EXPENSIVE 7 SYNCHRONIZATION IS SOMETIMES
More informationWhy distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data?"
Why distributed databases suck, and what to do about it - Regaining consistency Do you want a database that goes down or one that serves wrong data?" 1 About the speaker NoSQL team lead at Trifork, Aarhus,
More informationBloom: Big Systems, Small Programs. Neil Conway UC Berkeley
Bloom: Big Systems, Small Programs Neil Conway UC Berkeley Distributed Computing Programming Languages Data prefetching Register allocation Loop unrolling Function inlining Optimization Global coordination,
More informationDYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun
DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE Presented by Byungjin Jun 1 What is Dynamo for? Highly available key-value storages system Simple primary-key only interface Scalable and Reliable Tradeoff:
More informationTARDiS: A branch and merge approach to weak consistency
TARDiS: A branch and merge approach to weak consistency By: Natacha Crooks, Youer Pu, Nancy Estrada, Trinabh Gupta, Lorenzo Alvis, Allen Clement Presented by: Samodya Abeysiriwardane TARDiS Transactional
More informationDeterministic Threshold Queries of Distributed Data Structures
Deterministic Threshold Queries of Distributed Data Structures Lindsey Kuper and Ryan R. Newton Indiana University {lkuper, rrnewton}@cs.indiana.edu Abstract. Convergent replicated data types, or CvRDTs,
More informationMutual consistency, what for? Replication. Data replication, consistency models & protocols. Difficulties. Execution Model.
Replication Data replication, consistency models & protocols C. L. Roncancio - S. Drapeau Grenoble INP Ensimag / LIG - Obeo 1 Data and process Focus on data: several physical of one logical object What
More informationAvailability versus consistency. Eventual Consistency: Bayou. Eventual consistency. Bayou: A Weakly Connected Replicated Storage System
Eventual Consistency: Bayou Availability versus consistency Totally-Ordered Multicast kept replicas consistent but had single points of failure Not available under failures COS 418: Distributed Systems
More informationOptimized OR-Sets Without Ordering Constraints
Optimized OR-Sets Without Ordering Constraints Madhavan Mukund, Gautham Shenoy R, and S P Suresh Chennai Mathematical Institute, India {madhavan,gautshen,spsuresh}@cmi.ac.in Abstract. Eventual consistency
More informationEventual Consistency 1
Eventual Consistency 1 Readings Werner Vogels ACM Queue paper http://queue.acm.org/detail.cfm?id=1466448 Dynamo paper http://www.allthingsdistributed.com/files/ amazon-dynamo-sosp2007.pdf Apache Cassandra
More informationScaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman
Scaling KVS CS6450: Distributed Systems Lecture 14 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed
More informationLasp: A Language for Distributed, Coordination-Free Programming
Lasp: A Language for Distributed, Coordination-Free Programming Christopher Meiklejohn Basho Technologies, Inc. cmeiklejohn@basho.com Peter Van Roy Université catholique de Louvain peter.vanroy@uclouvain.be
More informationIntroduction to Distributed Systems Seif Haridi
Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send
More informationChapter 11 - Data Replication Middleware
Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 11 - Data Replication Middleware Motivation Replication: controlled
More informationBasic vs. Reliable Multicast
Basic vs. Reliable Multicast Basic multicast does not consider process crashes. Reliable multicast does. So far, we considered the basic versions of ordered multicasts. What about the reliable versions?
More informationCAP Theorem, BASE & DynamoDB
Indian Institute of Science Bangalore, India भ रत य व ज ञ न स स थ न ब गल र, भ रत DS256:Jan18 (3:1) Department of Computational and Data Sciences CAP Theorem, BASE & DynamoDB Yogesh Simmhan Yogesh Simmhan
More informationSCALABLE CONSISTENCY AND TRANSACTION MODELS
Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:
More informationReplication and Consistency
Replication and Consistency Today l Replication l Consistency models l Consistency protocols The value of replication For reliability and availability Avoid problems with disconnection, data corruption,
More informationCS6450: Distributed Systems Lecture 11. Ryan Stutsman
Strong Consistency CS6450: Distributed Systems Lecture 11 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed
More informationEECS 498 Introduction to Distributed Systems
EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Replicated State Machines Logical clocks Primary/ Backup Paxos? 0 1 (N-1)/2 No. of tolerable failures October 11, 2017 EECS 498
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL
CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.
More informationBuilding Consistent Transactions with Inconsistent Replication
DB Reading Group Fall 2015 slides by Dana Van Aken Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports
More informationCS6450: Distributed Systems Lecture 13. Ryan Stutsman
Eventual Consistency CS6450: Distributed Systems Lecture 13 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University.
More informationDistributed Data Management Replication
Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut Distributing Data Motivation Scalability (Elasticity) If data volume, processing, or access exhausts one machine, you might want to spread
More informationCS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.
CS 138: Dynamo CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Dynamo Highly available and scalable distributed data store Manages state of services that have high reliability and
More informationWeak Consistency and Disconnected Operation in git. Raymond Cheng
Weak Consistency and Disconnected Operation in git Raymond Cheng ryscheng@cs.washington.edu Motivation How can we support disconnected or weakly connected operation? Applications File synchronization across
More information10. Replication. Motivation
10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure
More informationWhat Came First? The Ordering of Events in
What Came First? The Ordering of Events in Systems @kavya719 kavya the design of concurrent systems Slack architecture on AWS systems with multiple independent actors. threads in a multithreaded program.
More informationComputing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1
Computing Parable The Archery Teacher Courtesy: S. Keshav, U. Waterloo Lecture 16, page 1 Consistency and Replication Today: Consistency models Data-centric consistency models Client-centric consistency
More informationIntroduction to Distributed Data Systems
Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January
More informationCISC 7610 Lecture 2b The beginnings of NoSQL
CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone
More informationTechnical Report TR Design and Evaluation of a Transaction Model with Multiple Consistency Levels for Replicated Data
Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 Keller Hall 200 Union Street SE Minneapolis, MN 55455-0159 USA TR 15-009 Design and Evaluation of a Transaction
More informationAbstract Specifications for Weakly Consistent Data
Abstract Specifications for Weakly Consistent Data Sebastian Burckhardt Microsoft Research Redmond, WA, USA Jonathan Protzenko Microsoft Research Redmond, WA, USA Abstract Weak consistency can improve
More informationDynamo: Amazon s Highly Available Key-value Store
Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and
More informationCS 655 Advanced Topics in Distributed Systems
Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3
More informationISSN Bulletin. of the. European Association for Theoretical Computer Science EATC S
IN 0252 9742 Bulletin of the European Association for Theoretical Computer cience EATC EATC Number 104 June 2011 Tʜ Dɪʀɪʙ Cɪɴɢ Cʟɴ ʙʏ Pɴɢɪ Fʀ Department of Computer cience, University of Crete P.O. Box
More informationTransaction Management using Causal Snapshot Isolation in Partially Replicated Databases
Transaction Management using Causal Snapshot Isolation in Partially Replicated Databases ABSTRACT We present here a transaction management protocol using snapshot isolation in partially replicated multi-version
More informationIntuitive distributed algorithms. with F#
Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype
More informationDatabases. Laboratorio de sistemas distribuidos. Universidad Politécnica de Madrid (UPM)
Databases Laboratorio de sistemas distribuidos Universidad Politécnica de Madrid (UPM) http://lsd.ls.fi.upm.es/lsd/lsd.htm Nuevas tendencias en sistemas distribuidos 2 Summary Transactions. Isolation.
More informationDesigning and Evaluating a Distributed Computing Language Runtime. Christopher Meiklejohn Université catholique de Louvain, Belgium
Designing and Evaluating a Distributed Computing Language Runtime Christopher Meiklejohn (@cmeik) Université catholique de Louvain, Belgium R A R B R A set() R B R A set() set(2) 2 R B set(3) 3 set() set(2)
More informationCSE 5306 Distributed Systems. Consistency and Replication
CSE 5306 Distributed Systems Consistency and Replication 1 Reasons for Replication Data are replicated for the reliability of the system Servers are replicated for performance Scaling in numbers Scaling
More informationSMAC: State Management for Geo-Distributed Containers
SMAC: State Management for Geo-Distributed Containers Jacob Eberhardt, Dominik Ernst, David Bermbach Information Systems Engineering Research Group Technische Universitaet Berlin Berlin, Germany Email:
More informationDistributed Systems (5DV147)
Distributed Systems (5DV147) Replication and consistency Fall 2013 1 Replication 2 What is replication? Introduction Make different copies of data ensuring that all copies are identical Immutable data
More informationPerformance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences
Performance and Forgiveness June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences Margo Seltzer Architect Outline A consistency primer Techniques and costs of consistency
More informationG Bayou: A Weakly Connected Replicated Storage System. Robert Grimm New York University
G22.3250-001 Bayou: A Weakly Connected Replicated Storage System Robert Grimm New York University Altogether Now: The Three Questions! What is the problem?! What is new or different?! What are the contributions
More informationThere is a tempta7on to say it is really used, it must be good
Notes from reviews Dynamo Evalua7on doesn t cover all design goals (e.g. incremental scalability, heterogeneity) Is it research? Complexity? How general? Dynamo Mo7va7on Normal database not the right fit
More informationConsistency and Replication. Why replicate?
Consistency and Replication Today: Consistency models Data-centric consistency models Client-centric consistency models Lecture 15, page 1 Why replicate? Data replication versus compute replication Data
More informationCS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.
Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message
More informationProviding Real-Time and Fault Tolerance for CORBA Applications
Providing Real-Time and Tolerance for CORBA Applications Priya Narasimhan Assistant Professor of ECE and CS University Pittsburgh, PA 15213-3890 Sponsored in part by the CMU-NASA High Dependability Computing
More informationDynamo Tom Anderson and Doug Woos
Dynamo motivation Dynamo Tom Anderson and Doug Woos Fast, available writes - Shopping cart: always enable purchases FLP: consistency and progress at odds - Paxos: must communicate with a quorum Performance:
More informationDistributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf
Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need
More informationReplications and Consensus
CPSC 426/526 Replications and Consensus Ennan Zhai Computer Science Department Yale University Recall: Lec-8 and 9 In the lec-8 and 9, we learned: - Cloud storage and data processing - File system: Google
More informationVersioning, Consistency, and Agreement
Lee Lorenz, Brent Sheppard Jenkins, if I want another yes man, I ll build one! Versioning, Consistency, and Agreement COS 461: Computer Networks Spring 2010 (MW 3:00 4:20 in CS105) Mike Freedman hip://www.cs.princeton.edu/courses/archive/spring10/cos461/
More informationPresented By: Devarsh Patel
: Amazon s Highly Available Key-value Store Presented By: Devarsh Patel CS5204 Operating Systems 1 Introduction Amazon s e-commerce platform Requires performance, reliability and efficiency To support
More informationConsistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering
Consistency: Relaxed SWE 622, Spring 2017 Distributed Software Engineering Review: HW2 What did we do? Cache->Redis Locks->Lock Server Post-mortem feedback: http://b.socrative.com/ click on student login,
More informationTolerating Latency in Replicated State Machines through Client Speculation
Tolerating Latency in Replicated State Machines through Client Speculation April 22, 2009 1, James Cowling 2, Edmund B. Nightingale 3, Peter M. Chen 1, Jason Flinn 1, Barbara Liskov 2 University of Michigan
More informationPriya Narasimhan. Assistant Professor of ECE and CS Carnegie Mellon University Pittsburgh, PA
OMG Real-Time and Distributed Object Computing Workshop, July 2002, Arlington, VA Providing Real-Time and Fault Tolerance for CORBA Applications Priya Narasimhan Assistant Professor of ECE and CS Carnegie
More information11. Replication. Motivation
11. Replication Seite 1 11. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure
More informationDrRobert N. M. Watson
Distributed systems Lecture 15: Replication, quorums, consistency, CAP, and Amazon/Google case studies DrRobert N. M. Watson 1 Last time General issue of consensus: How to get processes to agree on something
More informationConsistency in Distributed Systems
Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission
More informationarxiv: v1 [cs.dc] 4 Mar 2016
Delta State Replicated Data Types Paulo Sérgio Almeida, Ali Shoker, and Carlos Baquero HASLab/INESC TEC and Universidade do Minho, Portugal arxiv:1603.01529v1 [cs.dc] 4 Mar 2016 Abstract. CRDTs are distributed
More informationEngineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05
Engineering Goals Scalability Availability Transactional behavior Security EAI... Scalability How much performance can you get by adding hardware ($)? Performance perfect acceptable unacceptable Processors
More informationTransaction Management using Causal Snapshot Isolation in Partially Replicated Databases. Technical Report
Transaction Management using Causal Snapshot Isolation in Partially Replicated Databases Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 Keller Hall 200 Union
More informationTransactional storage for geo-replicated systems
Transactional storage for geo-replicated systems Yair Sovran Russell Power Marcos K. Aguilera Jinyang Li New York University Microsoft Research Silicon Valley ABSTRACT We describe the design and implementation
More informationDISTRIBUTED COMPUTER SYSTEMS
DISTRIBUTED COMPUTER SYSTEMS CONSISTENCY AND REPLICATION CONSISTENCY MODELS Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Consistency Models Background Replication Motivation
More information