Conflict-free Replicated Data Types (CRDTs)
|
|
- Brendan Brown
- 5 years ago
- Views:
Transcription
1 Conflict-free Replicated Data Types (CRDTs) for collaborative environments Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC
2 Conflict-free objects for largescale distribution Shared Mutable data Read replicate Updates? Novel, principled approach: Conflict-free objects Large, dynamic graph Incremental, parallel, asynchronous: - updates - processing Can we design useful object types without any synchronisation whatsoever? Can we build practical systems from such objects? 2
3 Conflict-free objects for largescale distribution Shared Mutable data Read replicate Updates? Novel, principled approach: Conflict-free objects Large, dynamic graph Incremental, parallel, asynchronous: - updates - processing Can we design useful object types without any synchronisation whatsoever? Can we build practical systems from such objects? 2
4 Conflict-free objects for largescale distribution Shared Mutable data Read replicate Updates? Novel, principled approach: Conflict-free objects Large, dynamic graph Incremental, parallel, asynchronous: - updates - processing Can we design useful object types without any synchronisation whatsoever? Can we build practical systems from such objects? 2
5 Conflict-free objects for largescale distribution Shared Mutable data Read replicate Updates? Novel, principled approach: Conflict-free objects Large, dynamic graph Incremental, parallel, asynchronous: - updates - processing Can we design useful object types without any synchronisation whatsoever? Can we build practical systems from such objects? 2
6 Conflict-free objects for largescale distribution Shared Mutable data Read replicate Updates? Novel, principled approach: Conflict-free objects Large, dynamic graph Incremental, parallel, asynchronous: - updates - processing Can we design useful object types without any synchronisation whatsoever? Can we build practical systems from such objects? 2
7 Replication for beginners
8 Replicated data Share data Replicate at many locations Performance: local reads Availability: immune from network failure Fault-tolerance: replicate computation Scalability: load balancing Updates Push to all replicas Conflicts: Consistency? Fault tolerance and parallelism too? Conflict!! 4
9 Strong consistency Preclude conflicts All replicas execute updates in same total order Any deterministic object Simultaneous N- way agreement Consensus Serialisation bottleneck Tolerates < n/2 faults Very general Correct Doesn't scale 5
10 Strong consistency Preclude conflicts All replicas execute updates in same total order Any deterministic object 1 Simultaneous N- way agreement Consensus Serialisation bottleneck Tolerates < n/2 faults Very general Correct Doesn't scale 5
11 Strong consistency Preclude conflicts All replicas execute updates in same total order Any deterministic object 12 Simultaneous N- way agreement Consensus Serialisation bottleneck Tolerates < n/2 faults Very general Correct Doesn't scale 5
12 Strong consistency Preclude conflicts All replicas execute updates in same total order Any deterministic object 123 Simultaneous N- way agreement Consensus Serialisation bottleneck Tolerates < n/2 faults Very general Correct Doesn't scale 5
13 Strong consistency Preclude conflicts All replicas execute updates in same total order Any deterministic object 1234 Simultaneous N- way agreement Consensus Serialisation bottleneck Tolerates < n/2 faults Very general Correct Doesn't scale 5
14 Strong consistency Preclude conflicts All replicas execute updates in same total order Any deterministic object Simultaneous N- way agreement Consensus Serialisation bottleneck Tolerates < n/2 faults Very general Correct Doesn't scale 5
15 Strong consistency Preclude conflicts All replicas execute updates in same total order Any deterministic object Simultaneous N- way agreement Consensus Serialisation bottleneck Tolerates < n/2 faults Very general Correct Doesn't scale 5
16 Eventual Consistency Update local + propagate No foreground synch Expose tentative state Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus moved to background 6
17 Eventual Consistency Update local + propagate No foreground synch Expose tentative state Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus moved to background 6
18 Eventual Consistency Update local + propagate No foreground synch Expose tentative state Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus moved to background 6
19 Eventual Consistency Update local + propagate No foreground synch Expose tentative state Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus moved to background 6
20 Eventual Consistency Update local + propagate No foreground synch Expose tentative state Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus moved to background 6
21 Eventual Consistency Update local + propagate No foreground synch Expose tentative state Conflict! Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus moved to background 6
22 Eventual Consistency Update local + propagate No foreground synch Expose tentative state Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus moved to background 6
23 Eventual Consistency Update local + propagate No foreground synch Expose tentative state Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus moved to background 6
24 Eventual Consistency Update local + propagate No foreground synch Expose tentative state Eventual, reliable delivery On conflict Arbitrate Roll back Availability ++ Parallelism++ Latency -- Complexity ++ Consensus moved to background 6
25 Strong Eventual Consistency Available, responsive More parallelism No conflicts No rollback Update local + propagate No synch Expose intermediate state Eventual, reliable delivery No conflict Deterministic outcome of concurrent updates No consensus: n-1 faults Not universal Solves the CAP problem 7
26 Strong Eventual Consistency Available, responsive More parallelism No conflicts No rollback Update local + propagate No synch Expose intermediate state Eventual, reliable delivery No conflict Deterministic outcome of concurrent updates No consensus: n-1 faults Not universal Solves the CAP problem 7
27 Strong Eventual Consistency Available, responsive More parallelism No conflicts No rollback Update local + propagate No synch Expose intermediate state Eventual, reliable delivery No conflict Deterministic outcome of concurrent updates No consensus: n-1 faults Not universal Solves the CAP problem 7
28 Strong Eventual Consistency Available, responsive More parallelism No conflicts No rollback Update local + propagate No synch Expose intermediate state Eventual, reliable delivery No conflict Deterministic outcome of concurrent updates No consensus: n-1 faults Not universal Solves the CAP problem 7
29 Strong Eventual Consistency Available, responsive More parallelism No conflicts No rollback Update local + propagate No synch Expose intermediate state Eventual, reliable delivery No conflict Deterministic outcome of concurrent updates No consensus: n-1 faults Not universal Solves the CAP problem 7
30 The challenge: What interesting objects can we design with no synchronisation whatsoever?
31 Portfolio of CRDTs Register Last-Writer Wins Multi-Value Set Grow-Only 2P Observed-Remove Map Set of Registers Counter Unlimited Non-negative Graphs Directed Monotonic DAG Edit graph Sequence Edit sequence 9
32 Set design alternatives Sequential specification: {true} add(e) {e S} {true} remove(e) {e S} linearisable: sequential order equivalent to real-time order Requires consensus {true} add(e) remove(e) {????} linearisable? add wins? remove wins? last writer wins? error state? 10
33 Observed-Remove Set s s1 s2 s3 {} {} {} Can never remove more tokens than exist Op order removed tokens have been previously added 11
34 Observed-Remove Set s s1 s2 s3 {} {} {} add(a) S {aα} Can never remove more tokens than exist Op order removed tokens have been previously added 11
35 Observed-Remove Set s {} add(a) s1 s2 s3 {} {} S {aα} add(a) S {aβ} Can never remove more tokens than exist Op order removed tokens have been previously added 11
36 Observed-Remove Set s s1 s2 s3 {} {} {} add(a) S {aα} add(a) S {aβ} {aβ} M {aβ} Can never remove more tokens than exist Op order removed tokens have been previously added 11
37 Observed-Remove Set s s1 s2 s3 {} {} {} add(a) S {aα} add(a) S {aβ} {aβ} M {aβ} {aα} M {aβ, aα} Can never remove more tokens than exist Op order removed tokens have been previously added 11
38 Observed-Remove Set s s1 s2 s3 {} {} {} add(a) S {aα} add(a) S {aβ} {aβ} M rmv (a) {aβ} S {aα} {aα} M {aβ, aα} Can never remove more tokens than exist Op order removed tokens have been previously added 11
39 Observed-Remove Set s s1 s2 s3 {} {} {} add(a) S {aα} add(a) S {aβ} {aβ} rmv (a) S {aα} Can never remove more tokens than {aα} {aα} exist Op order removed M M M {aβ} {aβ, aα} {aβ, aα} tokens have been previously added 11
40 Observed-Remove Set s s1 s2 s3 {} {} {} add(a) S {aα} add(a) S {aβ} {aβ} rmv (a) S {aβ, aα} M {aα} {aβ} Can never remove more tokens than {aα} {aα} exist Op order removed M M M {aβ} {aβ, aα} {aβ, aα} tokens have been previously added 11
41 Observed-Remove Set s s1 s2 s3 {} {} {} add(a) S {aα} add(a) S {aβ} {aβ} rmv (a) S {aβ, aα} M {aα} {aβ} Can never remove more tokens than {aα} {aα} exist Op order removed M M M {aβ} {aβ, aα} {aβ, aα} tokens have been previously added Payload: added, removed (element, unique-token) add(e) = A A {(e, α)} Remove: all unique elements observed remove(e) = R R { (e, ) A} lookup(e) = (e, ) A \ R merge (S, S') = (A A', R R') {true} add(e) remove(e) {e S} 11
42 OR-Set Set: solves Dynamo Shopping Cart anomaly Optimisations Just mark tombstones Garbage-collect tombstones Operation-based approach 12
43 Graph design alternatives Graph = (V, E) where E V! V Sequential specification: {v,v' V} addedge(v,v') { } { (v,v') E} removevertex(v) { } Concurrent: removevertex(v') addedge(v,v') linearisable? addedge wins? removevertex wins? etc. for our Web Search Engine application, removevertex wins Do not check precondition at add/remove 13
44 Graph Payload = OR-Set V, OR-Set E Updates add/remove to V, E addvertex(v), removevertex(v) addedge(v,v'), removeedge(v,v') Do not enforce invariant a priori lookupedge(v,v') = (v,v') E v V v' V removevertex(v') addedge(v,v') remove wins " 14
45 Graph Payload = OR-Set V, OR-Set E Updates add/remove to V, E addvertex(v), removevertex(v) addedge(v,v'), removeedge(v,v') Do not enforce invariant a priori lookupedge(v,v') = (v,v') E v V v' V removevertex(v') addedge(v,v') remove wins " 14
46 Graph Payload = OR-Set V, OR-Set E Updates add/remove to V, E addvertex(v), removevertex(v) addedge(v,v'), removeedge(v,v') Do not enforce invariant a priori lookupedge(v,v') = (v,v') E v V v' V removevertex(v') addedge(v,v') remove wins " 14
47 Graph Payload = OR-Set V, OR-Set E Updates add/remove to V, E addvertex(v), removevertex(v) addedge(v,v'), removeedge(v,v') Do not enforce invariant a priori lookupedge(v,v') = (v,v') E v V v' V removevertex(v') addedge(v,v') remove wins " 14
48 Graph Payload = OR-Set V, OR-Set E Updates add/remove to V, E addvertex(v), removevertex(v) addedge(v,v'), removeedge(v,v') Do not enforce invariant a priori lookupedge(v,v') = (v,v') E v V v' V removevertex(v') addedge(v,v') remove wins " 14
49 Deep (internal) view: DAG representing insert order Co-operative editing {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} add: between alreadyordered elements strong order takes preceden ce over weak begin and end sentinels Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order ensure consistent ordering at all replicas II NR RI A α ββ γ γδ ε
50 Deep (internal) view: DAG representing insert order Co-operative editing {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} I δ add: between alreadyordered elements strong order takes preceden ce over weak begin and end sentinels Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order ensure consistent ordering at all replicas II NR RI A α ββ γ γδ ε
51 Co-operative editing Deep (internal) view: DAG representing insert order ensure consistent ordering at all replicas {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} I δ add: between alreadyordered elements strong order takes preceden ce over weak A ε begin and end sentinels II NR RI A Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order α ββ γ γδ ε
52 Co-operative editing Deep (internal) view: DAG representing insert order N β ensure consistent ordering at all replicas {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} I δ add: between alreadyordered elements strong order takes preceden ce over weak A ε begin and end sentinels II NR RI A Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order α ββ γ γδ ε
53 Co-operative editing Deep (internal) view: DAG representing insert order N β ensure consistent ordering at all replicas {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} R γ I δ add: between alreadyordered elements strong order takes preceden ce over weak A ε begin and end sentinels II NR RI A Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order α ββ γ γδ ε
54 Co-operative editing Deep (internal) view: DAG representing insert order I α N β ensure consistent ordering at all replicas {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} R γ I δ add: between alreadyordered elements strong order takes preceden ce over weak A ε begin and end sentinels II NR RI A Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order α ββ γ γδ ε
55 Co-operative editing Deep (internal) view: DAG representing insert order I α N β ensure consistent ordering at all replicas {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} R γ I δ add: between alreadyordered elements strong order takes preceden ce over weak A ε begin and end sentinels II NR RI A Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order α ββ γ γδ ε
56 Co-operative editing Deep (internal) view: DAG representing insert order I α N β ensure consistent ordering at all replicas {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} R γ I δ add: between alreadyordered elements strong order takes preceden ce over weak A ε begin and end sentinels II NR RI A Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order α ββ γ γδ ε
57 Co-operative editing Deep (internal) view: DAG representing insert order I α N β ensure consistent ordering at all replicas {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} R γ I δ add: between alreadyordered elements strong order takes preceden ce over weak A ε begin and end sentinels II NR RI A Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order α ββ γ γδ ε
58 Co-operative editing Deep (internal) view: DAG representing insert order I α N β ensure consistent ordering at all replicas {x,z graphi x < z} add-betweeni (x, y, z) {y graphi x<y<z} R γ I δ add: between alreadyordered elements strong order takes preceden ce over weak A ε begin and end sentinels II NR RI A Local constraint implies globally acyclic This spec is implemented directly by WOOT Clean Surface view: summarises total order α ββ γ γδ ε
59 Continuum I N Assign each element a unique real number position Real numbers not appropriate approximate by tree 16
60 Continuum I N A Assign each element a unique real number position Real numbers not appropriate approximate by tree 16
61 Continuum I N I A Assign each element a unique real number position Real numbers not appropriate approximate by tree 16
62 Continuum L I N R I A Assign each element a unique real number position Real numbers not appropriate approximate by tree 16
63 Treedoc binary tree 0 R 1 I 1 N I 0 A Compact, lowarity tree In the following slides, will Low arity: binary, quaternary = L I N R I Binary naming tree: compact, self-adjusting logarithmic: Logarithmic properties assuming tree add appends leaf non-destructive, IDs don t change remove: tombstone, IDs don't change 17
64 Treedoc binary tree 0 R 1 L 0 I 1 N I 0 A Compact, lowarity tree In the following slides, will Low arity: binary, quaternary = L I N R I Binary naming tree: compact, self-adjusting logarithmic: Logarithmic properties assuming tree add appends leaf non-destructive, IDs don t change remove: tombstone, IDs don't change 17
65 Treedoc binary tree Low arity: binary, quaternary Binary naming tree: compact, self-adjusting Logarithmic properties L 0 I 0 1 N R I 0 1 = L I N R I A logarithmic: assuming tree Compact, lowarity tree In the following slides, will add appends leaf non-destructive, IDs don t change remove: tombstone, IDs don't change 17
66 Treedoc binary tree Low arity: binary, quaternary Binary naming tree: compact, self-adjusting Logarithmic properties L 0 I 0 1 N R I 0 1 = L I N R I A logarithmic: assuming tree Compact, lowarity tree In the following slides, will add appends leaf non-destructive, IDs don t change remove: tombstone, IDs don't change 17
67 Treedoc binary tree Low arity: binary, quaternary Binary naming tree: compact, self-adjusting Logarithmic properties L 0 I 0 1 N R I 0 1 = L I N R I A logarithmic: assuming tree Compact, lowarity tree In the following slides, will add appends leaf non-destructive, IDs don t change remove: tombstone, IDs don't change 17
68 Layered Treedoc binary tree 18
69 Layered Treedoc sparse ary tree Site 34 Site 79 binary tree 18
70 Layered Treedoc sparse ary tree Site 34 Site 79 Site 22 binary tree 18
71 Layered Treedoc sparse ary tree Site 34 Site 79 Site 34 Site 22 binary tree 18
72 Layered Treedoc sparse ary tree Site 34 Site 79 Site 34 Site 22 Site 34 Site 66 Site 79 binary tree 18
73 Layered Treedoc sparse ary tree Site 34 Site 79 Site 34 Site 22 Site 34 Site 66 Site 79 binary tree Edit: Binary tree Concurrency: Sparse tree 18
74 The theory Two simple conditions for correctness without synchronisation
75 Query client s s1 s2 s3 Local at source replica Client's choice Example: Amazon shopping cart is replicated unspecified client, e.g., Web frontend One or more load-balancer, failures may direct client to different replicas 20
76 Query s s1 client S s1.q(a) s2 s3 Local at source replica Client's choice Example: Amazon shopping cart is replicated unspecified client, e.g., Web frontend One or more load-balancer, failures may direct client to different replicas 20
77 Query client s s1 s2.q(b) S s1.q(a) s2 S s3 Example: Amazon shopping cart is replicated unspecified client, e.g., Web frontend One or more load-balancer, failures may direct client to different replicas Local at source replica Client's choice 20
78 State-based replication client s s1 s2 s3 Local at source s1.u(a), s2.u(b), Compute Update local payload Convergence: Episodically: send si payload On delivery: merge payloads m merge two valid states produce valid state no historical info available Inefficient if payload is large 21
79 State-based replication client s s1 S s1.u(a) s2 s3 Local at source s1.u(a), s2.u(b), Compute Update local payload Convergence: Episodically: send si payload On delivery: merge payloads m merge two valid states produce valid state no historical info available Inefficient if payload is large 21
80 State-based replication client s s1 s2.u(b) S s1.u(a) s2 S s3 Local at source s1.u(a), s2.u(b), Compute Update local payload Convergence: Episodically: send si payload On delivery: merge payloads m merge two valid states produce valid state no historical info available Inefficient if payload is large 21
81 State-based replication s s1 s2.u(b) S s1.u(a) s2 S s3 Local at source s1.u(a), s2.u(b), Compute Update local payload Convergence: Episodically: send si payload On delivery: merge payloads m merge two valid states produce valid state no historical info available Inefficient if payload is large 21
82 State-based replication s s1 s2.u(b) S s1.u(a) s1 s2 S M s2.m(s1) s3 Local at source s1.u(a), s2.u(b), Compute Update local payload Convergence: Episodically: send si payload On delivery: merge payloads m merge two valid states produce valid state no historical info available Inefficient if payload is large 21
83 State-based replication s s1 s2.u(b) S s1.u(a) s1 s2 S M s2.m(s1) s2 s3 M s3.m(s2) Local at source s1.u(a), s2.u(b), Compute Update local payload Convergence: Episodically: send si payload On delivery: merge payloads m merge two valid states produce valid state no historical info available Inefficient if payload is large 21
84 State-based replication s s1 s2 s2.u(b) S S s1.u(a) s1 M s2.m(s1) s2 s1.m(s2) M s2 s3 M s3.m(s2) Local at source s1.u(a), s2.u(b), Compute Update local payload Convergence: Episodically: send si payload On delivery: merge payloads m merge two valid states produce valid state no historical info available Inefficient if payload is large 21
85 State-based: monotonic semilattice CRDT s s1 s2 s3 s1.u(a) S s2.u(b) S s1 M s2.m(s1) s2 M s3.m(s2) s1.m(s2) M s2 = Least Upper Bound LUB = merge If payload type forms a semi-lattice updates are increasing merge computes Least Upper Bound then replicas converge to LUB of last values Example: Payload = int, merge = max no reference to history 22
86 Operation-based replication s s1 s2 s2.u(b) s1.u(a) S S push to all replicas eventually push small updates - more efficient than state-based s3 At source: prepare broadcast to all replicas Eventually, at all replicas: update local replica 23
87 Operation-based replication s s1 s2 s2.u(b) s1.u(a) S S b b s1.u(b) D push to all replicas eventually push small updates - more efficient than state-based s3 D s3.u(b) At source: prepare broadcast to all replicas Eventually, at all replicas: update local replica 23
88 Operation-based replication s s1 s2 s2.u(b) s1.u(a) S S b b a s1.u(b) D a s2.u(a) D push to all replicas eventually push small updates - more efficient than state-based s3 D s3.u(b) D s3.u(a) At source: prepare broadcast to all replicas Eventually, at all replicas: update local replica 23
89 Op-based: commute CRDT s s1 s1.u(a) S b s1.u(b) D a s2.u(a) s2 s2.u(b) S b a D s3 D s3.u(b) D s3.u(a) Delivery order ensures downstream precondition happened-before or weaker If:!! (Liveness) all replicas execute all operations!!! in delivery order (Safety) concurrent operations all commute Then: replicas converge 24
90 Monotonic semi-lattice commutative Systematic transformation Inefficient Hand-crafted op-based implementation 1. A state-based object can emulate an operation-based object, and vice-versa 2. State-based emulation of a CvRDT is a CmRDT 3. Operation-based emulation of a CvRDT is a CmRDT 25
91 Operation-based OR-Set Payload: S = { (e,α), (e, β), (e', γ), }!! where α, β, unique Operations: lookup(e) = α: (e, α) S Set of IDs associated with a in the old state As observed by source! 26
92 Operation-based OR-Set Payload: S = { (e,α), (e, β), (e', γ), }!! where α, β, unique Operations: lookup(e) = α: (e, α) S add(e) = S S {(e, α)} where α fresh Set of IDs associated with a in the old state As observed by source! 26
93 Operation-based OR-Set Payload: S = { (e,α), (e, β), (e', γ), }!! where α, β, unique Operations: lookup(e) = α: (e, α) S add(e) = S S {(e, α)} where α fresh remove (e) =! (at source) R = {(e, α) S}! (downstream) S S \ R Set of IDs associated with a in the old state As observed by source! 26
94 Operation-based OR-Set Payload: S = { (e,α), (e, β), (e', γ), }!! where α, β, unique Operations: lookup(e) = α: (e, α) S add(e) = S S {(e, α)} where α fresh remove (e) =! (at source) R = {(e, α) S}! (downstream) S S \ R Set of IDs associated with a in the old state As observed by source! No tombstones 26
95 Operation-based OR-Set Payload: S = { (e,α), (e, β), (e', γ), }!! where α, β, unique Operations: lookup(e) = α: (e, α) S add(e) = S S {(e, α)} where α fresh remove (e) =! (at source) R = {(e, α) S}! (downstream) S S \ R Set of IDs associated with a in the old state As observed by source! No tombstones {true} add(e) remove(e) {e S} 26
96 Ongoing work
97 CRDTs for P2P & Cloud Computing ConcoRDanT: ANR Systematic study of conflict-free design space Theory and practice Characterise invariants Library of data types Not universal Conflict-free vs. conflict semantics Move consensus off critical path, non-critical ops 28
98 CRDT + dataflow Web site 1 Web site 2 Set Whitelist crawl extract links Graph Graph spam detector Web site 3 Content DB Map URL to last 2 versions extract words Words Map word to Set of URLs Incremental, asynchronous processing Replicate, shard CRDTs near the edge Propagate updates dataflow Throttle according to QoS metrics (freshness, availability, cost, etc.) Scale: sharded Synchronous processing: snapshot, at centre 29
99 OR-Set + Snapshot Read consistent snapshot Despite concurrent, incremental updates Unique token = time (vector clock) α = Lamport (process i, counter t) UIDs identify snapshot version Snapshot: vector clock value Retain tombstones until not needed lookup(e, t) = (e, i, t') A : t'>t (e, i, t') R: t'>t 30
100 Sharded OR-Set Very large objects Independent shards Static: hash (Dynamic: requires consensus to rebalance) Statically-Sharded CRDT Each shard is a CRDT Update: single shard No cross-object invariants The combination remains a CRDT Statically Sharded OR-Set Combination of smaller OR-Sets Snapshot: clock across shards 31
101 Take aways Principled approach Strong Eventual Consistency Two sufficient conditions: State: monotonic semi-lattice Operation: commutativity Useful CRDTs Register, Counter, Set, Map (KVS), Graph, Monotonic DAG, Sequence Future work Snapshot, sharding, dataflow A wee bit of synchronisation 32
102 Portfolio of CRDTs Register Last-Writer Wins Multi-Value Set Grow-Only 2P Observed-Remove Map Set of Registers Counter Unlimited Non-negative Graphs Directed Monotonic DAG Edit graph Sequence Edit sequence 33
Strong Eventual Consistency and CRDTs
Strong Eventual Consistency and CRDTs Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC Large-scale replicated data structures Large, dynamic
More informationStrong Eventual Consistency and CRDTs
Strong Eventual Consistency and CRDTs Marc Shapiro, INRIA & LIP6 Nuno Preguiça, U. Nova de Lisboa Carlos Baquero, U. Minho Marek Zawirski, INRIA & UPMC Large-scale replicated data structures Large, dynamic
More informationSelf-healing Data Step by Step
Self-healing Data Step by Step Uwe Friedrichsen (codecentric AG) NoSQL matters Cologne, 29. April 2014 @ufried Uwe Friedrichsen uwe.friedrichsen@codecentric.de http://slideshare.net/ufried http://ufried.tumblr.com
More informationConflict-free Replicated Data Types
Conflict-free Replicated Data Types Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski To cite this version: Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski. Conflict-free Replicated
More informationA comprehensive study of Convergent and Commutative Replicated Data Types
INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE A comprehensive study of Convergent and Commutative Replicated Data Types Marc Shapiro, INRIA & LIP6, Paris, France Nuno Preguiça, CITI,
More informationCloud-backed applications. Write Fast, Read in the Past: Causal Consistency for Client- Side Applications. Requirements & challenges
Write Fast, Read in the Past: ausal onsistenc for lient- Side Applications loud-backed applications 1. Request 3. Repl 2. Process request & store update 4. Transmit update 50~200ms databas e app server
More information(It s the invariants, stupid)
Consistency in three dimensions (It s the invariants, stupid) Marc Shapiro Cloud to the edge Social, web, e-commerce: shared mutable data Scalability replication consistency issues Inria & Sorbonne-Universités
More informationSWIFTCLOUD: GEO-REPLICATION RIGHT TO THE EDGE
SWIFTCLOUD: GEO-REPLICATION RIGHT TO THE EDGE Annette Bieniusa T.U. Kaiserslautern Carlos Baquero HSALab, U. Minho Marc Shapiro, Marek Zawirski INRIA, LIP6 Nuno Preguiça, Sérgio Duarte, Valter Balegas
More informationGenie. Distributed Systems Synthesis and Verification. Marc Rosen. EN : Advanced Distributed Systems and Networks May 1, 2017
Genie Distributed Systems Synthesis and Verification Marc Rosen EN.600.667: Advanced Distributed Systems and Networks May 1, 2017 1 / 35 Outline Introduction Problem Statement Prior Art Demo How does it
More informationINF-5360 Presentation
INF-5360 Presentation Optimistic Replication Ali Ahmad April 29, 2013 Structure of presentation Pessimistic and optimistic replication Elements of Optimistic replication Eventual consistency Scheduling
More informationLarge-Scale Geo-Replicated Conflict-free Replicated Data Types
Large-Scale Geo-Replicated Conflict-free Replicated Data Types Carlos Bartolomeu carlos.bartolomeu@tecnico.ulisboa.pt Instituto Superior Técnico (Advisor: Professor Luís Rodrigues) Abstract. Conflict-free
More informationConflict-free Replicated Data Types
Conflict-free Replicated Data Types Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski To cite this version: Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski. Conflict-free Replicated
More informationEventual Consistency Today: Limitations, Extensions and Beyond
Eventual Consistency Today: Limitations, Extensions and Beyond Peter Bailis and Ali Ghodsi, UC Berkeley Presenter: Yifei Teng Part of slides are cited from Nomchin Banga Road Map Eventual Consistency:
More informationHorizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage
Horizontal or vertical scalability? Scaling Out Key-Value Storage COS 418: Distributed Systems Lecture 8 Kyle Jamieson Vertical Scaling Horizontal Scaling [Selected content adapted from M. Freedman, B.
More informationScaling Out Key-Value Storage
Scaling Out Key-Value Storage COS 418: Distributed Systems Logan Stafman [Adapted from K. Jamieson, M. Freedman, B. Karp] Horizontal or vertical scalability? Vertical Scaling Horizontal Scaling 2 Horizontal
More informationJust-Right Consistency. Centralised data store. trois bases. As available as possible As consistent as necessary Correct by design
Just-Right Consistency As available as possible As consistent as necessary Correct by design Marc Shapiro, UMC-LI6 & Inria Annette Bieniusa, U. Kaiserslautern Nuno reguiça, U. Nova Lisboa Christopher Meiklejohn,
More informationDistributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency
Distributed Key Value Store Utilizing CRDT to Guarantee Eventual Consistency CPSC 416 Project Proposal n6n8: Trevor Jackson, u2c9: Hayden Nhan, v0r5: Yongnan (Devin) Li, x5m8: Li Jye Tong Introduction
More informationEventual Consistency Today: Limitations, Extensions and Beyond
Eventual Consistency Today: Limitations, Extensions and Beyond Peter Bailis and Ali Ghodsi, UC Berkeley - Nomchin Banga Outline Eventual Consistency: History and Concepts How eventual is eventual consistency?
More informationConflict-free Replicated Data Types in Practice
Conflict-free Replicated Data Types in Practice Georges Younes Vitor Enes Wednesday 11 th January, 2017 HASLab/INESC TEC & University of Minho InfoBlender Motivation Background Background: CAP Theorem
More informationConflict-Free Replicated Data Types (basic entry)
Conflict-Free Replicated Data Types (basic entry) Marc Shapiro Sorbonne-Universités-UPMC-LIP6 & Inria Paris http://lip6.fr/marc.shapiro/ 16 May 2016 1 Synonyms Conflict-Free Replicated Data Types (CRDTs).
More informationUsing Erlang, Riak and the ORSWOT CRDT at bet365 for Scalability and Performance. Michael Owen Research and Development Engineer
1 Using Erlang, Riak and the ORSWOT CRDT at bet365 for Scalability and Performance Michael Owen Research and Development Engineer 2 Background 3 bet365 in stats Founded in 2000 Located in Stoke-on-Trent
More informationBloom: Big Systems, Small Programs. Neil Conway UC Berkeley
Bloom: Big Systems, Small Programs Neil Conway UC Berkeley Distributed Computing Programming Languages Data prefetching Register allocation Loop unrolling Function inlining Optimization Global coordination,
More informationLarge-Scale Key-Value Stores Eventual Consistency Marco Serafini
Large-Scale Key-Value Stores Eventual Consistency Marco Serafini COMPSCI 590S Lecture 13 Goals of Key-Value Stores Export simple API put(key, value) get(key) Simpler and faster than a DBMS Less complexity,
More informationDynamo: Amazon s Highly Available Key-Value Store
Dynamo: Amazon s Highly Available Key-Value Store DeCandia et al. Amazon.com Presented by Sushil CS 5204 1 Motivation A storage system that attains high availability, performance and durability Decentralized
More informationISSN Bulletin. of the. European Association for Theoretical Computer Science EATC S
IN 0252 9742 Bulletin of the European Association for Theoretical Computer cience EATC EATC Number 104 June 2011 Tʜ Dɪʀɪʙ Cɪɴɢ Cʟɴ ʙʏ Pɴɢɪ Fʀ Department of Computer cience, University of Crete P.O. Box
More informationExtending Eventually Consistent Cloud Databases for Enforcing Numeric Invariants
Extending Eventually Consistent Cloud Databases for Enforcing Numeric Invariants Valter Balegas, Diogo Serra, Sérgio Duarte Carla Ferreira, Rodrigo Rodrigues, Nuno Preguiça NOVA LINCS/FCT/Universidade
More informationFrom strong to eventual consistency: getting it right. Nuno Preguiça, U. Nova de Lisboa Marc Shapiro, Inria & UPMC-LIP6
From strong to eventual consistency: getting it right Nuno Preguiça, U. Nova de Lisboa Marc Shapiro, Inria & UPMC-LIP6 Conflict-free Replicated Data Types ı Marc Shapiro 1,5, Nuno Preguiça 2,1, Carlos
More informationScaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman
Scaling KVS CS6450: Distributed Systems Lecture 14 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed
More informationDynamo: Key-Value Cloud Storage
Dynamo: Key-Value Cloud Storage Brad Karp UCL Computer Science CS M038 / GZ06 22 nd February 2016 Context: P2P vs. Data Center (key, value) Storage Chord and DHash intended for wide-area peer-to-peer systems
More informationEECS 498 Introduction to Distributed Systems
EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Replicated State Machines Logical clocks Primary/ Backup Paxos? 0 1 (N-1)/2 No. of tolerable failures October 11, 2017 EECS 498
More informationDynamo: Amazon s Highly Available Key-value Store. ID2210-VT13 Slides by Tallat M. Shafaat
Dynamo: Amazon s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat Dynamo An infrastructure to host services Reliability and fault-tolerance at massive scale Availability providing
More informationThis talk is about. Data consistency in 3D. Geo-replicated database q: Queue c: Counter { q c } Shared database q: Queue c: Counter { q c }
Data consistency in 3D (It s the invariants, stupid) Marc Shapiro Masoud Saieda Ardekani Gustavo Petri This talk is about Understanding consistency Primitive consistency mechanisms How primitives compose
More informationSelective Hearing: An Approach to Distributed, Eventually Consistent Edge Computation
Selective Hearing: An Approach to Distributed, Eventually Consistent Edge Computation Christopher Meiklejohn Basho Technologies, Inc. Bellevue, WA Email: cmeiklejohn@basho.com Peter Van Roy Université
More informationTARDiS: A branch and merge approach to weak consistency
TARDiS: A branch and merge approach to weak consistency By: Natacha Crooks, Youer Pu, Nancy Estrada, Trinabh Gupta, Lorenzo Alvis, Allen Clement Presented by: Samodya Abeysiriwardane TARDiS Transactional
More informationWhy distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data?"
Why distributed databases suck, and what to do about it - Regaining consistency Do you want a database that goes down or one that serves wrong data?" 1 About the speaker NoSQL team lead at Trifork, Aarhus,
More informationBuilding Consistent Transactions with Inconsistent Replication
DB Reading Group Fall 2015 slides by Dana Van Aken Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports
More informationCS 138: Dynamo. CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.
CS 138: Dynamo CS 138 XXIV 1 Copyright 2017 Thomas W. Doeppner. All rights reserved. Dynamo Highly available and scalable distributed data store Manages state of services that have high reliability and
More informationReplication in Distributed Systems
Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over
More informationDistributed Data Management Replication
Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut Distributing Data Motivation Scalability (Elasticity) If data volume, processing, or access exhausts one machine, you might want to spread
More informationOptimized OR-Sets Without Ordering Constraints
Optimized OR-Sets Without Ordering Constraints Madhavan Mukund, Gautham Shenoy R, and S P Suresh Chennai Mathematical Institute, India {madhavan,gautshen,spsuresh}@cmi.ac.in Abstract. Eventual consistency
More informationCSE-E5430 Scalable Cloud Computing Lecture 10
CSE-E5430 Scalable Cloud Computing Lecture 10 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 23.11-2015 1/29 Exam Registering for the exam is obligatory,
More informationIntroduction to Distributed Data Systems
Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January
More informationCAP Theorem, BASE & DynamoDB
Indian Institute of Science Bangalore, India भ रत य व ज ञ न स स थ न ब गल र, भ रत DS256:Jan18 (3:1) Department of Computational and Data Sciences CAP Theorem, BASE & DynamoDB Yogesh Simmhan Yogesh Simmhan
More informationWeak Consistency and Disconnected Operation in git. Raymond Cheng
Weak Consistency and Disconnected Operation in git Raymond Cheng ryscheng@cs.washington.edu Motivation How can we support disconnected or weakly connected operation? Applications File synchronization across
More information10. Replication. Motivation
10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure
More informationarxiv: v1 [cs.dc] 9 Jan 2018
Search on Secondary Attributes in Geo-Distributed Systems Dimitrios Vasilas 1,2, Marc Shapiro 1, and Bradley King 2 arxiv:1801.02974v1 [cs.dc] 9 Jan 2018 1 Inria & Sorbonne Universités, UPMC Univ Paris
More informationDynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation
Dynamo Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/20 Outline Motivation 1 Motivation 2 3 Smruti R. Sarangi Leader
More informationDeterministic Threshold Queries of Distributed Data Structures
Deterministic Threshold Queries of Distributed Data Structures Lindsey Kuper and Ryan R. Newton Indiana University {lkuper, rrnewton}@cs.indiana.edu Abstract. Convergent replicated data types, or CvRDTs,
More informationRiak. Distributed, replicated, highly available
INTRO TO RIAK Riak Overview Riak Distributed Riak Distributed, replicated, highly available Riak Distributed, highly available, eventually consistent Riak Distributed, highly available, eventually consistent,
More informationA Framework for Transactional Consistency Models with Atomic Visibility
A Framework for Transactional Consistency Models with Atomic Visibility Andrea Cerone, Giovanni Bernardi, Alexey Gotsman IMDEA Software Institute, Madrid, Spain CONCUR - Madrid, September 1st 2015 Data
More informationChapter 11 - Data Replication Middleware
Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 11 - Data Replication Middleware Motivation Replication: controlled
More informationLasp: A Language for Distributed, Coordination-Free Programming
Lasp: A Language for Distributed, Coordination-Free Programming Christopher Meiklejohn Basho Technologies, Inc. cmeiklejohn@basho.com Peter Van Roy Université catholique de Louvain peter.vanroy@uclouvain.be
More informationSMAC: State Management for Geo-Distributed Containers
SMAC: State Management for Geo-Distributed Containers Jacob Eberhardt, Dominik Ernst, David Bermbach Information Systems Engineering Research Group Technische Universitaet Berlin Berlin, Germany Email:
More informationLinearizability CMPT 401. Sequential Consistency. Passive Replication
Linearizability CMPT 401 Thursday, March 31, 2005 The execution of a replicated service (potentially with multiple requests interleaved over multiple servers) is said to be linearizable if: The interleaved
More informationDISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS
LASP 2 DISTRIBUTED EVENTUALLY CONSISTENT COMPUTATIONS 3 EN TAL AV CHRISTOPHER MEIKLEJOHN 4 RESEARCH WITH: PETER VAN ROY (UCL) 5 MOTIVATION 6 SYNCHRONIZATION IS EXPENSIVE 7 SYNCHRONIZATION IS SOMETIMES
More informationConsistency in Distributed Systems
Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission
More informationCS Amazon Dynamo
CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed
More informationCS6450: Distributed Systems Lecture 13. Ryan Stutsman
Eventual Consistency CS6450: Distributed Systems Lecture 13 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University.
More informationMutual consistency, what for? Replication. Data replication, consistency models & protocols. Difficulties. Execution Model.
Replication Data replication, consistency models & protocols C. L. Roncancio - S. Drapeau Grenoble INP Ensimag / LIG - Obeo 1 Data and process Focus on data: several physical of one logical object What
More informationAvailability versus consistency. Eventual Consistency: Bayou. Eventual consistency. Bayou: A Weakly Connected Replicated Storage System
Eventual Consistency: Bayou Availability versus consistency Totally-Ordered Multicast kept replicas consistent but had single points of failure Not available under failures COS 418: Distributed Systems
More informationAutomating the Choice of Consistency Levels in Replicated Systems
Automating the Choice of Consistency Levels in Replicated Systems Cheng Li, João Leitão, Allen Clement Nuno Preguiça, Rodrigo Rodrigues, Viktor Vafeiadis Max Planck Institute for Software Systems (MPI-SWS),
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL
CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.
More informationCitrea and Swarm: partially ordered op logs in the browser
Citrea and Swarm: partially ordered op logs in the browser Victor Grishchenko Citrea LLC victor.grishchenko@gmail.com 25 March 2014 Abstract Principles of eventual consistency are normally applied in largescale
More informationDIVING IN: INSIDE THE DATA CENTER
1 DIVING IN: INSIDE THE DATA CENTER Anwar Alhenshiri Data centers 2 Once traffic reaches a data center it tunnels in First passes through a filter that blocks attacks Next, a router that directs it to
More informationReplicated State Machine in Wide-area Networks
Replicated State Machine in Wide-area Networks Yanhua Mao CSE223A WI09 1 Building replicated state machine with consensus General approach to replicate stateful deterministic services Provide strong consistency
More informationDYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun
DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE Presented by Byungjin Jun 1 What is Dynamo for? Highly available key-value storages system Simple primary-key only interface Scalable and Reliable Tradeoff:
More informationBasic vs. Reliable Multicast
Basic vs. Reliable Multicast Basic multicast does not consider process crashes. Reliable multicast does. So far, we considered the basic versions of ordered multicasts. What about the reliable versions?
More informationReplication and Consistency
Replication and Consistency Today l Replication l Consistency models l Consistency protocols The value of replication For reliability and availability Avoid problems with disconnection, data corruption,
More informationWhat Came First? The Ordering of Events in
What Came First? The Ordering of Events in Systems @kavya719 kavya the design of concurrent systems Slack architecture on AWS systems with multiple independent actors. threads in a multithreaded program.
More informationDynamo Tom Anderson and Doug Woos
Dynamo motivation Dynamo Tom Anderson and Doug Woos Fast, available writes - Shopping cart: always enable purchases FLP: consistency and progress at odds - Paxos: must communicate with a quorum Performance:
More informationTAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton
TAPIR By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton Outline Problem Space Inconsistent Replication TAPIR Evaluation Conclusion Problem
More informationPushyDB. Jeff Chan, Kenny Lam, Nils Molina, Oliver Song {jeffchan, kennylam, molina,
PushyDB Jeff Chan, Kenny Lam, Nils Molina, Oliver Song {jeffchan, kennylam, molina, osong}@mit.edu https://github.com/jeffchan/6.824 1. Abstract PushyDB provides a more fully featured database that exposes
More informationCS 655 Advanced Topics in Distributed Systems
Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3
More informationThere is a tempta7on to say it is really used, it must be good
Notes from reviews Dynamo Evalua7on doesn t cover all design goals (e.g. incremental scalability, heterogeneity) Is it research? Complexity? How general? Dynamo Mo7va7on Normal database not the right fit
More informationExtreme Computing. NoSQL.
Extreme Computing NoSQL PREVIOUSLY: BATCH Query most/all data Results Eventually NOW: ON DEMAND Single Data Points Latency Matters One problem, three ideas We want to keep track of mutable state in a scalable
More informationACMS: The Akamai Configuration Management System. A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein
ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Instructor: Fabian Bustamante Presented by: Mario Sanchez The Akamai Platform Over 15,000 servers
More informationChapter 24 NOSQL Databases and Big Data Storage Systems
Chapter 24 NOSQL Databases and Big Data Storage Systems - Large amounts of data such as social media, Web links, user profiles, marketing and sales, posts and tweets, road maps, spatial data, email - NOSQL
More informationDatabases. Laboratorio de sistemas distribuidos. Universidad Politécnica de Madrid (UPM)
Databases Laboratorio de sistemas distribuidos Universidad Politécnica de Madrid (UPM) http://lsd.ls.fi.upm.es/lsd/lsd.htm Nuevas tendencias en sistemas distribuidos 2 Summary Transactions. Isolation.
More informationRecap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques
Recap CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo CAP Theorem? Consistency, Availability, Partition Tolerance P then C? A?
More informationConsistency and Replication. Why replicate?
Consistency and Replication Today: Consistency models Data-centric consistency models Client-centric consistency models Lecture 15, page 1 Why replicate? Data replication versus compute replication Data
More informationIntroduction to Distributed Systems Seif Haridi
Introduction to Distributed Systems Seif Haridi haridi@kth.se What is a distributed system? A set of nodes, connected by a network, which appear to its users as a single coherent system p1 p2. pn send
More information#IoT #BigData. 10/31/14
#IoT #BigData Seema Jethani @seemaj @basho 1 10/31/14 Why should we care? 2 11/2/14 Source: http://en.wikipedia.org/wiki/internet_of_things Motivation for Specialized Big Data Systems Rate of data capture
More informationEventual Consistency 1
Eventual Consistency 1 Readings Werner Vogels ACM Queue paper http://queue.acm.org/detail.cfm?id=1466448 Dynamo paper http://www.allthingsdistributed.com/files/ amazon-dynamo-sosp2007.pdf Apache Cassandra
More informationSCALABLE CONSISTENCY AND TRANSACTION MODELS
Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:
More information殷亚凤. Consistency and Replication. Distributed Systems [7]
Consistency and Replication Distributed Systems [7] 殷亚凤 Email: yafeng@nju.edu.cn Homepage: http://cs.nju.edu.cn/yafeng/ Room 301, Building of Computer Science and Technology Review Clock synchronization
More information11. Replication. Motivation
11. Replication Seite 1 11. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure
More informationCoordination-Free Computations. Christopher Meiklejohn
Coordination-Free Computations Christopher Meiklejohn LASP DISTRIBUTED, EVENTUALLY CONSISTENT COMPUTATIONS CHRISTOPHER MEIKLEJOHN (BASHO TECHNOLOGIES, INC.) PETER VAN ROY (UNIVERSITÉ CATHOLIQUE DE LOUVAIN)
More informationPresented By: Devarsh Patel
: Amazon s Highly Available Key-value Store Presented By: Devarsh Patel CS5204 Operating Systems 1 Introduction Amazon s e-commerce platform Requires performance, reliability and efficiency To support
More informationCS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.
Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message
More informationCORFU: A Shared Log Design for Flash Clusters
CORFU: A Shared Log Design for Flash Clusters Authors: Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, Ted Wobber, Michael Wei, John D. Davis EECS 591 11/7/18 Presented by Evan Agattas and Fanzhong
More informationComputing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1
Computing Parable The Archery Teacher Courtesy: S. Keshav, U. Waterloo Lecture 16, page 1 Consistency and Replication Today: Consistency models Data-centric consistency models Client-centric consistency
More informationReplication and Consistency. Fall 2010 Jussi Kangasharju
Replication and Consistency Fall 2010 Jussi Kangasharju Chapter Outline Replication Consistency models Distribution protocols Consistency protocols 2 Data Replication user B user C user A object object
More informationDISTRIBUTED COMPUTER SYSTEMS
DISTRIBUTED COMPUTER SYSTEMS CONSISTENCY AND REPLICATION CONSISTENCY MODELS Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Consistency Models Background Replication Motivation
More informationRecap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques
Recap Distributed Systems Case Study: Amazon Dynamo CAP Theorem? Consistency, Availability, Partition Tolerance P then C? A? Eventual consistency? Availability and partition tolerance over consistency
More informationIPA: Invariant-Preserving Applications for Weakly Consistent Replicated Databases
IPA: Invariant-Preserving Applications for Weakly Consistent Replicated Databases Valter Balegas NOVA LINCS, FCT, Universidade NOVA de Lisboa v.sousa@campus.fct.unl.pt Rodrigo Rodrigues INESC-ID, Instituto
More informationZHT: Const Eventual Consistency Support For ZHT. Group Member: Shukun Xie Ran Xin
ZHT: Const Eventual Consistency Support For ZHT Group Member: Shukun Xie Ran Xin Outline Problem Description Project Overview Solution Maintains Replica List for Each Server Operation without Primary Server
More informationDrRobert N. M. Watson
Distributed systems Lecture 15: Replication, quorums, consistency, CAP, and Amazon/Google case studies DrRobert N. M. Watson 1 Last time General issue of consensus: How to get processes to agree on something
More informationCS November 2017
Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationTransactional storage for geo-replicated systems
Transactional storage for geo-replicated systems Yair Sovran Russell Power Marcos K. Aguilera Jinyang Li New York University Microsoft Research Silicon Valley ABSTRACT We describe the design and implementation
More informationDistributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi
1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.
More information