Data synchronization Conflict resolution strategies

Size: px
Start display at page:

Download "Data synchronization Conflict resolution strategies"

Transcription

1 Data synchronization Conflict resolution strategies Waldemar Korłub Department of Computer Architecture Faculty of Electronics, Telecommunications and Informatics Gdansk University of Technology November 6, 2017 Waldemar Korłub (ETI PG) Sync November 6, / 57

2 Motivations from users point of view it is usually desirable to have data stored both locally (quick access, off-line operations) and on the remote server (backup, syncing across devices) when the same data is modified on multiple devices, a conflict may occur during synchronization: users with multiple devices using different appliances to modify the same data, e.g.: utility apps: note taking, TODO management games! multiple users modifying the same data, e.g.: warehouse management systems apps for shipping companies Waldemar Korłub (ETI PG) Sync November 6, / 57

3 Strong consistency vs eventual consistency Strong consistency: after data is changed, all access attempts on all devices in a distributed system will immediately return value that reflects the change some write operations might not be successful (e.g. when not all nodes holding replicas of data are available to propagate the change) e.g. clustered SQL database, ACID (Atomicity, Consistency, Isolation, and Durability) Eventual consistency: after data is changed, access attempts will eventually return value that reflects the change before all replicas are synchronized, some nodes will see updated value and others will see the old value e.g. DNS servers, BASE (Basically Available, Soft state, Eventual consistency) Waldemar Korłub (ETI PG) Sync November 6, / 57

4 Mobile setting and the CAP theorem CAP theorem: C consistency (understood as strong consistency) A availability P partition tolerance Choose any two! Mobile application setting: no guarantees about network availability, communication loss highly probable partition tolerance required users expect that app is operational at all times, regardless of network status or other circumstances availability that leaves no love for consistency strong consistency required Waldemar Korłub (ETI PG) Sync November 6, / 57

5 Disclaimer: Conflict resolution in a distributed environment is a hard problem in computer science Waldemar Korłub (ETI PG) Sync November 6, / 57

6 But some cases are easier than others... Waldemar Korłub (ETI PG) Sync November 6, / 57

7 Sample game player collects coins coins can be used to unlock new levels coins can be used to unlock new characters player can choose which character to play with there are highscores showing the 5 best player s scores there are global leaderboards showing best results of players from all around the world there are power-ups that player can collect during the game (e.g. increased speed, temporal immunity) there are achievements (e.g. for collecting all available power-ups) Waldemar Korłub (ETI PG) Sync November 6, / 57

8 Simple case #1: Newer is always better in some cases newer version of data is always preffered over the older version e.g. user s preferences: when a player chooses a new character to play with the new choice always invalidates the previous one conflict resolution based on timestamp stored alongside the data take care about timezone differences! t timestamp, d actual data data on the device: (d, t) data in the cloud: (d, t ) conflict resolution: pair with newer timestamp Waldemar Korłub (ETI PG) Sync November 6, / 57

9 Simple case #2: One version of data is clearly better than the other e.g. highscore the higher score the better to resolve conflict choose the greatest value h highscore data on the device: h data in the cloud: h conflict resolution: max(h, h ) Waldemar Korłub (ETI PG) Sync November 6, / 57

10 conflict resolution: A Waldemar Korłub (ETI PG) Sync November 6, / 57 Simple case #3: Remote data is always better data that user does not have ownership of e.g. global leaderboard TOP100 players from all around the world player does not own this data and cannot make changes to it (at least not directly to influence global loeaderboards player has to score a high result but even then new leaderboard is generated and owned by game operator) new version generated on the server based on highscores of individual players data is never changed locally if server-side version is different, it replaces locally cached data e.g. local cache, avatars of other players data on the device: A data in the cloud: A

11 Simple case #4: Local data is always better device-specific information e.g. device-specific preferences data on the device: A data in the cloud: A conflict resolution: A Waldemar Korłub (ETI PG) Sync November 6, / 57

12 Simple case #5a: Union of conflicting datasets e.g. the set of unlocked levels player can unlock different levels on different devices, when data is synchronized player should have access to all unlocked levels no matter on which device a particular level was unlocked L set of all levels data on the device: A = {l : l L and l is unlocked on the local device} data in the cloud: A = {l : l L and l is unlocked on any other device} conflict resolution: A A, union(a, A ) Waldemar Korłub (ETI PG) Sync November 6, / 57

13 Simple case #5b: Logical disjunction (OR operator) calculate union using OR operator on a vector of traits n the number of levels data on the device: L = [l 1, l 2,..., l n ] l i = { 1 if ith is unlocked on the local device 0 otherwise data in the cloud: L = [l 1, l 2,..., l n] conflict resolution: bitwise-or(l, L ) Waldemar Korłub (ETI PG) Sync November 6, / 57

14 Simple case #6a: Intersection of conflicting datasets e.g. the set of power-ups that where not yet encountered by the player when the player collects a new power-up it is removed from the set, when all power-ups are collected, user gains an achievement; collected power-ups should be counted no matter on which device they where encountered, so the resolved set of remaining power-ups becomes an intersection of conflicting sets P set of all power-ups data on the device: A = {p : p P and p is not marked as collected on the device} data in the cloud: A = {p : p P and p is not marked as collected on other devices} conflict resolution: A A, intersection(a, A ) Waldemar Korłub (ETI PG) Sync November 6, / 57

15 Simple case #6b: Logical conjunction (AND operator) calculate intersection using AND operator on a vector of traits n the number of power-ups data on the device: P = [p 1, p 2,..., p n ] p i = { 1 if ith power-up was not marked as collected 0 otherwise data in the cloud: P = [p 1, p 2,..., p n] conflict resolution: bitwise-and(p, P ) Waldemar Korłub (ETI PG) Sync November 6, / 57

16 Mixing simple cases: #2 + #5 #2 one version of data is clearly better than the other #5 merge by union e.g. highscores of 5 best results highscores from all devices should be taken into account when building the leaderboard (a kind of union), but when the data is merged only the top 5 scores are stored (some data is clearly better that the other) a i ith level data on the device: A = (a 1,..., a 5 ) data in the cloud: A = (a 1,..., a 5 ) conflict resolution: truncate(sort(union(a, A )), 5) Waldemar Korłub (ETI PG) Sync November 6, / 57

17 Harder cases not all kinds of data fall into one of the aformentioned categories (unfortunately) in harder cases we may try to reduce the complexity by decomposing the problem into a set of subproblems each of which falls into one of the simple scenarios (the all-time-favourite divide and conquer approach) if such decomposition is not possible, a custom conflict resolution strategy needs to be developed Waldemar Korłub (ETI PG) Sync November 6, / 57

18 Case study: maintaining the number of collected coins player collects coins during the game the overall coins count should reflect coins geathered on all devices how to store information about coins so that the overall count can be maintained? Waldemar Korłub (ETI PG) Sync November 6, / 57

19 The problem: there is no way to tell which coins were already taken into account. Waldemar Korłub (ETI PG) Sync November 6, / 57 Approach I: store the total count, a.k.a. an obvious fail c coins count data on the device: c data in the cloud: c conflict resolution:? Event Device A Device B Remote server Actual amount Initial conditions Collect 30 coins on device A Collect 20 coins on device B Device A syncs Device B syncs conflict Device B resolves the conflict by adding its coins count to remote count Collect 10 coins on device A Device A syncs conflict Device A resolves the conflict by adding its coins count to remote count fail!

20 Approach II: store the total count and the delta d delta since last sync data on the device: (c, d) data in the cloud (just the total count): c conflict resolution: resolved data for the device: (c + d, 0) resolved data for the cloud: c + d Event Device A Device B Remote server Actual amount Initial conditions (0, 0) (0, 0) - 0 Collect 30 coins on device A (30, 30) (0, 0) - 30 Collect 20 coins on device B (30, 30) (20, 20) - 50 Device A syncs (30, 0) (20, 20) Device B syncs conflict (30, 0) (20, 20) Device B resolves the conflict (30, 0) (50, 0) Collect 10 coins on device A (40, 10) (50, 0) Device A syncs conflict (40, 10) (50, 0) Device A resolves the conflict success! (60, 0) (50, 0) Waldemar Korłub (ETI PG) Sync November 6, / 57

21 Approach II: counterexample Event Device A Device B Remote server Actual amount Initial conditions (30, 0) (50, 0) Collect 10 coins on device A (40, 10) (50, 0) Device A syncs conflict (40, 10) (50, 0) Device A resolves the conflict yay! Communication is disrupted when device A sends the data do remote server (40, 10) or (60, 0)? (50,0)??? 60 If device A does not receive a response from the server it is not possible to determine what happend: or server might have not received the data the remote count remains 50 and we have to apply the delta during the next synchronization attempt server have received the data and sent a response but it did not reach the device; the remote count is 60 and the local delta should be set to zero as it was already applied on the server Waldemar Korłub (ETI PG) Sync November 6, / 57

22 A naive and faulty solution no matter what happend, we can store the correct total on the device and simply update the server during the next synchronization attempt, right? wrong... Event Device A Device B Remote server Actual amount Initial conditions (30, 0) (50, 0) Collect 10 coins on device A (40, 10) (50, 0) Device A syncs conflict (40, 10) (50, 0) Device A resolves the conflict yay! Communication is disrupted when (60, 0) (50,0) device A sends the data do remote server Collect 25 coins on device B (60, 0) (75, 25) Device B syncs conflict (60, 0) (75, 25) Device B resolves the conflict (60, 0) (75, 0) Device A tries to update the server fail (60, 0) (75, 0) The problem: there is no way to tell whether the delta was already applied or not 1 let s assume that the data did not reach the server at this point Waldemar Korłub (ETI PG) Sync November 6, / 57

23 Identifying deltas To reliably determine which deltas where applied we need to identify them using GUIDs: g i GUID data on the device: (c, ((g 1, d 1 ),..., (g m, d m ),..., (g n, d n ))), 1 m n c = n d i i=1 data in the cloud: (c, ((g 1, d 1 ),..., (g m, d m ), (g n+1, d n+1 ),..., (g n+k, d n+k ))) c = m d i + n+k i=1 d i i=n+1 conflict ( resolution: n+k d i, ( (g 1, d 1 ),..., (g n+k, d n+k ) )) i=1 This is a merge by union! Waldemar Korłub (ETI PG) Sync November 6, / 57

24 Identifying deltas To reliably determine which deltas where applied we need to identify them using GUIDs: g i GUID data on the device: (c, ((g 1, d 1 ),..., (g m, d m ),..., (g n, d n ))), 1 m n c = n d i i=1 data in the cloud: (c, ((g 1, d 1 ),..., (g m, d m ), (g n+1, d n+1 ),..., (g n+k, d n+k ))) c = m d i + n+k i=1 d i i=n+1 conflict ( resolution: n+k d i, ( (g 1, d 1 ),..., (g n+k, d n+k ) )) i=1 This is a merge by union! Waldemar Korłub (ETI PG) Sync November 6, / 57

25 GUID-Deltas and disrupted communication If the communication is disrupted during the synchronization phase there is no possibility of data misrepresentation: if the resolved data does not reach the server, the missing deltas will be applied during the next synchronization attempt (during the next merge by union missing elements will be included in the resolved set by the union operator) if the changes reach the server, during the next synchronization attempt a merge by union will be performed one more time and elements which were already present in the server-side set of deltas will be retained in the resolved set; deltas will not be duplicated as they are identified by GUIDs Waldemar Korłub (ETI PG) Sync November 6, / 57

26 Additional concerns The conflict resolution strategy based on GUID-deltas works as expected, but: the more changes are made to the data, the bigger grows the set of deltas the bigger the set of deltas is......the more costly becomes the union operation...the more space is required to store the data...the more data needs to be transmitted between the server and the mobile device Possible optimizations or how to make use of the server-side app: resolve conflicts on the server and send only the calculated totals to devices device no longer has to keep track of deltas created on other devices only send deltas that were created since the last synchronization remove deltas that were already taken into account in the total count Waldemar Korłub (ETI PG) Sync November 6, / 57

27 Managing deltas included in the total count On the mobile device: when a device sends a set of deltas to the server and receives a response containing current total amount of coins, the deltas may be considered as registered by the server and included in the total count maintained on the server; registered deltas can be removed from the local storage if device does not receive a response the set of deltas might not have been registered on the server; deltas should be kept in the local storage and resubmitted during the next synchronization attempt Waldemar Korłub (ETI PG) Sync November 6, / 57

28 Managing deltas included in the total count What about the server-side app? the server-side app only needs to keep track of deltas that might not be considered as registered by devices ther server-side app does not know whether the response to the last synchronization attempt reached the device or not if the response does not reach the device deltas will be considered as potentially unregistered the server-side app needs to keep track of deltas submitted during the last synchronization attempt from a particular device deltas submitted during previous attempts and not resubmitted during the last attempt are considered by the device as already-registered server-side app does not need to keep track of those any more when a request for a synchronization is received from a particular device, the server-side app can remove all deltas created on that device that are not present in the current request; the deltas from the current request have to be stored until the next sync attempt Waldemar Korłub (ETI PG) Sync November 6, / 57

29 Summary of data kept on devices and on the server Mobile device storage: current total fetched from the server a set of deltas created since the last successful sync Server-side storage: current total reflecting deltas from all devices for each device: a set of deltas included in the last synchronization request from the given device Waldemar Korłub (ETI PG) Sync November 6, / 57

30 Et voilà: Synchronization strategy is now complete! Waldemar Korłub (ETI PG) Sync November 6, / 57

31 But isn t it a bit too complicated for such a simple scenario as coins counting? Waldemar Korłub (ETI PG) Sync November 6, / 57

32 What is the problem? on the server side we need to maintain a set of recent deltas for each and every device because we don t know whether a particular delta is considered by the device as registered in the mobile app we need to handle communication errors because they might mean that a set of deltas was not registered and we need GUIDs all over the place, to fulfill aforementioned requirements It seems that the whole problem is about determining if a particular delta was taken into account. Imagine there is a way to know it for sure... Waldemar Korłub (ETI PG) Sync November 6, / 57

33 Towards Conflict Resolution Strategy 2.0 each device is only concerned about its own deltas server-side app incorporates deltas supplied by different devices into the overall total only the overall total is propagated to other devices, individual deltas remain on the server each device only needs to know if its own deltas were registered, the server-side app takes care of deltas from other devices the total coins count calculated on the server is a sum of all deltas: G the set of all GUIDs D the set of all deltas f(g) : G D maps GUIDs to deltas c the overall total c = g G f(g) = f(g 1 ) + f(g 2 ) f(g G ) Waldemar Korłub (ETI PG) Sync November 6, / 57

34 Let s play some maths changing the order of sum operands does not change the final result: M = {m 1,..., m n } the set of mobile devices, M = n G i the set of GUIDs generated on the device m i G = G 1 G 2... G n, i j G i G j = c = g G f(g) = f(g) + g G 1 f(g) f(g) g G 2 g G n subtotal subtotal subtotal for m 1 for m 2 for m n Waldemar Korłub (ETI PG) Sync November 6, / 57

35 Conflict resolution strategy 2.0 the server-side app needs deltas from all devices, to calculate subtotals for each individual device and incorporate them in the overall total (which is then propagated to other devices) if the server-side app only needs the subtotals we don t really need to sent individual deltas we can just send the subtotal calculated on the device instead! when we send a subtotal, we know that all deltas from a given device are taken into account in that subtotal the problem of knowing which deltas were registered is no more and we don t need individual deltas on the device any more to calculate subtotal for a given device we only need a single counter that will be updated on the device when operations are performed to synchronize the data we only send the current state of the local subtotal counter instead of a set of deltas Waldemar Korłub (ETI PG) Sync November 6, / 57

36 Conflict resolution strategy 2.0 The overall total coins count is composed of subtotals: the subtotal for a given device is a divice-specific information no other device nor the server-side app can influence this value: it can be synchronized using the local always wins strategy subtotals for other devices are the kind of data that the given device does not have ownership of (and cannot change it): they can be synchronized using the remote always wins strategy We ve decomposed the data into parts that can be synchronized using two simple strategies. Waldemar Korłub (ETI PG) Sync November 6, / 57

37 remote local remote wins wins wins Waldemar Korłub (ETI PG) Sync November 6, / 57 Conflict resolution strategy 2.0 The approach described above yields an extreamly simple conflict resolution strategy: M = {m 1,..., m n } the set of mobile devices, M = n c i subtotal for device m i data on the device k: ( (m1, c 1 ),..., (m k, c k ),..., (m n, c n ) ) one bucket for each device data in the cloud: ( (m1, c 1 ),..., (m k, c k ),..., (m n, c n) ) conflict resolution on device k: ( (m1, c 1 ),..., (m k, c k ),..., (m n, c n) )

38 Conflict resolution strategy 2.0 verification Event Device A Device B Device A resolves conflict Communication disrupted (A:40,B:20) (A:30, B:20) Remote server Actual amount Initial conditions (A:0, B:0) (A:0, B:0) - 0 Collect 30 coins on device (A:30, B:0) (A:0, B:0) - 30 A Collect 20 coins on device (A:30, B:0) (A:0, B:20) - 50 A Device A syncs (A:30, B:0) (A:0, B:20) (A:30, B:0) 50 Device B syncs conflict (A:30, B:0) (A:0, B:20) (A:30, B:0) 50 Device B resolves conflict (A:30, B:0) (A:30, B:20) (A:30, B:20) 50 Collect 10 coins on dev. A (A:40, B:0) (A:30, B:20) (A:30, B:20) 60 Device A syncs conflict (A:40, B:0) (A:30, B:20) (A:30, B:20) 60 (A:30, B:20) or (A:40, B:20)? DOESN T MATTER! 60 Waldemar Korłub (ETI PG) Sync November 6, / 57

39 Conflict resolution strategy 2.0 making better use of the server no need to store all subtotals on each and every device store only the overall total and the subtotal for the given device on the server store subtotals from all devices when an updated subtotal arrives respond with updated overall total Waldemar Korłub (ETI PG) Sync November 6, / 57

40 Conflict resolution strategy 2.0 the goodness no need to keep track of individual deltas no need to use GUIDs for deltas no need to keep track of which deltas were applied no need to bother about storage size and removing applied deltas no need to bother about transferred data size and sending only the required deltas no need to handle communication errors Waldemar Korłub (ETI PG) Sync November 6, / 57

41 When strategies mentioned above are not enough, the things get rather ugly... Waldemar Korłub (ETI PG) Sync November 6, / 57

42 Partial ordering of operations there are many scenarios where the order of operations and causal relations between them influence the final result it may seem at first that those cases can be handled by adding timestamps to operations Waldemar Korłub (ETI PG) Sync November 6, / 57

43 An app for delivery company Let s consider an app for delivery company: information about shipments is stored in a centralized database when the sender fills a consignment note on the website of the shipping company, a new shipment is added to the database information about a shipment can be edited by: facilities/delivery hubs that handle the shipment the delivery man with a mobile device who picks up the shipment from the sender the delivery man with another mobile device who brings the shipment to the recipient customer service/hotline employees who can be called by the recipient in case of issues Waldemar Korłub (ETI PG) Sync November 6, / 57

44 An ambiguous scenario Delivery status information for shipment # : Event Timestamp Sender fills a consignment note 1 Shipment picked up from sender 5 Shipment arrived at source facility A 10...shipment travel between delivery hubs... Shipment arrived at destination facility B 50 Shipment released for delivery to the recipient Delivery man syncs information about the shipment to his mobile device 70 Recipient calls the hotline and asks for a delivery to a different address Hotline employee modifies the delivery address 80 Delivery man sets the status of the shipment to: DELIVERED 100 Any problem with this scenario? Waldemar Korłub (ETI PG) Sync November 6, / 57

45 An ambiguous scenario Delivery status information for shipment # : Event Timestamp Sender fills a consignment note 1 Shipment picked up from sender 5 Shipment arrived at source facility A 10...shipment travel between delivery hubs... Shipment arrived at destination facility B 50 Shipment released for delivery to the recipient Delivery man syncs information about the shipment to his mobile device 70 Recipient calls the hotline and asks for a delivery to a different address Hotline employee modifies the delivery address 80 Delivery man sets the status of the shipment to: DELIVERED 100 Any problem with this scenario? Waldemar Korłub (ETI PG) Sync November 6, / 57

46 An ambiguous scenario Where was the shipment delivered? if the information about the new delivery address was not synchronized to the mobile device, the shipment was delivered to the original address if it was synchronized, the shipment was delivered to the new address timestamps only indicate that the event of shipment delivery happend after the event of address modification we don t know whether the data was synchronized or not we don t know if the last event took into account the previous one Waldemar Korłub (ETI PG) Sync November 6, / 57

47 Insufficiency of timestamps timestamps can be used to determine the precedence of events but they do not provide information about causal relations, e.g.: is one event a cause of another event? is one event an effect of another event? are two events independent? were all previous events taken into account when the new event was created? to determine causal relations we need more information, e.g. logical clocks: Lamport timestamps Vector clocks Matrix clocks Waldemar Korłub (ETI PG) Sync November 6, / 57

48 The ancestor we need to determine which version of data was an ancestor for the change made on the mobile device a change meaningful for other devices happens when a given device synchronizes its state with the server internal operations are not meaningful for other devices (why?) think about centralized VCS we can mark a version of data with the number of synchronization events (commits) that led to that particular version Waldemar Korłub (ETI PG) Sync November 6, / 57

49 Lamport timestamp Lamport timestamp is a logical clock for determining the ordering of events that involve interacting components: each component holds a local clock each message carries the clock of the sender when the message arrives and receiver s clock is lower than the clock on the received message, the receiver s clock is forwarded to the message s clock + 1. Otherwise nothing is done Synchronization events can be ordered using the happens before relation: a, b synchronization events L(x) Lamport timestamp for event x a b event a happens before event b If a b then L(a) < L(b). It s an implication not an equivalence! Waldemar Korłub (ETI PG) Sync November 6, / 57

50 Lamport timestamp Lamport timestamp is a logical clock for determining the ordering of events that involve interacting components: each component holds a local clock each message carries the clock of the sender when the message arrives and receiver s clock is lower than the clock on the received message, the receiver s clock is forwarded to the message s clock + 1. Otherwise nothing is done Synchronization events can be ordered using the happens before relation: a, b synchronization events L(x) Lamport timestamp for event x a b event a happens before event b If a b then L(a) < L(b). It s an implication not an equivalence! Waldemar Korłub (ETI PG) Sync November 6, / 57

51 Components of delivery tracking system Components of the system correspond to groups of entities that can modify information about a shipment: s sender who fills the original consignment note h facilities/delivery hubs that handle the shipment p the delivery man with a mobile device who picks up the shipment from the sender d the delivery man with another mobile device who brings the shipment to the recipient e customer service/hotline employees who can be called by the recipient in case of issues Waldemar Korłub (ETI PG) Sync November 6, / 57

52 Determining causal relations with Lamport timestamps Let s assume that the new address was not synchronized to the mobile device before the delivery: Event 2 h p d e X 3 Sender fills a consignment note Shipment picked up from sender Shipment arrived at source facility A shipment travels between delivery hubs... Shipment arrived at destination facility B Shipment released for delivery to the recipient Delivery man syncs data from his device Recipient calls the hotline and changes the delivery address Hotline employee modifies the delivery address Delivery man sets the status of the shipment to: DELIVERED Does not sychronize immediately 8 2 9* Mobile device of the delivery man tries to sync 8 2 9* Let s assume the operation is immediately synchronized unless stated otherwise 3 version on the server Waldemar Korłub (ETI PG) Sync November 6, / 57

53 Lamport timestamps we know that someone else interacted with the data in the meantime we still don t know who was it and what have he done......but in this case what we already know may be enough Waldemar Korłub (ETI PG) Sync November 6, / 57

54 Vector clocks A vector clock for a system that consists of n components (nodes, devices, processes) is an n-element vector of logical clocks one clock per each component. Each component maintains a local copy of the vector clock and updates it as follows: initially all logical clocks are set to zero for each internal operation, a component increments its own logical clock by one when a component sends a message, it attaches its vector clock to the payload data receiving component increments its own logical clock by one and updates other logical clocks by taking the maximum of the values stored in ith local vector clock and the vector clock that was attached to the message V (x) Vector clock for event x a b V (a) < V (b) Now, that s an equivalence! Waldemar Korłub (ETI PG) Sync November 6, / 57

55 Vector clocks A vector clock for a system that consists of n components (nodes, devices, processes) is an n-element vector of logical clocks one clock per each component. Each component maintains a local copy of the vector clock and updates it as follows: initially all logical clocks are set to zero for each internal operation, a component increments its own logical clock by one when a component sends a message, it attaches its vector clock to the payload data receiving component increments its own logical clock by one and updates other logical clocks by taking the maximum of the values stored in ith local vector clock and the vector clock that was attached to the message V (x) Vector clock for event x a b V (a) < V (b) Now, that s an equivalence! Waldemar Korłub (ETI PG) Sync November 6, / 57

56 Determining causal relations with vector clock Let s assume that the new address was not synchronized to the mobile device before the delivery: Event Vector clock 4 Sender fills a consignment note Shipment picked up from sender Shipment arrived at source facility A...shipment travel between delivery hubs... Shipment arrived at destination facility B Shipment released for delivery to the recipient Delivery man syncs information about the shipment to his device Recipient calls the hotline and changes the delivery address Hotline employee modifies the delivery address Delivery man sets the status of the shipment to: DELIVERED Mobile device of the delivery man tries to sync conflict! {s:1,h:7,p:2,d:2,e:2} vs {s:1,h:7,p:2,d:3,e:0} now we know that the new address was not synchronized {s:1,h:0,p:0,d:0,e:0} {s:1,h:0,p:2,d:0,e:0} {s:1,h:2,p:2,d:0,e:0} {s:1,h:7,p:2,d:0,e:0} {s:1,h:7,p:2,d:2,e:0} {s:1,h:7,p:2,d:2,e:2} {s:1,h:7,p:2,d:3,e:0} {s:1,h:7,p:2,d:3,e:0} 4 each component maintains its own local copy of the vector clock, but for the sake of clarity this columns only presents the vector of the component that generates an event in the given row Waldemar Korłub (ETI PG) Sync November 6, / 57

57 Descendant vector clocks In order for vector clock B to be considered a descendant of vector clock A, each logical clock in vector clock A must be lower or equal to the corresponding logical clock in B. A: {s:1,h:7,p:2,d:2,e:2} B: {s:1,h:7,p:2,d:3,e:0} B is not a descendant of A. The value e:0 in vector clock B indicates that the change made by hotline employee was not synced to the mobile device before the shipment was delivered. But how do we resolve this conflict? Waldemar Korłub (ETI PG) Sync November 6, / 57

58 Determining causal relations with vector clock If the new delivery address was synchronized to the mobile device early enough, the scenario would look as follows: Event Vector clock 5 Sender fills a consignment note Shipment picked up from sender Shipment arrived at source facility A...shipment travel between delivery hubs... Shipment arrived at destination facility B Shipment released for delivery to the recipient Delivery man syncs information about the shipment to his device Recipient calls the hotline and changes the different address Hotline employee modifies the delivery address Delivery man sets the status of the shipment to: DELIVERED {s:1,h:7,p:2,d:4,e:2} is a descendant of {s:1,h:7,p:2,d:2,e:2} And so the shipment was delivered to the requested address :) {s:1,h:0,p:0,d:0,e:0} {s:1,h:0,p:2,d:0,e:0} {s:1,h:2,p:2,d:0,e:0} {s:1,h:7,p:2,d:0,e:0} {s:1,h:7,p:2,d:2,e:0} {s:1,h:7,p:2,d:2,e:2} {s:1,h:7,p:2,d:4,e:2} 5 each component maintains its own local copy of the vector clock, but for the sake of clarity this columns only presents the vector of the component that generates an event in the given row Waldemar Korłub (ETI PG) Sync November 6, / 57

59 Summary Lamport timestamps: what I know about myself (about events I m involved in) Vector clocks: what I know about myself what I know about others Matrix clocks: what I know about myself what I know about others what I know about what others know Waldemar Korłub (ETI PG) Sync November 6, / 57

60 Questions Any questions? Waldemar Korłub (ETI PG) Sync November 6, / 57

61 Thanks Thank you for your attention! Waldemar Korłub (ETI PG) Sync November 6, / 57

Important Lessons. Today's Lecture. Two Views of Distributed Systems

Important Lessons. Today's Lecture. Two Views of Distributed Systems Important Lessons Replication good for performance/ reliability Key challenge keeping replicas up-to-date Wide range of consistency models Will see more next lecture Range of correctness properties L-10

More information

SCALABLE CONSISTENCY AND TRANSACTION MODELS

SCALABLE CONSISTENCY AND TRANSACTION MODELS Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:

More information

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering Consistency: Relaxed SWE 622, Spring 2017 Distributed Software Engineering Review: HW2 What did we do? Cache->Redis Locks->Lock Server Post-mortem feedback: http://b.socrative.com/ click on student login,

More information

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015 Page 1 Example Replicated File Systems NFS Coda Ficus Page 2 NFS Originally NFS did not have any replication capability

More information

Dynamo: Key-Value Cloud Storage

Dynamo: Key-Value Cloud Storage Dynamo: Key-Value Cloud Storage Brad Karp UCL Computer Science CS M038 / GZ06 22 nd February 2016 Context: P2P vs. Data Center (key, value) Storage Chord and DHash intended for wide-area peer-to-peer systems

More information

Replication and Consistency

Replication and Consistency Replication and Consistency Today l Replication l Consistency models l Consistency protocols The value of replication For reliability and availability Avoid problems with disconnection, data corruption,

More information

Distributed Systems COMP 212. Revision 2 Othon Michail

Distributed Systems COMP 212. Revision 2 Othon Michail Distributed Systems COMP 212 Revision 2 Othon Michail Synchronisation 2/55 How would Lamport s algorithm synchronise the clocks in the following scenario? 3/55 How would Lamport s algorithm synchronise

More information

Consistency & Replication

Consistency & Replication Objectives Consistency & Replication Instructor: Dr. Tongping Liu To understand replication and related issues in distributed systems" To learn about how to keep multiple replicas consistent with each

More information

Arranging lunch value of preserving the causal order. a: how about lunch? meet at 12? a: <receives b then c>: which is ok?

Arranging lunch value of preserving the causal order. a: how about lunch? meet at 12? a: <receives b then c>: which is ok? Lamport Clocks: First, questions about project 1: due date for the design document is Thursday. Can be less than a page what we re after is for you to tell us what you are planning to do, so that we can

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Replicated State Machines Logical clocks Primary/ Backup Paxos? 0 1 (N-1)/2 No. of tolerable failures October 11, 2017 EECS 498

More information

DISTRIBUTED COMPUTER SYSTEMS

DISTRIBUTED COMPUTER SYSTEMS DISTRIBUTED COMPUTER SYSTEMS CONSISTENCY AND REPLICATION CONSISTENCY MODELS Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Consistency Models Background Replication Motivation

More information

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems Consistency CS 475, Spring 2018 Concurrent & Distributed Systems Review: 2PC, Timeouts when Coordinator crashes What if the bank doesn t hear back from coordinator? If bank voted no, it s OK to abort If

More information

Distributed Systems. Lec 12: Consistency Models Sequential, Causal, and Eventual Consistency. Slide acks: Jinyang Li

Distributed Systems. Lec 12: Consistency Models Sequential, Causal, and Eventual Consistency. Slide acks: Jinyang Li Distributed Systems Lec 12: Consistency Models Sequential, Causal, and Eventual Consistency Slide acks: Jinyang Li (http://www.news.cs.nyu.edu/~jinyang/fa10/notes/ds-eventual.ppt) 1 Consistency (Reminder)

More information

Distributed Systems. 10. Consensus: Paxos. Paul Krzyzanowski. Rutgers University. Fall 2017

Distributed Systems. 10. Consensus: Paxos. Paul Krzyzanowski. Rutgers University. Fall 2017 Distributed Systems 10. Consensus: Paxos Paul Krzyzanowski Rutgers University Fall 2017 1 Consensus Goal Allow a group of processes to agree on a result All processes must agree on the same value The value

More information

Report to Brewer s original presentation of his CAP Theorem at the Symposium on Principles of Distributed Computing (PODC) 2000

Report to Brewer s original presentation of his CAP Theorem at the Symposium on Principles of Distributed Computing (PODC) 2000 Brewer s CAP Theorem Report to Brewer s original presentation of his CAP Theorem at the Symposium on Principles of Distributed Computing (PODC) 2000 Written by Table of Contents Introduction... 2 The CAP-Theorem...

More information

Relaxed Memory-Consistency Models

Relaxed Memory-Consistency Models Relaxed Memory-Consistency Models [ 9.1] In small multiprocessors, sequential consistency can be implemented relatively easily. However, this is not true for large multiprocessors. Why? This is not the

More information

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage Horizontal or vertical scalability? Scaling Out Key-Value Storage COS 418: Distributed Systems Lecture 8 Kyle Jamieson Vertical Scaling Horizontal Scaling [Selected content adapted from M. Freedman, B.

More information

Recall: Primary-Backup. State machine replication. Extend PB for high availability. Consensus 2. Mechanism: Replicate and separate servers

Recall: Primary-Backup. State machine replication. Extend PB for high availability. Consensus 2. Mechanism: Replicate and separate servers Replicated s, RAFT COS 8: Distributed Systems Lecture 8 Recall: Primary-Backup Mechanism: Replicate and separate servers Goal #: Provide a highly reliable service Goal #: Servers should behave just like

More information

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 14: Data Replication Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database Replication What is database replication The advantages of

More information

What Came First? The Ordering of Events in

What Came First? The Ordering of Events in What Came First? The Ordering of Events in Systems @kavya719 kavya the design of concurrent systems Slack architecture on AWS systems with multiple independent actors. threads in a multithreaded program.

More information

Exam 2 Review. Fall 2011

Exam 2 Review. Fall 2011 Exam 2 Review Fall 2011 Question 1 What is a drawback of the token ring election algorithm? Bad question! Token ring mutex vs. Ring election! Ring election: multiple concurrent elections message size grows

More information

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf Distributed systems Lecture 6: distributed transactions, elections, consensus and replication Malte Schwarzkopf Last time Saw how we can build ordered multicast Messages between processes in a group Need

More information

Linearizability CMPT 401. Sequential Consistency. Passive Replication

Linearizability CMPT 401. Sequential Consistency. Passive Replication Linearizability CMPT 401 Thursday, March 31, 2005 The execution of a replicated service (potentially with multiple requests interleaved over multiple servers) is said to be linearizable if: The interleaved

More information

Causal Consistency and Two-Phase Commit

Causal Consistency and Two-Phase Commit Causal Consistency and Two-Phase Commit CS 240: Computing Systems and Concurrency Lecture 16 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Consistency

More information

Eventual Consistency 1

Eventual Consistency 1 Eventual Consistency 1 Readings Werner Vogels ACM Queue paper http://queue.acm.org/detail.cfm?id=1466448 Dynamo paper http://www.allthingsdistributed.com/files/ amazon-dynamo-sosp2007.pdf Apache Cassandra

More information

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication Important Lessons Lamport & vector clocks both give a logical timestamps Total ordering vs. causal ordering Other issues in coordinating node activities Exclusive access to resources/data Choosing a single

More information

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein. Copyright 2003 Philip A. Bernstein. Outline

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein. Copyright 2003 Philip A. Bernstein. Outline 10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Copyright 2003 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4. Other Approaches

More information

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson Distributed systems Lecture 6: Elections, distributed transactions, and replication DrRobert N. M. Watson 1 Last time Saw how we can build ordered multicast Messages between processes in a group Need to

More information

Relaxed Memory-Consistency Models

Relaxed Memory-Consistency Models Relaxed Memory-Consistency Models Review. Why are relaxed memory-consistency models needed? How do relaxed MC models require programs to be changed? The safety net between operations whose order needs

More information

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm

Final Exam Logistics. CS 133: Databases. Goals for Today. Some References Used. Final exam take-home. Same resources as midterm Final Exam Logistics CS 133: Databases Fall 2018 Lec 25 12/06 NoSQL Final exam take-home Available: Friday December 14 th, 4:00pm in Olin Due: Monday December 17 th, 5:15pm Same resources as midterm Except

More information

Foundations of the C++ Concurrency Memory Model

Foundations of the C++ Concurrency Memory Model Foundations of the C++ Concurrency Memory Model John Mellor-Crummey and Karthik Murthy Department of Computer Science Rice University johnmc@rice.edu COMP 522 27 September 2016 Before C++ Memory Model

More information

G Bayou: A Weakly Connected Replicated Storage System. Robert Grimm New York University

G Bayou: A Weakly Connected Replicated Storage System. Robert Grimm New York University G22.3250-001 Bayou: A Weakly Connected Replicated Storage System Robert Grimm New York University Altogether Now: The Three Questions! What is the problem?! What is new or different?! What are the contributions

More information

CSE 530A. Non-Relational Databases. Washington University Fall 2013

CSE 530A. Non-Relational Databases. Washington University Fall 2013 CSE 530A Non-Relational Databases Washington University Fall 2013 NoSQL "NoSQL" was originally the name of a specific RDBMS project that did not use a SQL interface Was co-opted years later to refer to

More information

Extend PB for high availability. PB high availability via 2PC. Recall: Primary-Backup. Putting it all together for SMR:

Extend PB for high availability. PB high availability via 2PC. Recall: Primary-Backup. Putting it all together for SMR: Putting it all together for SMR: Two-Phase Commit, Leader Election RAFT COS 8: Distributed Systems Lecture Recall: Primary-Backup Mechanism: Replicate and separate servers Goal #: Provide a highly reliable

More information

CS5412: TRANSACTIONS (I)

CS5412: TRANSACTIONS (I) 1 CS5412: TRANSACTIONS (I) Lecture XVII Ken Birman Transactions 2 A widely used reliability technology, despite the BASE methodology we use in the first tier Goal for this week: in-depth examination of

More information

10. Replication. Motivation

10. Replication. Motivation 10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure

More information

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi 1 Lecture Notes 1 Basic Concepts Anand Tripathi CSci 8980 Operating Systems Anand Tripathi CSci 8980 1 Distributed Systems A set of computers (hosts or nodes) connected through a communication network.

More information

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs 1 Anand Tripathi CSci 8980 Operating Systems Lecture Notes 1 Basic Concepts Distributed Systems A set of computers (hosts or nodes) connected through a communication network. Nodes may have different speeds

More information

CS6450: Distributed Systems Lecture 11. Ryan Stutsman

CS6450: Distributed Systems Lecture 11. Ryan Stutsman Strong Consistency CS6450: Distributed Systems Lecture 11 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed

More information

Scaling Out Key-Value Storage

Scaling Out Key-Value Storage Scaling Out Key-Value Storage COS 418: Distributed Systems Logan Stafman [Adapted from K. Jamieson, M. Freedman, B. Karp] Horizontal or vertical scalability? Vertical Scaling Horizontal Scaling 2 Horizontal

More information

INF-5360 Presentation

INF-5360 Presentation INF-5360 Presentation Optimistic Replication Ali Ahmad April 29, 2013 Structure of presentation Pessimistic and optimistic replication Elements of Optimistic replication Eventual consistency Scheduling

More information

416 practice questions (PQs)

416 practice questions (PQs) 416 practice questions (PQs) 1. Goal: give you some material to study for the final exam and to help you to more actively engage with the material we cover in class. 2. Format: questions that are in scope

More information

Distributed Systems. Pre-Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2015

Distributed Systems. Pre-Exam 1 Review. Paul Krzyzanowski. Rutgers University. Fall 2015 Distributed Systems Pre-Exam 1 Review Paul Krzyzanowski Rutgers University Fall 2015 October 2, 2015 CS 417 - Paul Krzyzanowski 1 Selected Questions From Past Exams October 2, 2015 CS 417 - Paul Krzyzanowski

More information

Transactions and ACID

Transactions and ACID Transactions and ACID Kevin Swingler Contents Recap of ACID transactions in RDBMSs Transactions and ACID in MongoDB 1 Concurrency Databases are almost always accessed by multiple users concurrently A user

More information

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein 10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright 2012 Philip A. Bernstein 1 Outline 1. Introduction 2. Primary-Copy Replication 3. Multi-Master Replication 4.

More information

Engineering Robust Server Software

Engineering Robust Server Software Engineering Robust Server Software Scalability Other Scalability Issues Database Load Testing 2 Databases Most server applications use databases Very complex pieces of software Designed for scalability

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

Transaction Management: Concurrency Control, part 2

Transaction Management: Concurrency Control, part 2 Transaction Management: Concurrency Control, part 2 CS634 Class 16 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke Locking for B+ Trees Naïve solution Ignore tree structure,

More information

Locking for B+ Trees. Transaction Management: Concurrency Control, part 2. Locking for B+ Trees (contd.) Locking vs. Latching

Locking for B+ Trees. Transaction Management: Concurrency Control, part 2. Locking for B+ Trees (contd.) Locking vs. Latching Locking for B+ Trees Transaction Management: Concurrency Control, part 2 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke CS634 Class 16 Naïve solution Ignore tree structure,

More information

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology Mobile and Heterogeneous databases Distributed Database System Transaction Management A.R. Hurson Computer Science Missouri Science & Technology 1 Distributed Database System Note, this unit will be covered

More information

SEGR 550 Distributed Computing. Final Exam, Fall 2011

SEGR 550 Distributed Computing. Final Exam, Fall 2011 SEGR 550 Distributed Computing Final Exam, Fall 2011 (100 points total) 1) This is a take-home examination. You must send your solutions in a PDF or text file to zhuy@seattleu.edu by the deadline. Late

More information

EECS 498 Introduction to Distributed Systems

EECS 498 Introduction to Distributed Systems EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha Dynamo Recap Consistent hashing 1-hop DHT enabled by gossip Execution of reads and writes Coordinated by first available successor

More information

CS Amazon Dynamo

CS Amazon Dynamo CS 5450 Amazon Dynamo Amazon s Architecture Dynamo The platform for Amazon's e-commerce services: shopping chart, best seller list, produce catalog, promotional items etc. A highly available, distributed

More information

Data Replication in Offline Web Applications:

Data Replication in Offline Web Applications: Data Replication in Offline Web Applications: Optimizing Persistence, Synchronization and Conflict Resolution Master Thesis Computer Science May 4, 2013 Samuel Esposito - 1597183 Primary supervisor: Prof.dr.

More information

Time in Distributed Systems

Time in Distributed Systems Time Slides are a variant of slides of a set by Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13- 239227-5 Time in Distributed

More information

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary Consistency and Replication Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary Reasons for Replication Reliability/Availability : Mask failures Mask corrupted data Performance: Scalability

More information

CS October 2017

CS October 2017 Atomic Transactions Transaction An operation composed of a number of discrete steps. Distributed Systems 11. Distributed Commit Protocols All the steps must be completed for the transaction to be committed.

More information

Implementing Isolation

Implementing Isolation CMPUT 391 Database Management Systems Implementing Isolation Textbook: 20 & 21.1 (first edition: 23 & 24.1) University of Alberta 1 Isolation Serial execution: Since each transaction is consistent and

More information

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer? Parallel and Distributed Systems Instructor: Sandhya Dwarkadas Department of Computer Science University of Rochester What is a parallel computer? A collection of processing elements that communicate and

More information

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman Scaling KVS CS6450: Distributed Systems Lecture 14 Ryan Stutsman Material taken/derived from Princeton COS-418 materials created by Michael Freedman and Kyle Jamieson at Princeton University. Licensed

More information

Consistency in Distributed Systems

Consistency in Distributed Systems Consistency in Distributed Systems Recall the fundamental DS properties DS may be large in scale and widely distributed 1. concurrent execution of components 2. independent failure modes 3. transmission

More information

Distributed Systems (5DV147)

Distributed Systems (5DV147) Distributed Systems (5DV147) Replication and consistency Fall 2013 1 Replication 2 What is replication? Introduction Make different copies of data ensuring that all copies are identical Immutable data

More information

11. Replication. Motivation

11. Replication. Motivation 11. Replication Seite 1 11. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure

More information

Why distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data?"

Why distributed databases suck, and what to do about it. Do you want a database that goes down or one that serves wrong data? Why distributed databases suck, and what to do about it - Regaining consistency Do you want a database that goes down or one that serves wrong data?" 1 About the speaker NoSQL team lead at Trifork, Aarhus,

More information

OpenEdge & CouchDB. Integrating the OpenEdge ABL with CouchDB. Don Beattie Software Architect Quicken Loans Inc.

OpenEdge & CouchDB. Integrating the OpenEdge ABL with CouchDB. Don Beattie Software Architect Quicken Loans Inc. OpenEdge & CouchDB Integrating the OpenEdge ABL with CouchDB Don Beattie Software Architect Quicken Loans Inc. Apache CouchDB has started. Time to relax. Intro The OpenEdge RDBMS is a great database that

More information

Reminder: Mechanics of address translation. Paged virtual memory. Reminder: Page Table Entries (PTEs) Demand paging. Page faults

Reminder: Mechanics of address translation. Paged virtual memory. Reminder: Page Table Entries (PTEs) Demand paging. Page faults CSE 451: Operating Systems Autumn 2012 Module 12 Virtual Memory, Page Faults, Demand Paging, and Page Replacement Reminder: Mechanics of address translation virtual address virtual # offset table frame

More information

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju Chapter 4: Distributed Systems: Replication and Consistency Fall 2013 Jussi Kangasharju Chapter Outline n Replication n Consistency models n Distribution protocols n Consistency protocols 2 Data Replication

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL

More information

Exam 2 Review. October 29, Paul Krzyzanowski 1

Exam 2 Review. October 29, Paul Krzyzanowski 1 Exam 2 Review October 29, 2015 2013 Paul Krzyzanowski 1 Question 1 Why did Dropbox add notification servers to their architecture? To avoid the overhead of clients polling the servers periodically to check

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.

More information

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL CISC 7610 Lecture 5 Distributed multimedia databases Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL Motivation YouTube receives 400 hours of video per minute That is 200M hours

More information

Welfare Navigation Using Genetic Algorithm

Welfare Navigation Using Genetic Algorithm Welfare Navigation Using Genetic Algorithm David Erukhimovich and Yoel Zeldes Hebrew University of Jerusalem AI course final project Abstract Using standard navigation algorithms and applications (such

More information

ExaminingCassandra Constraints: Pragmatic. Eyes

ExaminingCassandra Constraints: Pragmatic. Eyes International Journal of Management, IT & Engineering Vol. 9 Issue 3, March 2019, ISSN: 2249-0558 Impact Factor: 7.119 Journal Homepage: Double-Blind Peer Reviewed Refereed Open Access International Journal

More information

Introduction to NoSQL

Introduction to NoSQL Introduction to NoSQL Agenda History What is NoSQL Types of NoSQL The CAP theorem History - RDBMS Relational DataBase Management Systems were invented in the 1970s. E. F. Codd, "Relational Model of Data

More information

Modern Database Concepts

Modern Database Concepts Modern Database Concepts Basic Principles Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz NoSQL Overview Main objective: to implement a distributed state Different objects stored on different

More information

Distributed Systems. Catch-up Lecture: Consistency Model Implementations

Distributed Systems. Catch-up Lecture: Consistency Model Implementations Distributed Systems Catch-up Lecture: Consistency Model Implementations Slides redundant with Lec 11,12 Slide acks: Jinyang Li, Robert Morris, Dave Andersen 1 Outline Last times: Consistency models Strict

More information

Replication in Distributed Systems

Replication in Distributed Systems Replication in Distributed Systems Replication Basics Multiple copies of data kept in different nodes A set of replicas holding copies of a data Nodes can be physically very close or distributed all over

More information

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast

Coordination 2. Today. How can processes agree on an action or a value? l Group communication l Basic, reliable and l ordered multicast Coordination 2 Today l Group communication l Basic, reliable and l ordered multicast How can processes agree on an action or a value? Modes of communication Unicast 1ç è 1 Point to point Anycast 1è

More information

Building Consistent Transactions with Inconsistent Replication

Building Consistent Transactions with Inconsistent Replication Building Consistent Transactions with Inconsistent Replication Irene Zhang, Naveen Kr. Sharma, Adriana Szekeres, Arvind Krishnamurthy, Dan R. K. Ports University of Washington Distributed storage systems

More information

Data Replication CS 188 Distributed Systems February 3, 2015

Data Replication CS 188 Distributed Systems February 3, 2015 Data Replication CS 188 Distributed Systems February 3, 2015 Page 1 Some Other Possibilities What if the machines sharing files are portable and not always connected? What if the machines communicate across

More information

There is a tempta7on to say it is really used, it must be good

There is a tempta7on to say it is really used, it must be good Notes from reviews Dynamo Evalua7on doesn t cover all design goals (e.g. incremental scalability, heterogeneity) Is it research? Complexity? How general? Dynamo Mo7va7on Normal database not the right fit

More information

Availability versus consistency. Eventual Consistency: Bayou. Eventual consistency. Bayou: A Weakly Connected Replicated Storage System

Availability versus consistency. Eventual Consistency: Bayou. Eventual consistency. Bayou: A Weakly Connected Replicated Storage System Eventual Consistency: Bayou Availability versus consistency Totally-Ordered Multicast kept replicas consistent but had single points of failure Not available under failures COS 418: Distributed Systems

More information

Selected Questions. Exam 2 Fall 2006

Selected Questions. Exam 2 Fall 2006 Selected Questions Exam 2 Fall 2006 Page 1 Question 5 The clock in the clock tower in the town of Chronos broke. It was repaired but now the clock needs to be set. A train leaves for the nearest town,

More information

802.1AS Fast Master Clock Selection

802.1AS Fast Master Clock Selection 802.1AS Fast Master Clock Selection Moving 802.1AS closer to RSTP Version 2 Norman Finn Cisco Systems 1 Introduction 2 Introduction IEEE 1588 networks that contain transparent clocks and the current draft

More information

Python & Web Mining. Lecture Old Dominion University. Department of Computer Science CS 495 Fall 2012

Python & Web Mining. Lecture Old Dominion University. Department of Computer Science CS 495 Fall 2012 Python & Web Mining Lecture 6 10-10-12 Old Dominion University Department of Computer Science CS 495 Fall 2012 Hany SalahEldeen Khalil hany@cs.odu.edu Scenario So what did Professor X do when he wanted

More information

Distributed KIDS Labs 1

Distributed KIDS Labs 1 Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database

More information

CSE 344 Final Review. August 16 th

CSE 344 Final Review. August 16 th CSE 344 Final Review August 16 th Final In class on Friday One sheet of notes, front and back cost formulas also provided Practice exam on web site Good luck! Primary Topics Parallel DBs parallel join

More information

A state-based 3-way batch merge algorithm for models serialized in XMI

A state-based 3-way batch merge algorithm for models serialized in XMI A state-based 3-way batch merge algorithm for models serialized in XMI Aron Lidé Supervisor: Lars Bendix Department of Computer Science Faculty of Engineering Lund University November 2011 Abstract With

More information

Atomicity. Bailu Ding. Oct 18, Bailu Ding Atomicity Oct 18, / 38

Atomicity. Bailu Ding. Oct 18, Bailu Ding Atomicity Oct 18, / 38 Atomicity Bailu Ding Oct 18, 2012 Bailu Ding Atomicity Oct 18, 2012 1 / 38 Outline 1 Introduction 2 State Machine 3 Sinfonia 4 Dangers of Replication Bailu Ding Atomicity Oct 18, 2012 2 / 38 Introduction

More information

Distributed Systems. Day 13: Distributed Transaction. To Be or Not to Be Distributed.. Transactions

Distributed Systems. Day 13: Distributed Transaction. To Be or Not to Be Distributed.. Transactions Distributed Systems Day 13: Distributed Transaction To Be or Not to Be Distributed.. Transactions Summary Background on Transactions ACID Semantics Distribute Transactions Terminology: Transaction manager,,

More information

Two phase commit protocol. Two phase commit protocol. Recall: Linearizability (Strong Consistency) Consensus

Two phase commit protocol. Two phase commit protocol. Recall: Linearizability (Strong Consistency) Consensus Recall: Linearizability (Strong Consistency) Consensus COS 518: Advanced Computer Systems Lecture 4 Provide behavior of a single copy of object: Read should urn the most recent write Subsequent reads should

More information

CAP Theorem. March 26, Thanks to Arvind K., Dong W., and Mihir N. for slides.

CAP Theorem. March 26, Thanks to Arvind K., Dong W., and Mihir N. for slides. C A CAP Theorem P March 26, 2018 Thanks to Arvind K., Dong W., and Mihir N. for slides. CAP Theorem It is impossible for a web service to provide these three guarantees at the same time (pick 2 of 3):

More information

Distributed Data Management Replication

Distributed Data Management Replication Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut Distributing Data Motivation Scalability (Elasticity) If data volume, processing, or access exhausts one machine, you might want to spread

More information

Ananta: Cloud Scale Load Balancing. Nitish Paradkar, Zaina Hamid. EECS 589 Paper Review

Ananta: Cloud Scale Load Balancing. Nitish Paradkar, Zaina Hamid. EECS 589 Paper Review Ananta: Cloud Scale Load Balancing Nitish Paradkar, Zaina Hamid EECS 589 Paper Review 1 Full Reference Patel, P. et al., " Ananta: Cloud Scale Load Balancing," Proc. of ACM SIGCOMM '13, 43(4):207-218,

More information

GOSSIP ARCHITECTURE. Gary Berg css434

GOSSIP ARCHITECTURE. Gary Berg css434 GOSSIP ARCHITECTURE Gary Berg css434 WE WILL SEE Architecture overview Consistency models How it works Availability and Recovery Performance and Scalability PRELIMINARIES Why replication? Fault tolerance

More information

Strong Consistency & CAP Theorem

Strong Consistency & CAP Theorem Strong Consistency & CAP Theorem CS 240: Computing Systems and Concurrency Lecture 15 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material. Consistency models

More information

Process Synchroniztion Mutual Exclusion & Election Algorithms

Process Synchroniztion Mutual Exclusion & Election Algorithms Process Synchroniztion Mutual Exclusion & Election Algorithms Paul Krzyzanowski Rutgers University November 2, 2017 1 Introduction Process synchronization is the set of techniques that are used to coordinate

More information

Computing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1

Computing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1 Computing Parable The Archery Teacher Courtesy: S. Keshav, U. Waterloo Lecture 16, page 1 Consistency and Replication Today: Consistency models Data-centric consistency models Client-centric consistency

More information

From eventual to strong consistency. Primary-Backup Replication. Primary-Backup Replication. Replication State Machines via Primary-Backup

From eventual to strong consistency. Primary-Backup Replication. Primary-Backup Replication. Replication State Machines via Primary-Backup From eventual to strong consistency Replication s via - Eventual consistency Multi-master: Any node can accept operation Asynchronously, nodes synchronize state COS 418: Distributed Systems Lecture 10

More information

Paxos and Replication. Dan Ports, CSEP 552

Paxos and Replication. Dan Ports, CSEP 552 Paxos and Replication Dan Ports, CSEP 552 Today: achieving consensus with Paxos and how to use this to build a replicated system Last week Scaling a web service using front-end caching but what about the

More information