Replication Data replication, consistency models & protocols C. L. Roncancio - S. Drapeau Grenoble INP Ensimag / LIG - Obeo 1 Data and process Focus on data: several physical of one logical object What for? Performances, availability, reliability Scalability Application requirements @Drapeau & Roncancio 2 Execution Model There are physical of logical objects in the system. Operations are specified on logical objects, but translated to operate on physical objects. One-copy equivalence The effect of transactions performed by clients on replicated objects should be the same as if they had been performed on a single set of objects. Write(x) x Logical data item Write(x 1 ) Write(x 2 ) Write(x n ) Difficulties Transparency of replication Global coherency Mutual consistency among Transactional coherency Mutual consistency Cost Impact on the availability M. T. Özsu & P. Valduriez x 1 x 2 x n Physical data item (replicas, ) @Drapeau & Roncancio 4 Widely used! Distributed systems! Parallelism: fragmentation (sharding) & replication! Contexts with few conflicting updates! Data warehouses, read only systems.. Mutual consistency, what for? Weak High No Many protocols with different " performances " fault tolerance guaranties " transparency " Programming difficulties " Consistency Performance Availability/reliability Strong Medium Full @Drapeau & Roncancio 5 Drapeau & Roncancio, notes de cours 6 1
Several aspects Application Notions from distributed systems Consistency model Consistency perceived by the programmers/users Replication Replication Code Code non non fonctionnel fonctionnel Concurrency Duplication control Gestion Duplication de la concurrence Gestion Tolérance de la aux concurrence fautes Tolérance Etc. aux fautes Other aspects Etc. Fault tolerance @Drapeau & Roncancio 7 Consistency protocol How the system insures de consistency model Degree of synchronization of the Update propagation, invalidation @Drapeau & Roncancio 8 Types of replication Symmetric replication Symmetric Updates on any copy Asymmetric Updates on some selected Other are read-only Snapshots Materialized views Master1 u3(i) Master2 u3(i) Master3 u3(i) @Drapeau & Roncancio 9 @Drapeau & Roncancio 10 Asymmetric replication (1) Asymmetric replication (2) slave2 Slave1 master1 slave3 Master Slave2 master2 slave1 slave3 @Drapeau & Roncancio 11 temps @Drapeau & Roncancio 12 2
Update propagation (1) Update propagation (2) Eager Update of the in a single transaction Synchronous Lazy Update of the in separate transactions Asynchronous Eager Symmetric master1 master2 Asymmetric master slave Lazy master1 master2 master slave @Drapeau & Roncancio 13 @Drapeau & Roncancio 14 One-copy serializability (1SR) The execution of the transactions is equivalent to a sequential execution on non replicated data Eager and lazy strategies @Drapeau & Roncancio 15 Mutual consistency Strong consistency When reading any copy the user gets a value including all the preceding updates Weak consistency Copies may not have the last value but all updates will be eventually propagated to all the Eventual consistency: will converge to a single value in a finite time @Drapeau & Roncancio 16 Update propagation strategies Eager Full synchronization Synchronization of available Quorum Lazy Master / slaves Master / secondary Lazy asymmetric (independent) Full synchronization All the are updated in a synchronous way Atomic update of the 2PC, Concurrency control requirements @Drapeau & Roncancio 17 @Drapeau & Roncancio 18 3
Synchronization of available All the available are updated in a synchronous way Asynchronous update of the unavailable Invariant: available are up-to-date Use of 2PC Some are not up-to-date Quorum Synchronous update of a quorum of Other are updated in asynchronously QR + QW > N QW > N/2 Impatiente Invariant: a quorum of is up-to-date @Drapeau & Roncancio 19 @Drapeau & Roncancio 20 Master/slaves Updates on the master copy Asynchronous update of the slaves Invariant: master copy up-to-date Master/secondary Updates on the master copy Synchronous update of a secondary copy (fault tolerance of the master) Asynchronous update of the other slaves Static / dynamic @Drapeau & Roncancio 21 @Drapeau & Roncancio 22 Lazy asymmetric (independent) Updates everywhere Divergence of the Reconciliation algorithms Invariant no warranty on the value of the Applications in large scale and mobile systems Paresseuse Update detection Log based Analyzing the system logs Event based General approach Part of the transaction @Drapeau & Roncancio 23 @Drapeau & Roncancio 24 4
Update propagation Approaches Value: total, delta Transaction replication Initiative Push / Pull Replication & adaptation Applications Non functional context transactional, shared memory, cache, cloud Replication protocols Partial, full Static, dynamic @Drapeau & Roncancio 25 @Drapeau & Roncancio 26 Examples adapting consistency The price of the items (strong consistency) Data about the products sold (eventual with constraints) Credit card information (strong consistency) Data of customer profiles (eventual with constraints) Records on user s preferences (weak consistency) Logging information (weak consistency) Examples adapting consistency Time Policy Numeric Policies Demarcation policy Pay only when it matters @Drapeau & Roncancio 27 @Drapeau & Roncancio 28 CAP theorem CAP theorem (2) Brewer s conjencture, Symposium on Principles of Distributed Computing, PODC, 2000 Proved by Seth Gilbert & Nancy Lynch, 2002 Consistency: all nodes see the same data at the same time Availability: a guarantee that every request receives a response about whether it was successful or failed Partition Tolerance: the system continues to operate despite arbitrary message loss or failure of part of the system Distributed systems cannot satisfy all three of these guarantees at the same time. @Drapeau & Roncancio 29 @Drapeau & Roncancio 30 5
BASE Transactions Basically Available, Soft state, Eventual consistency Basically Available : response to any request but, that response could still be failure to obtain the requested data or the data may be in an inconsistent Soft state: the state of the system could change over time, so even during times without input there may be changes going on due to eventual consistency Eventual consistency: the system will eventually become consistent once it stops receiving input Eventual consistency If no new updates are made to the object, eventually all access will return the last updated value Read your writes Monotonic read & Monotonic write @Drapeau & Roncancio 32 Conclusion on replication Important issue Performances, availability, fault tolerance Transparency Main aspects of replication What, when, where Consistency model @Drapeau & Roncancio 34 6