Distributed Systems (5DV147)

Similar documents
Module 7 - Replication

Chapter 7 Consistency And Replication

Replication in Distributed Systems

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

殷亚凤. Consistency and Replication. Distributed Systems [7]

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 7 Consistency And Replication

DISTRIBUTED COMPUTER SYSTEMS

CSE 5306 Distributed Systems. Consistency and Replication

Consistency & Replication

Replication and Consistency

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju

Consistency and Replication

Replication and Consistency. Fall 2010 Jussi Kangasharju

Module 7 File Systems & Replication CS755! 7-1!

Lecture 6 Consistency and Replication

Consistency and Replication (part b)

Basic vs. Reliable Multicast

CSE 5306 Distributed Systems

Consistency and Replication

Consistency and Replication 1/62

Consistency and Replication

Distributed Systems. replication Johan Montelius ID2201. Distributed Systems ID2201

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary

Distributed Systems: Consistency and Replication

Consistency and Replication. Why replicate?

Replication. Consistency models. Replica placement Distribution protocols

Consistency and Replication 1/65

Computing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1

MYE017 Distributed Systems. Kostas Magoutis

Consistency and Replication. Why replicate?

Distributed Systems (5DV020) Group communication. Fall Group communication. Fall Indirect communication

Distributed Systems Principles and Paradigms. Chapter 07: Consistency & Replication

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Distributed Systems. Lec 12: Consistency Models Sequential, Causal, and Eventual Consistency. Slide acks: Jinyang Li

Distributed Systems. Aleardo Manacero Jr.

Memory Consistency. Minsoo Ryu. Department of Computer Science and Engineering. Hanyang University. Real-Time Computing and Communications Lab.

Replication & Consistency Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Replication of Data. Data-Centric Consistency Models. Reliability vs. Availability

Linearizability CMPT 401. Sequential Consistency. Passive Replication

11. Replication. Motivation

CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [DISTRIBUTED MUTUAL EXCLUSION] Frequently asked questions from the previous class survey

Consistency & Replication in Distributed Systems

Distributed File Systems. CS432: Distributed Systems Spring 2017

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Today CSCI Data Replication. Example: Distributed Shared Memory. Data Replication. Data Consistency. Instructor: Abhishek Chandra

Important Lessons. Today's Lecture. Two Views of Distributed Systems

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Control. CS432: Distributed Systems Spring 2017

Replication architecture

Replica Placement. Replica Placement

CS Amazon Dynamo

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Last Class: Web Caching. Today: More on Consistency

Distributed Information Processing

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Distributed Information Processing

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Eventual Consistency. Eventual Consistency

Distributed Systems. Catch-up Lecture: Consistency Model Implementations

Module 8 Fault Tolerance CS655! 8-1!

Middleware and Interprocess Communication

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Today CSCI Consistency Protocols. Consistency Protocols. Data Consistency. Instructor: Abhishek Chandra. Implementation of a consistency model

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

June Gerd Liefländer System Architecture Group Universität Karlsruhe, System Architecture Group

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

Replication Brian Nielsen

Replications and Consensus

DISTRIBUTED SYSTEMS. Second Edition. Andrew S. Tanenbaum Maarten Van Steen. Vrije Universiteit Amsterdam, 7'he Netherlands PEARSON.

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering

10. Replication. Motivation

Middleware and Distributed Systems. System Models. Dr. Martin v. Löwis

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Distributed Systems COMP 212. Revision 2 Othon Michail

Last Class: Consistency Models. Today: Implementation Issues

Distributed Systems (5DV147)

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun

Transactions with Replicated Data. Distributed Software Systems

Implementation Issues. Remote-Write Protocols

Last Class:Consistency Semantics. Today: More on Consistency

Chapter 1: Distributed Systems: What is a distributed system? Fall 2013

Distributed Systems. Fundamentals Shared Store. Fabienne Boyer Reprise du cours d Olivier Gruber. Université Joseph Fourier. Projet ERODS (LIG)

CONSISTENCY IN DISTRIBUTED SYSTEMS

Chapter 6 Synchronization (1)

Presented By: Devarsh Patel

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS 425 / ECE 428 Distributed Systems Fall 2017

Database Replication: A Tutorial

SCALABLE CONSISTENCY AND TRANSACTION MODELS

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

DISTRIBUTED MUTEX. EE324 Lecture 11

The UNIVERSITY of EDINBURGH. SCHOOL of INFORMATICS. CS4/MSc. Distributed Systems. Björn Franke. Room 2414

Module 8 - Fault Tolerance

Distributed Systems. Lec 11: Consistency Models. Slide acks: Jinyang Li, Robert Morris

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Cache Coherence in Distributed and Replicated Transactional Memory Systems. Technical Report RT/4/2009

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

GOSSIP ARCHITECTURE. Gary Berg css434

Transcription:

Distributed Systems (5DV147) Replication and consistency Fall 2013 1

Replication 2

What is replication? Introduction Make different copies of data ensuring that all copies are identical Immutable data trivial Often updated data tricky, can be expensive to maintain copies identical Replication requirements Replication transparency (illusion of single copy) Consistency 3

Why replication? Introduction Reliability Fault tolerance (redundancy, switching to other replica) Protection against corrupted data (majority) Availability Performance Scalability (divide the work) Load balancing Reducing access latency (data closer to process) Caching 4

Problems that you may find Introduction Problems if multiple clients access replicas Concurrent access, rather than exclusive Operations are interleaved How do we ensure correctness? Replica placement Servers Content Overhead required to keep replicas up to date Global synchronization (Atomic operations) Relaxed atomicity constraint, but copies will not always be the same Depends on access and update patterns of data 5

Example Interleaving setbalance B (x,1) setbalance A (y,2) Client 1 readbalance A (y) 2 readbalance A (x) 0 Client2 Local replica of Client 1 is B Local replica of Client 2 is A 6

Correctness of interleaving Interleaving Basic correctness property An interleaved sequence of operations must meet the specification of a single correct copy of the object(s) Sequential consistency property Order of operations is consistent with the program order in which each individual process executed them Linearizability property Order of operations is consistent with the real times at which the operations occurred during execution 7

Example of interleaved operations for 2 clients: C1: A, B, C C2: d, e, f Real Order during execution: A, B, d, C, e, f Interleaving An interleaving with sequential consistency: A, B, d, e, f, C Interleaving with linearizability: A, B, d, C, e, f Fall 2012 8

Passive replication Replication: models One primary replica manager, many backup replicas If primary fails, backups can take its place (election!) Implements linearizability if: A failing primary is replaced by a unique backup C FE Backups agree on which operations were performed before primary crashed View-synchronous group C FE communication! Primary RM RM Backup RM Backup Figure adapted from Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012 based on Figure 18.3 9

Steps of passive replication 1. Request Front end issues request with unique ID 2. Coordination Primary checks if request has been carried out, if so, returns cached response 3. Execution Perform operation, cache results 4. Agreement Primary sends updated state to backups 5. Response Primary sends result to front end, which forwards to the client C C 1 1 FE FE 5 Primary RM 2 3 Replication: models RM 4 Backup RM Backup What happens if the primary RM crashes? Before agreement After agreement 10

Active replication Replication: models More distributed All replica managers carry out all operations Requests to RM are totally ordered Front ends issue one request at a time (FIFO) Implements sequential consistency C FE RM RM RM FE C Figure adapted from Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012 based on Figure 18.4 11

Steps of active replication Replication: models 1. Request Front end adds unique identifier to request, multicasts to RMs 2. Coordination Totally ordered request delivery to RMs 3. Execution Each RM executes request 4. Agreement Not needed 5. Response All RMs respond to front end, front end interprets response and forwards response to client C 1 FE 5 5 5 2 2 2 RM 3 RM 3 RM 3 FE 1 C 12

Replication: models Comparing active and passive replication Both handle crash failures (but differently) Only active can handle arbitrary failures Optimizations? Send reads to backups in passive Lose linearizability property! Send reads to specific RM in active Lose fault tolerance Exploit commutativity of requests to avoid ordering requests in active 13

Consistency 14

Consistency problem Consistency Replication improves reliability and performance but when a replica is updated, it becomes different from the others so we need to propagate updates in a way that temporal inconsistencies are not noticed however this may degrade performance severely 15

Consistency models Consistency Is a contract between processes and a data store if processes agree to obey certain rules, the store promises to work correctly (Tanenbaum and van Steen, 2002) What to expect when reading and updating shared date (while others do the same) Data-centric models (system-wide) Client-centric models (single client) 16

Data-centric consistency models 17

Strict consistency Data centric consistency Every read of x returns a value corresponding to the result of the most recent write to x True replication transparency, every process receives a response that is consistent with the real time All writes are instantaneously visible to all process Assumes absolute global time Due to message latency, strict consistency is difficult to implement 18

Data centric consistency Strict consistency example: A: W(x) a A: W(x) a B: R(x) a B: R(x) NIL R(x) a Strictly consistent Not strictly consistent In general, A:write t (x,a) then B:read t (x,a) ; t >t (regardless on the number of replicas of x) Figure adapted from Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, (c) 2002 Prentice-Hall, Inc.- based on Figure 6.5 19

Linearizability Data centric consistency Interleaving of reads and writes into a single total order that respects the local ordering of the operations of every process A trace is consistent when every read returns the latest write preceding the read A trace is linearizable when It is consistent If t 1, t 2 are the times at which p i and p j perform operations, and t 1 < t 2, then the consistent trace must satisfy the condition that t 1 < t 2 20

Data centric consistency Linearizability example: A: B: W(x) 1 W(y) 1 R(y) 1 R(x) 1 The linearizable trace is A:W(x,1), B:W(y,1), A:R(y,1), B:R(x,1) 21

Sequential consistency Data centric consistency The result of any execution is the same as if the (read and write) operations by all processes on the data store were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program (Lamport 1979) Requires that interleaving preserving local temporal order of reads and writes are consistent traces Is not concerned with real time All processes see the same interleaving of operations 22

Data centric consistency Sequential consistency example Sequentially consistent Not sequentially consistent W 2 (x)b, R 3 (x)b, R 4 (x)b, W 1 (x)a, R 3 (x)a, R 4 (x)a W 2 (x)b, R 3 (x)b,??? Figure adapted from Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, (c) 2002 Prentice-Hall, Inc.- based on Figure 6.6 23

Causal consistency Data centric consistency All writes that are causally related must be seen by every process in the same order, and reads must be consistent with this order Writes that are not causally related to one another (concurrent) can be seen in any order No constraints on the order of values read by a process if writes are not causally related 24

Data centric consistency Example Causally consistent Not causally consistent Figure adapted from Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, (c) 2002 Prentice-Hall, Inc.- based on Figure 6.10 25

Data centric consistency Example Causally consistent Figure adapted from Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, (c) 2002 Prentice-Hall, Inc.- based on Figure 6.9 26

Consistency Description Data centric consistency Strict Linearizability Sequential Causal Absolute time ordering on all shared accesses, essentially impossible to implement it in distributed systems All processes see all shared accesses in the same order. Accesses are ordered based on a global timestamp. Good for reasoning about correctness of concurrent programs but not really used for building programs All processes see all shared accesses in the same order. Accesses are not ordered in time. Feasible and popular but has poor performance All processes see causally-related shared accesses in the same order. There is no globally agreed upon view of the order of operations Figure adapted from Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, (c) 2002 Prentice-Hall, Inc.- based on Figure 6.18.a 27

Update propagation and consistency protocols 28

Update propagation Update propagation Price of replication Keep replicas consistent Replica updates Atomic updates (all replicas need to be identical), maintain all replicas equal Maintaining replicas consistent may also generate scalability problems 29

Update propagation Update propagation Update notifications Good for low read-to-write ratios Transfer data from one copy to another Good for high read-to-write-ratios Propagate the update operation to other copies Active replication Push Propagation initiated by server Good for high read-to-write ratios Pull Client requests server to send updates Good for low read-to-write ratios 30

Consistency protocols Consistency Protocols Describes an implementation of a specific consistency model Sequential consistency Passive replication remote-write protocols and local-write protocols (primary-based) Active replication quorum-based protocols 31

Primary-based protocol: remote-write Consistency Protocols Updates are blocking operations non-blocking operations improve performance but, problem Fault tolerance Figure adapted from Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, (c) 2002 Prentice-Hall, Inc.- based on Figure 6.28 32

Primary-based protocol: local-write Consistency Protocols Primary migrates between processes that wish to perform an operation Optimization carry out multiple successive writes locally Figure adapted from Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, (c) 2002 Prentice-Hall, Inc.- based on Figure 6.30 33

Active replication: quorum-based Consistency Protocols Clients need to request and acquire permission from replicas before reading (read quorum) or writing (write quorum) Each data item contains a version number Read/write requires agreement of a majority Constraints for read (N R ) and write (N W ) quorums 1. N R + N W > N 2. N W > N/2 34

Quorum-based example Consistency Protocols Correct choice of N R & N W write-write conflict ROWA (read one, write all) Figure adapted from Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, (c) 2002 Prentice-Hall, Inc.- based on Figure 6.33 35

Client-centric consistency models 36

Eventual consistency Client centric consistency Maintains consistency for individual clients, not considering that data may be shared by several clients If updates are infrequent, eventually all replicas will obtain the update and become identical Good if clients always access the same replica Delay resolving conflicts Assume that clients connect to different replicas and that differences in those replicas should be transparent Several variations 37

Client centric consistency Figure adapted from Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, (c) 2002 Prentice-Hall, Inc.- based on Figure 6.19 38

Monotonic-read consistency Client centric consistency If a process has seen a value of (data item) x at a certain time, it will never see an older version of x at a later time 39

Monotonic-write consistency Client centric consistency A write to data item x is completed before any successive write to x by the same process 40

Read-your-writes consistency Client centric consistency A process will never see a previous value of x after a write to that data item x 41

Write-follow-reads consistency Client centric consistency When writing to x following a previous read by the same process, is guaranteed to take place on the same or a more recent value of x that was read 42

more about groups 43

Group views groups Contain the set of group members at a given point in time Failed identified processes are not in the view Events occur in views Views are delivered when membership changes View-synchronous group communication Based on view delivery, we can know which messages must have been delivered (within a view) 44

View delivery requirements groups Order View changes always occur in the same order at all processes Integrity If a process delivers a view then that process is part of that view Non-triviality Processes that have joined a group and communicate indefinitely are members of the same group Membership service should eventually reflect network partitions 45

View-synchronous group communication groups Correct processes deliver the same set of messages in any given view Messages are delivered at most once Correct processes always deliver messages they send: If delivering to q fails, the next view excludes q Fall 2012 46

Summary (1) Summary What is replication and why is it necessary Correctness of interleaving Basic, sequential consistency, and linearizability General replication phases Request, coordination, execution, agreement, response Types of ordering adapted to replication FIFO, Causal, Total 47

Summary (2) Passive replication (implements linearizability) A single primary replica manager and one or more backup replica managers Active replication (implements sequential consistency) Independent replica managers executing all operations Updates propagation Update notifications, data, or operations, and push vs. pull Consistency protocols (implementation of consistency model) Primary-based protocols (passive) Remote-write Local-write Quorum-based protocols (active) Summary 48

Summary (3) Summary Consistency models Data-centric models (strict, linearizability, sequential, causal), differ In how restrictive they are How complex their implementations are Ease of programming Performance Client-centric models Eventual consistency Monotonic reads, monotonic writes, read your writes, writes follow reads 49

Summary (4) Summary More about groups Static vs. dynamic groups and primary partition vs. partitionable groups Group views Current list of members Events occur in views Views are delivered when membership changes Requirements for delivering views View synchronous group communication 50

Next Lecture Cassandra 51