Replication architecture

Similar documents
Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary

Consistency and Replication

Consistency and Replication (part b)

Replication in Distributed Systems

Consistency & Replication

Last Class: Consistency Models. Today: Implementation Issues

Distributed Systems (5DV147)

Replication. Consistency models. Replica placement Distribution protocols

Chapter 4: Distributed Systems: Replication and Consistency. Fall 2013 Jussi Kangasharju

Consistency and Replication

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Distributed Systems. replication Johan Montelius ID2201. Distributed Systems ID2201

Today: Fault Tolerance. Replica Management

CONSISTENCY IN DISTRIBUTED SYSTEMS

Lecture 6 Consistency and Replication

Replication and Consistency. Fall 2010 Jussi Kangasharju

Eventual Consistency. Eventual Consistency

Distributed Systems Principles and Paradigms. Chapter 07: Consistency & Replication

Implementation Issues. Remote-Write Protocols

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

Module 7 - Replication

Consistency and Replication 1/65

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

Consistency and Replication 1/62

Fault Tolerance for Highly Available Internet Services: Concept, Approaches, and Issues

Basic vs. Reliable Multicast

Consistency and Replication

CSE 5306 Distributed Systems. Consistency and Replication

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Consistency and Replication. Why replicate?

The Google File System

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013

Replica Placement. Replica Placement

Last Class: Web Caching. Today: More on Consistency

殷亚凤. Consistency and Replication. Distributed Systems [7]

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 7 Consistency And Replication

Replication & Consistency Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Distributed Systems 24. Fault Tolerance

CS 425 / ECE 428 Distributed Systems Fall 2017

Failure Models. Fault Tolerance. Failure Masking by Redundancy. Agreement in Faulty Systems

Today: World Wide Web! Traditional Web-Based Systems!

CA464 Distributed Programming

Fault Tolerance. Distributed Systems. September 2002

Distributed Systems: Consistency and Replication

Distributed Systems Principles and Paradigms. Chapter 01: Introduction. Contents. Distributed System: Definition.

Distributed Systems Principles and Paradigms. Chapter 01: Introduction

CSE 5306 Distributed Systems

CSE 5306 Distributed Systems. Fault Tolerance

TECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica

Distributed Systems 23. Fault Tolerance

Chapter 8 Fault Tolerance

Fault Tolerance Dealing with an imperfect world

Last Class: Clock Synchronization. Today: More Canonical Problems

Chapter 6: Distributed Systems: The Web. Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al.

CSE 5306 Distributed Systems

Distributed Systems Principles and Paradigms

Chapter 7 Consistency And Replication

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

Chapter 5: Distributed Systems: Fault Tolerance. Fall 2013 Jussi Kangasharju

A Framework for Highly Available Services Based on Group Communication

Fault Tolerance. The Three universe model

Lecture XII: Replication

GFS: The Google File System. Dr. Yingwu Zhu

Managing Copy Services

Today: World Wide Web

Failure Tolerance. Distributed Systems Santa Clara University

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc.

Replication. Feb 10, 2016 CPSC 416

System Models for Distributed Systems

Today: World Wide Web. Traditional Web-Based Systems

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Paxos provides a highly available, redundant log of events

CPS 512 midterm exam #1, 10/7/2016

Distributed Systems (5DV020) Group communication. Fall Group communication. Fall Indirect communication

Consistency in Distributed Systems

Motivation for peer-to-peer

MODELS OF DISTRIBUTED SYSTEMS

The Google File System (GFS)

Computing Parable. The Archery Teacher. Courtesy: S. Keshav, U. Waterloo. Computer Science. Lecture 16, page 1

Last Class: Clock Synchronization. Today: More Canonical Problems

Lecture XIII: Replication-II

Last Class:Consistency Semantics. Today: More on Consistency

Distributed Systems Principles and Paradigms. Chapter 12: Distributed Web-Based Systems

Linearizability CMPT 401. Sequential Consistency. Passive Replication

COMMUNICATION IN DISTRIBUTED SYSTEMS

Consistency and Replication. Why replicate?

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692)

What is the relationship between a domain name (e.g., youtube.com) and an IP address?

MODELS OF DISTRIBUTED SYSTEMS

Fault Tolerance. Basic Concepts

Distributed Systems COMP 212. Revision 2 Othon Michail

Today: Fault Tolerance. Fault Tolerance

EECS 482 Introduction to Operating Systems

Exam 2 Review. October 29, Paul Krzyzanowski 1

TIBCO StreamBase 10 Distributed Computing and High Availability. November 2017

Chapter 11. High Availability

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

Today: Fault Tolerance. Failure Masking by Redundancy

Transcription:

Replication INF 5040 autumn 2008 lecturer: Roman Vitenberg INF5040, Roman Vitenberg 1 Replication architecture Client Front end Replica Client Front end Server Replica INF5040, Roman Vitenberg 2 INF 5040 høst 2005 1

Why replication I? Better performance Multiple servers offer the same service parallel processing of client requests Geographical distribution Creating copies of data/objects closer to the clients leads to smaller network delay and possibly reduced network traffic INF5040, Roman Vitenberg 3 Why replication II? Better availability (continuous operation despite failures of individual components) For many services it is important that availability with acceptable response time approaches 100%, despite that Server processes may fail Parts of the network may fail Data may get corrupted Example: 5% chance of a server failure within a given period - two independent servers give 99.75% availability INF5040, Roman Vitenberg 4 INF 5040 høst 2005 2

Challenges of replication Complex mechanisms Placement of replicas (and search for them) Propagation of data (e.g., updates) among the replicas Consistency maintenance Monitoring and failover mechanisms These protocols also consume bandwidth Some of this complexity is exposed to the clients Impossible to achieve complete replication transparency INF5040, Roman Vitenberg 5 Placement of replicas Permanent replicas Server-initiated caches Client-initiated caches Clients INF5040, Roman Vitenberg 6 INF 5040 høst 2005 3

Placement of replicas Permanent replicas Clusters of servers Geographically dispersed web mirrors (Akamai) Server-initiated caches Placement of hosting servers Placement of caches Flash crowds in the Web Client-initiated caches Enterprise proxies or web browser caches INF5040, Roman Vitenberg 7 Propagation of updates among the replicas Push-based propagation A replica pushes the update to the others May push the new data or parameters of the update operation Pull-based propagation A replica requests another replica to send the newest data it has Issue State at the server Messages sent Freshness of replicas Push-based List of client caches Update Eager propagation Pull-based A server to pull data Poll and update By demand INF5040, Roman Vitenberg 8 INF 5040 høst 2005 4

Propagation of updates among the replicas Pushing data vs pushing updates Pushing updates reduces traffic Requires more processing power on each replica Requires deterministic operations Hybrid push-pull approaches Lease-based propagation Pushing invalidations A replica that performs the update notifies other replicas A replica informed that a newer version is available will fetch the new version at a later point INF5040, Roman Vitenberg 9 Lack of consistency Client 1 deposit B (x, 100) deposit A (y, 100) Client 2 balance A (y) 100 balance A (x) 0 INF5040, Roman Vitenberg 11 INF 5040 høst 2005 5

Consistency A contract between the application developer and replicated service provider The provider guarantees that the data will be updated according to some criteria The application developer will need to devise applications with these criteria in mind The consistency-efficiency-simplicity triangle Consistency Efficiency Simplicity INF5040, Roman Vitenberg 12 Sequential consistency Intuition: for each interleaved execution of the system there should exist a sequential execution that: Performs the same operations as the original execution Fulfills the specification of a single object copy The sequence of operations satisfies the happened before order INF5040, Roman Vitenberg 13 INF 5040 høst 2005 6

Example revisited Client 1 deposit B (x, 100) deposit A (y, 100) Client 2 balance A (y) 100 balance A (x) 0 This is not sequentially consistent, because there is no corresponding sequential execution of a non-replicated system INF5040, Roman Vitenberg 14 More examples C 1 C 2 C1 Deposit(a, 50) Balance(a) = empty Balance(a) = 50 Deposit(a, 50) Balance(a) = empty C1 C2 C1 C2 C3 C4 Bal(b)=30 Bal(a)=50 Dep(a,50) Dep(b,30) Deposit(a,50) Balance(a) = 50 Balance(a) = empty Bal(a)=em Bal(b)=em INF5040, Roman Vitenberg 15 INF 5040 høst 2005 7

Active replication (replicated state machine) The idea: Every replica sees exactly the same set of messages in the same order and will execute them in that order Benefits: Every server is able to respond to client queries with updated data Immediate fail-over Limitations: Waste of resources, since all replicas are doing the same Update propagation only, which requires determinism Different implementation levels Machine instruction level (or VM), e.g., Tandem Logical state (software-based active replication) INF5040, Roman Vitenberg 16 Passive replication (primarybackup replication) One server plays a special primary role Performs all the updates May propagate them to backup replicas eagerly or lazily Maintains the most updated state Backup servers may take off the load of processing client requests but only if stale results are ok Implementable without deterministic operations Typically easier to implement than active replication Less network traffic during the normal operation but longer recovery with possible data loss Several sub-schemes (cold backup, warm backup, hot standby) INF5040, Roman Vitenberg 17 INF 5040 høst 2005 8

Primary-backup replication (cold backup) Only the primary is active Periodically checkpoints its state to backup storage Stable storage or shared storage (SAN) When the primary fails, the backup is initiated, it loads the state from storage, and takes over Slow recovery Need to start the backup (run applications, obtain resources, etc.) Either the backup replays the last actions from a log file, or it may miss the last updates since the most recent checkpoint Most resource-efficient It is possible to have several backups to survive multiple failures INF5040, Roman Vitenberg 18 Primary-backup replication (other than cold backup) Warm backup In this case, the backup is (at least) partially alive, so the recovery phase is faster But typically still requires some replaying of last transactions, or losing the last few updates Hot standby (leader/follower) The backup is also up, and is constantly updated about the state of the primary Local-write scheme The primary migrates between the servers Commonly used in mobile systems INF5040, Roman Vitenberg 19 INF 5040 høst 2005 9

Quorum-based replication Typically used with data-replacing updates Such updates can be performed on old non-updated replicas An update is performed on a majority of replicas A query is sent to a majority of replicas Replies include both versions and values A client picks a reply with the highest version The replica that sent such a reply is guaranteed to be the most updated one The scheme can be generalized Write quorum Sw ={Sw 1,, Sw n }, Sw i is a set of replicas i,j, Sw i Sw j A client picks i and performs the update on all replicas in Sw i Read quorum Sr ={Sr 1,, Sr n }, i,j, Sr i Sw j INF5040, Roman Vitenberg 20 From multicast to reliable group communication receive receive receive multicast receive Group send INF5040, Roman Vitenberg 21 INF 5040 høst 2005 10

From multicast to reliable group communication Group membership service Dynamic maintenance of groups Failure detection Distributes information about changes in the membership Address expansion an address for multicast to the entire group Reliable delivery Acknowledgement of message reception Message retransmissions Stability detection Learning when all members of the group have received the message INF5040, Roman Vitenberg 22 What is still missing for active replication support? Replicas should receive the same events in the same order Problem: synchronization between membership changes and message delivery P1 receives m before it learns about a membership change P2 receives m after it learns about a membership change Message ordering problem: P1 P2 P3 P4 A A B B Time Group communication provides view synchrony & ordered delivery INF5040, Roman Vitenberg 23 INF 5040 høst 2005 11

View synchrony View: epoch of system evolution between two consecutive changes of membership The evolution of the system can be seen as a global sequence of views Illusion of a static system in each view A B C D E ABCDE ABCDE ABCDE ABCDE ABCDE ABC ABC ABC CDE CDE CDE INF5040, Roman Vitenberg 24 View synchrony Synchronization: Processes deliver views and messages in the same sequence of events If two different processes deliver m, they do it in the same view Delivering the same set of messages: If the process p delivers m in v(g) and later delivers v(g ), then every process q that delivers both v(g) and v(g ) delivers m in v(g) This implies retransmitting missing messages If p delivers m in v(g), and a process q does not deliver m in v(g), the next view p delivers will not include q INF5040, Roman Vitenberg 25 INF 5040 høst 2005 12

Illustration (from the book) INF5040, Roman Vitenberg 26 Ordered message delivery Some rule (binary relation) that establishes that two messages m1 & m2 sent in the system are ordered: m1 < m2 Standard relation properties Two variants of ordered message delivery Unreliable ordered delivery: if a process delivers m1 and m2, it should deliver m2 after m1 Reliable ordered delivery: if a process delivers m2, it should have already delivered m1 Delay message delivery until earlier messages arrive To implement, one may need a lot of space for message buffering INF5040, Roman Vitenberg 27 INF 5040 høst 2005 13

Commonly used orderings FIFO Causal: two messages are ordered if related by the happen-before relation Many applications require message delivery in an order that preserves cause and effect Publish/subscribe (netnews), email, control systems, root cause determination Total: all messages will be received in the same order by all the processes in the group Useful towards implementing the state machine abstraction INF5040, Roman Vitenberg 28 Implementing ordered delivery The implementation does not deliver messages before it knows that consistent requirements are satisfied. Message processing delivery Delivery queue Hold back queue Incoming messages When delivery conditions are satisfied INF5040, Roman Vitenberg 29 INF 5040 høst 2005 14