Replication. Feb 10, 2016 CPSC 416

Similar documents
Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Replication in Distributed Systems

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Distributed Systems. 10. Consensus: Paxos. Paul Krzyzanowski. Rutgers University. Fall 2017

Basic vs. Reliable Multicast

PRIMARY-BACKUP REPLICATION

Paxos provides a highly available, redundant log of events

CS October 2017

ZooKeeper & Curator. CS 475, Spring 2018 Concurrent & Distributed Systems

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013

Distributed Systems. 09. State Machine Replication & Virtual Synchrony. Paul Krzyzanowski. Rutgers University. Fall Paul Krzyzanowski

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

Distributed Systems 8L for Part IB

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

Distributed Systems. Fault Tolerance. Paul Krzyzanowski

Transactions. CS 475, Spring 2018 Concurrent & Distributed Systems

Distributed Systems. replication Johan Montelius ID2201. Distributed Systems ID2201

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering

CSE380 - Operating Systems. Communicating with Devices

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692)

Distributed Systems. Lec 12: Consistency Models Sequential, Causal, and Eventual Consistency. Slide acks: Jinyang Li

GFS: The Google File System

Dynamic Reconfiguration of Primary/Backup Clusters

Replications and Consensus

Byzantine Fault Tolerance and Consensus. Adi Seredinschi Distributed Programming Laboratory

Intuitive distributed algorithms. with F#

Distributed Systems 24. Fault Tolerance

Distributed Systems 11. Consensus. Paul Krzyzanowski

Lecture XII: Replication

EECS 482 Introduction to Operating Systems

Distributed Systems. Lec 9: Distributed File Systems NFS, AFS. Slide acks: Dave Andersen

11. Replication. Motivation

The Microsoft Large Mailbox Vision

Consensus and related problems

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

416 practice questions (PQs)

GFS: The Google File System. Dr. Yingwu Zhu

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

Primary-Backup Replication

Consistency and Scalability

Distributed Computation Models

Example File Systems Using Replication CS 188 Distributed Systems February 10, 2015

Paxos Replicated State Machines as the Basis of a High- Performance Data Store

TAPIR. By Irene Zhang, Naveen Sharma, Adriana Szekeres, Arvind Krishnamurthy, and Dan Ports Presented by Todd Charlton

Lesson 9 Transcript: Backup and Recovery

Amazon Aurora Deep Dive

Distributed Systems 23. Fault Tolerance

ATOMIC COMMITMENT Or: How to Implement Distributed Transactions in Sharded Databases

Agreement and Consensus. SWE 622, Spring 2017 Distributed Software Engineering

TWO-PHASE COMMIT ATTRIBUTION 5/11/2018. George Porter May 9 and 11, 2018

Byzantine Fault Tolerance

CSE 5306 Distributed Systems

Dynamo Tom Anderson and Doug Woos

CSE 5306 Distributed Systems. Fault Tolerance

Performance and Forgiveness. June 23, 2008 Margo Seltzer Harvard University School of Engineering and Applied Sciences

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers

Byzantine Fault Tolerance

Causal Consistency and Two-Phase Commit

Map-Reduce. Marco Mura 2010 March, 31th

Paxos and Replication. Dan Ports, CSEP 552

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

416 Distributed Systems. Errors and Failures, part 2 Feb 3, 2016

Assignment 5. Georgia Koloniari

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

Distributed Systems Final Exam

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

Linearizability CMPT 401. Sequential Consistency. Passive Replication

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID

Fault Tolerance Dealing with an imperfect world

Distributed Systems. Catch-up Lecture: Consistency Model Implementations

Lecture 12. Lecture 12: The IO Model & External Sorting

Recovering from a Crash. Three-Phase Commit

What is version control? (discuss) Who has used version control? Favorite VCS? Uses of version control (read)

The CAP theorem. The bad, the good and the ugly. Michael Pfeiffer Advanced Networking Technologies FG Telematik/Rechnernetze TU Ilmenau

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

EECS 498 Introduction to Distributed Systems

Building Consistent Transactions with Inconsistent Replication

Lecture 6 Consistency and Replication

Distributed System. Gang Wu. Spring,2018

Lecture X: Transactions

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 14 Distributed Transactions

Tools for Social Networking Infrastructures

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

CSE 451: Operating Systems Winter Redundant Arrays of Inexpensive Disks (RAID) and OS structure. Gary Kimura

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897

CPS 512 midterm exam #1, 10/7/2016

Datacenter replication solution with quasardb

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein. Copyright 2003 Philip A. Bernstein. Outline

FAULT TOLERANCE. Fault Tolerant Systems. Faults Faults (cont d)

Consistency and Replication. Some slides are from Prof. Jalal Y. Kawash at Univ. of Calgary

Distributed Systems

X X C 1. Recap. CSE 486/586 Distributed Systems Gossiping. Eager vs. Lazy Replication. Recall: Passive Replication. Fault-Tolerance and Scalability

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

CS 425 / ECE 428 Distributed Systems Fall 2017

Exam 2 Review. October 29, Paul Krzyzanowski 1

Module 8 - Fault Tolerance

10. Replication. CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety. Copyright 2012 Philip A. Bernstein

Transcription:

Replication Feb 10, 2016 CPSC 416

How d we get here? Failures & single systems; fault tolerance techniques added redundancy (ECC memory, RAID, etc.) Conceptually, ECC & RAID both put a master in front of the redundancy to mask it from clients -- ECC handled by memory controller, RAID looks like a very reliable hard drive behind a (special) controller

Simpler examples... Replicated web sites e.g., Yahoo! or Amazon: DNS-based load balancing (DNS returns multiple IP addresses for each name) Hardware load balancers put multiple machines behind each IP address (Diagram. :)

Read-only content Easy to replicate - just make multiple copies of it. Performance boost: Get to use multiple servers to handle the load; Perf boost 2: Locality. We ll see this later when we discuss CDNs, can often direct client to a replica near it Availability boost: Can fail-over (done at both DNS level -- slower, because clients cache DNS answers -- and at front-end hardware level)

But for read-write data... Must implement write replication, typically with some degree of consistency

What consistency model? Just like in filesystems, want to look at the consistency model you supply R/L example: Google mail. Sending mail is replicated to ~2 physically separated datacenters (users hate it when they think they sent mail and it got lost); mail will pause while doing this replication. Q: How long would this take with 2-phase commit? in the wide area? Marking mail read is only replicated in the background - you can mark it read, the replication can fail, and you ll have no clue (re-reading a read email once in a while is no big deal) Weaker consistency is cheaper if you can get away with it.

Strict transactional consistency (you saw before) sequentially consistent: if client a executes operations {a1, a2, a3,...}, b executes {b1, b2, b3,...}, then you could create some serialized version (as if the ops had been performed through a single server) a1, b1, b2, a2,... (or whatever) executed by the clients using a central server Note this is not transactional consistency - we didn t enforce preserving happens-before. It s just per-program

Failure model We ll assume for today that failures and disconnections are relatively rare events - they may happen pretty often, but, say, any server is up more than 90% of the time. We ll come back later and look at disconnected operation models (e.g., Coda file system that allows clients to work offline )

Tools we ll assume Group membership manager Allow replica nodes to join/leave Failure detector e.g., process-pair monitoring, etc.

Goal Provide a service Survive the failure of up to f replicas Provide identical service as a non-replicated version (except more reliable, and perhaps different performance) (A lot like your assignment 4 (where f = r-1) except without durable storage)

We ll cover Primary-backup Operations handled by primary, it streams copies to backup(s) Replicas are passive Good: Simple protocol. Bad: Clients must participate in recovery. quorum consensus Designed to have fast response time even under failures Replicas are active - participate in protocol; there is no master, per se. Good: Clients don t even see the failures. Bad: More complex.

Primary-Backup Clients talk to a primary The primary handles requests, atomically and idempotently Executes them Sends the request to the backups Backups reply, OK ACKs to the client

primary-backup Note: If you don t care about strong consistency (e.g., the mail read flag), you can reply to client before reaching agreement with backups (sometimes called asynchronous replication ). This looks cool. What s the problem? This is OK for some services, not OK for others Advantage: With N servers, can tolerate loss of N-1 copies

primary-backup Note: If you don t care about strong consistency (e.g., the mail read flag), you can reply to client before reaching agreement with backups (sometimes called asynchronous replication ). This looks cool. What s the problem? What do we do if a replica has failed? We wait... how long? Until it s marked dead. Primary-backup has a strong dependency on the failure detector This is OK for some services, not OK for others Advantage: With N servers, can tolerate loss of N-1 copies

implementing primarybackup Remember logging (if you ve taken databases) Common technique for replication in databases and filesystem-like things: Stream the log to the backup. They don t have to actually apply the changes before replying, just make the log durable (i.e., on disk). You have to replay the log before you can be online again, but it s pretty cheap.

p-b: Did it happen? Client Primary Backup Commit! OK! Log Commit! Log OK! Failure here: Commit logged only at primary Primary dies? Client must re-send to backup

p-b: Happened twice Client Primary Backup Commit! Commit! Log OK! OK! Log Failure here: Commit logged at backup Primary dies? Client must check with backup (Seems like at-most-once / at-least-once... :)

Problems with p-b Not a great solution if you want very tight response time even when something has failed: Must wait for failure detector For that, quorum based schemes are used As name implies, different result: To handle f failures, must have 2f + 1 replicas. Why?

Problems with p-b Not a great solution if you want very tight response time even when something has failed: Must wait for failure detector For that, quorum based schemes are used As name implies, different result: To handle f failures, must have 2f + 1 replicas. Why? so that a majority is still alive