Today CSCI Recovery techniques. Recovery. Recovery CAP Theorem. Instructor: Abhishek Chandra

Similar documents
Fault Tolerance Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Failure Models. Fault Tolerance. Failure Masking by Redundancy. Agreement in Faulty Systems

Fault Tolerance. Distributed Systems IT332

Distributed Systems Principles and Paradigms. Chapter 08: Fault Tolerance

CSE 5306 Distributed Systems. Fault Tolerance

Today: Fault Tolerance. Reliable One-One Communication

Fault-Tolerant Computer Systems ECE 60872/CS Recovery

Recovering from a Crash. Three-Phase Commit

CSE 5306 Distributed Systems

Today: Fault Tolerance. Failure Masking by Redundancy

Fault Tolerance 1/64

Fault Tolerance. Basic Concepts

Today: Fault Tolerance. Replica Management

Clock and Time. THOAI NAM Faculty of Information Technology HCMC University of Technology

Rollback-Recovery p Σ Σ

Distributed Systems Principles and Paradigms

Failure Tolerance. Distributed Systems Santa Clara University

Three Models. 1. Time Order 2. Distributed Algorithms 3. Nature of Distributed Systems1. DEPT. OF Comp Sc. and Engg., IIT Delhi

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms

Basic concepts in fault tolerance Masking failure by redundancy Process resilience Reliable communication. Distributed commit.

Chapter 8 Fault Tolerance

Distributed Systems

Today: Fault Tolerance. Fault Tolerance

A Survey of Rollback-Recovery Protocols in Message-Passing Systems

FAULT TOLERANT SYSTEMS

Chapter 5: Distributed Systems: Fault Tolerance. Fall 2013 Jussi Kangasharju

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University


Distributed Systems COMP 212. Lecture 19 Othon Michail

Chapter 8 Fault Tolerance

Fault Tolerance. Distributed Systems. September 2002

Hypervisor-based Fault-tolerance. Where should RC be implemented? The Hypervisor as a State Machine. The Architecture. In hardware

Dep. Systems Requirements

Fault Tolerance. The Three universe model

CSE 444: Database Internals. Section 9: 2-Phase Commit and Replication

Checkpointing HPC Applications

SCALABLE CONSISTENCY AND TRANSACTION MODELS

GFS: The Google File System. Dr. Yingwu Zhu

CS 470 Spring Fault Tolerance. Mike Lam, Professor. Content taken from the following:

CS435 Introduction to Big Data FALL 2018 Colorado State University. 11/7/2018 Week 12-B Sangmi Lee Pallickara. FAQs

ARIES (& Logging) April 2-4, 2018

Optimistic Recovery in Distributed Systems

GFS: The Google File System

Log-Based Recovery for Middleware Servers

Georgia Institute of Technology ECE6102 4/20/2009 David Colvin, Jimmy Vuong

FAULT TOLERANT SYSTEMS

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

FAULT TOLERANT SYSTEMS

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

Fault Tolerance. Fall 2008 Jussi Kangasharju

Page 1 FAULT TOLERANT SYSTEMS. Coordinated Checkpointing. Time-Based Synchronization. A Coordinated Checkpointing Algorithm

Recovery. Independent Checkpointing

Fault Tolerance. Chapter 7

PNUTS: Yahoo! s Hosted Data Serving Platform. Reading Review by: Alex Degtiar (adegtiar) /30/2013

Choosing a MySQL HA Solution Today. Choosing the best solution among a myriad of options

Distributed Recovery with K-Optimistic Logging. Yi-Min Wang Om P. Damani Vijay K. Garg

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

Engineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05

David B. Johnson. Willy Zwaenepoel. Rice University. Houston, Texas. or the constraints of real-time applications [6, 7].

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Advanced Memory Management

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Today s Class. Faloutsos/Pavlo CMU /615

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?

Distributed Systems COMP 212. Revision 2 Othon Michail

Modern Database Concepts

A Review of Checkpointing Fault Tolerance Techniques in Distributed Mobile Systems

416 Distributed Systems. Distributed File Systems 4 Jan 23, 2017

Map-Reduce. Marco Mura 2010 March, 31th

Distributed Systems (ICE 601) Fault Tolerance

CS514: Intermediate Course in Computer Systems

ERROR RECOVERY IN MULTICOMPUTERS USING GLOBAL CHECKPOINTS

CS 43: Computer Networks. 08:Network Services and Distributed Systems 19 September

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Fundamentals of Database Systems

CS Amazon Dynamo

2 Copyright 2015 M. E. Kabay. All rights reserved. 4 Copyright 2015 M. E. Kabay. All rights reserved.

Distributed Systems. Day 13: Distributed Transaction. To Be or Not to Be Distributed.. Transactions

Introduction. Storage Failure Recovery Logging Undo Logging Redo Logging ARIES

CS 138: Google. CS 138 XVI 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

T ransaction Management 4/23/2018 1

The Google File System

Eventual Consistency Today: Limitations, Extensions and Beyond

MESSAGE INDUCED SOFT CHEKPOINTING FOR RECOVERY IN MOBILE ENVIRONMENTS

The Google File System

Message Logging: Pessimistic, Optimistic, Causal, and Optimal

Outline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles

A Reservation-Based Extended Transaction Protocol for Coordination of Web Services

Physical Storage Media

Thanks for Live Snapshots, Where's Live Merge?

In This Lecture. Transactions and Recovery. Transactions. Transactions. Isolation and Durability. Atomicity and Consistency. Transactions Recovery

Distributed Systems. Day 9: Replication [Part 1]

The CAP theorem. The bad, the good and the ugly. Michael Pfeiffer Advanced Networking Technologies FG Telematik/Rechnernetze TU Ilmenau

Rollback-Recovery Protocols for Send-Deterministic Applications. Amina Guermouche, Thomas Ropars, Elisabeth Brunet, Marc Snir and Franck Cappello

Atomicity. Bailu Ding. Oct 18, Bailu Ding Atomicity Oct 18, / 38

TRANSACTION PROPERTIES

Distributed Systems. 19. Fault Tolerance Paul Krzyzanowski. Rutgers University. Fall 2013

Respec: Efficient Online Multiprocessor Replay via Speculation and External Determinism

NoSQL systems: sharding, replication and consistency. Riccardo Torlone Università Roma Tre

Databases. Laboratorio de sistemas distribuidos. Universidad Politécnica de Madrid (UPM)

Transcription:

Today CSCI 5105 Recovery CAP Theorem Instructor: Abhishek Chandra 2 Recovery Operations to be performed to move from an erroneous state to an error-free state Backward recovery: Go back to a previous correct state E.g.: packet retransmission Forward recovery: Go to a new correct state E.g.: Error-correction codes Recovery techniques Checkpointing Message logging Rebooting 3 4 1

Checkpointing Periodically store state on stable storage Mirrored/RAID disks, etc. At error-recovery, go back to the last checkpointed state Problem: How do we rollback so that all process go back to a consistent global state? Cuts in Global State Space Cut: Partition of events representing a global state Set of last recorded event for each process 5 6 Distributed Snapshot Consistent cut: Receipt of a message m in the cut => sending of m also in the cut If event a is in the cut, then all b s.t. b->a are in the cut Distributed Snapshot: A consistent global state of the distributed system Recovery line: The most recent distributed snapshot Independent Checkpointing Each process periodically checkpoints independently of other processes Upon a failure, work backwards to locate a consistent cut Domino effect: Cascading rollbacks 7 8 2

Coordinated Checkpointing Processes synchronize before checkpointing locally Synchronization techniques: Two-phase protocol Incremental snapshot Two-phase protocol One process sends Checkpoint request Each recipient checkpoints current state, queues up new local messages Send checkpoint-done message 9 10 Incremental snapshot Checkpointing between causally related processes since last checkpointing Identify causally related processes incrementally Apply two-phase commit between these processes Message Logging Checkpointing is expensive Coordination, writing to stable storage Too few checkpoints => can lose lot of state, need lot of recomputation, message passing Message logging Take infrequent checkpoints Log messages between checkpoints Recovery: Replay messages since last checkpoint 11 12 3

Piecewise Deterministic Model Execution of each process takes place in a series of intervals Within each interval, the execution is deterministic E.g.: sequence of instructions, message sending Start of each interval is a non-deterministic event E.g.: receipt of a message Can replay the intervals if we log the nondeterministic events Orphan Process Process whose state becomes inconsistent because of another process s crash/recovery Dependent on messages unlogged at crashed process Goal: Prevent orphan processes When to log messages? 13 14 Orphan Process Definition Stable message: A message that cannot be lost DEP(m): Processes dependent on message m Receivers of m, causally dependent on m COPY(m): Processes with non-stable copy of m Orphan process: P in DEP(m), no process in COPY(m) Message-Logging Schemes Avoid orphan process: no process in COPY(m) => no process in DEP(m) Pessimistic logging: Ensures above property at time of message sending At most one dependent process for any nonstable message m Optimistic logging: Ensures above property after crash Roll back all orphan processes to a state where they are not in DEP(m) 15 16 4

Rebooting Localize fault and reboot faulty component Applied to software components Requirement: Modularity, decoupling Basic idea: Fault dependent on a rare, transient event Recursive rebooting Try shutting down the smallest faulty component first Continue rebooting successively larger componets CAP Theorem C: Consistency A: Availability P: Tolerance to network partitions 2 of 3 rule: Can have only 2 of these properties 17 18 CAP Theorem - Revisited Is 2 of 3 rule misleading? What is meant by: Consistency Availability Partitioning CAP Choices - Factors Partitioning duration, frequency E.g.: can have all 3 while network is connected Type of data, operations, users E.g.: Yahoo PNUTS provides stronger A via local single-user master data E.g.: Facebook prefers stronger C by maintaining a central master copy 19 20 5

Partition Management How to handle partitions when they occur? Key questions: How to detect partitions? How to operate during a partition? How to recover after re-connection? Partition Detection How would a node typically detect partitions? Related to communication latency Network latency Can be defined based on app latency requirements No global definition of partitioning Different nodes may (or may not) detect network partition 21 22 Partition Mode How should the two sides operate under a partition? Allow some operations. E.g.: those that do not conflict or could be resolved easily Delay some and prohibit some. E.g.: those that need to be globally consistent Maintain operation history. E.g.: version vectors Partition Recovery Make state consistent on both sides Roll forward from a state before the partition Use the logs to apply/merge operations How to merge conflicts? Use version vectors Might need manual intervention Automated if we allow only limited operations. E.g.: only commutative operations Compensate for mistakes Cancel a duplicate operation. E.g.: Refund user if charged twice 23 24 6