Advanced Database Management System (CoSc3052) Database Recovery Techniques. Purpose of Database Recovery. Types of Failure.

Similar documents
Chapter 22. Introduction to Database Recovery Protocols. Copyright 2012 Pearson Education, Inc.

Database Technology. Topic 11: Database Recovery

RECOVERY CHAPTER 21,23 (6/E) CHAPTER 17,19 (5/E)

UNIT 9 Crash Recovery. Based on: Text: Chapter 18 Skip: Section 18.7 and second half of 18.8

COURSE 4. Database Recovery 2

Database Recovery Techniques. DBMS, 2007, CEng553 1

A tomicity: All actions in the Xact happen, or none happen. D urability: If a Xact commits, its effects persist.

Crash Recovery CMPSCI 645. Gerome Miklau. Slide content adapted from Ramakrishnan & Gehrke

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Administrivia. Last Class. Faloutsos/Pavlo CMU /615

Crash Recovery Review: The ACID properties

Crash Recovery. The ACID properties. Motivation

ACID Properties. Transaction Management: Crash Recovery (Chap. 18), part 1. Motivation. Recovery Manager. Handling the Buffer Pool.

Crash Recovery. Chapter 18. Sina Meraji

Log-Based Recovery Schemes

Transaction Management: Crash Recovery (Chap. 18), part 1

some sequential execution crash! Recovery Manager replacement MAIN MEMORY policy DISK

Review: The ACID properties. Crash Recovery. Assumptions. Motivation. Preferred Policy: Steal/No-Force. Buffer Mgmt Plays a Key Role

CS 541 Database Systems. Recovery Managers

Atomicity: All actions in the Xact happen, or none happen. Consistency: If each Xact is consistent, and the DB starts consistent, it ends up

Review: The ACID properties. Crash Recovery. Assumptions. Motivation. More on Steal and Force. Handling the Buffer Pool

CAS CS 460/660 Introduction to Database Systems. Recovery 1.1

Outline. Purpose of this paper. Purpose of this paper. Transaction Review. Outline. Aries: A Transaction Recovery Method

Last time. Started on ARIES A recovery algorithm that guarantees Atomicity and Durability after a crash

Transaction Management

Database Recovery. Lecture #21. Andy Pavlo Computer Science Carnegie Mellon Univ. Database Systems / Fall 2018

Recovery Techniques. The System Failure Problem. Recovery Techniques and Assumptions. Failure Types

Database Systems ( 資料庫系統 )

System Malfunctions. Implementing Atomicity and Durability. Failures: Crash. Failures: Abort. Log. Failures: Media

ARIES (& Logging) April 2-4, 2018

Chapter 17: Recovery System

Introduction. Storage Failure Recovery Logging Undo Logging Redo Logging ARIES

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed McGraw-Hill by

Recoverability. Kathleen Durant PhD CS3200

Chapter 17: Recovery System

Failure Classification. Chapter 17: Recovery System. Recovery Algorithms. Storage Structure

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed

Chapter 16: Recovery System. Chapter 16: Recovery System

Recovery and Logging

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #19: Logging and Recovery 1

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. General Overview NOTICE: Faloutsos CMU SCS

Homework 6 (by Sivaprasad Sudhir) Solutions Due: Monday Nov 27, 11:59pm

Chapter 14: Recovery System

Architecture and Implementation of Database Systems (Summer 2018)

Slides Courtesy of R. Ramakrishnan and J. Gehrke 2. v Concurrent execution of queries for improved performance.

CS 541 Database Systems. Implementation of Undo- Redo. Clarification

Lecture 21: Logging Schemes /645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo

Aries (Lecture 6, cs262a)

Transaction Management. Readings for Lectures The Usual Reminders. CSE 444: Database Internals. Recovery. System Crash 2/12/17

Problems Caused by Failures

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

CompSci 516: Database Systems

CS122 Lecture 15 Winter Term,

Introduction to Database Systems CSE 444

Database Recovery. Dr. Bassam Hammo

Database Management System

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Relational Database Systems Recovery

Relational Database Systems Recovery

CS 377 Database Systems Transaction Processing and Recovery. Li Xiong Department of Mathematics and Computer Science Emory University

1/29/2009. Outline ARIES. Discussion ACID. Goals. What is ARIES good for?

Chapter 9. Recovery. Database Systems p. 368/557

CSE 544 Principles of Database Management Systems. Fall 2016 Lectures Transactions: recovery

Data Management. Marta Oliva Solé PID_

Recovery from failures

Database Applications (15-415)

DATABASE DESIGN I - 1DL300

The transaction. Defining properties of transactions. Failures in complex systems propagate. Concurrency Control, Locking, and Recovery

Transactions and Recovery Study Question Solutions

transaction - (another def) - the execution of a program that accesses or changes the contents of the database

Advances in Data Management Transaction Management A.Poulovassilis

Transactions. Transaction. Execution of a user program in a DBMS.

DBS related failures. DBS related failure model. Introduction. Fault tolerance

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Today s Class. Faloutsos/Pavlo CMU /615

ARIES. Handout #24. Overview

Transaction Management Overview. Transactions. Concurrency in a DBMS. Chapter 16

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Introduction to Transaction Processing Concepts and Theory

Employing Object-Based LSNs in a Recovery Strategy

Transactions. A transaction: a sequence of one or more SQL operations (interactive or embedded):

CS122 Lecture 18 Winter Term, All hail the WAL-Rule!

Transaction Management Overview

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Lecture 16: Transactions (Recovery) Wednesday, May 16, 2012

CS5200 Database Management Systems Fall 2017 Derbinsky. Recovery. Lecture 15. Recovery

Concurrency Control & Recovery

CS298 Report Schemes to make Aries and XML work in harmony

Concurrency Control & Recovery

Notes on Database System Reliability

Transaction Management. Pearson Education Limited 1995, 2005

Database Management Systems Reliability Management

In This Lecture. Transactions and Recovery. Transactions. Transactions. Isolation and Durability. Atomicity and Consistency. Transactions Recovery

DATABASE TECHNOLOGY - 1MB025

Transaction Processing Concepts and Theory. Truong Tuan Anh CSE-HCMUT

Foundation of Database Transaction Processing. Copyright 2012 Pearson Education, Inc.

Transactional Recovery

CSEP544 Lecture 4: Transactions

AC61/AT61 DATABASE MANAGEMENT SYSTEMS JUNE 2013

NOTES W2006 CPS610 DBMS II. Prof. Anastase Mastoras. Ryerson University

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

CompSci 516 Database Systems

Transcription:

Advanced Database Management System (CoSc3052) Database Recovery Techniques Purpose of Database Recovery To bring the database into a consistent state after a failure occurs To ensure the transaction properties of Atomicity and Durability After a failure, the DBMS recovery manager is responsible for bringing the system into a consistent state before transactions can resume ADBMS: Database Recovery Techniques 2 Types of Failure Transaction failure: Transactions may fail because of errors, incorrect input, deadlock, incorrect synchronization, etc. System failure: System may fail because of application error, operating system fault, RAM failure, etc. Media failure: Disk head crash, power disruption, etc. The Log File Holds the information that is necessary for the recovery process Records all relevant operations in the order in which they occur Is an append-only file. Holds various types of log records (or log entries). ADBMS: Database Recovery Techniques 3 ADBMS: Database Recovery Techniques 4

Log File Entries The Log File Types of records (entries) in log file: [start_transaction,t]: Records that transaction T has started execution. [write_item,t,x,old_value,new_value]: T has changed the value of item X from old_value to new_value. ADBMS: Transaction Processing Concepts 5 For write_item log entry, old value of item before modification (BFIM - BeFore Image) and the new value after modification (AFIM AFter Image) are stored. BFIM needed for UNDO, AFIM needed for REDO. A sample log is given below. Back P and Next P point to the previous and next log records of the same transaction. [read_item,t,x]: T has read the value of item X (not needed T ID Back P Next P Operation Data item BFIM AFIM T1 0 1 Begin in many cases). T1 1 4 Write X X = 100 X = 200 [end_transaction,t]: T has ended execution T2 0 8 Begin T1 2 5 W Y Y = 50 Y = 100 [commit,t]: T has completed successfully, and committed. T1 4 7 R M M = 200 M = 200 [abort,t]: T has been aborted. T3 0 9 R N N = 400 N = 400 T1 5 nil End ADBMS: Transaction Processing Concepts 6 Database Cache Database Cache Database Cache: A set of main memory buffers; each buffer typically holds contents of one disk block. Stores the disk blocks that contain the data items being read and written by the database transactions. Data Item Address: (disk block address, offset, size in bytes). Cache Table:Table of entries of the form (buffer addr, disk block addr, modified bit, pin/unpin bit,...) to indicate which disk blocks are currently in the cache buffers. ADBMS: Transaction Processing Concepts 7 Data items to be modified are first copied into database cache by the Cache Manager (CM) and after modification they are flushed (written) back to the disk. The flushing is controlled by Modified and Pin-Unpin bits. Pin-Unpin: If a buffer is pinned, it cannot be written back to disk until it is unpinned. Modified: Indicates that one or more data items in the buffer have been changed. ADBMS: Transaction Processing Concepts 8

Data Update UNDO and REDO Recovery Actions Immediate Update: A data item modified in cache can be written back to disk before the transaction commits. Deferred Update: A modified data item in the cache cannot be written back to disk till after the transaction commits (buffer is pinned). Shadow update: The modified version of a data item does not overwrite its disk copy but is written at a separate disk location (new version). In-place update: The disk version of the data item is overwritten by the cache version. ADBMS: Transaction Processing Concepts 9 To maintain atomicity and durability, some transaction s may have their operations redone or undone during recovery. UNDO (roll-back) is needed for transactions that are not committed yet REDO (roll-forward) is needed for committed transactions whose writes may have not yet been flushed from cache to disk. Undo: Restore all BFIMs from log to database on disk. UNDO proceeds backward in log (from most recent to oldest UNDO). Redo: Restore all AFIMs from log to database on disk. REDO proceeds forward in log (from oldest to most recent REDO). ADBMS: Transaction Processing Concepts 10 Write-Ahead Logging Protocol Checkpointing The information needed for recovery must be written to the log file on disk before changes are made to the database on disk. Write-Ahead Logging (WAL) protocol consists of two rules: For Undo: Before a data item s AFIM is flushed to the database on disk (overwriting the BFIM) its BFIM must be written to the log and the log must be saved to disk. For Redo: Before a transaction executes its commit operation, all its AFIMs must be written to the log and the log must be saved on a stable store. ADBMS: Transaction Processing Concepts 11 To minimize the REDO operations during recovery. The following steps define a checkpoint operation: Suspend execution of transactions temporarily. Force write modified buffers from cache to disk. Write a [checkpoint] record to the log, save the log to disk. This record also includes other info., such as the list of active transactions at the time of checkpoint. Resume normal transaction execution. During recovery redo is required only for transactions that have committed after the last [checkpoint] record in the log. ADBMS: Transaction Processing Concepts 12

Checkpointing Steps 1 and 4 in previous slide are not realistic. A variation of checkpointing called fuzzy checkpointing allows transactions to continue execution during the checkpointing process. ADBMS: Transaction Processing Concepts 13 Other Database Recovery Concepts Steal/No-Steal and Force/No-Force Specify how to flush database cache buffers to database on disk: Steal: Cache buffers updated by a transaction may be flushed to disk before the transaction commits (recovery may require UNDO). No-Steal: Cache buffers cannot be flushed until after transaction commit (NO-UNDO). (Buffers are pinned till transactions commit). Force: Cache is flushed (forced) to disk before transaction commits (NO-REDO). No-Force: Some cache flushing may be deferred till after transaction commits (recovery may require REDO). These give rise to four different ways for handling recovery: Steal/No-Force (Undo/Redo), Steal/Force (Undo/No-redo), No- Steal/No-Force (Redo/No-undo), No-Steal/Force (No-undo/No-redo). ADBMS: Transaction Processing Concepts 14 Deferred Update (NO-UNDO/REDO) Recovery Protocol System must impose NO-STEAL rule. Recovery subsystem analyzes the log, and creates two lists: Active Transaction list: All active (uncommitted) transaction ids are entered in this list. Committed Transaction list: Transactions committed after the last checkpoint are entered in this table. During recovery, transactions in commit list are redone; transactions in active list are ignored (because of NO-STEAL rule, none of their writes have been applied to the database on disk). Some transactions may be redone twice; this does not create inconsistency because REDO is idempotent, that is, one REDO for an AFIM is equivalent to multiple REDO for the same AFIM. ADBMS: Transaction Processing Concepts 15 Deferred Update (NO-UNDO/REDO) Recovery Protocol Advantage: Only REDO is needed during recovery Disadvantage: Many buffers may be pinned while waiting for transactions that updated them to commit, so system may run out of cache buffers when requests are made by new transactions ADBMS: Transaction Processing Concepts 16

UNDO/NO-REDO Recovery Protocol UNDO/REDO Recovery Protocol In this method, FORCE rule is imposed by system (AFIMs of a transaction are flushed to the database on disk under Write Ahead Logging before the transaction commits). Transactions in active list are undone; transactions in committed list are ignored (because based on FORCE rule, all their changes are already written to the database on disk). Advantage: During recovery, only UNDO is needed. Disadvantages: Commit of a transaction is delayed until all its changes are forcewritten to disk. Some buffers may be written to disk multiple times if they are updated by several transactions. ADBMS: Transaction Processing Concepts 17 Recovery can require both UNDO of some transactions and REDO of other transactions (Corresponds to STEAL/NO-FORCE). Used most often in practice because of disadvantages of the other two methods. To minimize REDO, checkpointing is used. The recovery performs: Undo of a transaction if it is in the active transaction list. Redo of a transaction if it is in the list of transactions that committed since the last checkpoint. ADBMS: Transaction Processing Concepts 18 Shadow Paging (NO-UNDO/NO-REDO) Shadow Paging (NO-UNDO/NO-REDO) The AFIM does not overwrite its BFIM but is recorded at another place (new version) on the disk. Thus, a data item can have AFIM and BFIM (Shadow copy of the data item) at two different places on the disk. X Y X' Y' Database X and Y: Shadow (old) copies of data items X' and Y': Current (new) copies of data items ADBMS: Transaction Processing Concepts 19 Shadow paging: Example ADBMS: Transaction Processing Concepts 20

Used in practice, it is based on: WAL (Write Ahead Logging) Repeating history during redo: ARIES will retrace all actions of the database system prior to the crash to reconstruct the correct database state. Logging changes during undo: It will prevent ARIES from repeating the completed undo operations if a failure occurs during recovery, which causes a restart of the recovery process. ADBMS: Transaction Processing Concepts 21 The ARIES recovery algorithm consists of three steps: 1)Analysis: step identifies the dirty (updated) page buffers in the cache and the set of transactions active at the time of crash. The set of transactions that committed after the last checkpoint is determined, and the appropriate point in the log where redo is to start is also determined. 2)Redo: necessary redo operations are applied. 3)Undo: log is scanned backwards and the operations of transactions active at the time of crash are undone in reverse order. ADBMS: Transaction Processing Concepts 22 The Log and Log Sequence Number (LSN) A log record is written for (a) data update (Write), (b) transaction commit, (c) transaction abort, (d) undo, and (e) transaction end. In the case of undo a compensating log record is written. A unique LSN is associated with every log record. LSN increases monotonically and indicates the disk address of the log record it is associated with. In addition, each data page stores the LSN of the latest log record corresponding to a change for that page. The Log and Log Sequence Number (LSN) A log record stores: 1)LSN of previous log record for same transaction: It links the log records of each transaction. 2)Transaction ID. 3)Type of log record. ADBMS: Transaction Processing Concepts 23 ADBMS: Transaction Processing Concepts 24

The Log and Log Sequence Number (LSN) For a write operation, additional information is logged: 1)Page ID for the page that includes the item 2)Length of the updated item 3)Its offset from the beginning of the page 4)BFIM of the item 5)AFIM of the item The Transaction table and the Dirty Page table For efficient recovery, the following tables are also stored in the log during checkpointing: Transaction table: Contains an entry for each active transaction, with information such as transaction ID, transaction status, and the LSN of the most recent log record for the transaction. Dirty Page table: Contains an entry for each dirty page (buffer) in the cache, which includes the page ID and the LSN corresponding to the earliest update to that page. ADBMS: Transaction Processing Concepts 25 ADBMS: Transaction Processing Concepts 26 Fuzzy Checkpointing A checkpointing process does the following: 1)Writes a begin_checkpoint record in the log, then forces updated (dirty) buffers to disk. 2)Writes an end_checkpoint record in the log, along with the contents of transaction table and dirty page table. 3)Writes the LSN of the begin_checkpoint record to a special file. This special file is accessed during recovery to locate the last checkpoint information. To allow the system to continue to execute transactions, ARIES uses fuzzy checkpointing. ADBMS: Transaction Processing Concepts 27 The following steps are performed for recovery 1) Analysis phase: Start at the begin_checkpoint record and proceed to the end_checkpoint record. Access transaction table and dirty page table that were appended to the log. During this phase some other records may be written to the log and the transaction table may be modified. The analysis phase compiles the set of redo and undo operations to be performed and ends. 2) Redo phase: Starts from the point in the log where all dirty pages have been flushed, and move forward. Operations of committed transactions are redone. 3) Undo phase: Starts from the end of the log and proceeds backward while performing appropriate undo. For each undo it writes a compensating record in the log. The recovery completes at the end of undo phase. ADBMS: Transaction Processing Concepts 28

ADBMS: Transaction Processing Concepts 29 Recovery in Multi-database Transactions (Two-phase commit) A multidatabase transaction can access several databases: e.g. airline database, car rental database, credit card database. The transaction commits only when all these multiple databases agree to commit individually the part of the transaction they were executing. This commit scheme is referred to as two-phase commit (2PC). If any one of these nodes fails or cannot commit its part of the transaction, then the whole transaction is aborted. Each node recovers the transaction under its own recovery protocol. ADBMS: Transaction Processing Concepts 30 Recovery in Multi-database Transactions (Two-phase commit) Phase 1: Coordinator (usually application program running in middle-tier of 3-tier architecture) sends Ready-to-commit? query to each participating database, then waits for replies. A participating database replies Ready-to-commit only after saving all actions in its local log on disk. Phase 2: If coordinator receives Ready-to-commit signals from all participating databases, it sends Commit to all; otherwise, it send Abort to all. This protocol can survive most types of crashes. ADBMS: Transaction Processing Concepts 31