Page 1. Goals of Today s Lecture. The ACID properties of Transactions. Transactions

Similar documents
Page 1. Goals of Today s Lecture" Two Key Questions" Goals of Transaction Scheduling"

Page 1. Goals of Todayʼs Lecture" Two Key Questions" Goals of Transaction Scheduling"

Review. Review. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Lecture #21: Concurrency Control (R&G ch.

Lock Granularity and Consistency Levels (Lecture 7, cs262a) Ali Ghodsi and Ion Stoica, UC Berkeley February 7, 2018

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #17: Transac0ons 2: 2PL and Deadlocks

Page 1. CS194-3/CS16x Introduction to Systems. Lecture 8. Database concurrency control, Serializability, conflict serializability, 2PL and strict 2PL

Concurrency control 12/1/17

Page 1. Goals for Today" What is a Database " Key Concept: Structured Data" CS162 Operating Systems and Systems Programming Lecture 13.

Concurrency Control! Snapshot isolation" q How to ensure serializability and recoverability? " q Lock-Based Protocols" q Other Protocols"

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Last Class. Faloutsos/Pavlo CMU /615

Transaction Management & Concurrency Control. CS 377: Database Systems

WarmGup#Exercise# CS#133:#Databases# TwoGPhase#Locking#(2PL)# Goals#for#Today#

CAS CS 460/660 Introduction to Database Systems. Transactions and Concurrency Control 1.1

Transaction Management and Concurrency Control. Chapter 16, 17

Concurrency Control. Concurrency Control Ensures interleaving of operations amongst concurrent transactions result in serializable schedules

Intro to Transactions

Transaction Management: Concurrency Control

Database System Concepts

Slides Courtesy of R. Ramakrishnan and J. Gehrke 2. v Concurrent execution of queries for improved performance.

Today s Class. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Formal Properties of Schedules. Conflicting Operations

CSE 344 MARCH 5 TH TRANSACTIONS

Transaction Overview and Concurrency Control

Transaction Processing: Concurrency Control. Announcements (April 26) Transactions. CPS 216 Advanced Database Systems

CSE 444: Database Internals. Lectures Transactions

Goal of Concurrency Control. Concurrency Control. Example. Solution 1. Solution 2. Solution 3

CSC 261/461 Database Systems Lecture 21 and 22. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Transaction Processing. Introduction to Databases CompSci 316 Fall 2018

References. Transaction Management. Database Administration and Tuning 2012/2013. Chpt 14 Silberchatz Chpt 16 Raghu

CMSC 424 Database design Lecture 22 Concurrency/recovery. Mihai Pop

Introduction to Data Management CSE 344

CSE 444: Database Internals. Lectures 13 Transaction Schedules

Introduction to Transaction Management

Page 1. Quiz 18.1: Flow-Control" Goals for Today" Quiz 18.1: Flow-Control" CS162 Operating Systems and Systems Programming Lecture 18 Transactions"

Introduction to Data Management CSE 344

CS377: Database Systems Concurrency Control. Li Xiong Department of Mathematics and Computer Science Emory University

UNIT 4 TRANSACTIONS. Objective

Transaction Management Overview

Concurrency Control & Recovery

Overview of Transaction Management

Chapter 22. Transaction Management

Concurrency Control & Recovery

Including Aborts in Serializability. Conflict Serializable Schedules. Recall Conflicts. Conflict Equivalent

Transactions and Concurrency Control

Transactions. Kathleen Durant PhD Northeastern University CS3200 Lesson 9

TRANSACTION MANAGEMENT

Database Systems CSE 414

Transactions and Concurrency Control

Intro to Transaction Management

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

Transaction Processing Concurrency control

Concurrency Control. R &G - Chapter 19

CHAPTER: TRANSACTIONS

TRANSACTION PROCESSING CONCEPTS

Transaction Management. Chapter 14

Transaction Management: Introduction (Chap. 16)

Introduction to Data Management. Lecture #18 (Transactions)

CSC 261/461 Database Systems Lecture 24

CSE 344 MARCH 21 ST TRANSACTIONS

Concurrency Control. Chapter 17. Comp 521 Files and Databases Spring

Lecture 21 Concurrency Control Part 1

CSE 344 MARCH 9 TH TRANSACTIONS

Transaction Management Overview. Transactions. Concurrency in a DBMS. Chapter 16

Database Management Systems 2010/11

5/17/17. Announcements. Review: Transactions. Outline. Review: TXNs in SQL. Review: ACID. Database Systems CSE 414.

UNIT-IV TRANSACTION PROCESSING CONCEPTS

Database Systems CSE 414

Lecture 20 Transactions

Introduction to Data Management CSE 414

Database Management System

Introduction to Data Management. Lecture #24 (Transactions)

Conflict Equivalent. Conflict Serializability. Example 1. Precedence Graph Test Every conflict serializable schedule is serializable

The transaction. Defining properties of transactions. Failures in complex systems propagate. Concurrency Control, Locking, and Recovery

Concurrency Control. Chapter 17. Comp 521 Files and Databases Fall

h p:// Authors: Tomáš Skopal, Irena Holubová Lecturer: Mar n Svoboda, mar

Announcements. Motivating Example. Transaction ROLLBACK. Motivating Example. CSE 444: Database Internals. Lab 2 extended until Monday

UNIT IV TRANSACTION MANAGEMENT

Introduction to Data Management CSE 344

Intro to DB CHAPTER 15 TRANSACTION MNGMNT

Distributed Database Management System UNIT-2. Concurrency Control. Transaction ACID rules. MCA 325, Distributed DBMS And Object Oriented Databases

Transaction Processing: Concurrency Control ACID. Transaction in SQL. CPS 216 Advanced Database Systems. (Implicit beginning of transaction)

CS322: Database Systems Transactions

Concurrency Control. Chapter 17. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

L i (A) = transaction T i acquires lock for element A. U i (A) = transaction T i releases lock for element A

Introduction to Data Management CSE 344

doc. RNDr. Tomáš Skopal, Ph.D.

Introduction TRANSACTIONS & CONCURRENCY CONTROL. Transactions. Concurrency

Lectures 8 & 9. Lectures 7 & 8: Transactions

Lock-based Concurrency Control

CSE 344 MARCH 25 TH ISOLATION

Transactions. Lecture 8. Transactions. ACID Properties. Transaction Concept. Example of Fund Transfer. Example of Fund Transfer (Cont.

Database Management Systems

Operating Systems. Operating Systems Sina Meraji U of T

Introduction to Data Management CSE 344

Checkpoints. Logs keep growing. After every failure, we d have to go back and replay the log. This can be time consuming. Checkpoint frequently

Database Systems. Announcement

Database Systems CSE 414

Chapter 14: Transactions

Roadmap of This Lecture

Lecture 21. Lecture 21: Concurrency & Locking

6.830 Lecture Transactions October 23, 2017

Transcription:

Goals of Today s Lecture CS162 Operating Systems and Systems Programming Lecture 19 Transactions, Two Phase Locking (2PL), Two Phase Commit (2PC) Finish Transaction scheduling Two phase locking (2PL) and strict 2PL Two-phase commit (2PC) November 13, 2013 Anthony D. Joseph and John Canny http://inst.eecs.berkeley.edu/~cs162 Note: Some slides and/or pictures in the following are adapted from lecture notes by Mike Franklin. Lec 19.2 The ACID properties of Transactions Atomicity: all actions in the transaction happen, or none happen Consistency: transactions maintain data integrity, e.g., Balance cannot be negative Cannot reschedule meeting on February 30 Isolation: execution of one transaction is isolated from that of all others; no problems from concurrency Transactions Group together a set of updates so that they execute atomically. Ensure that the database is in a consistent state before and after the transaction: To move money from account A to B: Debit A (read(a), write(a)), and Credit B (read(b), write(b)) Use locks to prevent conflicts with other clients. Durability: if a transaction commits, its effects persist despite crashes Lec 19.3 Lec 19.4 Page 1

Goals of Transaction Scheduling Maximize system utilization, i.e., concurrency Interleave operations from different transactions Preserve transaction semantics Semantically equivalent to a serial schedule, i.e., one transaction runs at a time T1: R, W, R, W T2: R, W, R, R, W Two Key Questions 1) Is a given schedule equivalent to a serial execution of transactions? (color codes the transaction T1 or T2) Schedule: R, R, W, W, R, R, R, W, W Serial schedule (T1, then T2): Serial schedule (T2, then T1): R, : W, R, W, R, W, R, R, W R, W, R, R, W, R, W, R, W Serial schedule (T1, then T2): R, W, R, W, R, W, R, R, W Serial schedule (T2, then T1): R, W, R, R, W, R, W, R, W 2) How do you come up with a schedule equivalent to a serial schedule? Lec 19.5 Lec 19.6 Transaction Scheduling Serial schedule: A schedule that does not interleave the operations of different transactions Transactions run serially (one at a time) Equivalent schedules: For any storage/database state, the effect (on storage/database) and output of executing the first schedule is identical to the effect of executing the second schedule Serializable schedule: A schedule that is equivalent to some serial execution of the transactions Intuitively: with a serializable schedule you only see things that could happen in situations where you were running transactions one-at-a-time Conflict Serializable Schedules Two operations conflict if they Belong to different transactions Are on the same data At least one of them is a write T1 R(X) T2 W(X,b) T1 W(X,a) T2 W(X,b) Two schedules are conflict equivalent iff: Involve same operations of same transactions Every pair of conflicting operations is ordered the same way Schedule S is conflict serializable if S is conflict equivalent to some serial schedule Lec 19.7 Lec 19.8 Page 2

Conflict Equivalence Intuition (cont d) If you can transform an interleaved schedule by swapping consecutive non-conflicting operations of different transactions into a serial schedule, then the original schedule is conflict serializable Is this schedule serializable? T1:R(A), W(A) T2: R(A),W(A), Dependency Graph Dependency graph: Transactions represented as nodes Edge from Ti to Tj:» an operation of Ti conflicts with an operation of Tj» Ti appears earlier than Tj in the schedule Theorem: Schedule is conflict serializable if and only if its dependency graph is acyclic Is it conflict serializable? Why? Lec 19.9 Lec 19.10 Example Example Conflict serializable schedule: Conflict that is not serializable: T1:R(A),W(A), R(B),W(B) T1:R(A),W(A), R(B),W(B) T2: R(A),W(A), R(B),W(B) A B T1 T2 Dependency graph T2: R(A),W(A),R(B),W(B) A T1 T2 Dependency graph B No cycle! Cycle: The output of T1 depends on T2, and viceversa Lec 19.11 Lec 19.12 Page 3

Notes on Conflict Serializability Conflict Serializability doesn t allow all schedules that you would consider correct This is because it is strictly syntactic - it doesn t consider the meanings of the operations or the data In practice, Conflict Serializability is what gets used, because it can be done efficiently Note: in order to allow more concurrency, some special cases do get implemented, such as for travel reservations, Two-phase locking (2PL) is how we implement it Serializability Conflict Serializability Following schedule is not conflict serializable Dependency graph T1:R(A), W(A), A A T2: W(A), T1 T2 A A T3: W(A) T3 However, the schedule is serializable since its output is equivalent with the following serial schedule T1:R(A),W(A), T2: W(A), T3: WA Lec 19.13 Note: deciding whether a schedule is serializable (not conflict-serializable) is NP-complete Lec 19.14 Locks Locks to control access to data Two types of locks: shared (S) lock multiple concurrent transactions allowed to operate on data exclusive (X) lock only one transaction can operate on data at a time Lock Compatibility Matrix S X S X Two-Phase Locking (2PL) 1) Each transaction must obtain: S (shared) or X (exclusive) lock on data before reading, X (exclusive) lock on data before writing 2) A transaction can not request additional locks once it releases any locks Thus, each transaction has a growing phase followed by a shrinking phase Lock Point! Avoid deadlock by acquiring locks in some lexicographic order # Locks Held 4 3 2 1 Growing Phase Shrinking Phase 0 1 3 5 7 9 11 13 15 17 19 Time Lec 19.15 Lec 19.16 Page 4

Two-Phase Locking (2PL) 2PL guarantees that the dependency graph of a schedule is acyclic. For every pair of transactions with a conflicting lock, one acquires is first ordering of those two total ordering. Therefore 2PL-compatible schedules are conflict serializable. Note: 2PL can still lead to deadlocks since locks are acquired incrementally. An important variant of 2PL is strict 2PL, where all locks are released at the end of the transaction. Lock Management Lock Manager (LM) handles all lock and unlock requests LM contains an entry for each currently held lock When lock request arrives see if anyone else holds a conflicting lock If not, create an entry and grant the lock Else, put the requestor on the wait queue Locking and unlocking are atomic operations Lock upgrade: share lock can be upgraded to exclusive lock Lec 19.17 Lec 19.18 Example T1 transfers $50 from account A to account B T1:Read(A),A:=A 50,Write(A),Read(B),B:=B+50,Write(B) Is this a 2PL Schedule? 1 Lock_X(A) <granted> 2 Read(A) Lock_S(A) 3 A: = A-50 4 Write(A) T2 outputs the total of accounts A and B T2:Read(A),Read(B),PRINT(A+B) Initially, A = $1000 and B = $2000 What are the possible output values? 3000, 2950, 3050 Lec 19.19 5 Unlock(A) <granted> 6 Read(A) 7 Unlock(A) 8 Lock_S(B) <granted> 9 Lock_X(B) 10 Read(B) 11 <granted> Unlock(B) 12 PRINT(A+B) 13 Read(B) 14 B := B +50 15 Write(B) 16 Unlock(B) No, and it is not serializable Lec 19.20 Page 5

Is this a 2PL Schedule? 1 Lock_X(A) <granted> 2 Read(A) Lock_S(A) 3 A: = A-50 4 Write(A) 5 Lock_X(B) <granted> 6 Unlock(A) <granted> 7 Read(A) 8 Lock_S(B) 9 Read(B) 10 B := B +50 11 Write(B) 12 Unlock(B) <granted> 13 Unlock(A) 14 Read(B) 15 Unlock(B) 16 PRINT(A+B) Yes, so it is serializable Lec 19.21 Cascading Aborts Example: T1 aborts Note: this is a 2PL schedule T1:X(A),R(A),W(A),X(B),~X(A) R(B),W(B),abort T2: X(A),R(A),W(A),~X(A) Rollback of T1 requires rollback of T2, since T2 reads a value written by T1 Solution: Strict Two-phase Locking (Strict 2PL): same as 2PL except All locks held by a transaction are released only when the transaction completes Lec 19.22 Strict 2PL (cont d) All locks held by a transaction are released only when the transaction completes In effect, shrinking phase is delayed until: a) Transaction has committed (commit log record on disk), or b) Decision has been made to abort the transaction (then locks can be released after rollback) Lec 19.23 Is this a Strict 2PL schedule? 1 Lock_X(A) <granted> 2 Read(A) Lock_S(A) 3 A: = A-50 4 Write(A) 5 Lock_X(B) <granted> 6 Unlock(A) <granted> 7 Read(A) 8 Lock_S(B) 9 Read(B) 10 B := B +50 11 Write(B) 12 Unlock(B) <granted> 13 Unlock(A) 14 Read(B) 15 Unlock(B) 16 PRINT(A+B) No: Cascading Abort Possible Lec 19.24 Page 6

Is this a Strict 2PL schedule? Administrivia 1 Lock_X(A) <granted> 2 Read(A) Lock_S(A) 3 A: = A-50 4 Write(A) Project 3 code due 11:59pm on Thursday 11/21. 5 Lock_X(B) <granted> 6 Read(B) Project 3 group evals due 11:59pm on Friday 11/22. 7 B := B +50 8 Write(B) 9 Unlock(A) 10 Unlock(B) <granted> 11 Read(A) 12 Lock_S(B) <granted> 13 Read(B) 14 PRINT(A+B) 15 Unlock(A) 16 Unlock(B) Lec 19.25 Lec 19.26 Quiz 19.1: Transactions Q1: True _ False _ It is possible for two read operations to conflict Q2: True _ False _ A strict 2PL schedule does not avoid cascading aborts Q3: True _ False _ 2PL leads to deadlock if schedule not conflict serializable 5min Break Q4: True _ False _ A conflict serializable schedule is always serializable Q5: True _ False _ The following schedule is serializable T1:R(A),W(A), R(B), W(B) T2: R(A), W(A), R(B),W(B) Lec 19.27 Lec 19.28 Page 7

Quiz 19.1: Transactions Q1: True _ False X_ It is possible for two read operations to conflict Q2: True _ False _ X A strict 2PL schedule does not avoid cascading aborts Q3: True X_ False _ 2PL leads to deadlock if schedule not conflict serializable Q4: True _ X False _ A conflict serializable schedule is always serializable Q5: True X_ False _ The following schedule is serializable T1:R(A),W(A), R(B), W(B) T2: R(A), W(A), R(B),W(B) Deadlock Recall: if a schedule is not conflict-serializable, 2PL leads to deadlock, i.e., Cycles of transactions waiting for each other to release locks Recall: two ways to deal with deadlocks Deadlock prevention Deadlock detection Many systems punt problem by using timeouts instead Associate a timeout with each lock If timeout expires release the lock What is the problem with this solution? Lec 19.29 Lec 19.30 Deadlock Prevention Deadlock Detection Prevent circular waiting Assign priorities based on timestamps. Assume Ti wants a lock that Tj holds. Two policies are possible: Wait-Die: If Ti is older, Ti waits for Tj; otherwise Ti aborts (wait chain is acyclic going forward in time) Wound-wait: If Ti is older, Tj aborts; otherwise Ti waits (wait chain is acyclic going backward in time) If a transaction re-starts, make sure it gets its original timestamp Why? Allow deadlocks to happen but check for them and fix them if found Create a wait-for graph: Nodes are transactions There is an edge from Ti to Tj if Ti is waiting for Tj to release a lock Periodically check for cycles in the waits-for graph If cycle detected find a transaction whose removal will break the cycle and kill it Lec 19.31 Lec 19.32 Page 8

Example: Deadlock Detection (Continued) T1: S(A),S(D), S(B) T2: X(B), X(C) T3: S(D),S(C), X(A) T4: X(B) T1 T2 Durability and Atomicity How do you make sure transaction results persist in the face of failures (e.g., disk failures)? Replicate database Commit transaction to each replica What happens if you have failures during a transaction commit? Need to ensure atomicity: either transaction is committed on all replicas or none at all T4 T3 Lec 19.33 Lec 19.34 Two Phase (2PC) Commit 2PC Algorithm 2PC is a distributed protocol High-level problem statement If no node fails and all nodes are ready to commit, then all nodes Otherwise at all nodes Developed by Turing award winner Jim Gray (first Berkeley CS PhD, 1969) One N workers (replicas) High level algorithm description Coordinator asks all workers if they can commit If all workers reply VOTE-, then broadcasts GLOBAL-, Otherwise broadcasts GLOBAL- Workers obey the GLOBAL messages Lec 19.35 Lec 19.36 Page 9

Coordinator Algorithm Detailed Algorithm Worker Algorithm Failure Free Example Execution Coordinator sends VOTE REQ to all workers If receive VOTE from all N workers, send GLOBAL to all workers If doesn t receive VOTE from all N workers, send GLOBAL to all workers Wait for VOTE REQ from If ready, send VOTE to If not ready, send VOTE to And immediately abort If receive GLOBAL then commit If receive GLOBAL then abort worker 1 worker 2 worker 3 VOTE REQ VOTE GLOBAL time Lec 19.37 Lec 19.38 State Machine of Coordinator State Machine of Workers Coordinator implements simple state machine Recv: VOTE Send: GLOBAL WAIT Recv: START Send: VOTE REQ Recv: VOTE Send: GLOBAL Recv: VOTE REQ Send: VOTE READY Recv: GLOBAL Recv: VOTE REQ Send: VOTE Recv: GLOBAL Lec 19.39 Lec 19.40 Page 10

Dealing with Worker Failures Example of Worker Failure How to deal with worker failures? Failure only affects states in which the node is waiting for messages Coordinator only waits for votes in WAIT state In WAIT, if doesn t receive N votes, it times out and sends GLOBAL- Recv: VOTE Send: GLOBAL WAIT Recv: START Send: VOTE REQ Recv: VOTE Send: GLOBAL worker 1 worker 2 VOTE REQ WAIT COMM timeout VOTE GLOBAL worker 3 time Lec 19.41 Lec 19.42 Dealing with Coordinator Failure Example of Coordinator Failure #1 How to deal with failures? worker waits for VOTE-REQ in» Worker can time out and abort ( handles it) worker waits for GLOBAL-* message in READY» If fails, workers must BLOCK waiting for to recover and send GLOBAL_* message Recv: VOTE REQ Send: VOTE READY Recv: VOTE REQ Send: VOTE worker 1 worker 2 VOTE REQ READY COMM timeout timeout VOTE Recv: GLOBAL Recv: GLOBAL worker 3 timeout Lec 19.43 Lec 19.44 Page 11

Example of Coordinator Failure #2 READY Remembering Where We Were (Durability) All nodes use stable storage* to store which state they are in worker 1 worker 2 worker 3 VOTE REQ VOTE COMM restarted block waiting for GLOBAL Upon recovery, it can restore state and resume: Coordinator aborts in, WAIT, or Coordinator commits in Worker aborts in, READY, Worker commits in Worker asks Coordinator in READY * - stable storage is non-volatile storage (e.g. backed by disk) that guarantees atomic writes. Lec 19.45 Lec 19.46 Blocking for Coordinator to Recover A worker waiting for global decision can ask fellow workers about their state If another worker is in or state then must have sent GLOBAL-* Thus, worker can safely abort or commit, respectively If another worker is still in state then both workers can decide to abort If all workers are in ready, need to BLOCK (don t know if wanted to abort or commit) Recv: VOTE REQ Send: VOTE READY Recv: VOTE REQ Send: VOTE Recv: GLOBAL Recv: GLOBAL Quiz 19.2: Distributed Execution Q1: True _ False _ Strict 2PL schedules prevent deadlock Q2: 2PC in a distributed system ensures (tick all that apply): True _ False _ Atomicity True _ False _ Consistency True _ False _ Isolation True _ False _ Durability Q3: True _ False _ 2PC prevents workers from blocking during a commit. Q4: True _ False _ The maintains its state after a power failure. Lec 19.47 Lec 19.48 Page 12

Quiz 19.2: Distributed Execution Q1: True _ False X_ Strict 2PL schedules prevent deadlock Q2: 2PC in a distributed system ensures (tick all that apply): True X_ False _ Atomicity True _ False X_ Consistency True _ False X_ Isolation True X_ False _ Durability Q3: True _ False X_ 2PC prevents workers from blocking during a commit. Q4: True X_ False _ The maintains its state after a power failure. Summary Correctness criterion for transactions is Serializability In practice, we use Conflict Serializability, which is somewhat more restrictive but easy to enforce Two phase locking (2PL) and strict 2PL Ensure conflict-serializability for R/W operations Deadlocks can be either detected or prevented Two-phase commit (2PC) Ensure atomicity and durability: a transaction is committed/aborted either by all replicas or by none of them Lec 19.49 Lec 19.50 Page 13