CMPT 354: Database System I. Lecture 11. Transaction Management

Similar documents
Lectures 8 & 9. Lectures 7 & 8: Transactions

CSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Transactions. 1. Transactions. Goals for this lecture. Today s Lecture

Lecture 21. Lecture 21: Concurrency & Locking

CSC 261/461 Database Systems Lecture 21 and 22. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CSC 261/461 Database Systems Lecture 24

Transaction Management & Concurrency Control. CS 377: Database Systems

COURSE 1. Database Management Systems

TRANSACTION MANAGEMENT

Database Systems. Announcement

Introduction to Database Systems CSE 444. Transactions. Lecture 14: Transactions in SQL. Why Do We Need Transactions. Dirty Reads.

Intro to Transactions

Introduction TRANSACTIONS & CONCURRENCY CONTROL. Transactions. Concurrency

CSE 344 MARCH 21 ST TRANSACTIONS

Lecture 14: Transactions in SQL. Friday, April 27, 2007

Announcements. Transaction. Motivating Example. Motivating Example. Transactions. CSE 444: Database Internals

Introduction to Data Management CSE 344

Introduction to Data Management CSE 344

CSE 444: Database Internals. Lectures Transactions

Overview of Transaction Management

Introduction to Data Management CSE 344

Introduction to Data Management CSE 344

Motivating Example. Motivating Example. Transaction ROLLBACK. Transactions. CSE 444: Database Internals

Overview. Introduction to Transaction Management ACID. Transactions

Concurrency Control & Recovery

CSE 344 MARCH 5 TH TRANSACTIONS

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

Concurrency Control & Recovery

Transaction Management: Introduction (Chap. 16)

Intro to Transaction Management

CSE 444: Database Internals. Lectures 13 Transaction Schedules

Transaction Management Overview

References. Transaction Management. Database Administration and Tuning 2012/2013. Chpt 14 Silberchatz Chpt 16 Raghu

Transactions. ACID Properties of Transactions. Atomicity - all or nothing property - Fully performed or not at all

Announcements. Motivating Example. Transaction ROLLBACK. Motivating Example. CSE 444: Database Internals. Lab 2 extended until Monday

Database Systems CSE 414

Lecture 12. Lecture 12: The IO Model & External Sorting

Introduction to Data Management. Lecture #24 (Transactions)

Transaction Management and Concurrency Control. Chapter 16, 17

CMPT 354: Database System I. Lecture 2. Relational Model

Transaction Management Overview. Transactions. Concurrency in a DBMS. Chapter 16

Database Systems CSE 414

Database Management System

CMPT 354: Database System I. Lecture 1. Course Introduction

Page 1. Goals for Today" What is a Database " Key Concept: Structured Data" CS162 Operating Systems and Systems Programming Lecture 13.

Introduction to Data Management. Lecture #18 (Transactions)

Page 1. CS194-3/CS16x Introduction to Systems. Lecture 8. Database concurrency control, Serializability, conflict serializability, 2PL and strict 2PL

5/17/17. Announcements. Review: Transactions. Outline. Review: TXNs in SQL. Review: ACID. Database Systems CSE 414.

CSE 344 MARCH 25 TH ISOLATION

L i (A) = transaction T i acquires lock for element A. U i (A) = transaction T i releases lock for element A

CSE 190D Database System Implementation

Database Systems CSE 414

Database Management Systems CSEP 544. Lecture 9: Transactions and Recovery

Databases - Transactions

Introduction to Data Management CSE 344

CMPT 354: Database System I. Lecture 7. Basics of Query Optimization

Introduction to Transaction Management

Transactions Processing (i)

CSE 344 MARCH 9 TH TRANSACTIONS

COSC344 Database Theory and Applications. Lecture 21 Transactions

Transactions & Update Correctness. April 11, 2018

DATABASE TRANSACTIONS. CS121: Relational Databases Fall 2017 Lecture 25

h p:// Authors: Tomáš Skopal, Irena Holubová Lecturer: Mar n Svoboda, mar

Database Systems CSE 414

Transactions. Kathleen Durant PhD Northeastern University CS3200 Lesson 9

Concurrency Control 1

Transaction Management: Concurrency Control

Lock Granularity and Consistency Levels (Lecture 7, cs262a) Ali Ghodsi and Ion Stoica, UC Berkeley February 7, 2018

CSE 530A ACID. Washington University Fall 2013

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #6: Transac/ons 1: Intro. to ACID

Introduction to Data Management CSE 414

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #17: Transac0ons 1: Intro. to ACID

Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Why Transac'ons? Database systems are normally being accessed by many users or processes at the same 'me.

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Transactions. Lecture 8. Transactions. ACID Properties. Transaction Concept. Example of Fund Transfer. Example of Fund Transfer (Cont.

Chapter 22. Transaction Management

CMPT 354: Database System I. Lecture 4. SQL Advanced

Introduction to Data Management CSE 344

Transactions. Silberschatz, Korth and Sudarshan

Roadmap of This Lecture

Example: Transfer Euro 50 from A to B

Concurrency Control - Formal Foundations

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Last Class. Faloutsos/Pavlo CMU /615

Database Management Systems 2010/11

TRANSACTION PROPERTIES

CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

CSE544 Transac-ons: Concurrency Control. Lectures #5-6 Thursday, January 20, 2011 Tuesday, January 25, 2011

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

Database Management System

Database Management Systems

Transaction Management

CMPT 354: Database System I. Lecture 3. SQL Basics

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 10: Transaction processing. November 14, Lecturer: Rasmus Pagh

CHAPTER: TRANSACTIONS

Lecture 20 Transactions

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI

Page 1. Goals of Today s Lecture. The ACID properties of Transactions. Transactions

doc. RNDr. Tomáš Skopal, Ph.D.

Distributed Data Management Transactions

Transcription:

CMPT 354: Database System I Lecture 11. Transaction Management 1

Why this lecture DB application developer What if crash occurs, power goes out, etc? Single user à Multiple users 2

Outline Transaction Basics Definition Motivation for Transaction ACID Properties Concurrency Control Scheduling Anomaly Types Conflict Serializability 3

Transactions: Basic Definition A transaction ( TXN ) is a sequence of one or more operations (reads or writes) which reflects a single realworld transition. In the real world, a TXN either happened completely or not at all Examples: Transfer money between accounts Purchase a group of products Register for a class (either waitlist or allocated)

Transactions in SQL In ad-hoc SQL: Default: each statement = one transaction In a program, multiple statements can be grouped together as a transaction: START TRANSACTION UPDATE Bank SET amount = amount 100 WHERE name = Bob UPDATE Bank SET amount = amount + 100 WHERE name = Joe COMMIT 5

Motivation for Transactions Grouping user actions (reads & writes) into transactions helps with two goals: 1. Recovery & Durability: Keeping the DBMS data consistent and durable in the face of crashes, aborts, system shutdowns, etc. 2. Concurrency: Achieving better performance by parallelizing TXNs without creating anomalies

Motivation 1. Recovery & Durability of user data is essential for reliable DBMS usage The DBMS may experience crashes (e.g. power outages, etc.) Individual TXNs may be aborted (e.g. by the user) Idea: Make sure that TXNs are either durably stored in full, or not at all; keep log to be able to roll-back TXNs

Protection against crashes / aborts Client 1: INSERT INTO SmallProduct(name, price) SELECT pname, price FROM Product WHERE price <= 0.99 DELETE Product WHERE price <=0.99 Crash / abort! What goes wrong? 8

Protection against crashes / aborts Client 1: START TRANSACTION INSERT INTO SmallProduct(name, price) SELECT pname, price FROM Product WHERE price <= 0.99 DELETE Product WHERE price <=0.99 COMMIT OR ROLLBACK Now we d be fine! 9

Motivation 2. Concurrent execution of user programs is essential for good DBMS performance. Disk accesses may be frequent and slow- optimize for throughput (# of TXNs), trade for latency (time for any one TXN) Users should still be able to execute TXNs as if in isolation and such that consistency is maintained Idea: Have the DBMS handle running several user TXNs concurrently, in order to keep CPUs humming

Multiple users: single statements Client 1: UPDATE Employee SET Salary = Salary + 1000 Client 2: UPDATE Product SET Salary = Salary * 2 Two managers attempt to increase employee salary concurrently- What could go wrong? 11

Multiple users: single statements Client 1: START TRANSACTION UPDATE Employee SET Salary = Salary + 1000 COMMIT Client 2: START TRANSACTION UPDATE Employee SET Salary = Salary * 2 COMMIT Now works like a charm- we ll see how / why later 12

Transaction Properties: ACID Atomic State shows either all the effects of txn, or none of them Consistent Txn moves from a state where integrity holds, to another where integrity holds Isolated Effect of txns is the same as txns running one after another (ie looks like batch mode) Durable Once a txn has committed, its effects remain in the database ACID continues to be a source of great debate! 13

ACID: Atomic TXN s activities are atomic: all or nothing Intuitively: in the real world, a transaction is something that would either occur completely or not at all Two possible outcomes for a TXN It commits: all the changes are made It aborts: no changes are made 14

ACID: Consistent The tables must always satisfy user-specified constraints Examples: Account number is unique Stock amount can t be negative Sum of debits and of credits is 0 How consistency is achieved: Programmer makes sure a txn takes a consistent state to a consistent state System makes sure that the txn is atomic 15

ACID: Isolated A transaction executes concurrently with other transactions Isolation: the effect is as if each transaction executes in isolation of the others. E.g. Should not be able to observe changes from other transactions during the run 16

ACID: Durable The effect of a TXN must continue to exist ( persist ) after the TXN And after the whole program has terminated And even if there are power failures, crashes, etc. And etc Means: Write data to disk 17

A Note: ACID is contentious! Many debates over ACID, both historically and currently Many newer NoSQL DBMSs relax ACID In turn, now NewSQL reintroduces ACID compliance to NoSQL-style DBMSs ACID is an extremely important & successful paradigm, but still debated!

Transaction Management (Big Picture) Definition Two Big Problems Techniques Properties Write Read Storage 1. Support multiple transaction at the same time 1. Concurrency Control Atomicity Consistency Transaction = A list of writes and reads 2. Make surethedata stored is reliable 2. Database Recovery Isolation Durability For example: {Read, Read, Write}

Outline Transaction Basics Definition Motivation for Transaction ACID Properties Concurrency Control Scheduling Anomaly types Conflict serializability 20

Concurrency: Isolation & Consistency The DBMS must handle concurrency such that 1. Isolation is maintained: Users must be able to execute each TXN as if they were the only user DBMS handles the details of interleaving various TXNs ACID 2. Consistency is maintained: TXNs must leave the DB in a consistent state DBMS handles the details of enforcing integrity constraints ACID

Example- consider two TXNs: T1: START TRANSACTION UPDATE Accounts SET Amt = Amt + 100 WHERE Name = A COMMIT UPDATE Accounts SET Amt = Amt - 100 WHERE Name = B T2: START TRANSACTION UPDATE Accounts SET Amt = Amt * 1.06 COMMIT T1 transfers $100 from B s account to A s account T2 credits both accounts with a 6% interest payment

Example- consider two TXNs: We can look at the TXNs in a timeline view- serial execution: A += 100 B -= 100 A *= 1.06 B *= 1.06 T1 transfers $100 from B s account to A s account T2 credits both accounts with a 6% interest payment Time

Example- consider two TXNs: The TXNs could occur in either order DBMS allows! A += 100 B -= 100 A *= 1.06 B *= 1.06 T2 credits both accounts with a 6% interest payment T1 transfers $100 from B s account to A s account Time

Example- consider two TXNs: The DBMS can also interleave the TXNs A += 100 B -= 100 A *= 1.06 B *= 1.06 T2 credits A s account with 6% interest payment, then T1 transfers $100 to A s account T2 credits B s account with a 6% interest payment, then T1 transfers $100 from B s account Time

Example- consider two TXNs: The DBMS can also interleave the TXNs A += 100 B -= 100 A *= 1.06 B *= 1.06 Time Is it correct?

Why Interleave TXNs? Interleaving TXNs might lead to anomalous outcomes why do it? Several important reasons: Individual TXNs might be slow- don t want to block other users during! Disk access may be slow- let some TXNs use CPUs while others accessing disk! All concern large differences in performance 27

Ignore all issues? At Facebook, only 0.0004% of results returned are inconsistent But,

Interleaving & Isolation The DBMS has freedom to interleave TXNs However, it must pick an interleaving or schedule such that isolation and consistency are maintained Must be as if the TXNs had executed serially! With great power comes great responsibility ACID DBMS must pick a schedule which maintains isolation & consistency 29

Scheduling examples Serial schedule, : Starting Balance A B $50 $200 A += 100 B -= 100 A *= 1.06 B *= 1.06 A B $159 $106 Interleaved schedule 1: A += 100 B -= 100 Same result! A *= 1.06 B *= 1.06 A B $159 $106 30

Scheduling examples Serial schedule, : Starting Balance A B $50 $200 A += 100 B -= 100 A *= 1.06 B *= 1.06 A B $159 $106 Interleaved schedule 2: A += 100 B -= 100 A *= 1.06 B *= 1.06 A B $159 $112 Different result than serial,! 31

Scheduling examples Serial schedule, : A += 100 B -= 100 Starting Balance A B $50 $200 A *= 1.06 B *= 1.06 Interleaved schedule 2: A += 100 B -= 100 A B $153 $112 Different result than serial, ALSO! A *= 1.06 B *= 1.06 A B $159 $112 32

Scheduling examples Interleaved schedule B: A += 100 B -= 100 A *= 1.06 B *= 1.06 This schedule is different than any serial order! We say that it is not serializable 33

Scheduling Definitions A serial schedule is one that does not interleave the actions of different transactions A serializable schedule is a schedule that is equivalent to some serial schedule. Schedule 1 and Schedule 2 are equivalent if, for any database state, the effect on DB of executing Schedule 1 is identical to the effect of Schedule 2

Serializable? Serial schedules: A B, 1.06*(A+100) 1.06*(B-100), 1.06*A + 100 1.06*B - 100 A += 100 B -= 100 A *= 1.06 B *= 1.06 A 1.06*(A+100) B 1.06*(B-100) Same as a serial schedule for all possible values of A, B = serializable 35

Serializable? A += 100 B -= 100 A *= 1.06 B *= 1.06 Serial schedules: A B, 1.06*(A+100) 1.06*(B-100), 1.06*A + 100 1.06*B - 100 A B 1.06*(A+100) 1.06*B - 100 Not equivalent to any serializable schedule = not serializable 36

Outline Transaction Basics Definition Motivation for Transaction ACID Properties Concurrency Control Scheduling Anomaly types Conflict Serializability 37

What else can go wrong with interleaving? Various anomalies which break isolation / serializability Often referred to by name Occur because of / with certain conflicts between interleaved TXNs 38

The DBMS s view of the schedule A += 100 B -= 100 A *= 1.06 B *= 1.06 Each action in the TXNs reads a value from global memory and then writes one back to it Scheduling order matters! R(B) W(B) R(B) W(B) 39

Conflict Types Two actions conflict if they are part of different TXNs, involve the same variable, and at least one of them is a write Thus, there are three types of conflicts: Read-Write conflicts (RW) Write-Read conflicts (WR) Write-Write conflicts (WW) Why no RR Conflict? Interleaving anomalies occur with / because of these conflicts between TXNs (but these conflicts can occur without causing anomalies!)

Classic Anomalies with Interleaved Execution Unrepeatable read : Example: 1. reads some data from A 2. writes to A C 3. Then, reads from A again and now gets a different / inconsistent value Occurring with / because of a RW conflict

Classic Anomalies with Interleaved Execution Dirty read / Reading uncommitted data: Example: 1. writes some data to A A 2. reads from A, then writes back to A & commits C 3. then aborts- now s result is based on an obsolete / inconsistent value Occurring with / because of a WR conflict

Classic Anomalies with Interleaved Execution Lost update : Example: 1. blind writes some data to A W(B) C 2. blind writes to A and B W(B) C 3. then blind writes to B; now we have s value for B and s value for A- not equivalent to any serial schedule! Occurring because of a WW conflict

Outline Transaction Basics Definition Motivation for Transaction ACID Properties Concurrency Control Scheduling Anomaly Types Conflict Serializability 44

Schedules Serial Schedule: R(B) W(B) R(B) W(B) Serializable Schedule: R(B) W(B) R(B) W(B) 45

Conflict Serializable Schedule All Schedules Serializable Schedule Conflict Serializable Schedule Serial Schedule 46

Conflicts Two actions conflict if all the conditions hold i) they are part of different TXNs, ii) they involve the same variable, iii) at least one of them is a write R(B) W(B) R(B) W(B) 47

Exercise Two actions conflict if all the conditions hold i) they are part of different TXNs, ii) they involve the same variable, iii) at least one of them is a write R(B) W(B) R(B) W(B) Find all the other conflicts 48

Exercise: Answer Two actions conflict if all the conditions hold i) they are part of different TXNs, ii) they involve the same variable, iii) at least one of them is a write R(B) W(B) R(B) W(B) 49

Conflict serializable Schedule S is conflict serializable if S is conflict equivalent to some serial schedule Two schedules are conflict equivalent if: They involve the same actions of the same TXNs Every pair of conflicting actions of two TXNs are ordered in the same way 50

Are they conflict equivalent? R(B) W(B) R(B) NO R(B) W(B) R(B) W(B) They involve the same actions of the same TXNs does not hold 51

Are they conflict equivalent? R(B) W(B) R(B) W(B) NO R(B) W(B) R(B) W(B) Every pair of conflicting actions of two TXNs are ordered in the same way does not hold 52

Are they conflict equivalent? R(B) W(B) R(B) W(B) YES R(B) W(B) R(B) W(B) They involve the same actions of the same TXNs Every pair of conflicting actions of two TXNs are ordered in the same way 53

The Conflict Graph Consider a graph where the nodes are TXNs, and there is an edge from T i àt j if any actions in T i precede and conflict with any actions in T j R(B) W(B) R(B) W(B) Theorem: Schedule is conflict serializable if and only if its conflict graph is acyclic 54

Is this schedule conflict serializable? R(B) W(B) R(B) W(B) 1. Find all conflicts 2. Model the schedule as a conflict graph 3. Check whether the graph has a cycle Yes à not conflict serializable No à conflict serializable NO 55

Isolation Levels Transactions in SQLite are serializable (https://www.sqlite.org/isolation.html) Isolation Levels in SQL Server (https://docs.microsoft.com/enus/sql/t-sql/statements/set-transaction-isolation-level-transact-sql?view=sql-server-2017) ACID Check out CMPT 454 56

Concurrency Control Algorithms Locking Timestamp Ordering 2-phase locking Multi-version concurrency control (MVCC) Check out CMPT 454 57

Summary Transaction Basics Definition Motivation for Transaction ACID Properties Concurrency Control Scheduling Anomaly Types Conflict Serializability 58 58

Acknowledge Some lecture slides were copied from or inspired by the following course materials W4111:Introduction to databases by Eugene Wu at Columbia University CSE344: Introduction to Data Management by Dan Suciu at University of Washington CMPT354: Database System I by John Edgar at Simon Fraser University CS186: Introduction to Database Systems by Joe Hellerstein at UC Berkeley CS145: Introduction to Databases by Peter Bailis at Stanford CS 348: Introduction to Database Management by Grant Weddell at University of Waterloo 59