Concurrency Control. Data Base Management Systems. Inherently Concurrent Systems: The requirements

Similar documents
2 nd Semester 2009/2010

Concurrency Control. Transaction Management. Lost Update Problem. Need for Concurrency Control. Concurrency control

Lecture 22 Concurrency Control Part 2

Concurrency Control! Snapshot isolation" q How to ensure serializability and recoverability? " q Lock-Based Protocols" q Other Protocols"

Chapter 13 : Concurrency Control

Graph-based protocols are an alternative to two-phase locking Impose a partial ordering on the set D = {d 1, d 2,..., d h } of all data items.

! A lock is a mechanism to control concurrent access to a data item! Data items can be locked in two modes :

Chapter 22. Transaction Management

UNIT-IV TRANSACTION PROCESSING CONCEPTS

Lecture 21 Concurrency Control Part 1

TRANSACTION PROCESSING CONCEPTS

CS352 Lecture - Concurrency

Chapter 12 : Concurrency Control

References. Concurrency Control. Administração e Optimização de Bases de Dados 2012/2013. Helena Galhardas e

Concurrency Control Algorithms

UNIT 4 TRANSACTIONS. Objective

Chapter 15 : Concurrency Control

Review. Review. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Lecture #21: Concurrency Control (R&G ch.

Concurrency Control. Concurrency Control Ensures interleaving of operations amongst concurrent transactions result in serializable schedules

DB2 Lecture 10 Concurrency Control

CMSC 424 Database design Lecture 22 Concurrency/recovery. Mihai Pop

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Last Class. Faloutsos/Pavlo CMU /615

UNIT IV TRANSACTION MANAGEMENT

Transaction Management. Pearson Education Limited 1995, 2005

Concurrency Control in Distributed Systems. ECE 677 University of Arizona

CS352 Lecture - Concurrency

A lock is a mechanism to control concurrent access to a data item Data items can be locked in two modes:

mywbut.com Concurrency Control

Chapter 5. Concurrency Control Techniques. Adapted from the slides of Fundamentals of Database Systems (Elmasri et al., 2006)

Transactions. Kathleen Durant PhD Northeastern University CS3200 Lesson 9

Concurrency Control Overview. COSC 404 Database System Implementation. Concurrency Control. Lock-Based Protocols. Lock-Based Protocols (2)

Transaction Processing: Basics - Transactions

DATABASE DESIGN I - 1DL300

Synchronization Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University


CSE 444: Database Internals. Lectures Transactions

Page 1. Goals of Todayʼs Lecture" Two Key Questions" Goals of Transaction Scheduling"

Today s Class. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Formal Properties of Schedules. Conflicting Operations

Lecture 13 Concurrency Control

Page 1. Goals of Today s Lecture" Two Key Questions" Goals of Transaction Scheduling"

T ransaction Management 4/23/2018 1

Chapter 18 Concurrency Control Techniques

CS Reading Packet: "Transaction management, part 2"

Checkpoints. Logs keep growing. After every failure, we d have to go back and replay the log. This can be time consuming. Checkpoint frequently

Database design and implementation CMPSCI 645. Lectures 18: Transactions and Concurrency

Distributed Transaction Management. Distributed Database System

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 18-1

Introduction to Data Management CSE 414

Intro to DB CHAPTER 15 TRANSACTION MNGMNT

Foundation of Database Transaction Processing. Copyright 2012 Pearson Education, Inc.

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #17: Transac0ons 2: 2PL and Deadlocks

L i (A) = transaction T i acquires lock for element A. U i (A) = transaction T i releases lock for element A

Chapter 6 Distributed Concurrency Control

Database System Concepts

Chapter 9: Concurrency Control

Transaction Management

Database Management Systems

Concurrency Control Techniques

Introduction to Data Management CSE 344

Page 1. Goals of Today s Lecture. The ACID properties of Transactions. Transactions

Distributed Databases Systems

Transactions and Concurrency Control. Dr. Philip Cannata

CSE 344 MARCH 9 TH TRANSACTIONS

Multiversion schemes keep old versions of data item to increase concurrency. Multiversion Timestamp Ordering Multiversion Two-Phase Locking Each

Distributed Database Management System UNIT-2. Concurrency Control. Transaction ACID rules. MCA 325, Distributed DBMS And Object Oriented Databases

Operating Systems. Operating Systems Sina Meraji U of T

Concurrency control CS 417. Distributed Systems CS 417

Chapter 7 (Cont.) Transaction Management and Concurrency Control

Phantom Problem. Phantom Problem. Phantom Problem. Phantom Problem R1(X1),R1(X2),W2(X3),R1(X1),R1(X2),R1(X3) R1(X1),R1(X2),W2(X3),R1(X1),R1(X2),R1(X3)

Intro to Transactions

Deadlock Prevention (cont d) Deadlock Prevention. Example: Wait-Die. Wait-Die

Chapter 16 : Concurrency Control

Lock Granularity and Consistency Levels (Lecture 7, cs262a) Ali Ghodsi and Ion Stoica, UC Berkeley February 7, 2018

Comp 5311 Database Management Systems. 14. Timestamp-based Protocols

Database Systems CSE 414

Transaction Management

For more Articles Go To: Whatisdbms.com CONCURRENCY CONTROL PROTOCOL

Unit 10.5 Transaction Processing: Concurrency Zvi M. Kedem 1

Silberschatz and Galvin Chapter 18

11/7/2018. Event Ordering. Module 18: Distributed Coordination. Distributed Mutual Exclusion (DME) Implementation of. DME: Centralized Approach

Transaction Management and Concurrency Control. Chapter 16, 17

CS377: Database Systems Concurrency Control. Li Xiong Department of Mathematics and Computer Science Emory University

Concurrency. Consider two ATMs running in parallel. We need a concurrency manager. r1[x] x:=x-250 r2[x] x:=x-250 w[x] commit w[x] commit

CSE 344 MARCH 25 TH ISOLATION

Transactions and Concurrency Control

Concurrency Control. Chapter 17. Comp 521 Files and Databases Fall

Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition. Chapter 13 Managing Transactions and Concurrency

Transaction Processing: Concurrency Control. Announcements (April 26) Transactions. CPS 216 Advanced Database Systems

Conflict Equivalent. Conflict Serializability. Example 1. Precedence Graph Test Every conflict serializable schedule is serializable

Conflict serializability

CS 370 Concurrency worksheet. T1:R(X); T2:W(Y); T3:R(X); T2:R(X); T2:R(Z); T2:Commit; T3:W(X); T3:Commit; T1:W(Y); Commit

Concurrency Control. Chapter 17. Comp 521 Files and Databases Spring

In This Lecture. Exam revision. Main topics. Exam format. Particular topics. How to revise. Exam format Main topics How to revise

transaction - (another def) - the execution of a program that accesses or changes the contents of the database

Introduction to Transaction Management

Advanced Databases. Lecture 9- Concurrency Control (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch

Transaction Processing Concurrency control

Datenbanksysteme II: Implementation of Database Systems Synchronization of Concurrent Transactions

ISSN: Monica Gahlyan et al, International Journal of Computer Science & Communication Networks,Vol 3(3),

Concurrency Control. Chapter 17. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Transcription:

Concurrency Control Inherently Concurrent Systems: These are Systems that respond to and manage simultaneous activities in their external environment which are inherently concurrent and maybe broadly classified as: Real-time Systems DBMS (Transaction Processing Systems) Operating Systems The requirements There is a need to support separate activities. There is a need to ensure that these activities access and update common data without interference. There is a need that the results of transactions are recorded permanently and securely before the user is told that the operation has been done. Data Base Management Systems Concurrency control is the controlling of Transactions that operate on the same db simultaneously. Gives shorter response times by running several transactions in parallel. We have seen that a Transaction is defined as a sequence of operations (read, write, update) that transforms the db from one consistent state to another consistent state. BUT it does not always happen - there is a need for rules. So here are the rules: 1. A Transaction must be protected against inconsistencies caused by other transactions. 2. If a Transaction terminates abnormally or runs into unforeseen problems, the updating of the transaction must be canceled so that the db is left in a consistent state. Examples Two or more users accessing the same db as a repository Two or more people reading the same book Two or more people accessing the same bank account Two or more TV remote controls Two or more garage door openers, Playing chess one against two or more Cooking two or more course meals simultaneously. Two or more users use the same compiler to compile their programs.

Concurrent Atomic Transactions Serializability and Recoverability In order that we have control over concurrency we need to schedule transactions in such a way as to avoid any interference between them, when they compete during execution. A simple single-user situation is to allow only one Transaction to execute at a time: T1 is committed before T2 begins its execution. A multi-user DBMS objective is to maximize the degree of concurrency or parallelism in the system. Thus Transactions should be able to run concurrently without interfering with each other. (These are the same objectives with Real-Time Systems and Operating Systems) The ultimate goal is then to examine the serializability as a means of helping to identify those executions of transactions that are guaranteed to ensure consistency of the db. Schedule A Schedule is a sequence of the operations by a set of concurrent transactions that preserves the order of the operations in each of the individual transactions. Alternatively, schedules are execution sequences that represent the chronological order in which instructions are executed in the system. Serializability The execution is serializable when a new transaction is not started until the previous transaction is finished. A serial schedule is a schedule where the operations of each transaction are executed consecutively or serially without any interleaved operations from other transactions. A Non-serial schedule is a schedule where the operations from a set of concurrent transactions are interleaved. When concurrent Transactions are executing, its results are correct only if it produces the same results as some serial execution. Such a schedule is called serializable. To ensure integrity and consistency of the db, schedules should be serializable. The ordering of the read and write operations is important: 1. Two Transactions only read Data Item (DI), they do not conflict and order is not important. 2. Two Transactions either read or write completely separate DIs, they do not conflict and the order is not important. 3. If one Transaction writes a DI and another either reads or writes the same DI, the order of execution is important.

Concurrency Control Techniques These techniques include the following protocols: Locking Based Protocols Locking Two Phase Locking (2PL) Graph-Based Protocols (Index or Tree Structures) Time Stamping Protocols Basic Time Stamping Thomas' Write Rule Multi-version Time stamping Granularity of Data Items Multi-version Read Consistency (Oracle9i, 2004) Lock Based Protocols Access to the data items (DI) are done in a mutually exclusive (ME) manner. While one Transaction accesses a data item (DI) no other transaction can modify that DI. The most common way to achieve concurrency is to hold a lock on that data item (DI). Many systems hold a lock on the entire db (very bad), a file or a Relation (??), a record or a tuple (OK), a field (good). Mutually Exclusive means that if a Transaction holds a lock on that data item (DI), no other Transaction can interfere. Safeguarding against incorrect results - which in turn may compromise the consistency of the db.

Locks There are various types of lock modes: Shared If a transaction Ti has obtained a shared mode lock (SL) on data item DI then Ti can read but not write. So, only READ. Exclusive If TI has obtained an exclusive mode lock (XL) on data item DI then Ti can READ and WRITE. Lock Compatibility SL XL SL TRUE FALSE XL FALSE FALSE NOTE: SL is compatible with SL but not with XL, therefore at any time several SLs can be held simultaneously by different Transactions on a particular DI or tuple or file. A subsequent XL has to wait until the currently held XLs or SLs are released. Wait A Transaction Ti must request a lock first, for a DI or a file, in order to access it. If the file or DI is already locked by another Transaction Tj in an incompatible mode, then Ti must wait until all incompatible locks held by other Ts have been explicitly released. Tj will eventually release the lock (unlock) that DI, which already had locked at a previous time. Ergo, Ti will be granted the request. Example Consider two accounts A and B accessed by two transactions T1 and T2. T1 transfers $50 from B to A. T2 displays the sum of A + B. Suppose A = $100 and B = $200 Consider the following four (4) examples:

Example L1 Time T1 T2 B = B -50 Write(B) Lock_XL(A) A = A + 50 Write (A) Lock_SL(A) Lock_SL(B) Print (A+B) This is not good because although is serializable, there is no concurrency. If T1 & T2 are executed serially (either T1T2 or T2T1) the result A + B = $300 is the same. If they are executed concurrently the following results are possible!! Example L2 Time T1 T2 B = B -50 Write(B) Lock_SL(A) Lock_SL(B) Print (A+B) Lock_XL(A) A = A + 50 Write (A) The result of A + B = 250 (WRONG) T1 has unlocked B too early and as a result brought T2 to an inconsistent state!!!

Example L3 Now delay unlocking until the end of both Transactions. Time T1 T2 B = B -50 Write(B) Lock_XL(A) A = A + 50 Write (A) Lock_SL(A) Lock_SL(B) Print (A+B) Here we have that T2 has to wait for T1 to release the lock on B (Last line of T1 & 3rd line of T2). Notice the value of A. Example L4 Furthermore consider the following partial schedule (a variation of L3): Time T1 T2 B = B -50 Write(B) Lock_SL(A) Lock_SL(B) Lock_XL(A) Note Since T1 is XL on B and T2 is requesting a Shared mode Lock on B, T2 is waiting for T1 to unlock B. Since T2 is requesting a SL on A and T1 is requesting an XL on A, T1 is waiting for T2 to unlock A. This is a DEADLOCK! Neither of the Transactions can proceed normally!!! When deadlock occurs, the system must rollback one of the two Transactions. Once a Transaction has been rolled back, the data items that were locked by that Transaction are unlocked and available to other transactions, which in turn may continue with normal execution. Avoid STARVATION.

Deadlock A standstill, an impasse, or a stalemate of two or more Transactions. This is a result when one Transaction holds a lock (or a resource) on a DI and it needs another DI that is held by another Transaction which in turn needs the DI held by the first Transaction. Deadlock Detection and Prevention a) Deadlock Prevention A simple, mostly used approach to deadlock prevention is TIMEOUTs. Here a Transaction that requests a lock will wait for only a period of time defined by the system, called quantum (or quanta, pl.). If the lock has not been granted for this period, the lock request has timed out. When this occurs, the DBMS: i. assumes that the Transaction is in a deadlock, but in reality it may not be the case, ii. it aborts thetransaction, iii. it automatically restarts the Transaction (new). i.e. Rollbacks. This is very simple and practical solution to the deadlock problem and it is employed by most of the commercial DBMSs. But unfortunately is extremely time consuming. N.B. Operating Systems, such as Unix, employ a similar scheme, in which the OS chooses a victim (one of the two deadlocked processes) and applies (i), (ii), (iii) above. See below for Recovery from a deadlock. b) Deadlock Detection One way to detect deadlocks is to build a wait-for graph, or a DBMS graph. This is a special case locking protocol, where prior knowledge is used in order to avoid deadlocks. See next topic: Graph-based Protocols. c) Recovery from Deadlock The DBMS has to select one or more Transactions as victims. But this selection should depend on several issues: i. Choice of a victim: Should depend on, How long the T has been running (less time better), how many Dls have been updated (better if few), and how many Dls will update (better if many). ii. How far a Transaction has to rollback. (better less) iii. Avoid Starvation. When the same T is chosen as a victim, it can never complete. Solution: A counter that counts the number of times that T has been a victim.

Two-Phase Locking (2PL) Another way to ensure serializability is with two-phase locking. Transactions issue lock and unlock requests in 2 phases: 1. Growing phase: A T may obtain locks but may not release any lock (initially). 2. Shrinking phase: A T may release locks but may not obtain any new locks. Notes: i. Examples L1 and L2 above are not two-phase locking but L3 and L4 are. ii. Although L4 uses two-phase locking, it is also deadlocked! Graph Based Locking Protocols As mentioned above, one way to solve the Deadlock problem and to ensure serializability is to build a simple model from the requirements specification (prior knowledge) of the data items DIs (which and how) will be accessed. A DBMS graph or a wait-for graph is a directed acyclic graph G = (N, E) that consists of a set of nodes N and a set of directed edges E. It is rooted like a tree. Construction: In order to construct such a graph: i. Create a node for each Transaction. ii. Create a directed Edge Ti to Tj, if Ti is waiting to lock an item that is currently locked by Tj. Deadlock exists if and only if the graph contains a cycle. Rules: i. The first lock by Ti may be on any DI. ii. Then a DI can be locked by Ti only if the parent of that DI is currently locked by Ti. iii. DIs may be locked or unlocked at any time. iv. A DI that has been locked by Ti can not be subsequently relocked by Ti Example Consider the following lock and unlock instructions of four Transactions T1, T2, T3 and T4 that lock and unlock DIs: A, B, C, D, E, F, G, H, J, I as in the Graph.

T1 T2 T3 T4 Lock_XL(D) Lock_XL(D) Lock_XL(E) Unlock(E) Lock_XL(D) Lock_XL(G) Lock_XL(H) Lock_XL(E) Lock_XL(H) Unlock(D) Unlock(E) Unlock(D) Lock_XL(J) Unlock(H) Unlock(H) Unlock(J) Unlock(D) Unlock(G) This can be serialized so that there is a deadlock free solution. I can be shown. T1 T2 T3 T4 Lock_XL(E) Unlock(E) Lock_XL(D) Lock_XL(G) Unlock(D) Lock_XL(D) Lock_XL(H) Unlock(D) Unlock(H) Lock_XL(J) Unlock(J) Lock_XL(E) Unlock(E) Lock_XL(D) Lock_XL(H) Unlock(D) Unlock(H) Unlock(G) Advantages: Unlocking may occur earlier, i.e. less waiting time and increase in concurrency. Also No Rollbacks, since it is deadlock free. Less time, etc. Disadvantages: In some cases it is necessary for a T to lock DIs the T does not access (this is BAD). For example - A T needs to access DIs A and J in the graph. It must lock not only A and J, but also B, D and H. This increases locking overhead (waiting time) which may decrease concurrency. Note: There are schedules that are not possible under the 2PL and are not possible under the Graph based Protocol, and vice versa.

Time Stamped Based Protocols In the last two protocols, the order between every pair of conflicting transactions is determined at execution time by the first lock that both transactions request that involves incompatible modes. The problem is that Deadlock still may exist. Timestamp: A unique identifier created by the DBMS that indicated the relative starting time of a Transaction. Timestamping: A Concurrency control protocol that orders Ts in such a way that older Ts, that is Ts with smaller timestamps, get priority in the event of conflict. Thus, in the timestamp protocol, the serializability order is determined by selecting an ordering among Ts in advance, i.e. before execution. A fixed unique timestamp (TS), is associated with each Ti. TS(Ti) by the system. It is assigned by the system before execution of Ti. If a new Tj enters the system after Ti has been assigned a TS, then TS(Ti) is less than TS(Tj) => TS(Ti) < TS(Tj). Implementation There are two methods to implement Timestamps: i. The System Clock is used as the Timestamp; i.e., a Transaction timestamp TS(Ti) = "The value of the system clock when Ti enters the system." ii. A logical counter is used that is incremented after a new timestamp has been assigned; i.e., a Transaction Ti = "The value of the counter when the T enters the system. " Serializability order The TSs determine the serializability order: If TS(Ti) < TS(Tj) then the system must ensure that the produced schedule is equivalent to a serial schedule in which Ti appears before Tj. To implement this we need to associate with each DI two timestamp values: 1) W- Timestamp - denotes largest TS of any T which has successfully executed a Write(DI) 2) R- Timestamp - denotes largest TS of Read(DI). Notes i. These are updated whenever a Read(DI) or Write(DI) is executed ii. In Rollback, a Ti - if it issues a read or write operation it is then assigned a new timestamp and it is restarted. iii. Usually a TS is assigned immediately before its first instruction. Example TS Assigned -> T1 T2 <- TS Assigned

Timestamp Ordering: The Scheduler From above: each DI, I, has 2 TSs: a W-TS(I) and R-TS(I), which have the highest TSs of the Ts that carried out the read and write operation on DI, I. The Scheduler receives requests for access to Dis, I, as: read (I, ts) and write (I, ts), where ts is the Timestamp of the requesting Transaction. The protocol The scheduler now accepts or rejects the request as follows: Read (I, ts): If ts < W-TS(I) then the request is rejected and the T is killed (Aborted - Rollback), otherwise the request is accepted and R-TS(I) is set equal to the greater of R-TS(I) and TS, i.e. max(r-ts(i), ts)). Write(l, ts): if ts < W-ts(l) or ts < R-ts(l) then the request is rejected and the T is killed (Aborted, Rollback), otherwise the request is accepted and W-ts is set equal to TS. N.B. In practice, NO Transaction can read or write a DI written by a T with greater Timestamp (TS), and cannot write a DI that has been read by a T with greater TS. Example T2 Suppose r-ts(l) is equal to 7, and w-ts(i) is equal to 5. Request to (Scheduler) Reply from (Scheduler) New Values/Comments read(i, 6) OK still 7 (no change) read(i, 8) OK r-ts(i) = 8 read(i, 9) OK r-ts(i) = 9 write(i, 8) NO conflict: killed - rollback write(i, 11) OK w-ts(i) = 11 read(i, 10) NO conflict: killed - rollback Note: DI, I, was read by the T with highest ts (7), and written by the T with highest ts(5). Example T3 read(t1, B, 1) OK r-ts(i) = 1 T1 adds and prints the contents of A and B. read(t2, B, 2) OK r-ts(i) = 2 T2 transfers 50 from A to B write(t2, B, 3) OK w-ts(i) = 3 T1: read(b), read(a), print(a+b) read(ti, A, 4) OK r-ts(i) = 4 T2: read(b), B=B-50,write(B),read(A) write(t2, A, 5) OK w-ts(i) = 5 A=A+50,write(A),print(A+B)