EE324 INTRO. TO DISTRIBUTED SYSTEMS LECTURE 13 TRANSACTIONS
Midterm Midterm grading will take about a week and a half. Assignment 3 will be out. Thursday there will be a in-class session to prepare you for the assignment.
Last lecture Distributed mutex
Lamport s Shared Priority Queue Each process i locally maintains Qi (its own version of the priority Q) To execute critical section, you must have replies from all other processes AND your request must be at the front of Qi When you have all replies: All other processes are aware of your request (because the request happens before response) You are aware of any earlier requests (assume messages from the same process are not reordered)
Lamport s Shared Priority Queue To enter critical section at process i :Stamp your request with the current time T Add request to Qi Broadcast REQUEST(T) to all processes Wait for all replies and for T to reach front of Qi To leave Pop head of Qi, Broadcast RELEASE to all processes On receipt of REQUEST(T ) from process j: Add T to Qi If waiting for REPLY from j for an earlier request T, wait until j replies to you Otherwise REPLY On receipt of RELEASEPop head of Qi
Shared priority queue Node1: time Action Q: <15,3> 40 (start) 41 Recv <15,3> 42 Reply to <15,3> Q: <15,3> Node2: time Action 11 (start) 12 Recv <15,3> 13 Reply to <15,3> Q: <15,3> Node3: time Action 14 (start) 15 Request <15,3>
Shared priority queue Node1: time Action Q: <15,3> 40 (start) 41 Recv <15,3> 42 Reply to <15,3> Q: <15,3> Q: <15,3> Node2: time Action 11 (start) 12 Recv <15,3> 13 Reply to <15,3> Node3: time Action 14 (start) 15 Request <15,3> 43 Recv reply 1 44 Recv reply 2 45 Run critical section
Shared priority queue Q: <15,3>, <43,1> Node1: time Action 40 (start) 41 Recv <15,3> Q: <15,3>, <18,2>, <45,1> 42 Reply to <15,3> Node3: time Action 43 Requet <43,1> 14 (start) 15 Request <15,3> 43 Recv reply 1 Q: <15,3>, <18,2> 44 Recv reply 2 45 Run critical section Node2: time Action 46 Recv <43,1> 11 (start) Reply 16 Recv <15,3> 48 Recv <18,2> 17 Reply to <15,3> 18 Request <18,2>
Shared priority queue Q: <15,3>, <18,2>, <43,1> ode2: time Action 1 (start) 6 Recv <15,3> 7 Reply to <15,3> 8 Request <18,2> 0 Recv reply from 1 1 Recv <43,1> Delay reply because <18,2> is my earlier request Node1: time Action 40 (start) 41 Recv <15,3> 42 Reply to <15,3> 43 Request <43,1> Q: <15,3>, <43,1> Q: <15,3>, <18,2>, <45,1> Node3: time Action 14 (start) 15 Request <15,3> 43 Recv reply 1 44 Recv reply 2 45 Run critical section 46 Recv <43,1> 47 Reply to 1 48 Recv <18,2> 49 Reply to 2
Shared priority queue Q: <15,3>, <18,2>, <43,1> ode2: time Action 1 (start) 6 Recv <15,3> 7 Reply to <15,3> 8 Request <18,2> 0 Recv reply from 3 1 Recv <43,1> Recv reply from 1 <18,2> Node1: time Action 40 (start) 41 Recv <15,3> 42 Reply to <15,3> 43 Request <43,1> Recv <18,2> Reply to 1 <18,2> Q: <15,3>, <18,2> <43,1> Q: <15,3>, <18,2>, <43,1> Node3: time Action 14 (start) 15 Request <15,3> 43 Recv reply 1 44 Recv reply 2 45 Run critical section 46 Recv <43,1> 47 Reply to 1 48 Recv <18,2> 49 Reply to 2
Shared Queue approach Everyone eventually sees the same ordering Ordered by Lamport s clock. Disadvantages: Very unreliable Any process failure halts progress 3(N-1) messages per entry/exit Advantages: Fair, Short synchronization delay
Lamport s Shared Priority Queue Advantages: Fair Short synchronization delay Disadvantages: Very unreliable (Any process failure halts progress) 3(N-1) messages per entry/exit
Today We want to look at distributed transactions, but first we need to understand transactions in a single machine.
Today's Lecture 14 Reading CDK5 16.2~.4 Transaction basics Locking and deadlock in transactions
Transactions A group of operations often represent a unit of work. Fundamental abstraction to group operations into a single unit of work begin: begins the transaction commit: attempts to complete the transaction rollback / abort: aborts the transaction
Transactions 16 A transaction is a sequence of server operations that is guaranteed by the server to be atomic in the presence of multiple clients and server cr ashes. Free from interference by operations being performed on behalf of other co ncurrent clients Either all of the operations must be completed successfully or they must have no effect at all in the presence of server crashes
Transactions The ACID Properties 17 The four desirable properties for reliable handling of concurren t transactions. (The alternative definition of transactions.) Atomicity: All or Nothing Consistency: Each transaction, if executed by itself, maintains the correctn ess of the database. Isolation (Serializability): each transaction runs as if alone Durability: once a transaction is done, it stays done. Cannot be undone.
Bank Operations 18 A client s banking transaction Operations of the Account interface bool xfer(account src, Account dest, long x) { deposit(amount) Transaction t = begin(); deposit amount in the account if (src.getbalance() >= x) { withdraw(amount) withdraw amount from the account src.setbalance(src.getbalance() x); getbalance() -> amount dest.setbalance(dest.getbalance() + x); return the balance of the account return t.commit(); setbalance(amount) set the balance of the account to amount } t.abort(); return FALSE; }
The transactional model 19 Applications are coded in a stylized way: begin transaction Perform a series of read, update operations Terminate by commit or abort. Terminology The application is the transaction manager The data manager is presented with operations from concurrently activ e transactions It schedules them in an interleaved but serializable order
Transaction and Data Managers 20 Transactions Data (and Lock) Managers read update read update transactions are stateful: transaction knows about database contents and updates
Transaction life histories 21 Successful Aborted by client Aborted by server opentransaction opentransaction opentransaction operation operation operation operation operation operation server aborts transaction operation operation operation ERROR reported to client closetransaction aborttransaction opentransaction() trans; starts a new transaction and delivers a unique TID trans. This identifier will be used in the other ope rations in the transaction. closetransaction(trans) (commit, abort); ends a transaction: a commit return value indicates that the transaction has committed; an abort retu rn value indicates that it has aborted. aborttransaction(trans); aborts the transaction.
Transactional Execution Log 22 As the transaction runs, it creates a history of its actions. Suppose we were to write down the sequence of operations it performs. Data manager does this, one by one This yields a schedule Operations and order they executed Can infer order in which transactions ran Scheduling is called concurrency control
Figure 16.5 The lost update problem Transaction T : balance = b.getbalance(); b.setbalance(balance*1.1); a.withdraw(balance/10) balance = b.getbalance(); $200 b.setbalance(balance*1.1); $220 a.withdraw(balance/10) $80 Transaction U: balance = b.getbalance(); b.setbalance(balance*1.1); c.withdraw(balance/10) balance = b.getbalance(); $200 b.setbalance(balance*1.1); $220 c.withdraw(balance/10) $280 Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012
Figure 16.6 The inconsistent retrievals problem Transaction V: a.withdraw(100) b.deposit(100) a.withdraw(100); $100 b.deposit(100) $300 Transaction W: abranch.branchtotal() total = a.getbalance() $100 total = total+b.getbalance() $300 total = total+c.getbalance() Instructor s Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012
Concurrency control 25 Motivation: without concurrency control, we have lost updates, inconsistent retrievals, etc. Concurrency control schemes are designed to allow two or more transacti ons to be executed correctly while maintaining serial equivalence Serial Equivalence is correctness criterion Schedule produced by concurrency control scheme should be equivalent to a seri al schedule in which transactions are executed one after the other Schemes: locking, optimistic concurrency control, time-stamp based concur rency control
Serially Equivalent Interleaving 26 Means that effect of the interleaved execution is indistinguishable from s ome possible serial execution of the committed transactions For example: T1 and T2 are interleaved but it looks like T2 ran before T 1 Idea is that transactions can be coded to be correct if run in isolation, an d yet will run correctly when executed concurrently (and hence gain a sp eedup)
Need for serially equivalent interleaving 27 T 1 : R 1 (X) R 1 (Y) W 1 (X) commit 1 T 2 : R 2 (X) W 2 (X) W 2 (Y) commit 2 DB: R 1 (X) R 2 (X) W 2 (X) R 1 (Y) W 1 (X) W 2 (Y) commit 1 commit 2 Data manager interleaves operations to improve concurrency
Need for serially equivalent interleaving 28 T 1 : R 1 (X) R 1 (Y) W 1 (X) commit 1 T 2 : R 2 (X) W 2 (X) W 2 (Y) commit 2 DB: R 1 (X) R 2 (X) W 2 (X) R 1 (Y) W 1 (X) W 2 (Y) commit 2 commit 1 Unsafe! Not serially equivalent Problem: transactions may interfere. Here, T 2 changes x, henc e T 1 should have either run first (read and write) or after (reading the changed value).
Serially equivalent interleaving 29 T 1 : R 1 (X) R 1 (Y) W 1 (X) commit 1 T 2 : R 2 (X) W 2 (X) W 2 (Y) commit 2 DB: R 2 (X) W 2 (X) R 1 (X) W 1 (X) W 2 (Y) R 1 (Y) commit 2 commit 1 Data manager interleaves operations to improve concurrency but schedules the m so that it looks as if one transaction ran at a time. This schedule looks like T 2 ran first.
Conflicting operations A pair of operations conflicts when their combined effect depends on the ordering. Read and write operation conflict rules Operations of different transactions Conflict Reason read read No Because the effect of a pair of read operations does not depend on the order in which they are executed read write Yes Because the effect of a read and a write operation depends on the order of their execution write write Yes Because the effect of a pair of write operations depends on the order of their execution
Serial equivalence property For two transactions to be serially equivalent, it is necessary and sufficient that all pairs of conflicting operations of the two transactions be executed in the same order at all of the objects they both access.
Recovery from abort Servers must record all the effects of committed transactions and non o f the effects of aborted transactions. Aborted transactions can cause dirty reads and premature writes.
A dirty read when transaction T aborts 33 Transaction T: a.getbalance() a.setbalance(balance + 10) balance = a.getbalance() $100 a.setbalance(balance + 10) $110 abort transaction Transaction U: a.getbalance() a.setbalance(balance + 20) balance = a.getbalance() $110 a.setbalance(balance + 20) $130 commit transaction uses result of uncommitted transaction!
Today's Lecture 34 Transaction basics Locking and deadlock
Schemes for Concurrency control 35 Locking Server attempts to gain an exclusive lock that is about to be used by one o f its operations in a transaction. Can use different lock types (read/write for example) Two-phase locking Optimistic concurrency control Time-stamp based concurrency control
What about the locks? 36 Unlike other kinds of distributed systems, transactional systems t ypically lock the data they access They obtain these locks as they run: Before accessing x get a lock on x Usually we assume that the application knows enough to get the right kind of lock. It is not good to get a read lock if you ll later need to update the object In clever applications, one lock will often cover many objects
Locking rule 37 Suppose that transaction T will access object x. We need to know that first, T gets a lock that covers x What does coverage entail? We need to know that if any other transaction T tries to access x it will attempt to get the same lock
Examples of lock coverage 38 We could have one lock per object or one lock for the whole database (a global lock) or one lock for a category of objects In a tree, we could have one lock for the whole tree associated with the root In a table we could have one lock for row, or one for each column, or one for the w hole table All transactions must use the same rules! And if you will update the object, the lock must be a write lock, not a read lock
Global lock? Only let one transaction run at a time Poor solution Performance issues. bool xfer(account src, Account dest, long x) { lock(); if (src.getbalance() >= x) { src.setbalance(src.getbalance() x); dest.setbalance(dest.getbalance() + x); unlock(); return TRUE; } unlock(); return FALSE; }
Per-Object Locking Other transactions can execute concurrently, as long as they don t read or write the src or dest accounts bool xfer(account src, Account dest, long x) { lock(src); if (src.getbalance() >= x) { src.setbalance(src.getbalance() x); unlock(src); lock(dest); dest.setbalance(dest.getbalance() + x); unlock(dest); return TRUE; } unlock(src); return FALSE; } See any problem?
Read/Write locks We can use different type of locks to increase concurrency. Read/write locks. Need to respect the conflict rule.
Read/Write locks: Lock compatibility 42 For one object Lock requested read write Lock already set none OK OK read OK wait write wait wait Operation Conflict rules: 1. If a transaction T has already performed a read operation on a particular object, then a concurrent transaction U must not write that object until T commits or aborts 2. If a transaction T has already performed a read operation on a particular object, then a concurrent transaction U must not read or write that object until T commits or aborts
Strict Two-Phase Locking 43 Strict two-phase locking. Automatically release all locks upon commit or abort.
Why does strict 2PL imply serializability? 44 Suppose that T will perform an operation that conflicts with an operation that T has done: T will update data item X that T read or updated T updated item Y and T will read or update it T must have had a lock on X/Y that conflicts with the lock that T wants T won t release it until it commits or aborts So T will wait until T commits or aborts
Use of locks in strict two-phase locking 45 1. When an operation accesses an object within a transaction: (a) If the object is not already locked, it is locked and the operation proceeds. (b) If the object has a conflicting lock set by another transaction, the transaction m ust wait until it is unlocked. (c) If the object has a non-conflicting lock set by another transaction, the lock is s hared and the operation proceeds. (d) If the object has already been locked in the same transaction, the lock will be p romoted if necessary and the operation proceeds. (Where promotion is prevent ed by a conflicting lock, rule (b) is used.) Lock promotion: getting a more exclusive lock (e.g., read write lock) 2. When a transaction is committed or aborted, the server unlocks all objects it locked for the transaction.
Deadlock with write locks 46 Transaction T Transaction U Operations Locks Operations Locks a.deposit(100); write lock A b.deposit(200) write lock B b.withdraw(100) waits for U s a.withdraw(200); waits for T s lock on B lock on A
Dealing with Deadlock in two-phase locking 47 Deadlock prevention Acquire all needed locks in a single atomic operation Acquire locks in a particular order Often impractical in practice: transactions may not know which lock they may need in the future
Dealing with Deadlock in two-phase locking 48 Deadlock detection Keep graph of locks held. Check for cycles periodically or each time an edge is added Cycles can be eliminated by aborting transactions Timeouts ( ignoring ) Aborting transactions when time expires Most transactions are short. Long-lived ones are probably deadlocked, so abort and retry.
Deadlock detection: The wait-for graph 49 Held by Waits for A T U T U Waits for B Held by
Timeouts 50 Transaction T Transaction U Operations Locks Operations Locks a.deposit(100); write lock A b.deposit(200) write lock B b.withdraw(100) waits for U s a.withdraw(200); waits for T s lock on B (timeout elapses) T s lock on A becomes vulnerable, unlock A, abort T a.withdraw(200); lock on A write locks A unlock A, B