Concurre Concurr nt n Systems 8L for Part IB Handout 4 Dr. Steven Hand 1

Similar documents
Implementing Rollback: Copy. 2PL: Rollback. Implementing Rollback: Undo. 8L for Part IB Handout 4. Concurrent Systems. Dr.

Reminder from last time

Concurrent systems Lecture 7: Crash recovery, lock- free programming, and transac<onal memory. Dr Robert N. M. Watson

Concurrent & Distributed Systems Supervision Exercises

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

T ransaction Management 4/23/2018 1

Reminder from last time

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Producer Consumer in Ada

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

DB2 Lecture 10 Concurrency Control

Concurrency Control & Recovery

CS122 Lecture 15 Winter Term,

Concurre Concurr nt n Systems 8L for Part IB Handout 3 Dr. Steven Hand 1

CS5412: TRANSACTIONS (I)

Atomic Transac1ons. Atomic Transactions. Q1: What if network fails before deposit? Q2: What if sequence is interrupted by another sequence?

The transaction. Defining properties of transactions. Failures in complex systems propagate. Concurrency Control, Locking, and Recovery

Defining properties of transactions

Problems Caused by Failures

Advances in Data Management Transaction Management A.Poulovassilis

Transaction Management & Concurrency Control. CS 377: Database Systems

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Transaction Management: Concurrency Control, part 2

Locking for B+ Trees. Transaction Management: Concurrency Control, part 2. Locking for B+ Trees (contd.) Locking vs. Latching

Transactions and Recovery Study Question Solutions

Some Examples of Conflicts. Transactional Concurrency Control. Serializable Schedules. Transactions: ACID Properties. Isolation and Serializability

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #19: Logging and Recovery 1

Weak Levels of Consistency

In This Lecture. Transactions and Recovery. Transactions. Transactions. Isolation and Durability. Atomicity and Consistency. Transactions Recovery

Database System Concepts

Concurrency control 12/1/17

Unit 9 Transaction Processing: Recovery Zvi M. Kedem 1

RECOVERY CHAPTER 21,23 (6/E) CHAPTER 17,19 (5/E)

Transactions. Kathleen Durant PhD Northeastern University CS3200 Lesson 9

Concurrency Control & Recovery

ACID Properties. Transaction Management: Crash Recovery (Chap. 18), part 1. Motivation. Recovery Manager. Handling the Buffer Pool.

Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Slides Courtesy of R. Ramakrishnan and J. Gehrke 2. v Concurrent execution of queries for improved performance.

Part III Transactions

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. General Overview NOTICE: Faloutsos CMU SCS

Chapter 6: Process Synchronization. Module 6: Process Synchronization

Introduction to Transaction Management

Topics in Reliable Distributed Systems

Operating Systems. Operating Systems Sina Meraji U of T

Transaction Management. Pearson Education Limited 1995, 2005

Transaction Processing. Introduction to Databases CompSci 316 Fall 2018

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Chapter 20 Introduction to Transaction Processing Concepts and Theory

CSC 261/461 Database Systems Lecture 21 and 22. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Transaction Management: Crash Recovery (Chap. 18), part 1

some sequential execution crash! Recovery Manager replacement MAIN MEMORY policy DISK

DATABASE TRANSACTIONS. CS121: Relational Databases Fall 2017 Lecture 25

Database Management Systems 2010/11

CSE 444: Database Internals. Lectures Transactions

Database Management System

Chapter 15: Transactions

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017

Review: The ACID properties. Crash Recovery. Assumptions. Motivation. Preferred Policy: Steal/No-Force. Buffer Mgmt Plays a Key Role

Advanced Databases (SE487) Prince Sultan University College of Computer and Information Sciences. Dr. Anis Koubaa. Spring 2014

Multi-Core Computing with Transactional Memory. Johannes Schneider and Prof. Roger Wattenhofer

Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Overview of Transaction Management

Intro to Transactions

Linked Lists: The Role of Locking. Erez Petrank Technion


transaction - (another def) - the execution of a program that accesses or changes the contents of the database

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Transactions. Lecture 8. Transactions. ACID Properties. Transaction Concept. Example of Fund Transfer. Example of Fund Transfer (Cont.

Introduction. Storage Failure Recovery Logging Undo Logging Redo Logging ARIES

6.830 Lecture Recovery 10/30/2017

6.830 Lecture Recovery 10/30/2017

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI

Concurrency Control. Transaction Management. Lost Update Problem. Need for Concurrency Control. Concurrency control

Transaction Processing: Concurrency Control ACID. Transaction in SQL. CPS 216 Advanced Database Systems. (Implicit beginning of transaction)

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28

Module 6: Process Synchronization. Operating System Concepts with Java 8 th Edition

Mobile and Heterogeneous databases Distributed Database System Transaction Management. A.R. Hurson Computer Science Missouri Science & Technology

Atomicity: All actions in the Xact happen, or none happen. Consistency: If each Xact is consistent, and the DB starts consistent, it ends up

Introduction to Data Management CSE 344

Transaction Concept. Two main issues to deal with:

Foundation of Database Transaction Processing. Copyright 2012 Pearson Education, Inc.

Chapter 22. Transaction Management

CSE 444: Database Internals. Lectures 13 Transaction Schedules

Tackling Concurrency With STM. Mark Volkmann 10/22/09

Tackling Concurrency With STM

Outline. Purpose of this paper. Purpose of this paper. Transaction Review. Outline. Aries: A Transaction Recovery Method

Transaction Management Overview

CMSC 424 Database design Lecture 22 Concurrency/recovery. Mihai Pop

Transactions. Prepared By: Neeraj Mangla

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations

Transaction Management: Concurrency Control

Transactions. Transaction. Execution of a user program in a DBMS.

CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

Transactions. CS6450: Distributed Systems Lecture 16. Ryan Stutsman

Distributed Systems COMP 212. Revision 2 Othon Michail

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

6 Transactional Memory. Robert Mullins

Database Tuning and Physical Design: Execution of Transactions

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 21 November 2014

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Transcription:

Concurrent Systems 8L for Part IB Handout 4 Dr. Steven Hand 1

2PL: Rollback Recall that transactions can abort Could be to run time conflicts (non strict 2PL), or could be programmed (e.g. on an exception) Using locking for isolation works, but means that updates are made in place ie i.e. once acquire write lock, can directly update If txaction aborts, need to make sure no effects visible Rollback is the process of returning the world to the state it in was before the start of the txaction 2

Implementing Rollback: Undo One strategy is to undo operations, e.g. Keep a log of all operations, in order: O 1, O 2,.. O n On abort, undo changes of O n, O (n 1),.. O 1 Must know how to undo an operation: Assume we log both operations and parameters Programmer can provide an explicit counter action UNDO(credit(A, x)) debit(a, x); May not be sufficient (e.g. setbalance(a, x)) Would need to record previous balance, which we may not have explicitly read within transaction 3

Implementing Rollback: Copy A more brute force approach is to take a copy of an object before [first] modification On abort, just revert to original copy Has some advantages: Doesn t require programmer effort Undo is simple, and can be efficient (e.g. if there are many operations, and/or they are complex) However can lead to high overhead if objects are large and may not be needed if don t abort! Can reduce overhead with partial copying 4

Timestamp Ordering (TSO) 2PL and Strict t 2PL are widely used in practice But can limit concurrency (certainly the latter) Andmustbeable to deal with deadlock TSO is an alternative approach: As a transaction begins, it is assigned a timestamp Timestamps are comparable, and unique (can think of as e.g. current time or as a ticket from a sequencer) Every object O records the timestamp of the last transaction to successfully access it: V(O) T can access object O iff V(T) >= V(O), where V(T) is the timestamp of T (otherwise rejected as too late ) 5

TSO Example 1 T1 transaction { s = getbalance(s); c = getbalance(c); return = s + c; } T2 transaction { debit(s, 100); credit(c, 100); return true; } Imagine S and C start off with version 10 1. T1 and T2 both start concurrently: T1 gets timestamp 27, T2 gets timestamp 29 2. T1 reads S => ok! (27 >= 10); S gets timestamp 27 3. T2 does debit S, 100 => ok! (29 >= 27); S gets timestamp 29 4. T1 reads C => ok! (27 => 10); C gets timestamp 27 5. T2 does credit C, 100 => ok! (29 >= 27); C gets timestamp 29 6. Bothtransactions transactions commit. 6

TSO Example 2 T1 transaction { s = getbalance(s); c = getbalance(c); return = s + c; } T2 transaction { debit(s, 100); credit(c, 100); return true; } As before, S and C start off with version 10 1. T1 and T2 both start concurrently: T1 gets timestamp 27, T2 gets timestamp 29 2. T1 reads S => ok! (27 >= 0); S gets timestamp 27 3. T2 does debit S, 100 => ok! (29 >= 27); S gets timestamp 29 4. T2 does credit C, 100 => ok! (29 >= 0); C gets timestamp 29 5. T1 reads C => FAIL! (27 < 29); T1 aborts 6. T2commits; T1restarts restarts, gets timestamp30 30 7

Advantages of TSO Deadlock kfree Can allow more concurrency than 2PC Can be implemented in a decentralized fashion Can be augmented to distinguish reads & writes objects have read timestamp R & write timestamp W READ(O, T) { if(v(t) < W(O)) abort; // do actual read R(O): = MAX(V(T), R(O)); } R(O) holds timestamp of latest transaction to read Unsafe to write if later txaction has read value Only safe to read if noone wrote after us WRITE(O, T) { if(v(t) < R(O)) abort; if(v(t) < W(O)) return; // do actual write W(O) := V(T); } But if later txaction wrote it, just skip write (he won!). Or? 8

However TSO needs a rollback mechanism (like 2PC) TSO does not provide strict isolation: hence subject to cascading aborts (can provide strict TSO by locking objects when access is granted still remains deadlock free) TSO decides a priori on one serialization even if others might have been possible And TSO does not perform well under contention will repeatedly have transactions aborting & retrying & In general TSO is a good choice for distributed systems [decentralized management] where conflicts are rare 9

Optimistic Concurrency Control OCC is an alternative ti to 2PC or TSO Optimistic since assume conflicts are rare Execute transaction on a shadow hd [copy] of the data On commit, check if all OK ; if so, apply updates; otherwise discard shadows & retry OK means: Allshadows read were mutually consistent, and No one else has committed changes to any object that we are hoping to update Advantages: no deadlock, no cascading aborts And rollback comes pretty much for free! 10

Implementing OCC Various efficient i schemes for shadowing e.g. write buffering, page based copy on write. Complexity arises in performing validation when a transaction T finishes & tries to commit Read Validation: Must ensure that all versions of data read by T (all shadows) were valid at some particular time t This becomes the tentative start time for T Serializability Validation: Must ensure that there are no conflicts with any transactions which have an earlier start time 11

OCC Example (1) All objects are tagged with a version Validation timestamp of the transaction which most recently wrote its updates to that object Many threads execute transactions When wish to read an object, take a shadow copy, and take note of the version number If wish to write: first take copy, then update that When a thread finishes a transaction, it submits the versions to a single threaded validator 12

OCC Example (2) Validator keeps track of last k validated transactions, their timestamps, and the objects they updated Transaction Validation Timestamp Objects Updated Writeback Done? T5 10 A, B, C Yes T6 11 D Yes T7 12 A, E No The versions of the objects are as follows: T7 has started, but not finished, writeback (A has been updated, but not E) Object Version A 12 B 10 C 10 D 11 E 9 13

OCC Example (3) Consider T8: { write(b), write(e) }; T8 executes and makes shadows of B & E Records timestamps: B@10, E@9 When done, T8 submits for validation Phase 1: read validation Check shadows are part of a consistent snapshot Latest committed start time is 11 = ok (10, 9 < 11) Phase 2: serializability validation Check T8 against all later transactions (here, T7) Conflict detected! (T7 updates E, but T8 read old E) 14

Issues with OCC Preceding example uses a simple validator Possible will abort even when don t need to (e.g. can search for a better start time) In general OCC can find more serializable schedules than TSO Timestamps assigned after the fact, and taking the actual data read and written into account However OCC is not suitable when high conflict Can perform lots of work with stale data => wasteful! Livelock possible if conflicting set continually retries 15

Isolation & Concurrency: Summary 2PL explicitly itl locks items as required, then releases Guarantees a serializable schedule Strict 2PCavoids cascading aborts Can limit concurrency; & prone to deadlock TSO assigns timestamps when transactions start Cannot deadlock, but may miss serializable schedules Suitable for distributed/decentralized systems OCC executes with shadow copies, then validates Validation assigns timestamps when transactions end Lots of concurrency, & admitsmany many serializable able schedules No deadlock but potential livelock when contention is high 16

Crash Recovery & Logging Transactions require ACID properties So far have focused on I (and implicitly C). How can we ensure Atomicity & Durability? Need to make sure that if a transaction always done entirely or not at all Need to make sure that a transaction reported as committed remains so, even after a crash Consider for now a fail stop model: If system crashes, all in memory contents are lost Data on disk, however, remains available after reboot 17

Using Persistent Storage Simplest solution : write all updated dobjects to disk on commit, read back on reboot Doesn t work, since crash could occur during write Can fail to provide Atomicity and/or Consistency Instead split update into two stages 1. Write proposed updates to a write ahead log 2. Write actual updates Crash hduring #1 => no actual updates dt done Crash during #2 => use log to redo, or undo 18

Write Ahead Logging Od Ordered dappend only file on disk Contains entries like <txid, obj, op, old, new> ID of transaction, object modified, (optionally) the operation performed, the old value and the new value This means we can both roll forward (redo operations) and rollback (undo operations) When persisting a transaction to disk: First log a special entry <txid, START> Next log a number of entries to describe operations Finally log another special entry <txid, COMMIT> 19

Using a Write Ahead Log When executing transactions, ti perform updates dt to objects in memory with lazy write back i.e. the OS can push changes to disk whenever it wants Initially can do the same with the log entries But when wish to commit a transaction, must first snhronosl synchronously flush a commit record to the log Assume there is a fsync operation or similar which allows us to force data out to disk Only report transaction as committed when fsync returns Can improve performance by delaying flush until we have a number oftransaction to commit Hence at any point in time we have some prefix of the writeahead log on disk, and the rest in memory 20

The Big Picture RAM acts as a cache of disk (e.g. no copy of z) Log conceptually infinite, and spans RAM & Disk Object Values RAM x = 3 y = 27 Log Entries T3, START T2, ABORT T2, y, 17, 27 T1, x, 2, 3 Disk Object Values s x = 1 y = 17 z = 42 On disk values may be older versions of objects Log Entries T2, z, 40, 42 T2, START T1, START T0, COMMIT T0, x, 1, 2 T0, START Newer Log Entries Older Log Entries 21

Checkpoints As described, d log will get very long And need to process every entry in log to recover Better to periodically write a checkpoint Flush all current in memory log records to disk Write a special checkpoint record to log which contains a list of active transactions Flush all dirty objects (i.e. ensure object values on disk are up to date) Flush hlocation of new checkpoint record to disk (Not fatal if crash during final write) 22

Checkpoints and Recovery Key benefit of a checkpoint is it lets us focus our attention on possibly affected txactions Time Checkpoint Time Failure Time T1 T1: no action required T2 T2: REDO Active at checkpoint. Has since committed; and record in log. T3 T4 T5 T3: UNDO T4: REDO T1:UNDO Ati Active at checkpoint; in progress at crash. Not active at checkpoint. But has since committed, and commit record in log. Not active at checkpoint, and still in progress. 23

Recovery Algorithm Initialize ii undo list U = { set of active txactions } Also have redo list R, initially empty Walk log forward from checkpoint record: If see a START record, add txaction to U If see a COMMIT record, move txaction from U >R Whenhitendoflog hit end log, perform undo: Walk backward and undo all records for all Tx in U Whenreach checkpointrecord again, Redo: Walk forward, and re do all records for all Tx in R 24

Transactions: Summary Standard dmutual exclusion techniques not great for dealing with >1 object intricate locking (& lock order) required, or single coarse grained lock, limiting concurrency Transactions allow us a better way: potentially many operations (reads and updates) on many objects, but should execute as if atomically underlying system deals with providing isolation, allowing safe concurrency, and even fault tolerance! Transactions widely used in database systems 25

Advanced Topics Will briefly look at two advanced topics lock free data structures, and transactional memory This is informational l& not examinable! but worth knowing at least something about (Those of you who are super keen are invited to attend Tim Harris s s ACS course: 4pm 6pm on Thu Nov 3, 10 and 17; in SW01) 26

Lock free Programming What s wrong with ihlocks? Difficult to get right (if locks are fine grained) Don t scale well (if locks too coarse grained) Don t compose well (deadlock!) Poor cache behavior (e.g. convoying) Priority inversion And can be expensive Lock free programming involves getting rid of locks... but not at the cost of safety! 27

Assumptions We have a shared memory system Low level (assembly instructions) include: val = read(addr); // atomic read from memory (void) write(addr, val); // atomic write to memory done = CAS(addr, old, new); // atomic compare-and-swap Compare and Swap (CAS) is atomic reads value of addr ( val ), compares with old, and updates memory to new iff old==val without interruption! somethinglikethis this instruction commononmostmodern on most modern processors (e.g. cmpxchg on x86) Typically used to build spinlocks (or mutexes, or semaphores, or sequencers, or whatever...) 28

Lock free Approach Directly use CAS to update shared date As an example consider a lock freelinked linked list of integer values list is singly linked, and sorted td Represents the set abstract data type, i.e. find(int) > bool insert(int) > bool delete(int) > bool 29

Searching a sorted list find(20): 20? H 10 30 T find(20) > > false fl Non blocking data structures and transactional memory 30

Inserting an item with CAS insert(20): 30 20 H 10 30 T 20 insert(20) > true Non blocking data structures and transactional memory 31

Inserting an item with CAS insert(20): insert(25): 30 20 30 25 H 10 30 T 25 20 Non blocking data structures and transactional memory 32

Searching and finding together find(20) > fl false insert(20) > true 20? H 10 30 T 20 Non blocking data structures and transactional memory 33

Searching and finding together find(20) > fl false insert(20) > true 20? This thread saw 20 was not in the set......but this thread succeeded in putting it in! H 10 30 T 20 Is this a correct implementation of a set? Should the programmer be surprised dif this happens? What about more complicated mixes of operations? Non blocking data structures and transactional memory 34

Linearizability As with ihtransactions, we return to a conceptual model to define correctness a lock free data structure is correct if all changes (and return values) consistent with some serial view: we call this a linearizable schedule Hence in the previous example, we were ok: can just tdeem the find() to have occurred first Gets a lot more complicated for more complicated data structures & operations! see Tim Harris s course for more gory details... 35

Transactional Memory (TM) Steal idea from dtb databases! Instead of: lock(&mylock); shared[i] *= shared[j] + 17; unlock(&mylock); Use: atomic { shared[i] *= shared[j] + 17; } Has obvious semantics, i.e. all operations within block occur as if atomically Transactional since under the hood it looks like: do { txid = tx_begin(&thd); { g ( ); shared[i] *= shared[j] + 17; } while!(tx_commit(txid));

TM Advantages Simplicity: programmer just puts atomic { } around anything h/h he/she wants to occur in isolation Composability: unlike locks, atomic { } blocks nest, e.g: credit(a, x) = atomic { setbal(a, readbal(a) + x); } debit(a, x) = atomic { setbal(a, readbal(a) - x); } transfer(a, b, x) = atomic { debit(a, x); credit(b, x); }

TM Advantages Cannot deadlock: No locks, so don t have to worry about locking order (Though may get livelock if not careful) No races (kinda): Cannot forget to take a lock (although you can forget to put atomic { } around your critical section ; ) Scalability: High performance possible via OCC No need to worry about complex fine grained locking

TM is very promising Essentially ill does ACI C but no D no need to worry about crash recovery can work entirely in memory some hardware support emerging (or promised) But not a panacea Contention management can get ugly Difficulties with irrevocable actions (e.g. IO) Still working out exact semantics (type (yp of atomicity, handling exceptions, signaling,...) For more details, see Tim Harris s course

Concurrent Systems: Summary Concurrency is essential in modern systems overlapping I/O with computation exploiting multi core building distributed systems Butthrows throws upa lot of challenges need to ensure safety, allow synchronization, and avoid issues of liveness (deadlock, livelock,...) Major risk of over engineering generally worth building sequential system first and worth using existing libraries, tools and design patterns rather than rolling your own! 40