Reminder from last time

Similar documents
Concurrent systems Lecture 7: Crash recovery, lock- free programming, and transac<onal memory. Dr Robert N. M. Watson

Implementing Rollback: Copy. 2PL: Rollback. Implementing Rollback: Undo. 8L for Part IB Handout 4. Concurrent Systems. Dr.

Concurre Concurr nt n Systems 8L for Part IB Handout 4 Dr. Steven Hand 1

Reminder from last time

Concurrent & Distributed Systems Supervision Exercises

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

RECOVERY CHAPTER 21,23 (6/E) CHAPTER 17,19 (5/E)

Problems Caused by Failures

T ransaction Management 4/23/2018 1

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #19: Logging and Recovery 1

Transaction Management & Concurrency Control. CS 377: Database Systems

In This Lecture. Transactions and Recovery. Transactions. Transactions. Isolation and Durability. Atomicity and Consistency. Transactions Recovery

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. General Overview NOTICE: Faloutsos CMU SCS

CS122 Lecture 15 Winter Term,

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Database Technology. Topic 11: Database Recovery

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Today s Class. Faloutsos/Pavlo CMU /615

Producer Consumer in Ada

Lectures 8 & 9. Lectures 7 & 8: Transactions

Transaction Management: Crash Recovery (Chap. 18), part 1

some sequential execution crash! Recovery Manager replacement MAIN MEMORY policy DISK

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Administrivia. Last Class. Faloutsos/Pavlo CMU /615

Lecture 21: Logging Schemes /645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo

ACID Properties. Transaction Management: Crash Recovery (Chap. 18), part 1. Motivation. Recovery Manager. Handling the Buffer Pool.

Unit 9 Transaction Processing: Recovery Zvi M. Kedem 1

Transactions. 1. Transactions. Goals for this lecture. Today s Lecture

Linked Lists: The Role of Locking. Erez Petrank Technion

6.830 Lecture Recovery 10/30/2017

6.830 Lecture Recovery 10/30/2017

Concurre Concurr nt n Systems 8L for Part IB Handout 3 Dr. Steven Hand 1

Reminder from last time

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Transactions. Transaction. Execution of a user program in a DBMS.

Atomic Transactions

Review: The ACID properties. Crash Recovery. Assumptions. Motivation. Preferred Policy: Steal/No-Force. Buffer Mgmt Plays a Key Role

Transactions and Recovery Study Question Solutions

Outline. Purpose of this paper. Purpose of this paper. Transaction Review. Outline. Aries: A Transaction Recovery Method

Atomicity: All actions in the Xact happen, or none happen. Consistency: If each Xact is consistent, and the DB starts consistent, it ends up

UNIT 9 Crash Recovery. Based on: Text: Chapter 18 Skip: Section 18.7 and second half of 18.8

Ext3/4 file systems. Don Porter CSE 506

Atomic Transac1ons. Atomic Transactions. Q1: What if network fails before deposit? Q2: What if sequence is interrupted by another sequence?

Concurrency Control & Recovery

Database Management System

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017

File Systems: Consistency Issues

A tomicity: All actions in the Xact happen, or none happen. D urability: If a Xact commits, its effects persist.

Crash Recovery Review: The ACID properties

CSE 190D Database System Implementation

Crash Recovery. The ACID properties. Motivation

Transaction Management: Concurrency Control, part 2

Locking for B+ Trees. Transaction Management: Concurrency Control, part 2. Locking for B+ Trees (contd.) Locking vs. Latching

CS5460: Operating Systems

Crash Recovery. Chapter 18. Sina Meraji

Crash Recovery CMPSCI 645. Gerome Miklau. Slide content adapted from Ramakrishnan & Gehrke

CompSci 516: Database Systems

Transaction Processing. Introduction to Databases CompSci 316 Fall 2018

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

Advances in Data Management Transaction Management A.Poulovassilis

Lecture X: Transactions

The Deadlock Lecture

System Malfunctions. Implementing Atomicity and Durability. Failures: Crash. Failures: Abort. Log. Failures: Media

Chapter 17: Recovery System

The transaction. Defining properties of transactions. Failures in complex systems propagate. Concurrency Control, Locking, and Recovery

Recoverability. Kathleen Durant PhD CS3200

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 21 November 2014

CAS CS 460/660 Introduction to Database Systems. Recovery 1.1

CS5412: TRANSACTIONS (I)

Transaction Management: Introduction (Chap. 16)

Multi-Core Computing with Transactional Memory. Johannes Schneider and Prof. Roger Wattenhofer

Last time. Distributed systems Lecture 6: Elections, distributed transactions, and replication. DrRobert N. M. Watson

Transaction Management. Readings for Lectures The Usual Reminders. CSE 444: Database Internals. Recovery. System Crash 2/12/17

XI. Transactions CS Computer App in Business: Databases. Lecture Topics

Introduction. Storage Failure Recovery Logging Undo Logging Redo Logging ARIES

CSC 261/461 Database Systems Lecture 21 and 22. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Lecture 18: Reliable Storage

CSL373/CSL633 Major Exam Solutions Operating Systems Sem II, May 6, 2013 Answer all 8 questions Max. Marks: 56

Reminder from last ;me

ò Very reliable, best-of-breed traditional file system design ò Much like the JOS file system you are building now

Topics in Reliable Distributed Systems

Review: The ACID properties. Crash Recovery. Assumptions. Motivation. More on Steal and Force. Handling the Buffer Pool

DATABASE TRANSACTIONS. CS121: Relational Databases Fall 2017 Lecture 25

ATOMIC COMMITMENT Or: How to Implement Distributed Transactions in Sharded Databases

CSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Recovery System These slides are a modified version of the slides of the book Database System Concepts (Chapter 17), 5th Ed McGraw-Hill by

Database Recovery Techniques. DBMS, 2007, CEng553 1

Database Management Systems CSEP 544. Lecture 9: Transactions and Recovery

Lecture 20: Transactional Memory. Parallel Computer Architecture and Programming CMU , Spring 2013

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17

CS122 Lecture 18 Winter Term, All hail the WAL-Rule!

Chapter 17: Recovery System

Failure Classification. Chapter 17: Recovery System. Recovery Algorithms. Storage Structure

Tackling Concurrency With STM. Mark Volkmann 10/22/09

Tackling Concurrency With STM

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012

CSE 444: Database Internals. Lectures Transactions

CSE 444: Database Internals. Lectures 13 Transaction Schedules

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Transcription:

Concurrent systems Lecture 7: Crash recovery, lock-free programming, and transactional memory DrRobert N. M. Watson 1 Reminder from last time History graphs; good (and bad) schedules Isolation vs. strict isolation; enforcing isolation Two-phase locking; rollback Timestamp ordering (TSO) Optimistic concurrency control (OCC) Isolation and concurrency summary 2 1

This time Transactional durability: crash recovery and logging Write-ahead logging Checkpoints Recovery Advanced topics Lock-free programming Transactional memory A few notes on supervision exercises 3 Crash Recovery & Logging Transactions require ACID properties So far have focused on I (and implicitly C). How can we ensure Atomicity & Durability? Need to make sure that if a transaction always done entirely or not at all Need to make sure that a transaction reported as committed remains so, even after a crash Consider for now a fail-stop model: If system crashes, all in-memory contents are lost Data on disk, however, remains available after reboot The small print: we must keep in mind the limitations of fail-stop, even as we assume it. Failing hardware/software do weird stuff. Pay attention to hardware price differentiation. 4 2

Using persistent storage Simplest solution : write all updated objects to disk on commit, read back on reboot Doesn t work, since crash could occur during write Can fail to provide Atomicity and/or Consistency Instead split update into two stages 1. Write proposed updates to a write-ahead log 2. Write actual updates Crash during #1 => no actual updates done Crash during #2 => use log to redo, or undo 5 Write-ahead logging Log: an ordered, append-only file on disk Contains entries like <txid, obj, op, old, new> ID of transaction, object modified, (optionally) the operation performed, the old value and the new value This means we can both roll forward (redo operations) and rollback (undo operations) When persisting a transaction to disk: First log a special entry <txid, START> Next log a number of entries to describe operations Finally log another special entry <txid, COMMIT> We build composite-operation atomicity from fundamental atomic unit: single-sector write. Much like building high-level primitives over LL/SC or CAS! 6 3

Using a write-ahead log When executing transactions, perform updates to objects in memory with lazy write back I.e. the OS can delay disk writes to improve efficiency Invariant: write log records before corresponding data But when wish to commit a transaction, must first synchronously flush a commit record to the log Assume there is a fsync() or fsyncdata() operation or similar which allows us to force data out to disk Only report transaction committed when fsync() returns Can improve performance by delaying flush until we have a number of transaction to commit - batching Hence at any point in time we have some prefix of the write-ahead log on disk, and the rest in memory 7 The Big Picture RAM acts as a cache of disk (e.g. no in-memory copy of z) Log conceptually infinite, and spans RAM & Disk RAM Object Values x = 3 y = 27 Log Entries T3, START T2, ABORT T2, y, 17, 27 T1, x, 2, 3 Disk Object Values x = 1 y = 17 z = 42 On-disk values may be older versions of objects or new uncommitted values as long as on-disk log describes rollback (e.g., z) Log Entries T2, z, 40, 42 T2, START T1, START T0, COMMIT T0, x, 1, 2 T0, START Newer Log Entries Older Log Entries 8 4

Checkpoints As described, log will get very long And need to process every entry in log to recover Better to periodically write a checkpoint Flush all current in-memory log records to disk Write a special checkpoint record to log which contains a list of active transactions Flush all dirty objects (i.e. ensure object values on disk are up to date) Flush location of new checkpoint record to disk (Not fatal if crash during final write) 9 Checkpoints and recovery Key benefit of a checkpoint is it lets us focus our attention on possibly affected transactions Time Checkpoint Time Failure Time T1 T1: no action required T2 T2: REDO Active at checkpoint. Has since committed; and record in log. T3 T4 T5 T3: UNDO T4: REDO T5: UNDO Active at checkpoint; in progress at crash. Not active at checkpoint. But has since committed, and commit record in log. Not active at checkpoint, and still in progress. 10 5

Recovery algorithm Initialize undo list U = { set of active txactions } Also have redo list R, initially empty Walk log forward from checkpoint record: If see a START record, add transaction to U If see a COMMIT record, move transaction from U->R When hit end of log, perform undo: Walk backward and undo all records for all Tx in U When reach checkpoint record again, Redo: Walk forward, and re-do all records for all Tx in R After recovery, we have effectively checkpointed On-disk store is consistent, so can truncate the log The order in which we apply undo/redo records is important to properly handling cases where multiple transactions touch the same data 11 Write-ahead logging: assumptions What can go wrong writing commits to disk? Even if sector writes are atomic: All affected objects may not fit in a single sector Large objects may span multiple sectors Trend towards copy-on-write, rather than journaled, FSes Many of the problems seen with in-memory commit (ordering and atomicity) apply to disks as well! Contemporary disks may not be entirely honest about sector size and atomicity E.g., unstable write caches to improve efficiency E.g., larger or smaller sector sizes than advertises E.g., non-atomicity when writing to mirrored disks These assumes fail-stop which is not true for some media 12 6

Transactions: summary Standard mutual exclusion techniques not great for dealing with >1 object intricate locking (& lock order) required, or single coarse-grained lock, limiting concurrency Transactions allow us a better way: potentially many operations (reads and updates) on many objects, but should execute as if atomically underlying system deals with providing isolation, allowing safe concurrency, and even fault tolerance! Transactions used in databases + filesystems 13 Advanced Topics Will briefly look at two advanced topics lock-free data structures, and transactional memory Then, next time, on to a case study 14 7

Lock-free programming What s wrong with locks? Difficult to get right (if locks are fine-grained) Don t scale well (if locks too coarse-grained) Don t compose well (deadlock!) Poor cache behavior (e.g. convoying) Priority inversion And can be expensive Lock-free programming involves getting rid of locks... but not at the cost of safety! Recall TAS, CAS, LL/SC from our first lecture: what if we used them to implement something other than locks? 15 Assumptions We have a shared memory system Low-level (assembly instructions) include: val = read(addr); // atomic read from memory (void) write(addr, val); // atomic write to memory done = CAS(addr, old, new); // atomic compare-and-swap Compare-and-Swap (CAS) is atomic reads value of addr ( val ), compares with old, and updates memory to new iff old==val -- without interruption! something like this instruction common on most modern processors (e.g. cmpxchg on x86 or LL/SC on RISC) Typically used to build spinlocks (or mutexes, or semaphores, or whatever...) 16 8

Lock-free approach Directly use CAS to update shared data As an example consider a lock-free linked list of integer values list is singly linked, and sorted Use CAS to update pointers Handle CAS failure cases (i.e., races) Represents the set abstract data type, i.e. find(int) -> bool insert(int) -> bool delete(int) -> bool Assumption: hardware supports atomic operations on pointer-size types 17 Searching a sorted list find(20): 20? H 10 30 T find(20) -> false Non-blocking data structures and transactional memory 18 9

Inserting an item with CAS insert(20): 30 20 ü H 10 30 T 20 insert(20) -> true Non-blocking data structures and transactional memory 19 Inserting an item with CAS insert(20): insert(25): 30 20 30 25 ü û H 10 30 T 25 20 Non-blocking data structures and transactional memory 20 10

Concurrent find+insert find(20) -> false insert(20) -> true 20? H 10 30 T 20 Non-blocking data structures and transactional memory 21 Concurrent find+insert find(20) -> false This thread saw 20 20? was not in the set... insert(20) -> true...but this thread succeeded in putting it in! H 10 30 T 20 Is this a correct implementation of a set? Should the programmer be surprised if this happens? What about more complicated mixes of operations? Non-blocking data structures and transactional memory 22 11

Linearisability As with transactions, we return to a conceptual model to define correctness a lock-free data structure is correct if all changes (and return values) are consistent with some serial view: we call this a linearisable schedule Hence in the previous example, we were ok: can just deem the find() to have occurred first Gets a lot more complicated for more complicated data structures & operations! NB: On current hardware, synchronisation does more than just provide atomicity Also provides ordering: happens-before Lock-free structures must take this into account as well 23 Transactional Memory (TM) Steal idea from databases! Instead of: lock(&mylock); shared[i] *= shared[j] + 17; unlock(&mylock); 4Use: atomic { shared[i] *= shared[j] + 17; } 4Has obvious semantics, i.e. all operations within block occur as if atomically 4Transactional since under the hood it looks like: do { txid = tx_begin(&thd); shared[i] *= shared[j] + 17; } while!(tx_commit(txid)); 12

TM advantages Simplicity: Programmer just puts atomic { } around anything he/she wants to occur in isolation Composability: Unlike locks, atomic { } blocks nest, e.g.: credit(a, x) = atomic { setbal(a, readbal(a) + x); } debit(a, x) = atomic { setbal(a, readbal(a) - x); } transfer(a, b, x) = atomic { debit(a, x); credit(b, x); } TM advantages Cannot deadlock: No locks, so don t have to worry about locking order (Though may get live lock if not careful) No races (mostly): Cannot forget to take a lock (although you can forget to put atomic { } around your critical section ;-)) Scalability: High performance possible via OCC No need to worry about complex fine-grained locking There is still a simplicity vs. performance tradeoff Too much atomic {} and implementation can t find concurrency. Too little, and race conditions. 13

TM is very promising Essentially does ACI but no D no need to worry about crash recovery can work entirely in memory some hardware support emerging (or promised) But not a panacea Contention management can get ugly Difficulties with irrevocable actions (e.g. IO) Still working out exact semantics (type of atomicity, handling exceptions, signaling,...) Recent x86 hardware has started to provide direct support for transactions; not widely used And promptly withdrawn in errata Now back on the street again but very new Supervision questions + exercises Supervision questions S1: Threads and synchronisation Semaphores, priorities, and work distribution S2: Transactions ACID properties, 2PL, TSO, and OCC Other C&DS topics also important, of course! Optional Java practical exercises Java concurrency primitives and fundamentals Threads, synchronisation, guarded blocks, producerconsumer, and data races 28 14

Concurrent systems: summary Concurrency is essential in modern systems overlapping I/O with computation exploiting multi-core building distributed systems But throws up a lot of challenges need to ensure safety, allow synchronization, and avoid issues of liveness (deadlock, livelock,...) Major risk of over-engineering generally worth building sequential system first and worth using existing libraries, tools and design patterns rather than rolling your own! 29 Summary + next time Transactional durability: crash recovery and logging Write-ahead logging; checkpoints; recovery Advanced topics Lock-free programming Transactional memory Notes on supervision exercises Next time: Concurrent system case study the FreeBSD kernel Brief history of kernel concurrency Primitives and debugging tools Applications to the network stack 30 15