Linearizability of Persistent Memory Objects

Similar documents
Linearizability of Persistent Memory Objects

Dalí: A Periodically Persistent Hash Map

Overview: Memory Consistency

Foundations of the C++ Concurrency Memory Model

Lock-Free, Wait-Free and Multi-core Programming. Roger Deran boilerbay.com Fast, Efficient Concurrent Maps AirMap

RELAXED CONSISTENCY 1

Hardware Support for NVM Programming

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Relaxed Memory-Consistency Models

Shared Memory Consistency Models: A Tutorial

The Java Memory Model

Coherence and Consistency

The transaction. Defining properties of transactions. Failures in complex systems propagate. Concurrency Control, Locking, and Recovery

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Safety SPL/2010 SPL/20 1

Concurrent Objects and Linearizability

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Conflict Detection and Validation Strategies for Software Transactional Memory

Symmetric Multiprocessors: Synchronization and Sequential Consistency

New Programming Abstractions for Concurrency. Torvald Riegel Red Hat 12/04/05

Relaxed Memory-Consistency Models

Reasoning About The Implementations Of Concurrency Abstractions On x86-tso. By Scott Owens, University of Cambridge.

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

AST: scalable synchronization Supervisors guide 2002

Failure-atomic Synchronization-free Regions

PROVING THINGS ABOUT PROGRAMS

Chapter 9 :: Subroutines and Control Abstraction

Remote Persistent Memory SNIA Nonvolatile Memory Programming TWG

Don t Start with Dekker s Algorithm: Top-Down

Using Relaxed Consistency Models

Concurrent Objects. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Announcements. ECE4750/CS4420 Computer Architecture L17: Memory Model. Edward Suh Computer Systems Laboratory

New Abstractions for Fast Non-Volatile Storage

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin)

The Java Memory Model

Lecture 12. Lecture 12: The IO Model & External Sorting

The Wait-Free Hierarchy

ZooKeeper & Curator. CS 475, Spring 2018 Concurrent & Distributed Systems

Sequential Consistency for Heterogeneous-Race-Free

EECS 482 Introduction to Operating Systems

Lock Oscillation: Boosting the Performance of Concurrent Data Structures

Lecture 11: Relaxed Consistency Models. Topics: sequential consistency recap, relaxing various SC constraints, performance comparison

Defining properties of transactions

k-abortable Objects: Progress under High Contention

Lecture 13: Consistency Models. Topics: sequential consistency, requirements to implement sequential consistency, relaxed consistency models

Lecture 12: Relaxed Consistency Models. Topics: sequential consistency recap, relaxing various SC constraints, performance comparison

Building Efficient Concurrent Graph Object through Composition of List-based Set

Efficient Persist Barriers for Multicores

Non-blocking Array-based Algorithms for Stacks and Queues!

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant

Review of last lecture. Peer Quiz. DPHPC Overview. Goals of this lecture. Lock-based queue

NOW Handout Page 1. Memory Consistency Model. Background for Debate on Memory Consistency Models. Multiprogrammed Uniprocessor Mem.

Hardware Memory Models: x86-tso

Section 4 Concurrent Objects Correctness, Progress and Efficiency

SoftWrAP: A Lightweight Framework for Transactional Support of Storage Class Memory

Program logics for relaxed consistency

1 The comparison of QC, SC and LIN

Consistency: Strict & Sequential. SWE 622, Spring 2017 Distributed Software Engineering

Implementing Rollback: Copy. 2PL: Rollback. Implementing Rollback: Undo. 8L for Part IB Handout 4. Concurrent Systems. Dr.

Lock-free Serializable Transactions

Concurrent specifications beyond linearizability

Fork Sequential Consistency is Blocking

Thread Safety. Review. Today o Confinement o Threadsafe datatypes Required reading. Concurrency Wrapper Collections

C++ Memory Model. Don t believe everything you read (from shared memory)

Recall: Primary-Backup. State machine replication. Extend PB for high availability. Consensus 2. Mechanism: Replicate and separate servers

Chapter 17: Recovery System

Failure Classification. Chapter 17: Recovery System. Recovery Algorithms. Storage Structure

Taming release-acquire consistency

Parallel Computer Architecture Spring Memory Consistency. Nikos Bellas

3. Memory Persistency Goals. 2. Background

Designing Memory Consistency Models for. Shared-Memory Multiprocessors. Sarita V. Adve

Project Compiler. CS031 TA Help Session November 28, 2011

Lectures 8 & 9. Lectures 7 & 8: Transactions

Important Lessons. A Distributed Algorithm (2) Today's Lecture - Replication

TriCheck: Memory Model Verification at the Trisection of Software, Hardware, and ISA

Trek: Testable Replicated Key-Value Store

Parallel Programming in Distributed Systems Or Distributed Systems in Parallel Programming

The C/C++ Memory Model: Overview and Formalization

Log-Free Concurrent Data Structures

Chapter 17: Recovery System

ThyNVM. Enabling So1ware- Transparent Crash Consistency In Persistent Memory Systems

More Advanced OpenMP. Saturday, January 30, 16

HSA MEMORY MODEL HOT CHIPS TUTORIAL - AUGUST 2013 BENEDICT R GASTER

Review of last lecture. Goals of this lecture. DPHPC Overview. Lock-based queue. Lock-based queue

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017

SPIN, PETERSON AND BAKERY LOCKS

40 Behaviour Compatibility

Unit 12: Memory Consistency Models. Includes slides originally developed by Prof. Amir Roth

Reuse, don t Recycle: Transforming Lock-free Algorithms that Throw Away Descriptors

Fork Sequential Consistency is Blocking

New Programming Abstractions for Concurrency in GCC 4.7. Torvald Riegel Red Hat 12/04/05

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012

What You can Do with NVDIMMs. Rob Peglar President, Advanced Computation and Storage LLC

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Linked Lists: The Role of Locking. Erez Petrank Technion

CMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago

Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

Transcription:

Linearizability of Persistent Memory Objects Michael L. Scott Joint work with Joseph Izraelevitz & Hammurabi Mendes www.cs.rochester.edu/research/synchronization/ Compiler-Driven Performance Workshop, CASCON Markham, ON, Canada, November 2016 based on work presented at DISC 2016

Fast Nonvolatile Memory NVM is on its way» PCM, STT-MRAM, ReRAM,... Tempting to put some long-lived data directly in NVM, rather than the file system But registers and caches are likely to remain transient, at least on many machines Have do we make sure what we get in the wake of a crash (power failure) is consistent? Implications for algorithm design & for compilation MLS 2

Problem: Early Writebacks Could assume HW tracks dependences and forces out earlier stuff» [Condit et al., Pelley et al., Joshi et al.] But real HW not doing that any time soon writes-back can happen in any order» Danger that B will perform and persist updates based on actions taken but not yet persisted by A» Have to explicitly force things out in order (ARM, Intel ISAs) Further complications due to buffering» Can be done in SW now, with shadow memory» Likely to be supported in HW eventually MLS 3

Outline Concurrent object correctness durable linearizability Hardware memory model Explicit epoch persistency Automatic transform to convert a (correct) transient nonblocking object into a (correct) persistent one Methodology to prove safety for more general objects Future directions MLS 4

Linearizability [Herlihy & Wing 1987] Standard safety criterion for transient objects Concurrent execution H guaranteed to be equivalent (same invocations and responses, inc. args) to some sequential execution S that respects 1. object semantics (legal) 2. real-time order (res(a) < H inv(b) A < S B) (subsumes per-thread program order) Need an extension for persistence MLS 5

Durable Linearizability Execution history H is durably linearizable iff 1. It s well formed (no thread survives a crash) and 2. It s linearizable if you elide the crashes But that requires every op to persist before returning Want a buffered variant H is buffered durably linearizable iff for each inter-crash era Ei we can identify a consistent cut Pi of Ei s real-time order such that P0... Pi-1 Ei is linearizable 0 i c, where c is the number of crashes.» That is, we may lose something at each crash, but what's left makes sense. (Again, buffering may be in HW or in SW.) MLS 6

Proving Code Correct Need to show that all realizable instruction histories are equivalent to legal abstract (operation-level) histories. For this we need to understand the hardware memory model, which determines which writes may be seen by which reads. And that model needs extension for persistence. MLS 7

Memory Model Background Sequential consistency: memory acts as if there was a total order on all loads and stores across all threads» Conceptually appealing, but only IBM z still supports it Relaxed models: separate ordinary and synchronizing accesses» Latter determine cross-thread ordering arcs» Happens-before order derived from per-thread & cross-thread orders Release consistency: each store-release synchronizes with the following load-acquire of the same location» Each local access happens after each previous load-acquire and before each subsequent store-release in its thread» Straightforward extension to Power But none of this addresses persistence MLS 8

Persistence Instructions Explicit write back ( pwb ); persistence fence ( pfence ); persistence sync ( psync ) We assume E1 E2 if» they're in the same thread and E1 = pwb & E2 {pfence, psync} E1 {pfence, psync} and E2 {pwb, st, st_rel} E1, E2 {st, st_rel, pwb} and access the same location E1 {ld, ld_acq}, E2 = pwb, and access the same location E1 = ld_acq and E2 {pfence, psync}» they re in different threads and E1 = st_rel, E2 = ld_acq, and E1 synchronizes with E2 MLS 9

Instruction-level Histories Programs induce sets of possible histories possible thread interleavings. With persistence, the reads-see-writes relationship must be augmented to allow returning a value persisted prior to a recent crash. Key problem: you see a write, act on it, and persist what you did, but the original write doesn't persist before we crash. Absent explicit action, this can lead to inconsistency i.e., can break durable linearizability. MLS 10

Mechanical Transform st st; pwb st_rel pfence; st_rel; pwb ld_acq ld_acq; pwb; pfence cas pfence; cas; pwb; pfence ld ld Can prove: if the original code is DRF and linearizable, the transformed code is durably linearizable.» Key is the ld_acq rule If original code is nonblocking, recovery process is null But note: not all stores have to be persisted» elimination/combining, announce arrays for wait freedom How do we build a correctness argument for more general, hand-optimized code? MLS 11

Linearization Points Every operation appears to happen at some individual instruction, somewhere between its call and return. Proofs commonly leverage this formulation» In lock-based code, could be pretty much anywhere» In simple nonblocking operations, often at a distinguished CAS In general, linearization points» may be statically known» may be determined by each operation dynamically» may be reasoned in retrospect to have happened» (may be executed by another thread!) MLS 12

Persist Points Proof-writing strategy (again, challenge is make sure nothing new persists before something old on which it depends) Implementation is (buffered) durably linearizable if 1. somewhere between linearization point and response, all stores needed to "capture" the operation have been pwb-ed and pfence-d; 2. whenever M1 & M2 overlap, linearization points can be chosen s.t. either M1 s persist point precedes M2 s linearization point, or M2 s linearization point precedes M1 s linearization point. NB: nonblocking persistent objects need helping: if an op has linearized but not yet persisted, its successor in linearization order must be prepared to push it through to persistence. MLS 13

Ongoing Work More optimized, nonblocking persistent objects Integrity in the face of buggy (Byzantine) user threads» File system no longer protects metadata! Integration w/ transactions Suggestions/collaborations welcome! MLS 14

www.cs.rochester.edu/research/synchronization/ www.cs.rochester.edu/u/scott/