Language- Level Memory Models

Similar documents
Programming Language Memory Models: What do Shared Variables Mean?

Nondeterminism is Unavoidable, but Data Races are Pure Evil

The Java Memory Model

High-level languages

C++ Memory Model. Don t believe everything you read (from shared memory)

11/19/2013. Imperative programs

Java Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016

C++ Memory Model. Martin Kempf December 26, Abstract. 1. Introduction What is a Memory Model

The Java Memory Model

Multicore Programming Java Memory Model

The New Java Technology Memory Model

C++ 11 Memory Consistency Model. Sebastian Gerstenberg NUMA Seminar

A unified machine-checked model for multithreaded Java

Chenyu Zheng. CSCI 5828 Spring 2010 Prof. Kenneth M. Anderson University of Colorado at Boulder

Part 1. Shared memory: an elusive abstraction

Using Weakly Ordered C++ Atomics Correctly. Hans-J. Boehm

Coherence and Consistency

Foundations of the C++ Concurrency Memory Model

EECS 570 Lecture 13. Directory & Optimizations. Winter 2018 Prof. Satish Narayanasamy

Memory Consistency Models: Convergence At Last!

Advanced MEIC. (Lesson #18)

Threads and Java Memory Model

Safe Optimisations for Shared-Memory Concurrent Programs. Tomer Raz

The C++ Memory Model. Rainer Grimm Training, Coaching and Technology Consulting

Review of last lecture. Peer Quiz. DPHPC Overview. Goals of this lecture. Lock-based queue

The Java Memory Model

Weak memory models. Mai Thuong Tran. PMA Group, University of Oslo, Norway. 31 Oct. 2014

CSE 374 Programming Concepts & Tools

Overhauling SC atomics in C11 and OpenCL

Threads and Locks. Chapter Introduction Locks

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization

Program logics for relaxed consistency

HSA MEMORY MODEL HOT CHIPS TUTORIAL - AUGUST 2013 BENEDICT R GASTER

Lecture 32: Volatile variables, Java memory model

An introduction to weak memory consistency and the out-of-thin-air problem

The C/C++ Memory Model: Overview and Formalization

Atomicity CS 2110 Fall 2017

Memory model for multithreaded C++: Issues

CS510 Advanced Topics in Concurrency. Jonathan Walpole

Quis Custodiet Ipsos Custodes The Java memory model

Adding std::*_semaphore to the atomics clause.

Motivation & examples Threads, shared memory, & synchronization

Overview: Memory Consistency

CS5460: Operating Systems

Overview of Lecture 4. Memory Models, Atomicity & Performance. Ben-Ari Concurrency Model. Dekker s Algorithm 4

Multicore Programming: C++0x

Audience. Revising the Java Thread/Memory Model. Java Thread Specification. Revising the Thread Spec. Proposed Changes. When s the JSR?

7/6/2015. Motivation & examples Threads, shared memory, & synchronization. Imperative programs

Lecture 13 The C++ Memory model Synchronization variables Implementing synchronization

NOW Handout Page 1. Memory Consistency Model. Background for Debate on Memory Consistency Models. Multiprogrammed Uniprocessor Mem.

Advanced concurrent programming in Java Shared objects

Final Field Semantics

Implementing the C11 memory model for ARM processors. Will Deacon February 2015

Threads Cannot Be Implemented As a Library

New description of the Unified Memory Model Proposal for Java

Reasoning about the C/C++ weak memory model

SharedArrayBuffer and Atomics Stage 2.95 to Stage 3

Review of last lecture. Goals of this lecture. DPHPC Overview. Lock-based queue. Lock-based queue

Tom Ball Sebastian Burckhardt Madan Musuvathi Microsoft Research

Taming release-acquire consistency

Outlawing Ghosts: Avoiding Out-of-Thin-Air Results

D Programming Language

Memory Consistency Models. CSE 451 James Bornholt

Enhancing std::atomic_flag for waiting.

Memory Models for C/C++ Programmers

Formalising Java s Data Race Free Guarantee

Potential violations of Serializability: Example 1

SPIN, PETERSON AND BAKERY LOCKS

RELAXED CONSISTENCY 1

Safe Non-blocking Synchronization in Ada 202x

Program Transformations in Weak Memory Models

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Analyzing the CRF Java Memory Model

Managed runtimes & garbage collection

New Programming Abstractions for Concurrency in GCC 4.7. Torvald Riegel Red Hat 12/04/05

Concurrent Programming Lecture 10

The Java Memory Model: a Formal Explanation 1

JSR-133: Java TM Memory Model and Thread Specification

CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2

Data-flow Analysis for Interruptdriven Microcontroller Software

What is uop Cracking?

Relaxed Memory Consistency

CS 261 Fall Mike Lam, Professor. Threads

Parallel Computer Architecture Spring Memory Consistency. Nikos Bellas

P1202R1: Asymmetric Fences

System Software Assignment 1 Runtime Support for Procedures

Implications of Memory Models (or Lack of Them) for Software Developers. J. L. Sloan

Rethinking Shared-Memory Languages and Hardware

ECSE 425 Lecture 15: More Dynamic Scheduling

Concurrent Programming in the D Programming Language. by Walter Bright Digital Mars

CS533 Concepts of Operating Systems. Jonathan Walpole

Other consistency models

The Java Memory Model

Data Races and Locks

Parallel Numerical Algorithms

Consistency & Coherence. 4/14/2016 Sec5on 12 Colin Schmidt

Synchronization in Java

GPU Concurrency: Weak Behaviours and Programming Assumptions

Learning from Bad Examples. CSCI 5828: Foundations of Software Engineering Lecture 25 11/18/2014

Java Memory Model for practitioners. by Vadym Kazulkin and Rodion Alukhanov, ip.labs GmbH

Transcription:

Language- Level Memory Models

A Bit of History Here is a new JMM [5]! 2000 Meyers & Alexandrescu DCL is not portable in C++ [3]. Manson et. al New shiny C++ memory model 2004 2008 2012 2002 2006 2010 2014 Bacon et. al Boehm Boehm & Adve Pugh Java Memory Model is broken [1]. Double- checked locking (DCL) is broken in Java [2]! Compiler oppmizapon break things [4]. We need a memory model for C++ [6].

The Need for a Language- Level Memory Model High- level languages did not provide adequate support for wripng portable and safe mulpthreaded code. TradiPonal compiler oppmizapons for serial code breaks mulpthreaded code. A memory model also informs the programmer how threads may interact through shared memory. A memory model informs the programmer what transformapon a compiler can / cannot do.

Compiler OpPmizaPon that Breaks MulPthreaded Code Ex: Redundant read eliminapon IniPally, p.x = 0 and p and q may be aliased. Redundant read eliminapon: Thread 1: int i = p.x; int j = q.x; int k = p.x; int i = p.x; int j = q.x; int k = i; Thread 2: //p == q... p.x = 42;... Thread 1's reads appear out of order.

Compiler OpPmizaPon that Breaks MulPthreaded Code Ex: HoisPng and register promopon hoispng and register promopon: for(p=x; p!=0; p=p->next) { if(p->data < 0) { count++; r1 = count; //r1 is a register for(p=x; p!=0; p=p->next) { if(p->data < 0) { r1++; count = r1; If another thread update count at the same Pme, the update will be lost.

Compiler TransformaPon that Breaks MulPthreaded Code Ex: Updates to fields in struct common compiler transformapon Thread 1: x.a = 1; struct tmp = x; tmp.a = 1; x = tmp; Thread 2: x.b = 42; Thread 2's update can be lost.

Common SynchronizaPon Idiom Thread 1: data =... flag = true; Thread 2: while(!flag){ res = data; res = data; while(!flag){ instrucpon reordering InstrucPon reordering causes thread 2 to read uninipalized data.

Common SynchronizaPon Idiom Thread 1: data =... SFENCE; flag = true; Thread 2: while(!flag){ MFENCE; res = data; In C and C++, we can get around around reordering with appropriate hardware fences, but this solupon is not portable (and there are no user- level fences in Java).

Lazy IniPalizaPon of Singleton (Single- Threaded Version) class Foo { private Helper helper = null; public Helper gethelper() { if (helper == null) helper = new Helper(); return helper; //... Not thread safe; mulpple helpers can get created.

Lazy IniPalizaPon of Singleton (Thread- Safe Version) class Foo { private Helper helper = null; public synchronized Helper gethelper() { if (helper == null) helper = new Helper(); return helper;... The code performs synchronizapon every Pme gethelper() is called. Can we avoid synchronizapon ager the helper is allocated?

Broken Double- Checked Locking Idiom class Foo { private Helper helper = null; public Helper gethelper() { if (helper == null) { synchronized(this) { if (helper == null) helper = new Helper(); return helper; //...

Broken Double- Checked Locking Idiom class Foo { private Helper helper = null; public Helper gethelper() { if (helper == null) { synchronized(this) { if (helper == null) helper = new Helper(); return helper; //... tmp =... tmp.field1 =...; tmp.field2 =...; helper = tmp;

Broken Double- Checked Locking Idiom class Foo { private Helper helper = null; public Helper gethelper() { if (helper == null) { synchronized(this) { if (helper == null) helper = new Helper(); return helper; //... tmp =... helper = tmp; tmp.field1 =...; tmp.field2 =...; Instruc/ons within a synchronized block can be reordered! Another thread can see the helper pointer being inipalized without the object construcpon completes.

Broken Double- Checked Locking Idiom class Foo { private Helper helper = null; public Helper gethelper() { if (helper == null) { synchronized(this) { Helper h = helper; if (h == null) synchronized (this) { h = new Helper(); // force a fence helper = h; return helper; //...

Broken Double- Checked Locking Idiom class Foo { private Helper helper = null; public Helper gethelper() { if (helper == null) { synchronized(this) { Helper h = helper; if (h == null) synchronized (this) { h = new Helper(); // force a fence helper = h; //... return helper; IntuiPon: Force the object construcpon to complete before sekng the helper pointer.

Broken Double- Checked Locking Idiom class Foo { private Helper helper = null; public Helper gethelper() { if (helper == null) { synchronized(this) { Helper h = helper; if (h == null) synchronized (this) { h = new Helper(); // force a fence helper = h; //... return helper; Broken intuipon: synchronized block (lock/ unlock) allows instrucpons to move inside the cripcal secpon (but never move out).

Broken Double- Checked Locking Idiom class Foo { private Helper helper = null; public Helper gethelper() { if (helper == null) { synchronized(this) { Helper h = helper; if (h == null) synchronized (this) { h = new Helper(); helper = h; In fact, there was NO easy way to fix this return helper; idiom with the old Java Memory Model. //... SPll broken: nothing prevents the sekng of the helper from moving into the synchronizapon block.

The New Java Memory Model Goal #1: weak enough to allow as much standard compiler oppmizapons as possible. Goal #2: simple enough to be usable. Goal #3: strong enough to preclude behaviors that can pose security hazard (e.g. out- of- thin air values), even for programs with data races. A data- race- free (DRF) model: if the code is data- race free, the execupon must be sequenpally consistent; otherwise, some safety guarantee, but no longer SC. Officially incorporated into Java 5.

What Is Data- Race Freedom? (DefiniPon I) Memory loca4on: each scalar value occupies a separate memory locapon (1/2/4/8 bytes). Types of acpons: data ac4ons: regular load (read) and store (write) synchroniza4on ac4ons: monitor enter (lock) and monitor exit (unlock) volaple load, volaple store, synchronized block thread create and the first acpon of the created thread, thread join and the last acpon of the terminapng thread. access to volatile fields

What Is Data- Race Freedom? (DefiniPon II) Intra- thread execupon: acpons within a thread are related by its program order Inter- thread execupon: acpons between threads are related by synchronizes- with edges (s.w.): unlock acpon s.w. subsequent lock acpon of the same lock a write to volaple var w.s. all subsequent reads to the same volaple var. thread create s.w. first acpon of the new thread last acpon of a terminapng thread s.w. thread join default inipalizapon of any variable s.w. first acpon of any thread Happens- Before relapon: transipve closure of the program order + order imposed by synchronizes- with edges

What Is Data- Race Freedom? (DefiniPon III) An execupon contains a data race if it contains two memory acpons from different threads that are: not ordered by happens- before conflic/ng (at least one of them is a write) at least one of them is a data opera4on (i.e., non volaple access) A program is data- race free if no sequen/ally consistent execupon contains a data races.

NOT A Data- Race- Free Program IniPally, x = 0 and y = 0 Thread 1: Thread 2: r1 = x; y = 42; r2 = y; x = 42; Data races on x and y, so the execupon can possibly result r1 = 42 and r2 = 42.

A Data- Race- Free Program IniPally, x = 0 and y = 0 Thread 1: r1 = x; if (r1!= 0) y = 42; Thread 2: r2 = y; if (r2!= 0) x = 42; Any sequenpally consistent execupon does not cause the write of x and y to occur, so there is no data race. So we should never see r1 = 42 and r2 = 42.

The New Java Memory Model Goal #1: weak enough to allow as much standard compiler oppmizapons as possible. Goal #2: simple enough to be usable. Goal #3: strong enough to preclude behaviors that can pose security hazard (e.g. out- of- thin air values), even for programs with data races. A data- race- free (DRF) model: if the code is data- race free, the execupon must be sequenpally consistent; otherwise, some safety guarantee, but no longer SC. Officially incorporated into Java 5. Much of the complica/on in JMM stems from Goal #3.

How to Prevent Compiler OpPmizaPon from Breaking Your Code IniPally, p.x = 0 and p and q may be aliased. Redundant read eliminapon: Thread 1: int i = p.x; int j = q.x; int k = p.x; int i = p.x; int j = q.x; int k = i; Thread 2:... p.x = 42;... Declare p.x to be volatile!

How to Prevent Compiler OpPmizaPon from Breaking Your Code hoispng and register promopon: for(p=x; p!=0; p=p->next) { if(p->data < 0) { count++; r1 = count; //r1 is a register for(p=x; p!=0; p=p->next) { if(p->data < 0) { r1++; count = r1; Problem: lost update if another thread update count at the same Pme. Use AtomicInteger if mulpple threads are updapng count.

How to Prevent Compiler OpPmizaPon from Breaking Your Code common compiler transformapon Thread 1: x.a = 1; struct tmp = x; tmp.a = 1; x = tmp; Thread 2: x.b = 42; Compilers don't do that unless the fields are bit fields.

How to Prevent Compiler OpPmizaPon from Breaking Your Code Thread 1: data =... flag = true; Thread 2: while(!flag){ res = data; res = data; while(!flag){ instrucpon reordering Prevent reordering by declaring flag to be volatile.

Double- Checked Locking that Works class Foo { private volatile Helper helper = null; public Helper gethelper() { if (helper == null) { synchronized(this) { if (helper == null) helper = new Helper(); return helper; //... tmp =... tmp.field1 =...; tmp.field2 =...; helper = tmp;

Double- Checked Locking that Works class Foo { private volatile Helper helper = null; public Helper gethelper() { if (helper == null) { synchronized(this) { if (helper == null) helper = new Helper(); return helper; //... tmp =... tmp.field1 =...; tmp.field2 =...; helper = tmp; volaple read volaple write A volaple read cannot be reordered with following instrucpons, and the volaple write cannot be reordered with previous instrucpons.

So Are We Done? Here is a new JMM [5]! 2000 Meyers & Alexandrescu DCL is not portable in C++ [3]. Manson et. al New shiny C++ memory model 2004 2008 2012 JMM is spll broken [7]. 2002 2006 2010 2014 Bacon et. al Boehm Boehm & Adve Sevcik & Aspinall Pugh Java Memory Model is broken [1]. Double- checked locking (DCL) is broken in Java [2]! Compiler oppmizapon break things [4]. We need a memory model for C++ [6].

The New Java Memory Model Goal #1: weak enough to allow as much standard compiler oppmizapons as possible. Goal #2: simple enough to be usable. Goal #3: strong enough to preclude behaviors that can pose security hazard (e.g. out- of- thin air values), even for programs with data races. A data- race- free (DRF) model: if the code is data- race free, the execupon must be sequenpally consistent; otherwise, some safety guarantee, but no longer SC. Officially incorporated into Java 5. If your program contains data races, the JMM may not accurately predict the program behaviors.

The New Java Memory Model C++ Goal #1: weak enough to allow as much standard compiler oppmizapons as possible. Goal #2: simple enough to be usable. Goal #3: strong enough to preclude behaviors that can pose security hazard (e.g. out- of- thin air values), even for programs with data races. A data- race- free (DRF) model: if the code is data- race free, the execupon must be sequenpally consistent; otherwise, some safety guarantee, but no longer SC. program behaviors undefined. Do NOT write programs with data races!

Data Race Can Bite IniPally, a = 0 and b = 1 Thread 1: r1 = a; r2 = a; if (r1==r2) b = 2; Thread 2: r3 = b; a = r3; Is r1 == r2 == r3 == 2 possible? Not in any SC execupon.

Data Race Can Bite IniPally, a = 0 and b = 1 Compiler trans- formapon: Thread 1: r1 = a; r2 = a; if (r1==r2) b = 2; r1 = a; r2 = r1; if (r1==r2) b = 2; Thread 2: r3 = b; a = r3;

Data Race Can Bite IniPally, a = 0 and b = 1 Thread 1: Thread 2: Compiler trans- formapon: r1 = a; r2 = a; if (r1==r2) b = 2; b = 2; r1 = a; r2 = r1; if (true); r3 = b; a = r3; Now it's possible to get r1 == r2 == r3 == 2! Claim: the reordering would not have been visible to the programmer if it weren't for the races on a and b. Or, declare them to be volaple.

Benign Race Can Bite IniPally, count is 17. Thread 1: Thread 2: count = 0; f(positive); f(positive); count = 0; f(p) { for(p=x; p!=0; p=p->next) { if(p->data < 0) { count++; The variable count is only ever wrisen with 0, because everything in the list is posipve. Both threads are uncondiponally wripng 0 to it, so the race is benign, right?

Benign Race Can Bite IniPally, count is 17. Thread 1: Thread 2: count = 0; f(positive); f(positive); count = 0; f(p) { for(p=x; p!=0; p=p->next) { if(p->data < 0) { count++; The variable count is only ever wrisen with 0, because everything in the list is posipve. Both threads are uncondiponally wripng 0 to it, so the race is benign, right?

Benign Race Can Bite IniPally, count is 17. Thread 1: Thread 2: count = 0; f(positive); f(positive); count = 0; f(p) { r1 = count; //r1 is a register for(p=x; p!=0; p=p->next) { if(p->data < 0) { r1++; count = r1; The write to count gives the compiler the license to hoist and do register promopon.

Benign Race Can Bite IniPally, count is 17. Thread 1: Thread 2: count = 0; r2 = count; r1 = count; count = r2; count = r1; count = 0; Interleaving that gets you in trouble: r2 = count; //17 count = 0; count = r2; //17 r1 = count; //17 count = 0; count = r1; //17 Final value of count is 17, as if write never occurred!

Summary The C++ Memory Model racy programs have undefined behaviors focus on performance atomic<t> with memory_order_seq_cst memory_order_acq_rel memory_order_acquire memory_order_release memory_order_consume memory_order_relaxed The Java Memory Model tried to define racy programs' behaviors focus on security volaple variables for synchronizapon idiom Like volaple in Java The others have looser ordering constraints Do NOT write programs with data races!

References [1] W. Pugh. Fixing the Java Memory Model, CommunicaPon of ACM, 1999. [2] Bacon et. al. The "Double- checked locking is broken" DeclaraPon, via hsp:// www.cs.umd.edu/~pugh/java/memorymodel/doublecheckedlocking.html [3] S. Meyers and Andrei Alexandrescu. C++ and the Perils of Double- Checked Locking, Sept. 2004. [4] H.- J. Boehm. Threads cannot be implemented as a library. Nov. 2004. [5] J. Manson, W. Pugh, and S. Adve. The Java Memory Model, POPL 2005. [6] H.- J. Boehm and S. Adve. FoundaPons of the C++ Concurrency Memory Model. PLDI 2008. [7] J. Sevcik and D. Aspinall. On Validity of Program TransformaPons in the Java Memory Model. ECOOP 2008.