C++ Concurrency - Formalised

Size: px
Start display at page:

Download "C++ Concurrency - Formalised"

Transcription

1 C++ Concurrency - Formalised Salomon Sickert Technische Universität München 26 th April 2013

2 Mutex Algorithms At most one thread is in the critical section at any time. 2 / 35

3 Dekker s Mutex Algorithm [2] Initialisation 1 x = 0; y = 0; Thread 1 1 x = 1; 2 if (y == 1) { 3... //Busy Wait 4 } 5 // Critical Section Thread 2 1 y = 1; 2 if (x == 1) { 3... //Busy Wait 4 } 5 // Critical section 3 / 35

4 Dekker s Mutex Algorithm [2] Initialisation 1 x = 0; y = 0; Thread 1 1 x = 1; 2 if (y == 1) { 3... //Busy Wait 4 } 5 // Critical Section Thread 2 1 y = 1; 2 if (x == 1) { 3... //Busy Wait 4 } 5 // Critical section 1 MOV [x] <- 1 2 MOV r1 <- [y] 1 MOV [y] <- 1 2 MOV r2 <- [x] 3 / 35

5 Modern x86 Multiprocessors - simplified (based on [4]) r1 r2 r1 r2 H/W Thread 1 H/W Thread 2 Shared Memory 4 / 35

6 Modern x86 Multiprocessors - simplified r1 r2 r1 r2 H/W Thread 1 H/W Thread 2 Write B. Write B. Shared Memory 5 / 35

7 Modern x86 Multiprocessors and Dekker s Algorithm - - H/W Thread H/W Thread x = 0, y = 0 6 / 35

8 Modern x86 Multiprocessors and Dekker s Algorithm - - H/W Thread H/W Thread 2 x = 1 - x = 0, y = 0 7 / 35

9 Modern x86 Multiprocessors and Dekker s Algorithm - - H/W Thread H/W Thread 2 x = 1 y = 1 x = 0, y = 0 8 / 35

10 Modern x86 Multiprocessors and Dekker s Algorithm r1 = 0 - H/W Thread H/W Thread 2 x = 1 y = 1 x = 0, y = 0 9 / 35

11 Modern x86 Multiprocessors and Dekker s Algorithm r1 = 0 - H/W Thread 1 - r2 = 0 H/W Thread 2 x = 1 y = 1 x = 0, y = 0 10 / 35

12 Modern x86 Multiprocessors Both threads may enter the critical section at the same time! 11 / 35

13 Modern x86 Multiprocessors Both threads may enter the critical section at the same time! Due to write buffers threads may see a subtly different memory. 11 / 35

14 Modern x86 Multiprocessors Both threads may enter the critical section at the same time! Due to write buffers threads may see a subtly different memory. The processor exhibits a relaxed memory model. 11 / 35

15 Modern x86 Multiprocessors Both threads may enter the critical section at the same time! Due to write buffers threads may see a subtly different memory. The processor exhibits a relaxed memory model. Solutions: Assembler: FENCE instructions. C++11: Special types. (Atomics) 11 / 35

16 Modern x86 Multiprocessors Both threads may enter the critical section at the same time! Due to write buffers threads may see a subtly different memory. The processor exhibits a relaxed memory model. Solutions: Assembler: FENCE instructions. C++11: Special types. (Atomics) Sequential consistency is restored by paying a performance penalty. 11 / 35

17 Modern x86 Multiprocessors Both threads may enter the critical section at the same time! Due to write buffers threads may see a subtly different memory. The processor exhibits a relaxed memory model. Solutions: Assembler: FENCE instructions. C++11: Special types. (Atomics) Sequential consistency is restored by paying a performance penalty. Some Non-x86 architectures exhibit even weaker models, e.g ARM. 11 / 35

18 Mathematizing C++ Concurrency ([3]) Given a program p, what are the possible ways to execute it? 12 / 35

19 Mathematizing C++ Concurrency ([3]) Given a program p, what are the possible ways to execute it? 1. Calculate X opsem from p. Loosely speaking: Structure of the program. 12 / 35

20 Mathematizing C++ Concurrency ([3]) Given a program p, what are the possible ways to execute it? 1. Calculate X opsem from p. Loosely speaking: Structure of the program. 2. Find all X witness consistent with X opsem. Loosely speaking: Different executions of the program. 12 / 35

21 Mathematizing C++ Concurrency ([3]) Given a program p, what are the possible ways to execute it? 1. Calculate X opsem from p. Loosely speaking: Structure of the program. 2. Find all X witness consistent with X opsem. Loosely speaking: Different executions of the program. 3. Check for undefined behaviour. Reading from uninitialized variables Unsequenced Races (e.g. x == (x = 2)) Data Races / 35

22 X opsem : An Example 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } 13 / 35

23 X opsem : An Example 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } e:w x=1 14 / 35

24 X opsem : An Example 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } f:w y=2 e:w x=1 15 / 35

25 X opsem : An Example 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } f:w y=2 e:w x=1 a:w a=1 c:r y=? b:r x=? d:w z=(x? == y?) 16 / 35

26 X opsem : An Example 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } f:w y=2 e:w x=1 a:w a=1 sb sb c:r y=? b:r x=? sb sb d:w z=(x? == y?) 17 / 35

27 X opsem : An Example 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } f:w y=2 e:w x=1 asw asw a:w a=1 sb sb c:r y=? b:r x=? sb sb d:w z=(x? == y?) 18 / 35

28 X opsem Independent from the architecture 19 / 35

29 X opsem Independent from the architecture Composed of (among other parts): Actions (simplified) aid : R l = v aid : W l = v 19 / 35

30 X opsem Independent from the architecture Composed of (among other parts): Actions (simplified) aid : R l = v aid : W l = v Binary relations: sequenced before (sb) additional synchronized with (asw) 19 / 35

31 X opsem Independent from the architecture Composed of (among other parts): Actions (simplified) aid : R l = v aid : W l = v Binary relations: sequenced before (sb) additional synchronized with (asw) In this special case: simple happens before = ( sequenced before ) additional synchronized with + 19 / 35

32 X witness : An Example 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } f:w y=2 e:w x=1 hb hb a:w a=1 hb hb c:r y=? b:r x=? hb hb d:w z=(x? == y?) 20 / 35

33 X witness : An Example 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } f:w y=2 e:w x=1 hb hb rf a:w a=1 hb hb c:r y=2 b:r x=? hb hb d:w z=(x? == 2) 21 / 35

34 X witness : An Example 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } f:w y=2 e:w x=1 hb hb rf a:w a=1 rf hb hb c:r y=2 b:r x=1 hb hb d:w z=0 22 / 35

35 X witness Dependent on the architecture 23 / 35

36 X witness Dependent on the architecture Composed of: Binary relations: reads from (rf) 23 / 35

37 X witness Dependent on the architecture Composed of: Binary relations: reads from (rf) sequentialconsistency (sc) (not applicable in the example) modificationorder (mo) (not applicable in the example) 23 / 35

38 X = (X opsem, X witness ) : An execution candidate 1 int main() { 2 int x, y, z, a; 3 {{{ x = 1; 4 y = 2; 5 }}}; 6 a = 1; 7 z = (x == y); 8 return 0; 9 } f:w y=2 e:w x=1 hb hb rf a:w a=1 rf hb hb c:r y=2 b:r x=1 hb hb d:w z=0 24 / 35

39 X = (X opsem, X witness ) : Undefined Behaviour? f:w y=2 e:w x=1 hb hb Uninitialised Reads? Unsequenced Races? Data Races? rf a:w a=1 rf hb hb c:r y=2 b:r x=1 hb hb d:w z=0 25 / 35

40 X = (X opsem, X witness ) : Undefined Behaviour? f:w y=2 e:w x=1 hb hb Uninitialised Reads? Unsequenced Races? Data Races? rf a:w a=1 rf hb hb c:r y=2 b:r x=1 hb hb d:w z=0 25 / 35

41 X = (X opsem, X witness ) : Undefined Behaviour? f:w y=2 e:w x=1 hb hb Uninitialised Reads? Unsequenced Races? Data Races? rf a:w a=1 rf hb hb c:r y=2 b:r x=1 hb hb d:w z=0 25 / 35

42 X = (X opsem, X witness ) : Undefined Behaviour? f:w y=2 e:w x=1 hb hb Uninitialised Reads? Unsequenced Races? Data Races? rf a:w a=1 rf hb hb c:r y=2 b:r x=1 hb hb d:w z=0 25 / 35

43 Undefined Behaviour: Data Races 1 int main() { 2 int x = 0; 3 {{{ x = 1; 4 x = 2; 5 }}}; 6 return x; 7 } c:w x=1 a:w x=0 asw asw b:r x=? asw d:w x=2 asw 26 / 35

44 Undefined Behaviour: Data Races 1 int main() { 2 int x = 0; 3 {{{ x = 1; 4 x = 2; 5 }}}; 6 return x; 7 } c:w x=1 a:w x=0 hb hb d:w x=2 hb hb b:r x=? 27 / 35

45 Undefined Behaviour: Data Races 1 int main() { 2 int x = 0; 3 {{{ x = 1; 4 x = 2; 5 }}}; 6 return x; 7 } c:w x=1 a:w x=0 hb hb dr d:w x=2 hb hb b:r x=? 28 / 35

46 The Formalised C++ Memory Model cpp_memory_model opsem (p : program) = let pre_executions = {(X opsem, X witness ) opsem p X opsem consistent_execution X opsem X witness } in if X pre_executions. undefined_behaviour X then NONE else SOME pre_executions 29 / 35

47 C++11 Concurrency / Language Features Concurrency Idioms (Atomics, Mutexes, Threads) 30 / 35

48 C++11 Concurrency / Language Features Concurrency Idioms (Atomics, Mutexes, Threads) Pre-C++11 No memory model for multi-threaded code Concurrency Idioms provided by a third party: pthreads OpenMP Drawbacks: No formalised standard, compiler may produce incorrect code. 30 / 35

49 C++11 Concurrency / Language Features Concurrency Idioms (Atomics, Mutexes, Threads) Pre-C++11 No memory model for multi-threaded code Concurrency Idioms provided by a third party: pthreads OpenMP Drawbacks: No formalised standard, compiler may produce incorrect code. C++11 A memory model for multi-threaded code Concurrency Idioms in are part of the language (std::atomic<t> [1], std::mutex, std::thread) Similar to Java Benefits: Compiler is able to produce correct code. 30 / 35

50 Applications of the Formal Memory Model Corrections to the C++0x standard. memory_order_seq_cst was in fact not sequentially consistent 31 / 35

51 Applications of the Formal Memory Model Corrections to the C++0x standard. memory_order_seq_cst was in fact not sequentially consistent Confidence in memory model and specification. 31 / 35

52 Applications of the Formal Memory Model Corrections to the C++0x standard. memory_order_seq_cst was in fact not sequentially consistent Confidence in memory model and specification. Verify correctness of prototype implementations. Figure: Figure taken from [3] 31 / 35

53 Applications of the Formal Memory Model Corrections to the C++0x standard. memory_order_seq_cst was in fact not sequentially consistent Confidence in memory model and specification. Verify correctness of prototype implementations. Developer support. 31 / 35 Figure: Figure taken from [3]

54 32 / 35 Questions?

55 Bonus: Atomics vs. Mutexes - short presentation 33 / 35

56 Bonus: Dekker s algorithm in CppMem 34 / 35

57 References C++ atomic types and operations, Dekker s algorithm, Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. Mathematizing c++ concurrency. SIGPLAN Not., 46(1):55 66, January Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli, and Magnus O. Myreen. x86-tso: a rigorous and usable programmer s model for x86 multiprocessors. Commun. ACM, 53(7):89 97, July / 35

Multicore Programming: C++0x

Multicore Programming: C++0x p. 1 Multicore Programming: C++0x Mark Batty University of Cambridge in collaboration with Scott Owens, Susmit Sarkar, Peter Sewell, Tjark Weber November, 2010 p. 2 C++0x: the next C++ Specified by the

More information

The C/C++ Memory Model: Overview and Formalization

The C/C++ Memory Model: Overview and Formalization The C/C++ Memory Model: Overview and Formalization Mark Batty Jasmin Blanchette Scott Owens Susmit Sarkar Peter Sewell Tjark Weber Verification of Concurrent C Programs C11 / C++11 In 2011, new versions

More information

The C1x and C++11 concurrency model

The C1x and C++11 concurrency model The C1x and C++11 concurrency model Mark Batty University of Cambridge January 16, 2013 C11 and C++11 Memory Model A DRF model with the option to expose relaxed behaviour in exchange for high performance.

More information

Hardware Memory Models: x86-tso

Hardware Memory Models: x86-tso Hardware Memory Models: x86-tso John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 522 Lecture 9 20 September 2016 Agenda So far hardware organization multithreading

More information

Relaxed Memory: The Specification Design Space

Relaxed Memory: The Specification Design Space Relaxed Memory: The Specification Design Space Mark Batty University of Cambridge Fortran meeting, Delft, 25 June 2013 1 An ideal specification Unambiguous Easy to understand Sound w.r.t. experimentally

More information

High-level languages

High-level languages High-level languages High-level languages are not immune to these problems. Actually, the situation is even worse: the source language typically operates over mixed-size values (multi-word and bitfield);

More information

Multicore Programming Java Memory Model

Multicore Programming Java Memory Model p. 1 Multicore Programming Java Memory Model Peter Sewell Jaroslav Ševčík Tim Harris University of Cambridge MSR with thanks to Francesco Zappa Nardelli, Susmit Sarkar, Tom Ridge, Scott Owens, Magnus O.

More information

Memory Models for C/C++ Programmers

Memory Models for C/C++ Programmers Memory Models for C/C++ Programmers arxiv:1803.04432v1 [cs.dc] 12 Mar 2018 Manuel Pöter Jesper Larsson Träff Research Group Parallel Computing Faculty of Informatics, Institute of Computer Engineering

More information

The Semantics of x86-cc Multiprocessor Machine Code

The Semantics of x86-cc Multiprocessor Machine Code The Semantics of x86-cc Multiprocessor Machine Code Susmit Sarkar Computer Laboratory University of Cambridge Joint work with: Peter Sewell, Scott Owens, Tom Ridge, Magnus Myreen (U.Cambridge) Francesco

More information

Programming Language Memory Models: What do Shared Variables Mean?

Programming Language Memory Models: What do Shared Variables Mean? Programming Language Memory Models: What do Shared Variables Mean? Hans-J. Boehm 10/25/2010 1 Disclaimers: This is an overview talk. Much of this work was done by others or jointly. I m relying particularly

More information

Weak memory models. Mai Thuong Tran. PMA Group, University of Oslo, Norway. 31 Oct. 2014

Weak memory models. Mai Thuong Tran. PMA Group, University of Oslo, Norway. 31 Oct. 2014 Weak memory models Mai Thuong Tran PMA Group, University of Oslo, Norway 31 Oct. 2014 Overview 1 Introduction Hardware architectures Compiler optimizations Sequential consistency 2 Weak memory models TSO

More information

Mechanised industrial concurrency specification: C/C++ and GPUs. Mark Batty University of Kent

Mechanised industrial concurrency specification: C/C++ and GPUs. Mark Batty University of Kent Mechanised industrial concurrency specification: C/C++ and GPUs Mark Batty University of Kent It is time for mechanised industrial standards Specifications are written in English prose: this is insufficient

More information

Understanding POWER multiprocessors

Understanding POWER multiprocessors Understanding POWER multiprocessors Susmit Sarkar 1 Peter Sewell 1 Jade Alglave 2,3 Luc Maranget 3 Derek Williams 4 1 University of Cambridge 2 Oxford University 3 INRIA 4 IBM June 2011 Programming shared-memory

More information

Load-reserve / Store-conditional on POWER and ARM

Load-reserve / Store-conditional on POWER and ARM Load-reserve / Store-conditional on POWER and ARM Peter Sewell (slides from Susmit Sarkar) 1 University of Cambridge June 2012 Correct implementations of C/C++ on hardware Can it be done?...on highly relaxed

More information

From POPL to the Jungle and back

From POPL to the Jungle and back From POPL to the Jungle and back Peter Sewell University of Cambridge PLMW: the SIGPLAN Programming Languages Mentoring Workshop San Diego, January 2014 p.1 POPL 2014 41th ACM SIGPLAN-SIGACT Symposium

More information

Lowering C11 Atomics for ARM in LLVM

Lowering C11 Atomics for ARM in LLVM 1 Lowering C11 Atomics for ARM in LLVM Reinoud Elhorst Abstract This report explores the way LLVM generates the memory barriers needed to support the C11/C++11 atomics for ARM. I measure the influence

More information

Topic C Memory Models

Topic C Memory Models Memory Memory Non- Topic C Memory CPEG852 Spring 2014 Guang R. Gao CPEG 852 Memory Advance 1 / 29 Memory 1 Memory Memory Non- 2 Non- CPEG 852 Memory Advance 2 / 29 Memory Memory Memory Non- Introduction:

More information

RELAXED CONSISTENCY 1

RELAXED CONSISTENCY 1 RELAXED CONSISTENCY 1 RELAXED CONSISTENCY Relaxed Consistency is a catch-all term for any MCM weaker than TSO GPUs have relaxed consistency (probably) 2 XC AXIOMS TABLE 5.5: XC Ordering Rules. An X Denotes

More information

Program logics for relaxed consistency

Program logics for relaxed consistency Program logics for relaxed consistency UPMARC Summer School 2014 Viktor Vafeiadis Max Planck Institute for Software Systems (MPI-SWS) 1st Lecture, 28 July 2014 Outline Part I. Weak memory models 1. Intro

More information

Taming release-acquire consistency

Taming release-acquire consistency Taming release-acquire consistency Ori Lahav Nick Giannarakis Viktor Vafeiadis Max Planck Institute for Software Systems (MPI-SWS) POPL 2016 Weak memory models Weak memory models provide formal sound semantics

More information

Implementing the C11 memory model for ARM processors. Will Deacon February 2015

Implementing the C11 memory model for ARM processors. Will Deacon February 2015 1 Implementing the C11 memory model for ARM processors Will Deacon February 2015 Introduction 2 ARM ships intellectual property, specialising in RISC microprocessors Over 50 billion

More information

C++ Memory Model. Martin Kempf December 26, Abstract. 1. Introduction What is a Memory Model

C++ Memory Model. Martin Kempf December 26, Abstract. 1. Introduction What is a Memory Model C++ Memory Model (mkempf@hsr.ch) December 26, 2012 Abstract Multi-threaded programming is increasingly important. We need parallel programs to take advantage of multi-core processors and those are likely

More information

Part 1. Shared memory: an elusive abstraction

Part 1. Shared memory: an elusive abstraction Part 1. Shared memory: an elusive abstraction Francesco Zappa Nardelli INRIA Paris-Rocquencourt http://moscova.inria.fr/~zappa/projects/weakmemory Based on work done by or with Peter Sewell, Jaroslav Ševčík,

More information

CMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 15: Memory Consistency and Synchronization Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 5 (multi-core) " Basic requirements: out later today

More information

Foundations of the C++ Concurrency Memory Model

Foundations of the C++ Concurrency Memory Model Foundations of the C++ Concurrency Memory Model John Mellor-Crummey and Karthik Murthy Department of Computer Science Rice University johnmc@rice.edu COMP 522 27 September 2016 Before C++ Memory Model

More information

Overhauling SC atomics in C11 and OpenCL

Overhauling SC atomics in C11 and OpenCL Overhauling SC atomics in C11 and OpenCL John Wickerson, Mark Batty, and Alastair F. Donaldson Imperial Concurrency Workshop July 2015 TL;DR The rules for sequentially-consistent atomic operations and

More information

CS510 Advanced Topics in Concurrency. Jonathan Walpole

CS510 Advanced Topics in Concurrency. Jonathan Walpole CS510 Advanced Topics in Concurrency Jonathan Walpole Threads Cannot Be Implemented as a Library Reasoning About Programs What are the valid outcomes for this program? Is it valid for both r1 and r2 to

More information

Nondeterminism is Unavoidable, but Data Races are Pure Evil

Nondeterminism is Unavoidable, but Data Races are Pure Evil Nondeterminism is Unavoidable, but Data Races are Pure Evil Hans-J. Boehm HP Labs 5 November 2012 1 Low-level nondeterminism is pervasive E.g. Atomically incrementing a global counter is nondeterministic.

More information

Parallel Computer Architecture Spring Memory Consistency. Nikos Bellas

Parallel Computer Architecture Spring Memory Consistency. Nikos Bellas Parallel Computer Architecture Spring 2018 Memory Consistency Nikos Bellas Computer and Communications Engineering Department University of Thessaly Parallel Computer Architecture 1 Coherence vs Consistency

More information

Shared Memory Programming with OpenMP. Lecture 8: Memory model, flush and atomics

Shared Memory Programming with OpenMP. Lecture 8: Memory model, flush and atomics Shared Memory Programming with OpenMP Lecture 8: Memory model, flush and atomics Why do we need a memory model? On modern computers code is rarely executed in the same order as it was specified in the

More information

Industrial concurrency specification for C/C++ Mark Batty University of Kent

Industrial concurrency specification for C/C++ Mark Batty University of Kent Industrial concurrency specification for C/C++ Mark Batty University of Kent It is time for mechanised industrial standards Specifications are written in English prose: this is insufficient Write mechanised

More information

An introduction to weak memory consistency and the out-of-thin-air problem

An introduction to weak memory consistency and the out-of-thin-air problem An introduction to weak memory consistency and the out-of-thin-air problem Viktor Vafeiadis Max Planck Institute for Software Systems (MPI-SWS) CONCUR, 7 September 2017 Sequential consistency 2 Sequential

More information

Example: The Dekker Algorithm on SMP Systems. Memory Consistency The Dekker Algorithm 43 / 54

Example: The Dekker Algorithm on SMP Systems. Memory Consistency The Dekker Algorithm 43 / 54 Example: The Dekker Algorithm on SMP Systems Memory Consistency The Dekker Algorithm 43 / 54 Using Memory Barriers: the Dekker Algorithm Mutual exclusion of two processes with busy waiting. //flag[] is

More information

Declarative semantics for concurrency. 28 August 2017

Declarative semantics for concurrency. 28 August 2017 Declarative semantics for concurrency Ori Lahav Viktor Vafeiadis 28 August 2017 An alternative way of defining the semantics 2 Declarative/axiomatic concurrency semantics Define the notion of a program

More information

Overhauling SC atomics in C11 and OpenCL

Overhauling SC atomics in C11 and OpenCL Overhauling SC atomics in C11 and OpenCL Mark Batty, Alastair F. Donaldson and John Wickerson INCITS/ISO/IEC 9899-2011[2012] (ISO/IEC 9899-2011, IDT) Provisional Specification Information technology Programming

More information

Overview: Memory Consistency

Overview: Memory Consistency Overview: Memory Consistency the ordering of memory operations basic definitions; sequential consistency comparison with cache coherency relaxing memory consistency write buffers the total store ordering

More information

THÈSE DE DOCTORAT de l Université de recherche Paris Sciences Lettres PSL Research University

THÈSE DE DOCTORAT de l Université de recherche Paris Sciences Lettres PSL Research University THÈSE DE DOCTORAT de l Université de recherche Paris Sciences Lettres PSL Research University Préparée à l École normale supérieure Compiler Optimisations and Relaxed Memory Consistency Models ED 386 sciences

More information

Memory Consistency Models

Memory Consistency Models Memory Consistency Models Contents of Lecture 3 The need for memory consistency models The uniprocessor model Sequential consistency Relaxed memory models Weak ordering Release consistency Jonas Skeppstedt

More information

New Programming Abstractions for Concurrency in GCC 4.7. Torvald Riegel Red Hat 12/04/05

New Programming Abstractions for Concurrency in GCC 4.7. Torvald Riegel Red Hat 12/04/05 New Programming Abstractions for Concurrency in GCC 4.7 Red Hat 12/04/05 1 Concurrency and atomicity C++11 atomic types Transactional Memory Provide atomicity for concurrent accesses by different threads

More information

New Programming Abstractions for Concurrency. Torvald Riegel Red Hat 12/04/05

New Programming Abstractions for Concurrency. Torvald Riegel Red Hat 12/04/05 New Programming Abstractions for Concurrency Red Hat 12/04/05 1 Concurrency and atomicity C++11 atomic types Transactional Memory Provide atomicity for concurrent accesses by different threads Both based

More information

Programming in Parallel COMP755

Programming in Parallel COMP755 Programming in Parallel COMP755 All games have morals; and the game of Snakes and Ladders captures, as no other activity can hope to do, the eternal truth that for every ladder you hope to climb, a snake

More information

Chenyu Zheng. CSCI 5828 Spring 2010 Prof. Kenneth M. Anderson University of Colorado at Boulder

Chenyu Zheng. CSCI 5828 Spring 2010 Prof. Kenneth M. Anderson University of Colorado at Boulder Chenyu Zheng CSCI 5828 Spring 2010 Prof. Kenneth M. Anderson University of Colorado at Boulder Actuality Introduction Concurrency framework in the 2010 new C++ standard History of multi-threading in C++

More information

Using Weakly Ordered C++ Atomics Correctly. Hans-J. Boehm

Using Weakly Ordered C++ Atomics Correctly. Hans-J. Boehm Using Weakly Ordered C++ Atomics Correctly Hans-J. Boehm 1 Why atomics? Programs usually ensure that memory locations cannot be accessed by one thread while being written by another. No data races. Typically

More information

CS5460: Operating Systems

CS5460: Operating Systems CS5460: Operating Systems Lecture 9: Implementing Synchronization (Chapter 6) Multiprocessor Memory Models Uniprocessor memory is simple Every load from a location retrieves the last value stored to that

More information

COMP Parallel Computing. CC-NUMA (2) Memory Consistency

COMP Parallel Computing. CC-NUMA (2) Memory Consistency COMP 633 - Parallel Computing Lecture 11 September 26, 2017 Memory Consistency Reading Patterson & Hennesey, Computer Architecture (2 nd Ed.) secn 8.6 a condensed treatment of consistency models Coherence

More information

Reasoning About The Implementations Of Concurrency Abstractions On x86-tso. By Scott Owens, University of Cambridge.

Reasoning About The Implementations Of Concurrency Abstractions On x86-tso. By Scott Owens, University of Cambridge. Reasoning About The Implementations Of Concurrency Abstractions On x86-tso By Scott Owens, University of Cambridge. Plan Intro Data Races And Triangular Races Examples 2 sequential consistency The result

More information

Verifying Fence Elimination Optimisations

Verifying Fence Elimination Optimisations Verifying Fence Elimination Optimisations Viktor Vafeiadis 1 and Francesco Zappa Nardelli 2 1 MPI-SWS 2 INRIA Abstract. We consider simple compiler optimisations for removing redundant memory fences in

More information

Reasoning between Programming Languages and Architectures

Reasoning between Programming Languages and Architectures École normale supérieure Mémoire d habilitation à diriger des recherches Specialité Informatique Reasoning between Programming Languages and Architectures Francesco Zappa Nardelli Présenté aux rapporteurs

More information

C++ Memory Model. Don t believe everything you read (from shared memory)

C++ Memory Model. Don t believe everything you read (from shared memory) C++ Memory Model Don t believe everything you read (from shared memory) The Plan Why multithreading is hard Warm-up example Sequential Consistency Races and fences The happens-before relation The DRF guarantee

More information

Multicore Semantics and Programming

Multicore Semantics and Programming Multicore Semantics and Programming Peter Sewell University of Cambridge Tim Harris MSR with thanks to Francesco Zappa Nardelli, Jaroslav Ševčík, Susmit Sarkar, Tom Ridge, Scott Owens, Magnus O. Myreen,

More information

Distributed Operating Systems Memory Consistency

Distributed Operating Systems Memory Consistency Faculty of Computer Science Institute for System Architecture, Operating Systems Group Distributed Operating Systems Memory Consistency Marcus Völp (slides Julian Stecklina, Marcus Völp) SS2014 Concurrent

More information

Reasoning about the C/C++ weak memory model

Reasoning about the C/C++ weak memory model Reasoning about the C/C++ weak memory model Viktor Vafeiadis Max Planck Institute for Software Systems (MPI-SWS) 13 October 2014 Talk outline I. Introduction Weak memory models The C11 concurrency model

More information

Pattern-based Synthesis of Synchronization for the C++ Memory Model

Pattern-based Synthesis of Synchronization for the C++ Memory Model Pattern-based Synthesis of Synchronization for the C++ Memory Model Yuri Meshman Technion Noam Rinetzky Tel Aviv University Eran Yahav Technion Abstract We address the problem of synthesizing efficient

More information

Programming Languages

Programming Languages TECHNISCHE UNIVERSITÄT MÜNCHEN FAKULTÄT FÜR INFORMATIK Programming Languages Concurrency: Atomic Executions, Locks and Monitors Dr. Michael Petter Winter term 2016 Atomic Executions, Locks and Monitors

More information

X-86 Memory Consistency

X-86 Memory Consistency X-86 Memory Consistency Andreas Betz University of Kaiserslautern a betz12@cs.uni-kl.de Abstract In recent years multiprocessors have become ubiquitous and with them the need for concurrent programming.

More information

Brookes is Relaxed, Almost!

Brookes is Relaxed, Almost! Brookes is Relaxed, Almost! Radha Jagadeesan, Gustavo Petri, and James Riely School of Computing, DePaul University Abstract. We revisit the Brookes [1996] semantics for a shared variable parallel programming

More information

x86-tso: A Rigorous and Usable Programmer s Model for x86 Multiprocessors

x86-tso: A Rigorous and Usable Programmer s Model for x86 Multiprocessors x86-tso: A Rigorous and Usable Programmer s Model for x86 Multiprocessors Peter Sewell University of Cambridge Francesco Zappa Nardelli INRIA Susmit Sarkar University of Cambridge Magnus O. Myreen University

More information

Moscova. Jean-Jacques Lévy. March 23, INRIA Paris Rocquencourt

Moscova. Jean-Jacques Lévy. March 23, INRIA Paris Rocquencourt Moscova Jean-Jacques Lévy INRIA Paris Rocquencourt March 23, 2011 Research team Stats Staff 2008-2011 Jean-Jacques Lévy, INRIA Karthikeyan Bhargavan, INRIA James Leifer, INRIA Luc Maranget, INRIA Francesco

More information

A Concurrency Semantics for Relaxed Atomics that Permits Optimisation and Avoids Thin-Air Executions

A Concurrency Semantics for Relaxed Atomics that Permits Optimisation and Avoids Thin-Air Executions A Concurrency Semantics for Relaxed Atomics that Permits Optimisation and Avoids Thin-Air Executions Jean Pichon-Pharabod Peter Sewell University of Cambridge, United Kingdom first.last@cl.cam.ac.uk Abstract

More information

Advanced OpenMP. Memory model, flush and atomics

Advanced OpenMP. Memory model, flush and atomics Advanced OpenMP Memory model, flush and atomics Why do we need a memory model? On modern computers code is rarely executed in the same order as it was specified in the source code. Compilers, processors

More information

Relaxed Memory Consistency

Relaxed Memory Consistency Relaxed Memory Consistency Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

Memory Consistency Models: Convergence At Last!

Memory Consistency Models: Convergence At Last! Memory Consistency Models: Convergence At Last! Sarita Adve Department of Computer Science University of Illinois at Urbana-Champaign sadve@cs.uiuc.edu Acks: Co-authors: Mark Hill, Kourosh Gharachorloo,

More information

C11 Compiler Mappings: Exploration, Verification, and Counterexamples

C11 Compiler Mappings: Exploration, Verification, and Counterexamples C11 Compiler Mappings: Exploration, Verification, and Counterexamples Yatin Manerkar Princeton University manerkar@princeton.edu http://check.cs.princeton.edu November 22 nd, 2016 1 Compilers Must Uphold

More information

Partially Redundant Fence Elimination for x86, ARM, and Power Processors

Partially Redundant Fence Elimination for x86, ARM, and Power Processors Partially Redundant Fence Elimination for x86, ARM, and Power Processors Robin Morisset Francesco Zappa Nardelli ENS & Inria, France robin.morisset@normalesup.org francesco.zappa nardelli@inria.fr Abstract

More information

Verification and Validation

Verification and Validation Verification and Validation Assuring that a software system meets a user's needs Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 19 Slide 1 Objectives To introduce software verification

More information

Memory Consistency Models. CSE 451 James Bornholt

Memory Consistency Models. CSE 451 James Bornholt Memory Consistency Models CSE 451 James Bornholt Memory consistency models The short version: Multiprocessors reorder memory operations in unintuitive, scary ways This behavior is necessary for performance

More information

Threads Cannot Be Implemented As a Library

Threads Cannot Be Implemented As a Library Threads Cannot Be Implemented As a Library Authored by Hans J. Boehm Presented by Sarah Sharp February 18, 2008 Outline POSIX Thread Library Operation Vocab Problems with pthreads POSIX Thread Library

More information

C++ 11 Memory Consistency Model. Sebastian Gerstenberg NUMA Seminar

C++ 11 Memory Consistency Model. Sebastian Gerstenberg NUMA Seminar C++ 11 Memory Gerstenberg NUMA Seminar Agenda 1. Sequential Consistency 2. Violation of Sequential Consistency Non-Atomic Operations Instruction Reordering 3. C++ 11 Memory 4. Trade-Off - Examples 5. Conclusion

More information

Concept of a process

Concept of a process Concept of a process In the context of this course a process is a program whose execution is in progress States of a process: running, ready, blocked Submit Ready Running Completion Blocked Concurrent

More information

GPU Concurrency: Weak Behaviours and Programming Assumptions

GPU Concurrency: Weak Behaviours and Programming Assumptions GPU Concurrency: Weak Behaviours and Programming Assumptions Jyh-Jing Hwang, Yiren(Max) Lu 03/02/2017 Outline 1. Introduction 2. Weak behaviors examples 3. Test methodology 4. Proposed memory model 5.

More information

Don t sit on the fence: A static analysis approach to automatic fence insertion

Don t sit on the fence: A static analysis approach to automatic fence insertion Don t sit on the fence: A static analysis approach to automatic fence insertion Power, ARM! SC! \ / ~. ~ ) / ( / / \_/ \ / /\ / \ Jade Alglave, Daniel Kroening, Vincent Nimal, Daniel Poetzl University

More information

Review of last lecture. Peer Quiz. DPHPC Overview. Goals of this lecture. Is coherence everything?

Review of last lecture. Peer Quiz. DPHPC Overview. Goals of this lecture. Is coherence everything? Review of last lecture Design of Parallel and High-Performance Computing Fall Lecture: Memory Models Motivational video: https://www.youtube.com/watch?v=twhtg4ous Instructor: Torsten Hoefler & Markus Püschel

More information

CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency

CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency JAROSLAV ŠEVČÍK, University of Cambridge VIKTOR VAFEIADIS, MPI-SWS FRANCESCO ZAPPA NARDELLI, INRIA SURESH JAGANNATHAN, Purdue University

More information

An operational semantics for C/C++11 concurrency

An operational semantics for C/C++11 concurrency Draft An operational semantics for C/C++11 concurrency Kyndylan Nienhuis Kayvan Memarian Peter Sewell University of Cambridge first.last@cl.cam.ac.uk Abstract The C/C++11 concurrency model balances two

More information

Applied Theorem Proving: Modelling Instruction Sets and Decompiling Machine Code. Anthony Fox University of Cambridge, Computer Laboratory

Applied Theorem Proving: Modelling Instruction Sets and Decompiling Machine Code. Anthony Fox University of Cambridge, Computer Laboratory Applied Theorem Proving: Modelling Instruction Sets and Decompiling Machine Code Anthony Fox University of Cambridge, Computer Laboratory Overview This talk will mainly focus on 1. Specifying instruction

More information

Beyond Sequential Consistency: Relaxed Memory Models

Beyond Sequential Consistency: Relaxed Memory Models 1 Beyond Sequential Consistency: Relaxed Memory Models Computer Science and Artificial Intelligence Lab M.I.T. Based on the material prepared by and Krste Asanovic 2 Beyond Sequential Consistency: Relaxed

More information

CS533 Concepts of Operating Systems. Jonathan Walpole

CS533 Concepts of Operating Systems. Jonathan Walpole CS533 Concepts of Operating Systems Jonathan Walpole Shared Memory Consistency Models: A Tutorial Outline Concurrent programming on a uniprocessor The effect of optimizations on a uniprocessor The effect

More information

OS Process Synchronization!

OS Process Synchronization! OS Process Synchronization! Race Conditions! The Critical Section Problem! Synchronization Hardware! Semaphores! Classical Problems of Synchronization! Synchronization HW Assignment! 3.1! Concurrent Access

More information

Unit 12: Memory Consistency Models. Includes slides originally developed by Prof. Amir Roth

Unit 12: Memory Consistency Models. Includes slides originally developed by Prof. Amir Roth Unit 12: Memory Consistency Models Includes slides originally developed by Prof. Amir Roth 1 Example #1 int x = 0;! int y = 0;! thread 1 y = 1;! thread 2 int t1 = x;! x = 1;! int t2 = y;! print(t1,t2)!

More information

Shared Memory Consistency Models: A Tutorial

Shared Memory Consistency Models: A Tutorial Shared Memory Consistency Models: A Tutorial By Sarita Adve & Kourosh Gharachorloo Slides by Jim Larson Outline Concurrent programming on a uniprocessor The effect of optimizations on a uniprocessor The

More information

Multiprocessor Synchronization

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency In addition, read Doeppner, 5.1 and 5.2 (Much material in this section has been freely borrowed from Gernot Heiser at UNSW and from Kevin Elphinstone) MP Memory

More information

arxiv: v1 [cs.pl] 29 Aug 2018

arxiv: v1 [cs.pl] 29 Aug 2018 Memory Consistency Models using Constraints Özgür Akgün, Ruth Hoffmann, and Susmit Sarkar arxiv:1808.09870v1 [cs.pl] 29 Aug 2018 School of Computer Science, University of St Andrews, UK {ozgur.akgun, rh347,

More information

C++ Memory Model Tutorial

C++ Memory Model Tutorial C++ Memory Model Tutorial Wenzhu Man C++ Memory Model Tutorial 1 / 16 Outline 1 Motivation 2 Memory Ordering for Atomic Operations The synchronizes-with and happens-before relationship (not from lecture

More information

Mohamed M. Saad & Binoy Ravindran

Mohamed M. Saad & Binoy Ravindran Mohamed M. Saad & Binoy Ravindran VT-MENA Program Electrical & Computer Engineering Department Virginia Polytechnic Institute and State University TRANSACT 11 San Jose, CA An operation (or set of operations)

More information

Concurrency. Johan Montelius KTH

Concurrency. Johan Montelius KTH Concurrency Johan Montelius KTH 2017 1 / 32 What is concurrency? 2 / 32 What is concurrency? Concurrency: (the illusion of) happening at the same time. 2 / 32 What is concurrency? Concurrency: (the illusion

More information

Generating Code from Event-B Using an Intermediate Specification Notation

Generating Code from Event-B Using an Intermediate Specification Notation Rodin User and Developer Workshop Generating Code from Event-B Using an Intermediate Specification Notation Andy Edmunds - ae02@ecs.soton.ac.uk Michael Butler - mjb@ecs.soton.ac.uk Between Abstract Development

More information

Lecture 24: Multiprocessing Computer Architecture and Systems Programming ( )

Lecture 24: Multiprocessing Computer Architecture and Systems Programming ( ) Systems Group Department of Computer Science ETH Zürich Lecture 24: Multiprocessing Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Most of the rest of this

More information

<atomic.h> weapons. Paolo Bonzini Red Hat, Inc. KVM Forum 2016

<atomic.h> weapons. Paolo Bonzini Red Hat, Inc. KVM Forum 2016 weapons Paolo Bonzini Red Hat, Inc. KVM Forum 2016 The real things Herb Sutter s talks atomic Weapons: The C++ Memory Model and Modern Hardware Lock-Free Programming (or, Juggling Razor Blades)

More information

Typed Assembly Language for Implementing OS Kernels in SMP/Multi-Core Environments with Interrupts

Typed Assembly Language for Implementing OS Kernels in SMP/Multi-Core Environments with Interrupts Typed Assembly Language for Implementing OS Kernels in SMP/Multi-Core Environments with Interrupts Toshiyuki Maeda and Akinori Yonezawa University of Tokyo Quiz [Environment] CPU: Intel Xeon X5570 (2.93GHz)

More information

Compositional Verification of Compiler Optimisations on Relaxed Memory

Compositional Verification of Compiler Optimisations on Relaxed Memory Compositional Verification of Compiler Optimisations on Relaxed Memory Mike Dodds 1, Mark Batty 2, and Alexey Gotsman 3 1 Galois Inc. 2 University of Kent 3 IMDEA Software Institute Abstract. A valid compiler

More information

Announcements. ECE4750/CS4420 Computer Architecture L17: Memory Model. Edward Suh Computer Systems Laboratory

Announcements. ECE4750/CS4420 Computer Architecture L17: Memory Model. Edward Suh Computer Systems Laboratory ECE4750/CS4420 Computer Architecture L17: Memory Model Edward Suh Computer Systems Laboratory suh@csl.cornell.edu Announcements HW4 / Lab4 1 Overview Symmetric Multi-Processors (SMPs) MIMD processing cores

More information

What is concurrency? Concurrency. What is parallelism? concurrency vs parallelism. Concurrency: (the illusion of) happening at the same time.

What is concurrency? Concurrency. What is parallelism? concurrency vs parallelism. Concurrency: (the illusion of) happening at the same time. What is concurrency? Concurrency Johan Montelius KTH 2017 Concurrency: (the illusion of) happening at the same time. A property of the programing model. Why would we want to do things concurrently? What

More information

A CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency

A CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency A CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency JAROSLAV ŠEVČÍK, Microsoft VIKTOR VAFEIADIS, MPI-SWS FRANCESCO ZAPPA NARDELLI, INRIA SURESH JAGANNATHAN, Purdue University PETER SEWELL,

More information

Concurrent Processes Rab Nawaz Jadoon

Concurrent Processes Rab Nawaz Jadoon Concurrent Processes Rab Nawaz Jadoon DCS COMSATS Institute of Information Technology Assistant Professor COMSATS Lahore Pakistan Operating System Concepts Concurrent Processes If more than one threads

More information

Symmetric Multiprocessors: Synchronization and Sequential Consistency

Symmetric Multiprocessors: Synchronization and Sequential Consistency Constructive Computer Architecture Symmetric Multiprocessors: Synchronization and Sequential Consistency Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November

More information

Static and dynamic Testing

Static and dynamic Testing Static and dynamic Testing Static testing Requirements specification High-level design Formal specification Detailed design Program Prototype Dynamic testing Ian Sommerville 1995 Software Engineering,

More information

CS 252 Graduate Computer Architecture. Lecture 11: Multiprocessors-II

CS 252 Graduate Computer Architecture. Lecture 11: Multiprocessors-II CS 252 Graduate Computer Architecture Lecture 11: Multiprocessors-II Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~krste http://inst.eecs.berkeley.edu/~cs252

More information

Threads and Memory Model for C++

Threads and Memory Model for C++ Threads and Memory Model for C++ Hochschule für Technik Rapperswil JORGE MIGUEL MACHADO Supervisor: Prof. Peter Sommerlad January 11, 2012 Abstract With the introduction of the new C++11 standard, additional

More information

Repairing Sequential Consistency in C/C++11

Repairing Sequential Consistency in C/C++11 Repairing Sequential Consistency in C/C++11 Ori Lahav MPI-SWS, Germany orilahav@mpi-sws.org Viktor Vafeiadis MPI-SWS, Germany viktor@mpi-sws.org Jeehoon Kang Seoul National University, Korea jeehoon.kang@sf.snu.ac.kr

More information

Checking Concurrent Java Programs

Checking Concurrent Java Programs Checking Concurrent Java Programs On Event Spaces Néstor CATAÑO Nestor.Catano@sophia.inria.fr INRIA Sophia-Antipolis, France ModoCop,1st July 2002 p.1/22 Outline Concurrent Java programs Event Spaces Constructing

More information