Shared Memory Programming with OpenMP. Lecture 8: Memory model, flush and atomics

Similar documents
Advanced OpenMP. Memory model, flush and atomics

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008

CS510 Advanced Topics in Concurrency. Jonathan Walpole

Lecture 24: Multiprocessing Computer Architecture and Systems Programming ( )

Parallel Numerical Algorithms

C++ Memory Model. Don t believe everything you read (from shared memory)

OpenMP Fundamentals Fork-join model and data environment

CS 5220: Shared memory programming. David Bindel

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program

The OpenMP Memory Model

CME 213 S PRING Eric Darve

OpenMP Tutorial. Dirk Schmidl. IT Center, RWTH Aachen University. Member of the HPC Group Christian Terboven

Parallel Computing. Lecture 16: OpenMP - IV

Concurrent Programming in the D Programming Language. by Walter Bright Digital Mars

OpenMP. OpenMP. Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum API for Fortran and C/C++

Introduction to OpenMP

OPENMP TIPS, TRICKS AND GOTCHAS

OpenMP C and C++ Application Program Interface Version 1.0 October Document Number

Concurrency, Thread. Dongkun Shin, SKKU

15-418, Spring 2008 OpenMP: A Short Introduction

Lecture 16: Recapitulations. Lecture 16: Recapitulations p. 1

Parallel Programming with OpenMP. CS240A, T. Yang

Questions from last time

Parallel Programming with OpenMP

Parallel Programming with OpenMP. CS240A, T. Yang, 2013 Modified from Demmel/Yelick s and Mary Hall s Slides

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman)

Allows program to be incrementally parallelized

Shared Memory Consistency Models: A Tutorial

OPENMP TIPS, TRICKS AND GOTCHAS

Overview: Memory Consistency

Concurrent Programming Lecture 10

Using Weakly Ordered C++ Atomics Correctly. Hans-J. Boehm

The OpenMP Memory Model

OpenMP Application Program Interface

Introduction to. Slides prepared by : Farzana Rahman 1

CSL 860: Modern Parallel

Introduction to OpenMP

Review. Tasking. 34a.cpp. Lecture 14. Work Tasking 5/31/2011. Structured block. Parallel construct. Working-Sharing contructs.

Shared Memory Programming Model

CS 470 Spring Mike Lam, Professor. Advanced OpenMP

Scientific Programming in C XIV. Parallel programming

Barbara Chapman, Gabriele Jost, Ruud van der Pas

More Advanced OpenMP. Saturday, January 30, 16

Relaxed Memory-Consistency Models

Overview: The OpenMP Programming Model

Java Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016

Implementation of Parallelization

CMSC 330: Organization of Programming Languages

A brief introduction to OpenMP

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen

Review. Lecture 12 5/22/2012. Compiler Directives. Library Functions Environment Variables. Compiler directives for construct, collapse clause

CS4961 Parallel Programming. Lecture 12: Advanced Synchronization (Pthreads) 10/4/11. Administrative. Mary Hall October 4, 2011

Parallel Computer Architecture Spring Memory Consistency. Nikos Bellas

Multithreading in C with OpenMP

High-level languages

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013

Introduction to Standard OpenMP 3.1

OpenMP Tasking Model Unstructured parallelism

An OpenMP Validation Suite

Comparing OpenACC 2.5 and OpenMP 4.1 James C Beyer PhD, Sept 29 th 2015

Shared Memory Programming Paradigm!

[Potentially] Your first parallel application

Using Relaxed Consistency Models

Foundations of the C++ Concurrency Memory Model

Hybrid MPI/OpenMP parallelization. Recall: MPI uses processes for parallelism. Each process has its own, separate address space.

ECE 574 Cluster Computing Lecture 10

CS5460: Operating Systems

Threads and Java Memory Model

Introduction to OpenMP. Tasks. N.M. Maclaren September 2017

Parallel Computing. Prof. Marco Bertini

The New Java Technology Memory Model

Tasking in OpenMP 4. Mirko Cestari - Marco Rorro -

Memory Consistency Models

Static Data Race Detection for SPMD Programs via an Extended Polyhedral Representation

Parallelising Scientific Codes Using OpenMP. Wadud Miah Research Computing Group

CS 470 Spring Mike Lam, Professor. OpenMP

DPHPC: Introduction to OpenMP Recitation session

G52CON: Concepts of Concurrency

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen

CS 470 Spring Mike Lam, Professor. OpenMP

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples

Alfio Lazzaro: Introduction to OpenMP

Lecture 36: MPI, Hybrid Programming, and Shared Memory. William Gropp

C++ 11 Memory Consistency Model. Sebastian Gerstenberg NUMA Seminar

Introduction to Parallel Programming Part 4 Confronting Race Conditions

<atomic.h> weapons. Paolo Bonzini Red Hat, Inc. KVM Forum 2016

Data Environment: Default storage attributes

Open Multi-Processing: Basic Course

GPU Concurrency: Weak Behaviours and Programming Assumptions

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines

CSE 506: Opera.ng Systems Memory Consistency

Advanced C Programming Winter Term 2008/09. Guest Lecture by Markus Thiele

Load-reserve / Store-conditional on POWER and ARM

Introduction to OpenMP

Shared Memory Programming Models I

Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing

!OMP #pragma opm _OPENMP

Potential violations of Serializability: Example 1

Lecture: Consistency Models, TM. Topics: consistency models, TM intro (Section 5.6)

Introduction to OpenMP

Transcription:

Shared Memory Programming with OpenMP Lecture 8: Memory model, flush and atomics

Why do we need a memory model? On modern computers code is rarely executed in the same order as it was specified in the source code. Compilers, processors and memory systems reorder code to achieve maximum performance. Individual threads, when considered in isolation, exhibit as-ifserial semantics. Programmer s assumptions based on the memory model hold even in the face of code reordering performed by the compiler, the processors and the memory. 2

Example Reasoning about multithreaded execution is not that simple. T1 T2 x=1; int r1=y; y=1; int r2=x; If there is no reordering and T2 sees value of y on read to be 1 then the following read of x should also return the value 1. If code in T1 is reordered we can no longer make this assumption. 3

OpenMP Memory Model OpenMP supports a relaxed-consistency shared memory model. Threads can maintain a temporary view of shared memory which is not consistent with that of other threads. These temporary views are made consistent only at certain points in the program. The operation which enforces consistency is called the flush operation 4

Flush operation Defines a sequence point at which a thread is guaranteed to see a consistent view of memory All previous read/writes by this thread have completed and are visible to other threads No subsequent read/writes by this thread have occurred A flush operation is analogous to a fence in other shared memory API s 5

Flush and synchronization A flush operation is implied by OpenMP synchronizations, e.g. at entry/exit of parallel regions at implicit and explicit barriers at entry/exit of critical regions whenever a lock is set or unset. (but not at entry to worksharing regions or entry/exit of master regions) Note: using the volatile qualifier in C/C++ does not give sufficient guarantees about multithreaded execution. 6

Example: producer-consumer pattern Thread 0 Thread 1 a = foo(); flag = 1; while (!flag); b = a; This is incorrect code The compiler and/or hardware may re-order the reads/writes to a and flag, or flag may be held in a register. OpenMP has a flush directive which specifies an explicit flush operation can be used to make the above example work!$omp flush 7

Using flush In order for a write of a variable on one thread to be guaranteed visible and valid on a second thread, the following operations must occur in the following order: 1. Thread A writes the variable 2. Thread A executes a flush operation 3. Thread B executes a flush operation 4. Thread B reads the variable 8

Example: producer-consumer pattern Thread 0 a = foo(); flag = 1; First flush ensures flag is written after a Second flush ensures flag is written to memory Thread 1 while (!flag){ } b = a; First and second flushes ensure flag is read from memory Third flush ensures correct ordering of flushes 9

Using flush Using flush correctly is difficult and prone to subtle bugs extremely hard to test whether code is correct may execute correctly on one platform/compiler but not on another bugs can be triggered by changing the optimisation level on the compiler Don t use it unless you are 100% confident you know what you are doing! and even then 10

Other atomic forms Sometimes we may wish to enforce atomic behaviour for operations other than updates #pragma omp atomic read v = x; #pragma omp atomic write x = expr; #pragma omp atomic capture {v = x; x binop= expr;}!$omp atomic read v = x!$omp atomic write x = expr!$omp atomic capture v = x x = x op expr!$omp end atomic 11

Example: producer-consumer pattern Thread 0 Thread 1 a = foo(); #pragma omp atomic write flag = 1; while (!myflag){ #pragma omp atomic read myflag = flag; } b = a; To be strictly correct we should use atomics to avoid the race condition on flag. 12