Primitive Task-Parallel Constructs The begin statement The sync types Structured Task-Parallel Constructs Atomic Transactions and Memory Consistency

Similar documents
Steve Deitz Cray Inc.

Task: a unit of parallel work in a Chapel program all Chapel parallelism is implemented using tasks

Task: a unit of parallel work in a Chapel program all Chapel parallelism is implemented using tasks

. Programming in Chapel. Kenjiro Taura. University of Tokyo

Brad Chamberlain Cray Inc. March 2011

Introduction to Parallel Programming

An Overview of Chapel

A brief introduction to OpenMP

Chapel Introduction and

Overview: Emerging Parallel Programming Models

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples

SYNCHRONIZED DATA. Week 9 Laboratory for Concurrent and Distributed Systems Uwe R. Zimmer. Pre-Laboratory Checklist

Eureka! Task Teams! Kyle Wheeler SC 12 Chapel Lightning Talk SAND: P

Chapel: Locality and Affinity

Chapel: An Emerging Parallel Programming Language. Thomas Van Doren, Chapel Team, Cray Inc. Northwest C++ Users Group April 16 th, 2014

Advanced Linked Lists. Doubly Linked Lists Circular Linked Lists Multi- Linked Lists

Abstract unit of target architecture Supports reasoning about locality Capable of running tasks and storing variables

Exceptions in Chapel

OpenMP. OpenMP. Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum API for Fortran and C/C++

Lecture 16: Recapitulations. Lecture 16: Recapitulations p. 1

CME 213 S PRING Eric Darve

Parallel Computing. Prof. Marco Bertini

Introduction to Standard OpenMP 3.1

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program

Steve Deitz Chapel project, Cray Inc.

Sung-Eun Choi and Steve Deitz Cray Inc.

Chapel: Multi-Locale Execution 2

Chapel: Features. Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010

Allows program to be incrementally parallelized

Distributed Systems. Distributed Shared Memory. Paul Krzyzanowski

M/s. Managing distributed workloads. Language Reference Manual. Miranda Li (mjl2206) Benjamin Hanser (bwh2124) Mengdi Lin (ml3567)

Parallel Programming with OpenMP. CS240A, T. Yang

Introduction to Model Checking

Process Management And Synchronization

Decaf Language Reference Manual

COMP Parallel Computing. CC-NUMA (2) Memory Consistency

Shared state model. April 3, / 29

Steve Deitz Cray Inc. Download Chapel v0.9 h1p://sourceforge.net/projects/chapel/ Compa&ble with Linux/Unix, Mac OS X, Cygwin

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means

Synchronizing the Asynchronous

The learning objectives of this chapter are the followings. At the end of this chapter, you shall

10/30/18. Dynamic Semantics DYNAMIC SEMANTICS. Operational Semantics. An Example. Operational Semantics Definition Process

UNIX Input/Output Buffering

Sung-Eun Choi and Steve Deitz Cray Inc.

OpenMP Lab on Nested Parallelism and Tasks

CIS Operating Systems Application of Semaphores. Professor Qiang Zeng Spring 2018

Advanced MEIC. (Lesson #18)

OpenMP C and C++ Application Program Interface Version 1.0 October Document Number

Overview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions

Final thoughts on functions F E B 2 5 T H

COMP Parallel Computing. SMM (2) OpenMP Programming Model

Overview: The OpenMP Programming Model

CMSC 330: Organization of Programming Languages

The Java Memory Model

Potential violations of Serializability: Example 1

Pierce Ch. 3, 8, 11, 15. Type Systems

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen

Chapter 4: Multi-Threaded Programming

Towards Resilient Chapel

Synchronization I. Jo, Heeseung

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen

Programming Safe Agents in Blueprint. Alex Muscar University of Craiova

15-418, Spring 2008 OpenMP: A Short Introduction

Multithreading in C with OpenMP

An Introduction to Scheme

A Deterministic Concurrent Language for Embedded Systems

Static Data Race Detection for SPMD Programs via an Extended Polyhedral Representation

Using Chapel to teach parallel concepts. David Bunde Knox College

CSE 153 Design of Operating Systems

Efficient, Deterministic, and Deadlock-free Concurrency

Tasking in OpenMP 4. Mirko Cestari - Marco Rorro -

Threading and Synchronization. Fahd Albinali

PARALLEL PROGRAMMING (gabor_par, lab_limits)

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Models and languages of concurrent computation

A Deterministic Concurrent Language for Embedded Systems

The C/C++ Memory Model: Overview and Formalization

COMP519 Web Programming Lecture 12: JavaScript (Part 3) Handouts

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman)

Parallel Computing Parallel Programming Languages Hwansoo Han

SFDV3006 Concurrent Programming

Synchronization I. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

A Deterministic Concurrent Language for Embedded Systems

Ch 9: Control flow. Sequencers. Jumps. Jumps

Language Reference Manual

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008

5.Language Definition

Monitors; Software Transactional Memory

Verification of Task Parallel Programs Using Predictive Analysis

What Came First? The Ordering of Events in

Introduction to OpenMP

Analysis of an OpenMP Program for Race Detection

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013

CS4411 Intro. to Operating Systems Exam 1 Fall points 10 pages

Chapter 8. Basic Synchronization Principles

CSE 307: Principles of Programming Languages

OpenMP - III. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen

Chenyu Zheng. CSCI 5828 Spring 2010 Prof. Kenneth M. Anderson University of Colorado at Boulder

CSL 860: Modern Parallel

Transcription:

Primitive Task-Parallel Constructs The begin statement The sync types Structured Task-Parallel Constructs Atomic Transactions and Memory Consistency Chapel: Task Parallelism 2

Syntax begin-stmt: begin stmt Semantics Creates a concurrent task to execute stmt Control continues immediately (no join) Example begin writeln( hello world ); writeln( good bye ); Possible output hello world good bye good bye hello world Chapel: Task Parallelism 3

Combine begin and on begin on Locale(1) do writeln( Hi from, here.id); writeln( Bye from,here.id); Possible output Hi from 1 Bye from 0 Bye from 0 Hi from 1 Chapel: Task Parallelism 4

Syntax sync-type: sync type Semantics Variable has a value and state (full or empty) Default read blocks until written (until full) Default write blocks until read (until empty) Examples: Critical sections var lock$: sync bool; // state is empty lock$ = true; // state is full critical(); lock$; // state is empty Chapel: Task Parallelism 5

readfe():t readff():t readxx():t writeef(v:t) writeff(v:t) writexf(v:t) reset() isfull: bool wait until full, leave empty, return value wait until full, leave full, return value non-blocking, return value wait until empty, leave full, set value to v wait until full, leave full, set value to v non-blocking, leave full, set value to v non-blocking, leave empty, reset value non-blocking, return true if full else false Defaults read: readfe, write: writeef Chapel: Task Parallelism 6

examples/primers/taskparallel.chpl examples/programs/prodcons.chpl Chapel: Task Parallelism 7

Primitive Task-Parallel Constructs Structured Task-Parallel Constructs The cobegin statement The coforall loop The sync statement Atomic Transactions and Memory Consistency Implementation Notes and Examples Chapel: Task Parallelism 8

Syntax cobegin-stmt: cobegin { stmt-list } Semantics Invokes a concurrent task for each listed stmt Control waits to continue implicit join Example cobegin { consumer(1); consumer(2); producer(); } Chapel: Task Parallelism 9

Any cobegin statement cobegin { stmt1(); stmt2(); stmt3(); } can be rewritten in terms of begin statements var s1$, s2$, s3$: sync bool; begin { stmt1(); s1$ = true; } begin { stmt2(); s2$ = true; } begin { stmt3(); s3$ = true; } s1$; s2$; s3$; but the compiler may miss out on optimizations. Chapel: Task Parallelism 10

Syntax coforall-loop: coforall index-expr in iteratable-expr { stmt } Semantics Loop over iteratable-expr invoking concurrent tasks Control waits to continue implicit join Example begin producer(); coforall i in 1..numConsumers { consumer(i); } Chapel: Task Parallelism 11

Use begin when Creating tasks with unbounded lifetimes Load balancing requires dynamic task creation Cobegin and coforall are insufficient for task structuring Use cobegin when Invoking a fixed # of tasks (potentially heterogeneous) The tasks have bounded lifetimes Use coforall when Invoking a fixed or dynamic # of homogeneous task The tasks have bounded lifetimes Chapel: Task Parallelism 12

Use for when A loop must be executed serially One task is sufficient for performance Use forall when The loop can be executed in parallel The loop can be executed serially Degree of concurrency << # of iterations Use coforall when The loop must be executed in parallel (And not just for performance reasons!) Each iteration has substantial work Chapel: Data Parallelism 13

Syntax sync-statement: sync stmt Semantics Executes stmt Waits on all dynamically-encountered begins Example sync { for i in 1..numConsumers { begin consumer(i); } producer(); } Chapel: Task Parallelism 14

Where the cobegin statement is static, cobegin { functionwithbegin(); functionwithoutbegin(); } the sync statement is dynamic. sync { begin functionwithbegin(); begin functionwithoutbegin(); } Program termination is defined by an implicit sync. sync main(); Chapel: Task Parallelism 15

examples/primers/taskparallel.chpl Chapel: Task Parallelism 16

Primitive Task-Parallel Constructs Structured Task-Parallel Constructs Atomic Transactions and Memory Consistency The atomic statement Races and memory consistency Implementation Notes and Examples Chapel: Task Parallelism 17

Syntax atomic-statement: atomic stmt Semantics Executes stmt so it appears as a single operation No other task sees a partial result Example atomic A(i) = A(i) + 1; atomic { newnode.next = node; newnode.prev = node.prev; node.prev.next = newnode; node.prev = newnode; } Chapel: Task Parallelism 18

Example var x = 0, y = 0; cobegin { { x = 1; y = 1; } { write(y); write(x); } } Task 1 Task 2 x = 1; y = 1; x = 1; y = 1; x = 1; y = 1; write(y); // 0 write(x); // 0 write(y); // 0 write(x); // 1 write(y); // 1 write(x); // 1 Could the output be 10? Or 42? Chapel: Task Parallelism 19

A program without races is sequentially consistent. A multi-processing system has sequential consistency if the results of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program. Leslie Lamport The behavior of a program with races is undefined. Synchronization is achieved in two ways: By reading or writing sync (or single) variables By executing atomic statements Chapel: Task Parallelism 20

Task teams Suspendable tasks Work stealing, load balancing Eurekas Task-private variables Chapel: Task Parallelism 21

Primitive Task-Parallel Constructs The begin statement The sync types Structured Task-Parallel Constructs The cobegin statement The coforall loop The sync statement Atomic Transactions and Memory Consistency The atomic statement Races and memory consistency Chapel: Task Parallelism 22