Enhancements in OpenMP 2.0

Similar documents
Introduction [1] 1. Directives [2] 7

OpenMP. Application Program Interface. CINECA, 14 May 2012 OpenMP Marco Comparato

Advanced C Programming Winter Term 2008/09. Guest Lecture by Markus Thiele

Shared Memory Parallelism - OpenMP

Topics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP

OPENMP TIPS, TRICKS AND GOTCHAS

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono

Parallel Computing Using OpenMP/MPI. Presented by - Jyotsna 29/01/2008

OpenMP 2. CSCI 4850/5850 High-Performance Computing Spring 2018

OpenMP C and C++ Application Program Interface Version 1.0 October Document Number

Barbara Chapman, Gabriele Jost, Ruud van der Pas

Introduction to Standard OpenMP 3.1

Parallel and Distributed Programming. OpenMP

A brief introduction to OpenMP

Introduction to Programming with OpenMP

OpenMP Application Program Interface

[Potentially] Your first parallel application

Parallelising Scientific Codes Using OpenMP. Wadud Miah Research Computing Group

An Introduction to OpenMP

Distributed Systems + Middleware Concurrent Programming with OpenMP

ECE 574 Cluster Computing Lecture 10

Introduction to OpenMP

Shared Memory Programming with OpenMP

OPENMP TIPS, TRICKS AND GOTCHAS

!OMP #pragma opm _OPENMP

Practical stuff! ü OpenMP. Ways of actually get stuff done in HPC:

Compiling and running OpenMP programs. C/C++: cc fopenmp o prog prog.c -lomp CC fopenmp o prog prog.c -lomp. Programming with OpenMP*

Lecture 4: OpenMP Open Multi-Processing

CS691/SC791: Parallel & Distributed Computing

Module 11: The lastprivate Clause Lecture 21: Clause and Routines. The Lecture Contains: The lastprivate Clause. Data Scope Attribute Clauses

Introduction to OpenMP. Lecture 4: Work sharing directives

OpenMP Application Program Interface

EPL372 Lab Exercise 5: Introduction to OpenMP

OpenMP. OpenMP. Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum API for Fortran and C/C++

Shared Memory Parallelism using OpenMP

Introduction to. Slides prepared by : Farzana Rahman 1

A Short Introduction to OpenMP. Mark Bull, EPCC, University of Edinburgh

Parallel Programming with OpenMP. CS240A, T. Yang

Parallel Processing Top manufacturer of multiprocessing video & imaging solutions.

Introduction to OpenMP

Introduction to OpenMP

Session 4: Parallel Programming with OpenMP

Implementation of Parallelization

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen

OpenMP Introduction. CS 590: High Performance Computing. OpenMP. A standard for shared-memory parallel programming. MP = multiprocessing

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen

OpenMP Technical Report 3 on OpenMP 4.0 enhancements

Mango DSP Top manufacturer of multiprocessing video & imaging solutions.

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008

Introduction to OpenMP

OpenMP - Introduction

Overview: The OpenMP Programming Model

Data Handling in OpenMP

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program

PGI Accelerator Programming Model for Fortran & C

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell. Specifications maintained by OpenMP Architecture Review Board (ARB)

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell

OpenMP 4. CSCI 4850/5850 High-Performance Computing Spring 2018

An OpenMP Validation Suite

Parallel Programming: OpenMP

by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by using function call

Alfio Lazzaro: Introduction to OpenMP

Introduction to OpenMP

Introduction to OpenMP

Introduction to OpenMP

Introduction to OpenMP. Lecture 2: OpenMP fundamentals

15-418, Spring 2008 OpenMP: A Short Introduction

An Introduction to OpenMP

Introduction to OpenMP

Parallel programming using OpenMP

Introduction to OpenMP

Introduction to OpenMP

OpenMP Tutorial. Dirk Schmidl. IT Center, RWTH Aachen University. Member of the HPC Group Christian Terboven

EE/CSCI 451: Parallel and Distributed Computation

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah

Multi-core Architecture and Programming

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP)

Towards OpenMP for Java

Performance Tuning and OpenMP

Programming Shared Memory Systems with OpenMP Part I. Book

GCC Developers Summit Ottawa, Canada, June 2006

OpenMP: Open Multiprocessing

Parallel Computing. Prof. Marco Bertini

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah

Parallel Programming with OpenMP. CS240A, T. Yang, 2013 Modified from Demmel/Yelick s and Mary Hall s Slides

Standard promoted by main manufacturers Fortran. Structure: Directives, clauses and run time calls

OpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2.

Standard promoted by main manufacturers Fortran

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah

PGI Fortran & C Accelerator Programming Model. The Portland Group

HPC Workshop University of Kentucky May 9, 2007 May 10, 2007

On the Implementation of OpenMP 2.0 Extensions in the Fujitsu PRIMEPOWER compiler

CS691/SC791: Parallel & Distributed Computing

OpenMP: Open Multiprocessing

OpenMP+F90 p OpenMP+F90

Data Environment: Default storage attributes

Introduction to OpenMP

Performance Tuning and OpenMP

Advanced OpenMP. OpenMP Basics

OpenACC 2.6 Proposed Features

Transcription:

Enhancements in mueller@hlrs.de University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Slide 1 Outline Timeline Clarifications/Modifications New Features Slide 2 19. Enhancements in 19. 19-1

Timeline OpenMP 1.0 for Fortran released October 1997 OpenMP 1.0 for C/C++ released at October 1998 OpenMP 1.1 for Fortran released at November 1999 for Fortran released at November 2000 for C/C++ in preparation Slide 3 Clarifications interface definitions for OpenMP runtime routines have been added an OpenMP compliant implementation must documents its implementation-defined behavior clarification of implied flush directive Recycling of thread numbers is clarified ( if dynamic threads are disabled, the threads keep the same number on subsequent parallel regions) initialized data must have the SAVE attribute, like in Fortran 95 Slide 4 19. Enhancements in 19. 19-2

Clarifications: Implementation-defined behavior See Appendix E of the standard The size of the first chunk in SCHEDULE(GUIDED) default schedule for SCHEDULE(RUNTIME) default schedule default number of threads default for dynamic thread adjustment number of threads used to execute nested parallel regions atomic directives might be replaced by critical sections behavior in case of thread exhaustion use of parameters other than OMP_*_KIND in generic interfaces allocation status of allocatable arrays that are not affected by COPYIN clause are undefined if dynamic thread mechanism is enabled Slide 5 Clarifications: Implied flush directive A FLUSH directive identifies a sequence point at which a consistent view of the shared memory is guaranteed It is implied at the following constructs: BARRIER CRITICAL and END CRITICAL END {DO, SECTIONS} END {SINGLE, WORKSHARE} ORDERED AND END ORDERED PARALLEL and END PARALLEL with their combined variants It is NOT implied at the following constructs: DO MASTER and END MASTER SECTIONS SINGLE WORKSHARE Thread- n m number new value Memory old value Cache CPU Registers? cache flush is needed to load the new value into the CPU Slide 6 19. Enhancements in 19. 19-3

New Features Wallclock timers WORKSHARE directive REDUCTION on array names NUM_THREAD clause _OPENMP preprocessor macro Nested lock routines like in C/C++ Reprivatization of variables is allowed Directives may contain comments COPYPRIVATE is a new modifier on END SINGLE THREADPRIVATE may be applied to common blocks COPYIN on common blocks Slide 7 New Feature: Wall clock timers Portable wall clock timers similar to MPI_WTIME DOUBLE PRECISION FUNCTION OMP_GET_WTIME() provides elapsed time START=OMP_GET_WTIME()! Work to be measured END = OMP_GET_WTIME() PRINT *, Work took, END-START, seconds provides per-thread time, i.e. needs not be globally consistent DOUBLE PRECISION FUNCTION OMP_GET_WTICK() returns the number of seconds between two successive clock ticks Slide 8 19. Enhancements in 19. 19-4

New Feature: WORKSHARE directive WORKSHARE directive allows parallelization of array expressions and FORALL statements Usage:!$OMP WORKSHARE A=B! Rest of block!$omp END WORKSHARE Semantics: Work inside block is divided into separate units of work. Each unit of work is executed only once. The units of work are assigned to threads in any manner. The compiler must ensure sequential semantics. Similar to PARALLEL DO without explicit loops. Slide 9 New Feature: Reduction on Arrays REDUCTION clause may be applied to arrays Example:!$OMP PARALLEL DO REDUCTION(MAX:M) DO I=1, 100 M=MAX(M,N(I)) END DO Deferred shape and assumed size arrays are not allowed! Slide 10 19. Enhancements in 19. 19-5

New Feature: NUM_THREAD clause The NUM_THREAD clause on parallel regions defines the number of threads to be used to execute that region Example:!$OMP PARALLEL NUM_THREADS(scalar integer expression) block!$omp END PARALLEL can also be used on combined parallel work-sharing constructs Slide 11 New Feature: _OPENMP If an OpenMP compliant compiler supports a macro preprocessor it has to define the symbol _OPENMP The symbol is of the value YYYYMM YYYY is the year MM the month of the version of OpenMP the implementation supports Slide 12 19. Enhancements in 19. 19-6

Summary contains many clarification that reflects the experience made in the last years some improvements/extensions: interface definitions Wallclock timers WORKSHARE directive REDUCTION on array names Some features have direct expressions in C/C++ and can be expected in for C/C++ Slide 13 19. Enhancements in 19. 19-7