Practical in Numerical Astronomy, SS 2012 LECTURE 12

Size: px
Start display at page:

Download "Practical in Numerical Astronomy, SS 2012 LECTURE 12"

Transcription

1 Practical in Numerical Astronomy, SS 2012 LECTURE 12 Parallelization II. Open Multiprocessing (OpenMP) Lecturer Eduard Vorobyov. raum

2 OpenMP is a shared memory parallelism. It is designed for the SMP (symmetric multuprocessing) machines. Wikipedia: symmetric multiprocessing involves a multi-processor computer hardware architecture where two or more identical processors are connected to a single shared main memory and are controlled by a single OS instance. MPI is a distributed memory parallelism. It is designed for computer clusters with distributed memory. Wikipedia: distributed memory refers to a multiple-processor computer systems in which each processor has its own private memory. 2

3 Basic idea fork- join programming model 0 Serial region 1 2 Serial region Serial region 3 4 1) The code starts as serial (non-parallel) and it has only one master thread. 2) The master thread is forked into N threads where a parallel region is encountered (in this example four additional threads 1, 2, 3, 4 are created). Thread 0 remains the master of all five threads. 3) Each thread executes part of the code in parallel with other threads. 4) Upon completion of the parallel region, threads and joined into one master thread, which continues execution in the serial region. 5) Calculations continue in the serial mode until a new parallel region is reached. 3

4 Parallelizing a serial code using OpenMP directives. The OpenMP standard offers the possibility of using the same source code with and without OpenMP parallelization (the MPI standard does not do this!). This can only be achieved by hiding the OpenMP directives and commands in such a way, that a normal compiler is unable to see them. For that purpose the following directive sentinel is introduced:!$omp Since the first character is an exclamation mark!, a normal compiler will interpret the line as a comment and will neglect its content. But an OpenMP-compliant compiler will identify the complete sequence and will execute commands that follow:!$omp PARALLEL DEFAULT(shared) PRIVATE(C, D) REDUCTION(+:a) 4

5 Making the FORTRAN compile recognize OpenMP directives. In order for the FOTRAN compiler to recognize OpenMP directives, one needs to compile the source code with a specific flag, which may be compiler-dependent and tells the compiler to link specific OpenMP libraries. GNU Fortran compiler gfortran -fopenmp Intel Fortran compiler ifort -openmp PGI Fortran compiler pgf90 -mp Note that when using OpenMP all local arrays will be allocated on the stack. When porting existing code to OpenMP, this may lead to surprising results, especially to segmentation faults if the stacksize is limited. 5

6 Setting the number of threads in a parallel region The number of threads can be set by environment variables In BASH shell: export OMP_NUM_THREADS = 8 In TCSH shell setenv OMP_NUM_THREADS = 8 Environment variables affect all OpenMP codes that are run from a given terminal. 6

7 The number of threads can also be set by OpenMP library calls Subroutine OMPsetup integer omp_get_num_threads, omp_get_max_threads, omp_get_num_procs Call OMP_SET_NUM_Threads(8)! Sets number of threads to 8!$OMP parallel! Parallel region starts here!$omp master! The following commands will be executed only by the master thread print(*,*) 'num threads=', omp_get_num_threads()! Number of executing threads print(*,*) 'max threads=', omp_get_max_threads()! Maximum possible number of threads print(*,*) 'max cpus=', omp_get_num_procs()! Available number of processors!$omp end master!$omp end parallel! End of parallel regions End subroutine OMPSetup Note that OMP_SET_NUM_Threads is called from a serial part of the code. The library call to OMP_SET_NUM_Threads supersedes the environment variable OMP_NUM_THREADS. 7

8 The PARALLEL construct The most important directive in OpenMP is the one in charge of defining the so called parallel regions. Such a region is a block of code that is going to be executed by multiple threads running in parallel. Since a parallel region needs to be created/opened and destroyed/closed, two directives are necessary, forming a so called directive-pair:!$omp parallel --!$OMP end parallel.... serial code..!$omp parallel write(*,*) "Hello"!$OMP end parallel... serial code.. Parallel code Since the code enclosed between the two directives is executed by each thread, the message Hello appears in the screen as many times as threads are being used in the parallel region. Before and after the parallel region, the code is executed by only one thread, which is the normal behavior in serial programs. 8

9 Parallelizing a DO loop. PRIVATE clause Serial DO loop Integer k Do k = 1, end do Parallel DO loop Integer k Call OMP_SET_NUM_Threads(2)!$OMP parallel do private(k) Do k = 1, end do!$omp end parallel do thread 0 thread 0 thread 1 Do k = 1, 1000 Do k = 1, 500 Do k = 501, 1000 Master thread does all the job Each thread computes part of the global DO loop Note that the same counter variable k has different values in each thread in the parallelized DO loop! To avoid memory conflicts, two copies of variable k need to be created in the memory. The clause PRIVATE(k) tells the compiler that each thread needs to have its own copy of the variable k. The PRIVATE clause can be very resource consuming Variables should be declared private only if they are modified inside the DO loop. Upon entering and after leaving the parallel DO loop, variable k is undefined (in the serial DO loop, k=1000, after leaving the loop). 9

10 Program example Implicit NONE Shared variables. The SHARED clause In contrast to the previous situation, sometimes there are variables which should be available to all threads inside the DO-loop because their values are needed by all threads or because all threads have to update their values. Call omp_set_num_threads(4)! Setting the number of threads to 4 Integer, parameter :: n=10 Integer i Real b Real, dimension(n) :: a!$omp parallel do shared(a,n) private(i,b)! Parallel DO loop begins here Do i=1, n b = i + 1 a(i) = b End do!$omp end parallel do! Parallel DO loop ends here end In this example, an array variable a(i), variable b, and counter i are modified inside the DO loop. However, each iteration of the loop accesses different elements of the array a(i). Therefore, one need not to create separate copies of array a(i). Such variables are declared as SHARED. Use shared when: a variable is not modified in the loop (as, e.g., n) a variable is an array in which each iteration of the loop accesses a different element. 10

11 Other DO loop clauses FIRSTPRIVATE(list) LASTPRIVATE(list) REDUCTION(operator:list) SCHEDULE(type, chunk) ORDERED DEFAULT FIRSTPRIVATE clause. Private variables have an undefined value after entering the parallel do construct. But sometimes it is of interest that these local variables have the value of the original variable in the serial part of the code. This is achieved by including the variable in a FIRSTPRIVATE clause as follows: Integer a, b a = 2 b = 1!$OMP parallel do private(a) firstprivate(b)!$omp end parallel do In this example, variable a has an undefined value at the beginning of the parallel region, while b has the value specified in the preceding serial region, namely b = 1. 11

12 LASTPRIVATE clause Private variables have an undefined value after leaving the parallel do construct. This is sometimes not convenient. By including a variable in a LASTPRIVATE clause, the original variable will be updated by the last value it gets inside the parallel DO-loop, if this DOloop would be executed in serial mode. For example: Integer i, a!$omp do private(i) lastprivate(a) do i = 1, 1000 a = i End do!$omp end do After the finishing of the parallel DO loop, the variable a will be equal to 1000, which is the value it would have, if the OpenMP directive would not exist. 12

13 The REDUCTION clause Integer i, a do i = 1, 1000 a = a + i enddo wrong OmpenMP parallelization!!$omp parallel do private(i) shared (a) do i = 1, 1000 a = a + i enddo!$omp end do When a variable has been declared as SHARED because all threads need to modify its value, it is necessary to ensure that only one thread at a time is writing/updating the memory location of the considered variable, otherwise unpredictable results will occur. By using the clause REDUCTION it is possible to solve this problem, since only one thread at a time is allowed to update the value of a, ensuring that the final result will be the correct one.!$omp parallel do reduction(+:a) do i = 1, 1000 a = a + i Endd o!$omp end parallel do 13

14 General syntax of the REDUCTION clause REDUCTION(operator or intrinsic function : variable list) Initialization rules for variables in variable list A private copy of each variable in variable list is created for each thread as if the PRIVATE clause had been used. The resulting private copies are initialized following the rules shown in the Table. At the end of the REDUCTION, the shared variable is updated to reflect the result of combining the final value of each of the private copies using the specified operator. 14

15 The SCHEDULE clause. Load balancing. Call omp_set_num_threads(4)!$omp parallel do private(k) shared(n) Do k=1,n.. End do!$omp end parallel do When a do-loop is parallelized and its iterations distributed over the different threads, the most simple way of doing this is by giving to each thread the same number of iterations: n/4. But this is not always the best choice, since the computational cost of the iterations may not be equal for all of them. Therefore, different ways of distributing the iterations exist. The SCHEDULE clause is meant to allow the programmer to specify the scheduling for each do-loop using the following syntax: Call omp_set_num_threads(4)!$omp parallel do private(k) shared(n) schedule(type, chunk) Do k=1,n.. End do!$omp end parallel do 15

16 The SCHEDULE clause accepts two parameters. The first one, type, specifies the way in which the work is distributed over the threads. The second one, chunk, is an optional parameter specifying the size of the work given to each thread.. STATIC :when this option is specified, the pieces of work created from the iteration space of the do-loop are distributed over the threads in the team following the order of their thread identification number. This assignment of work is done at the beginning of the do-loop and stays fixed during its execution. Number of threads = 3 and the DO-loop iteration space k=1, 600 No value of chunk is specified. Best choice in most cases. 16

17 When SCHEDULE(DYNAMIC,chunk) is specified, the iteration space is divided into pieces of work with a size equal to chunk. If this optional parameter is not given, then a size equal to one iteration is considered. Thereafter, each thread gets one of these pieces of work. When they have finished with their task, they get assigned a new one until no pieces of work are left. Example of dynamic scheduling See also: GUIDED and RUNTIME clauses 17

18 The ODERED clause. Eliminating the race condition. Program race_condition Integer i Integer, dimension(5) :: a,b a=1 b=2 Call omp_set_num_threads(2)!$omp parallel do private(i) shared(a,b) Do i=1,4 a(i+1)=a(i)+b(i) End do!$omp end parallel do end Thread 0 a(2) = a(1)+b(1) Thread 0 a(3) = a(2)+b(2) Thread 1 a(4) = a(3)+b(3) Thread 1 a(5) = a(4)+b(4) We have a data dependency between iterations, causing a so-called race condition P R O B L E M A solution is to use the ORDERED clause, which tell the compiler that some statements in the DO-loop need to be executed sequentially. 18

19 Program no_race_condition Integer i Integer, dimension(5) :: a,b a=1 b=2 Call omp_set_num_threads(2)!$omp parallel do private(i) shared(a,b) ordered Do i=1,4!$omp ordered a(i+1)=a(i)+b(i)!$omp end ordered End do!$omp end parallel do end In this case, the threads do not run in parallel. DEFAULT( PRIVATE SHARED NONE ) clause When most of the variables used inside the DO-loop are going to be private/shared, then it would be cumbersome to include all of them in one of the previous clauses. To avoid this, it is possible to specify what OpenMP has to do, when nothing is said about a specific variable: it is possible to specify a default setting. For example!$omp parallel do default(private) shared(a) 19

20 Parallelization of implicit DO-loops. WORKSHARE construct. FORTRAN 90 array operations include implicit DO-loops and can be parallelized by the WORKSHARE construct serial code real, dimension (10):: a, b, c.. a = 5.0 * cos(a) * sin(a).. parallelized code real, dimension (10):: a, b, c..!$omp parallel workshare a = 5.0 * cos(a) * sin(a)!$omp end parallel workshare.. Not all compilers support parallelization of FORTRAN 90 array operations! 20

21 Parallelization of nested DO-loops When several nested do-loops are present, it is always convenient to parallelize the outer most one, since then the amount of work distributed over the different threads is maximal. Also the number of times in which the!$omp parallel do --!$OMP end parallel do directive pair effectively acts is minimal, which implies a minimal overhead due to the OpenMP directive. do i = 1, 10 do j = 1, 10!$OMP parallel do private(k) shared(a,j,i) do k = 1, 10 A(k,j,i) = i * j * k end do!$omp end parallel do end do end do!$omp parallel do private(i,j,k) shared(a) do i = 1, 10 do j = 1, 10 do k = 1, 10 A(k,j,i) = i * j * k end do end do end do!$omp end parallel do the work to be computed in parallel is distributed i *j = 100 times and each thread gets less than 10 iterations to compute, since only the innermost do- loop is parallelized. the work to be computed in parallel is distributed only once and the work given to each thread has at least j*k = 100 iterations. Therefore, in this second case a better performance of the parallelization is to expect. 21

22 The SECTIONS construct The SECTIONS construct allows to assign to each thread a completely different task leading to an MPMD 1 model of execution. Each section of code is executed once and only once by a thread in the team. The syntax of this construct is the following one:!$omp parallel sections clause1 clause2...!$omp section... code executed by one thread!$omp section... code executed by another thread!$omp end parallel sections Each block of code, to be executed by one of the threads, starts with an!$omp SECTION directive and extends until the same directive is found again or until the closing-directive!$omp END SECTIONS is found. Any number of sections can be defined inside the present directive-pair, but only the existing number of threads is used to distribute the different blocks of code. This means, that if the number of sections is larger than the number of available threads, then some threads will execute more than one section of code in a serial fashion. Allowed clauses: PRIVATE, FIRSTPRIVATE, LASTPRIVATE, REDUCTION 1 MPMD stands for Multiple Programs Multiple Data and refers to the case of having completely different programs/tasks which share or interchange information and which are running simultaneously on different processors. 22

23 Calling serial subroutines inside a parallel region. SINGLE construct. integer, dimension(0:3) :: a = 99 integer :: i_am Call omp_set_num_threads(4)!$omp parallel private(i_am) shared(a) i_am = omp_get_thread_num() call work(a, i_am)!$omp single print*, 'a = ', a!$omp end single!$omp end parallel subroutine work(a, i_am) integer, dimension(0:3) :: a! becomes shared integer :: i_am! becomes private print*, 'work', i_am a(i_am) = i_am end subroutine work Dummy arguments inherit the data-sharing attributes of the associated actual arguments. The code enclosed in the SINGLE construct is only executed by one of the threads in the team, namely the one who first arrives to the opening-directive!$omp SINGLE. All the remaining threads wait at the implied synchronization in the closing-directive!$omp END SINGLE. Result of execution work 1 work 3 a = 99, 1, 99, 3 work 2 work 0 What went wrong? The SINGLE construct was executed by one of the threads (1 or 3) before threads 2 and 0 completed execution of subroutine work. 23

24 integer, dimension(0:3) :: a = 99 integer :: i_am Call omp_set_num_threads(4)!$omp parallel private(i_am) shared(a) i_am = omp_get_thread_num() call work(a, i_am)!$omp barrier! All threads wait at the barrier!$omp single print*, 'a = ', a!$omp end single!$omp end parallel subroutine work(a, i_am) integer, dimension(0:3) :: a! becomes shared integer :: i_am! becomes private print*, 'work', i_am a(i_am) = i_am end subroutine work Result of execution work 1 work 3 work 2 work 0 a = 0, 1, 2, 3 The BARRIER directive represents an explicit synchronization between the different threads in the team. When encountered, each thread waits until all the other threads have reached this point. 24

25 Calling parallel subroutines inside a parallel region. Call omp_set_num_threads(2)!$omp parallel shared(s) private(p)!$omp do private(j) do j = 1, end do!$omp end do call sub(s, p)!$omp end parallel... end subroutine sub(s, p) integer :: s! shared integer :: p! private integer :: var, k! local variables are private!$omp do private(k) do k = 1, 10...! Thread 0 will do the first 5 iterations...! Thread 1 will do the last 5 iterations end do!$omp end do do k = 1, 10...! All threads will do full 10 iterations end do A PARALLEL directive dynamically inside another PARALLEL directive logically establishes a new team, which is composed of only the current thread, unless nested parallelism is established. We say that the loop is serialized. All threads perform six iterations each.!$omp parallel do private(k) do k = 1, 10...! A PARALLEL directive inside! another PARALLEL directive end do!$omp end parallel do end 25

26 The MASTER and CRITICAL constructs The code enclosed inside the MASTER construct is executed only by the master thread of the team. Meanwhile, all the other threads continue with their work. The syntax is as follows:!$omp master...!$omp end master In essence, this construct is similar to using the!$omp single --!$OMP end single construct presented before, only that the thread to execute the block of code is forced to be the master one instead of the first arriving one. The CRITICAL construct restricts the access to the enclosed code to only one thread at a time. Examples of application of this directive-pair could be to read an input from the keyboard/file or to update the value of a shared variable. The syntax is the following one:!$omp critical...!$omp end critical When a thread reaches the beginning of a critical section, it waits there until no other thread is executing the code in the critical section. 26

27 The THREADPRIVATE construct Sometimes it is of interest to have global variables, but with values which are specific for each thread. An example could be a variable called my_id, which stores the thread identification number of each thread: this number will be different for each thread, but it would be useful that its value is accessible from everywhere inside each thread and that its value does not change from one parallel region to the next. When the program enters the first parallel region, a private copy of each variable marked as THREADPRIVATE is created for each thread. integer, save :: my_id! Variable must have a SAVE attribute!$omp threadprivate(my_id)!$omp parallel my_id = OMP_get_thread_num()! Thread number is assigned to my_id!$omp end parallel..!$omp parallel...!$omp end parallel. In this example, the variable my_id gets assigned the thread identification number of each thread during the first parallel region. In the second parallel region, the variable my_id keeps the values assigned to it in the first parallel region, since it is THREADPRIVATE. 27

28 OpenMP runtime library overview OpenMP Fortran library routines are external functions Their names start with OMP_ but usually have an integer or logical return type These functions must be declared explicitly Name omp_set_num_threads omp_get_num_threads omp_get_max_threads omp_get_thread_num omp_get_num_procs omp_in_parallel omp_set_dynamic omp_get_dynamic omp_set_nested omp_get_nested Functionality Set number of threads Return number of threads in team Return maximum number of threads Get thread ID Return maximum number of processors Check whether in parallel region Activate dynamic thread adjustment Check for dynamic thread adjustment Activate nested parallelism Check for nested parallelism 28

29 References : OpenMP Application Program Interface Version 3.0 May 2008 ALSO various web resources and books 29

30 Assignment 9 (five extra points) Parallelize your version of the Sedov test problem (or Sod shock tube problem) using OpenMP directives. (see Nigel s lecture on hyperbolic equations, Assignment 6). Use sufficiently high resolution so that the serial code would run 1 minute minimum. Use different number of threads (2, 4, max) Calculate the speedup for variable number of threads (2, 4, max) relative to the purely serial code. (use time./your _code in Linux to calculate the run time of your code) The report is due on

OpenMP+F90 p OpenMP+F90

OpenMP+F90 p OpenMP+F90 OpenMP+F90 hmli@ustc.edu.cn - http://hpcjl.ustc.edu.cn OpenMP+F90 p. 1 OpenMP+F90 p. 2 OpenMP ccnuma Cache-Coherent Non-Uniform Memory Access SMP Symmetric MultiProcessing MPI MPP Massively Parallel Processing

More information

Introduction [1] 1. Directives [2] 7

Introduction [1] 1. Directives [2] 7 OpenMP Fortran Application Program Interface Version 2.0, November 2000 Contents Introduction [1] 1 Scope............................. 1 Glossary............................ 1 Execution Model.........................

More information

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines What is OpenMP? What does OpenMP stands for? What does OpenMP stands for? Open specifications for Multi

More information

Parallel Programming in Fortran 95 using OpenMP

Parallel Programming in Fortran 95 using OpenMP Parallel Programming in Fortran 95 using OpenMP Miguel Hermanns School of Aeronautical Engineering Departamento de Motopropulsión y Termofluidodinámica Universidad Politécnica de Madrid Spain email: hermanns@tupi.dmt.upm.es

More information

EPL372 Lab Exercise 5: Introduction to OpenMP

EPL372 Lab Exercise 5: Introduction to OpenMP EPL372 Lab Exercise 5: Introduction to OpenMP References: https://computing.llnl.gov/tutorials/openmp/ http://openmp.org/wp/openmp-specifications/ http://openmp.org/mp-documents/openmp-4.0-c.pdf http://openmp.org/mp-documents/openmp4.0.0.examples.pdf

More information

Parallelising Scientific Codes Using OpenMP. Wadud Miah Research Computing Group

Parallelising Scientific Codes Using OpenMP. Wadud Miah Research Computing Group Parallelising Scientific Codes Using OpenMP Wadud Miah Research Computing Group Software Performance Lifecycle Scientific Programming Early scientific codes were mainly sequential and were executed on

More information

Amdahl s Law. AMath 483/583 Lecture 13 April 25, Amdahl s Law. Amdahl s Law. Today: Amdahl s law Speed up, strong and weak scaling OpenMP

Amdahl s Law. AMath 483/583 Lecture 13 April 25, Amdahl s Law. Amdahl s Law. Today: Amdahl s law Speed up, strong and weak scaling OpenMP AMath 483/583 Lecture 13 April 25, 2011 Amdahl s Law Today: Amdahl s law Speed up, strong and weak scaling OpenMP Typically only part of a computation can be parallelized. Suppose 50% of the computation

More information

Advanced C Programming Winter Term 2008/09. Guest Lecture by Markus Thiele

Advanced C Programming Winter Term 2008/09. Guest Lecture by Markus Thiele Advanced C Programming Winter Term 2008/09 Guest Lecture by Markus Thiele Lecture 14: Parallel Programming with OpenMP Motivation: Why parallelize? The free lunch is over. Herb

More information

EE/CSCI 451 Introduction to Parallel and Distributed Computation. Discussion #4 2/3/2017 University of Southern California

EE/CSCI 451 Introduction to Parallel and Distributed Computation. Discussion #4 2/3/2017 University of Southern California EE/CSCI 451 Introduction to Parallel and Distributed Computation Discussion #4 2/3/2017 University of Southern California 1 USC HPCC Access Compile Submit job OpenMP Today s topic What is OpenMP OpenMP

More information

OpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder

OpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder OpenMP programming Thomas Hauser Director Research Computing thomas.hauser@colorado.edu CU meetup 1 Outline OpenMP Shared-memory model Parallel for loops Declaring private variables Critical sections Reductions

More information

Introduction to. Slides prepared by : Farzana Rahman 1

Introduction to. Slides prepared by : Farzana Rahman 1 Introduction to OpenMP Slides prepared by : Farzana Rahman 1 Definition of OpenMP Application Program Interface (API) for Shared Memory Parallel Programming Directive based approach with library support

More information

CSL 860: Modern Parallel

CSL 860: Modern Parallel CSL 860: Modern Parallel Computation Hello OpenMP #pragma omp parallel { // I am now thread iof n switch(omp_get_thread_num()) { case 0 : blah1.. case 1: blah2.. // Back to normal Parallel Construct Extremely

More information

OpenMP on Ranger and Stampede (with Labs)

OpenMP on Ranger and Stampede (with Labs) OpenMP on Ranger and Stampede (with Labs) Steve Lantz Senior Research Associate Cornell CAC Parallel Computing at TACC: Ranger to Stampede Transition November 6, 2012 Based on materials developed by Kent

More information

OpenMP 2. CSCI 4850/5850 High-Performance Computing Spring 2018

OpenMP 2. CSCI 4850/5850 High-Performance Computing Spring 2018 OpenMP 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives

More information

Parallel Programming

Parallel Programming Parallel Programming OpenMP Dr. Hyrum D. Carroll November 22, 2016 Parallel Programming in a Nutshell Load balancing vs Communication This is the eternal problem in parallel computing. The basic approaches

More information

OpenMP. OpenMP. Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum API for Fortran and C/C++

OpenMP. OpenMP. Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum API for Fortran and C/C++ OpenMP OpenMP Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum 1997-2002 API for Fortran and C/C++ directives runtime routines environment variables www.openmp.org 1

More information

OpenMP C and C++ Application Program Interface Version 1.0 October Document Number

OpenMP C and C++ Application Program Interface Version 1.0 October Document Number OpenMP C and C++ Application Program Interface Version 1.0 October 1998 Document Number 004 2229 001 Contents Page v Introduction [1] 1 Scope............................. 1 Definition of Terms.........................

More information

Shared Memory Programming with OpenMP

Shared Memory Programming with OpenMP Shared Memory Programming with OpenMP (An UHeM Training) Süha Tuna Informatics Institute, Istanbul Technical University February 12th, 2016 2 Outline - I Shared Memory Systems Threaded Programming Model

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Le Yan Scientific computing consultant User services group High Performance Computing @ LSU Goals Acquaint users with the concept of shared memory parallelism Acquaint users with

More information

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program Amdahl's Law About Data What is Data Race? Overview to OpenMP Components of OpenMP OpenMP Programming Model OpenMP Directives

More information

Data Handling in OpenMP

Data Handling in OpenMP Data Handling in OpenMP Manipulate data by threads By private: a thread initializes and uses a variable alone Keep local copies, such as loop indices By firstprivate: a thread repeatedly reads a variable

More information

A Short Introduction to OpenMP. Mark Bull, EPCC, University of Edinburgh

A Short Introduction to OpenMP. Mark Bull, EPCC, University of Edinburgh A Short Introduction to OpenMP Mark Bull, EPCC, University of Edinburgh Overview Shared memory systems Basic Concepts in Threaded Programming Basics of OpenMP Parallel regions Parallel loops 2 Shared memory

More information

AMath 483/583 Lecture 14

AMath 483/583 Lecture 14 AMath 483/583 Lecture 14 Outline: OpenMP: Parallel blocks, critical sections, private and shared variables Parallel do loops, reductions Reading: class notes: OpenMP section of Bibliography $UWHPSC/codes/openmp

More information

Shared Memory programming paradigm: openmp

Shared Memory programming paradigm: openmp IPM School of Physics Workshop on High Performance Computing - HPC08 Shared Memory programming paradigm: openmp Luca Heltai Stefano Cozzini SISSA - Democritos/INFM

More information

Shared Memory Parallelism - OpenMP

Shared Memory Parallelism - OpenMP Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (http://www.llnl.gov/computing/tutorials/openmp/#introduction) OpenMP sc99 tutorial

More information

by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by using function call

by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by using function call OpenMP Syntax The OpenMP Programming Model Number of threads are determined by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by

More information

15-418, Spring 2008 OpenMP: A Short Introduction

15-418, Spring 2008 OpenMP: A Short Introduction 15-418, Spring 2008 OpenMP: A Short Introduction This is a short introduction to OpenMP, an API (Application Program Interface) that supports multithreaded, shared address space (aka shared memory) parallelism.

More information

OpenMP - Introduction

OpenMP - Introduction OpenMP - Introduction Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı - 21.06.2012 Outline What is OpenMP? Introduction (Code Structure, Directives, Threads etc.) Limitations Data Scope Clauses Shared,

More information

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008 1 of 6 Lecture 7: March 4 CISC 879 Software Support for Multicore Architectures Spring 2008 Lecture 7: March 4, 2008 Lecturer: Lori Pollock Scribe: Navreet Virk Open MP Programming Topics covered 1. Introduction

More information

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP)

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) V. Akishina, I. Kisel, G. Kozlov, I. Kulakov, M. Pugach, M. Zyzak Goethe University of Frankfurt am Main 2015 Task Parallelism Parallelization

More information

Alfio Lazzaro: Introduction to OpenMP

Alfio Lazzaro: Introduction to OpenMP First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B. Bertinoro Italy, 12 17 October 2009 Alfio Lazzaro:

More information

OpenMP. Application Program Interface. CINECA, 14 May 2012 OpenMP Marco Comparato

OpenMP. Application Program Interface. CINECA, 14 May 2012 OpenMP Marco Comparato OpenMP Application Program Interface Introduction Shared-memory parallelism in C, C++ and Fortran compiler directives library routines environment variables Directives single program multiple data (SPMD)

More information

https://www.youtube.com/playlist?list=pllx- Q6B8xqZ8n8bwjGdzBJ25X2utwnoEG

https://www.youtube.com/playlist?list=pllx- Q6B8xqZ8n8bwjGdzBJ25X2utwnoEG https://www.youtube.com/playlist?list=pllx- Q6B8xqZ8n8bwjGdzBJ25X2utwnoEG OpenMP Basic Defs: Solution Stack HW System layer Prog. User layer Layer Directives, Compiler End User Application OpenMP library

More information

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell. Specifications maintained by OpenMP Architecture Review Board (ARB)

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell. Specifications maintained by OpenMP Architecture Review Board (ARB) COMP4300/8300: The OpenMP Programming Model Alistair Rendell See: www.openmp.org Introduction to High Performance Computing for Scientists and Engineers, Hager and Wellein, Chapter 6 & 7 High Performance

More information

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell COMP4300/8300: The OpenMP Programming Model Alistair Rendell See: www.openmp.org Introduction to High Performance Computing for Scientists and Engineers, Hager and Wellein, Chapter 6 & 7 High Performance

More information

An Introduction to OpenMP

An Introduction to OpenMP An Introduction to OpenMP U N C L A S S I F I E D Slide 1 What Is OpenMP? OpenMP Is: An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Xiaoxu Guan High Performance Computing, LSU April 6, 2016 LSU HPC Training Series, Spring 2016 p. 1/44 Overview Overview of Parallel Computing LSU HPC Training Series, Spring 2016

More information

Introduction to Standard OpenMP 3.1

Introduction to Standard OpenMP 3.1 Introduction to Standard OpenMP 3.1 Massimiliano Culpo - m.culpo@cineca.it Gian Franco Marras - g.marras@cineca.it CINECA - SuperComputing Applications and Innovation Department 1 / 59 Outline 1 Introduction

More information

Parallel and Distributed Programming. OpenMP

Parallel and Distributed Programming. OpenMP Parallel and Distributed Programming OpenMP OpenMP Portability of software SPMD model Detailed versions (bindings) for different programming languages Components: directives for compiler library functions

More information

Programming Shared-memory Platforms with OpenMP. Xu Liu

Programming Shared-memory Platforms with OpenMP. Xu Liu Programming Shared-memory Platforms with OpenMP Xu Liu Introduction to OpenMP OpenMP directives concurrency directives parallel regions loops, sections, tasks Topics for Today synchronization directives

More information

Introduction to Programming with OpenMP

Introduction to Programming with OpenMP Introduction to Programming with OpenMP Kent Milfeld; Lars Koesterke Yaakoub El Khamra (presenting) milfeld lars yye00@tacc.utexas.edu October 2012, TACC Outline What is OpenMP? How does OpenMP work? Architecture

More information

Shared memory parallelism

Shared memory parallelism OpenMP Shared memory parallelism Shared memory parallel programs may be described as processes in which the execution flow is divided in different threads when needed. Threads, being generated inside a

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Le Yan HPC Consultant User Services Goals Acquaint users with the concept of shared memory parallelism Acquaint users with the basics of programming with OpenMP Discuss briefly the

More information

Parallel Programming: OpenMP + FORTRAN

Parallel Programming: OpenMP + FORTRAN Parallel Programming: OpenMP + FORTRAN John Burkardt Virginia Tech... http://people.sc.fsu.edu/ jburkardt/presentations/... fsu 2010 openmp.pdf... FSU Department of Scientific Computing Introduction to

More information

Distributed Systems + Middleware Concurrent Programming with OpenMP

Distributed Systems + Middleware Concurrent Programming with OpenMP Distributed Systems + Middleware Concurrent Programming with OpenMP Gianpaolo Cugola Dipartimento di Elettronica e Informazione Politecnico, Italy cugola@elet.polimi.it http://home.dei.polimi.it/cugola

More information

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah Introduction to OpenMP Martin Čuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu Overview Quick introduction. Parallel loops. Parallel loop directives. Parallel sections.

More information

Multithreading in C with OpenMP

Multithreading in C with OpenMP Multithreading in C with OpenMP ICS432 - Spring 2017 Concurrent and High-Performance Programming Henri Casanova (henric@hawaii.edu) Pthreads are good and bad! Multi-threaded programming in C with Pthreads

More information

COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP

COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP COMP4510 Introduction to Parallel Computation Shared Memory and OpenMP Thanks to Jon Aronsson (UofM HPC consultant) for some of the material in these notes. Outline (cont d) Shared Memory and OpenMP Including

More information

Parallel Programming

Parallel Programming Parallel Programming OpenMP Nils Moschüring PhD Student (LMU) Nils Moschüring PhD Student (LMU), OpenMP 1 1 Overview What is parallel software development Why do we need parallel computation? Problems

More information

Shared memory programming

Shared memory programming CME342- Parallel Methods in Numerical Analysis Shared memory programming May 14, 2014 Lectures 13-14 Motivation Popularity of shared memory systems is increasing: Early on, DSM computers (SGI Origin 3000

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Le Yan Objectives of Training Acquaint users with the concept of shared memory parallelism Acquaint users with the basics of programming with OpenMP Memory System: Shared Memory

More information

OpenMP Shared Memory Programming

OpenMP Shared Memory Programming OpenMP Shared Memory Programming John Burkardt, Information Technology Department, Virginia Tech.... Mathematics Department, Ajou University, Suwon, Korea, 13 May 2009.... http://people.sc.fsu.edu/ jburkardt/presentations/

More information

Speeding Up Reactive Transport Code Using OpenMP. OpenMP

Speeding Up Reactive Transport Code Using OpenMP. OpenMP Speeding Up Reactive Transport Code Using OpenMP By Jared McLaughlin OpenMP A standard for parallelizing Fortran and C/C++ on shared memory systems Minimal changes to sequential code required Incremental

More information

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013 OpenMP António Abreu Instituto Politécnico de Setúbal 1 de Março de 2013 António Abreu (Instituto Politécnico de Setúbal) OpenMP 1 de Março de 2013 1 / 37 openmp what? It s an Application Program Interface

More information

Lecture 4: OpenMP Open Multi-Processing

Lecture 4: OpenMP Open Multi-Processing CS 4230: Parallel Programming Lecture 4: OpenMP Open Multi-Processing January 23, 2017 01/23/2017 CS4230 1 Outline OpenMP another approach for thread parallel programming Fork-Join execution model OpenMP

More information

Session 4: Parallel Programming with OpenMP

Session 4: Parallel Programming with OpenMP Session 4: Parallel Programming with OpenMP Xavier Martorell Barcelona Supercomputing Center Agenda Agenda 10:00-11:00 OpenMP fundamentals, parallel regions 11:00-11:30 Worksharing constructs 11:30-12:00

More information

OpenMP Library Functions and Environmental Variables. Most of the library functions are used for querying or managing the threading environment

OpenMP Library Functions and Environmental Variables. Most of the library functions are used for querying or managing the threading environment OpenMP Library Functions and Environmental Variables Most of the library functions are used for querying or managing the threading environment The environment variables are used for setting runtime parameters

More information

[Potentially] Your first parallel application

[Potentially] Your first parallel application [Potentially] Your first parallel application Compute the smallest element in an array as fast as possible small = array[0]; for( i = 0; i < N; i++) if( array[i] < small ) ) small = array[i] 64-bit Intel

More information

Shared Memory Parallelism using OpenMP

Shared Memory Parallelism using OpenMP Indian Institute of Science Bangalore, India भ रत य व ज ञ न स स थ न ब गल र, भ रत SE 292: High Performance Computing [3:0][Aug:2014] Shared Memory Parallelism using OpenMP Yogesh Simmhan Adapted from: o

More information

Parallel Programming: OpenMP

Parallel Programming: OpenMP Parallel Programming: OpenMP Xianyi Zeng xzeng@utep.edu Department of Mathematical Sciences The University of Texas at El Paso. November 10, 2016. An Overview of OpenMP OpenMP: Open Multi-Processing An

More information

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means High Performance Computing: Concepts, Methods & Means OpenMP Programming Prof. Thomas Sterling Department of Computer Science Louisiana State University February 8 th, 2007 Topics Introduction Overview

More information

Introduction to OpenMP. Lecture 4: Work sharing directives

Introduction to OpenMP. Lecture 4: Work sharing directives Introduction to OpenMP Lecture 4: Work sharing directives Work sharing directives Directives which appear inside a parallel region and indicate how work should be shared out between threads Parallel do/for

More information

OpenMP threading: parallel regions. Paolo Burgio

OpenMP threading: parallel regions. Paolo Burgio OpenMP threading: parallel regions Paolo Burgio paolo.burgio@unimore.it Outline Expressing parallelism Understanding parallel threads Memory Data management Data clauses Synchronization Barriers, locks,

More information

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah Introduction to OpenMP Martin Čuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu Overview Quick introduction. Parallel loops. Parallel loop directives. Parallel sections.

More information

Introduction to OpenMP

Introduction to OpenMP Presentation Introduction to OpenMP Martin Cuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu September 9, 2004 http://www.chpc.utah.edu 4/13/2006 http://www.chpc.utah.edu

More information

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen Programming with Shared Memory PART II HPC Fall 2012 Prof. Robert van Engelen Overview Sequential consistency Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading

More information

OpenMP Application Program Interface

OpenMP Application Program Interface OpenMP Application Program Interface DRAFT Version.1.0-00a THIS IS A DRAFT AND NOT FOR PUBLICATION Copyright 1-0 OpenMP Architecture Review Board. Permission to copy without fee all or part of this material

More information

Allows program to be incrementally parallelized

Allows program to be incrementally parallelized Basic OpenMP What is OpenMP An open standard for shared memory programming in C/C+ + and Fortran supported by Intel, Gnu, Microsoft, Apple, IBM, HP and others Compiler directives and library support OpenMP

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Ekpe Okorafor School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014 A little about me! PhD Computer Engineering Texas A&M University Computer Science

More information

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen Programming with Shared Memory PART II HPC Fall 2007 Prof. Robert van Engelen Overview Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading HPC Fall 2007 2 Parallel

More information

Topics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP

Topics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP Topics Lecture 11 Introduction OpenMP Some Examples Library functions Environment variables 1 2 Introduction Shared Memory Parallelization OpenMP is: a standard for parallel programming in C, C++, and

More information

COSC 6374 Parallel Computation. Introduction to OpenMP. Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel)

COSC 6374 Parallel Computation. Introduction to OpenMP. Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel) COSC 6374 Parallel Computation Introduction to OpenMP Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel) Edgar Gabriel Fall 2015 OpenMP Provides thread programming model at a

More information

Compiling and running OpenMP programs. C/C++: cc fopenmp o prog prog.c -lomp CC fopenmp o prog prog.c -lomp. Programming with OpenMP*

Compiling and running OpenMP programs. C/C++: cc fopenmp o prog prog.c -lomp CC fopenmp o prog prog.c -lomp. Programming with OpenMP* Advanced OpenMP Compiling and running OpenMP programs C/C++: cc fopenmp o prog prog.c -lomp CC fopenmp o prog prog.c -lomp 2 1 Running Standard environment variable determines the number of threads: tcsh

More information

Practical stuff! ü OpenMP. Ways of actually get stuff done in HPC:

Practical stuff! ü OpenMP. Ways of actually get stuff done in HPC: Ways of actually get stuff done in HPC: Practical stuff! Ø Message Passing (send, receive, broadcast,...) Ø Shared memory (load, store, lock, unlock) ü MPI Ø Transparent (compiler works magic) Ø Directive-based

More information

OpenMP Tutorial. Dirk Schmidl. IT Center, RWTH Aachen University. Member of the HPC Group Christian Terboven

OpenMP Tutorial. Dirk Schmidl. IT Center, RWTH Aachen University. Member of the HPC Group Christian Terboven OpenMP Tutorial Dirk Schmidl IT Center, RWTH Aachen University Member of the HPC Group schmidl@itc.rwth-aachen.de IT Center, RWTH Aachen University Head of the HPC Group terboven@itc.rwth-aachen.de 1 IWOMP

More information

OPENMP OPEN MULTI-PROCESSING

OPENMP OPEN MULTI-PROCESSING OPENMP OPEN MULTI-PROCESSING OpenMP OpenMP is a portable directive-based API that can be used with FORTRAN, C, and C++ for programming shared address space machines. OpenMP provides the programmer with

More information

Introduction to OpenMP

Introduction to OpenMP 1 Introduction to OpenMP NTNU-IT HPC Section John Floan Notur: NTNU HPC http://www.notur.no/ www.hpc.ntnu.no/ Name, title of the presentation 2 Plan for the day Introduction to OpenMP and parallel programming

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Shared Memory Programming OpenMP Fork-Join Model Compiler Directives / Run time library routines Compiling and

More information

Introduction to OpenMP. Rogelio Long CS 5334/4390 Spring 2014 February 25 Class

Introduction to OpenMP. Rogelio Long CS 5334/4390 Spring 2014 February 25 Class Introduction to OpenMP Rogelio Long CS 5334/4390 Spring 2014 February 25 Class Acknowledgment These slides are adapted from the Lawrence Livermore OpenMP Tutorial by Blaise Barney at https://computing.llnl.gov/tutorials/openmp/

More information

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah Introduction to OpenMP Martin Čuma Center for High Performance Computing University of Utah m.cuma@utah.edu Overview Quick introduction. Parallel loops. Parallel loop directives. Parallel sections. Some

More information

Introduction to OpenMP

Introduction to OpenMP Christian Terboven, Dirk Schmidl IT Center, RWTH Aachen University Member of the HPC Group terboven,schmidl@itc.rwth-aachen.de IT Center der RWTH Aachen University History De-facto standard for Shared-Memory

More information

Parallel Processing Top manufacturer of multiprocessing video & imaging solutions.

Parallel Processing Top manufacturer of multiprocessing video & imaging solutions. 1 of 10 3/3/2005 10:51 AM Linux Magazine March 2004 C++ Parallel Increase application performance without changing your source code. Parallel Processing Top manufacturer of multiprocessing video & imaging

More information

OpenMP: Open Multiprocessing

OpenMP: Open Multiprocessing OpenMP: Open Multiprocessing Erik Schnetter June 7, 2012, IHPC 2012, Iowa City Outline 1. Basic concepts, hardware architectures 2. OpenMP Programming 3. How to parallelise an existing code 4. Advanced

More information

OpenMP Fundamentals Fork-join model and data environment

OpenMP Fundamentals Fork-join model and data environment www.bsc.es OpenMP Fundamentals Fork-join model and data environment Xavier Teruel and Xavier Martorell Agenda: OpenMP Fundamentals OpenMP brief introduction The fork-join model Data environment OpenMP

More information

Open Multi-Processing: Basic Course

Open Multi-Processing: Basic Course HPC2N, UmeåUniversity, 901 87, Sweden. May 26, 2015 Table of contents Overview of Paralellism 1 Overview of Paralellism Parallelism Importance Partitioning Data Distributed Memory Working on Abisko 2 Pragmas/Sentinels

More information

Using OpenMP. Rebecca Hartman-Baker Oak Ridge National Laboratory

Using OpenMP. Rebecca Hartman-Baker Oak Ridge National Laboratory Using OpenMP Rebecca Hartman-Baker Oak Ridge National Laboratory hartmanbakrj@ornl.gov 2004-2009 Rebecca Hartman-Baker. Reproduction permitted for non-commercial, educational use only. Outline I. About

More information

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen OpenMP - II Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT

More information

Multi-core Architecture and Programming

Multi-core Architecture and Programming Multi-core Architecture and Programming Yang Quansheng( 杨全胜 ) http://www.njyangqs.com School of Computer Science & Engineering 1 http://www.njyangqs.com Programming with OpenMP Content What is PpenMP Parallel

More information

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman)

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman) CMSC 714 Lecture 4 OpenMP and UPC Chau-Wen Tseng (from A. Sussman) Programming Model Overview Message passing (MPI, PVM) Separate address spaces Explicit messages to access shared data Send / receive (MPI

More information

INTRODUCTION TO OPENMP & OPENACC. Kadin Tseng Boston University Research Computing Services

INTRODUCTION TO OPENMP & OPENACC. Kadin Tseng Boston University Research Computing Services INTRODUCTION TO OPENMP & OPENACC Kadin Tseng Boston University Research Computing Services 2 Outline Introduction to OpenMP (for CPUs) Introduction to OpenACC (for GPUs) 3 Introduction to OpenMP (for CPUs)

More information

Data Environment: Default storage attributes

Data Environment: Default storage attributes COSC 6374 Parallel Computation Introduction to OpenMP(II) Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel) Edgar Gabriel Fall 2014 Data Environment: Default storage attributes

More information

!OMP #pragma opm _OPENMP

!OMP #pragma opm _OPENMP Advanced OpenMP Lecture 12: Tips, tricks and gotchas Directives Mistyping the sentinel (e.g.!omp or #pragma opm ) typically raises no error message. Be careful! The macro _OPENMP is defined if code is

More information

Parallel Computing. Prof. Marco Bertini

Parallel Computing. Prof. Marco Bertini Parallel Computing Prof. Marco Bertini Shared memory: OpenMP Implicit threads: motivations Implicit threading frameworks and libraries take care of much of the minutiae needed to create, manage, and (to

More information

OpenMP Lab on Nested Parallelism and Tasks

OpenMP Lab on Nested Parallelism and Tasks OpenMP Lab on Nested Parallelism and Tasks Nested Parallelism 2 Nested Parallelism Some OpenMP implementations support nested parallelism A thread within a team of threads may fork spawning a child team

More information

Module 11: The lastprivate Clause Lecture 21: Clause and Routines. The Lecture Contains: The lastprivate Clause. Data Scope Attribute Clauses

Module 11: The lastprivate Clause Lecture 21: Clause and Routines. The Lecture Contains: The lastprivate Clause. Data Scope Attribute Clauses The Lecture Contains: The lastprivate Clause Data Scope Attribute Clauses Reduction Loop Work-sharing Construct: Schedule Clause Environment Variables List of Variables References: file:///d /...ary,%20dr.%20sanjeev%20k%20aggrwal%20&%20dr.%20rajat%20moona/multi-core_architecture/lecture%2021/21_1.htm[6/14/2012

More information

OpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2.

OpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2. OpenMP Overview in 30 Minutes Christian Terboven 06.12.2010 / Aachen, Germany Stand: 03.12.2010 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda OpenMP: Parallel Regions,

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical

More information

A brief introduction to OpenMP

A brief introduction to OpenMP A brief introduction to OpenMP Alejandro Duran Barcelona Supercomputing Center Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing attributes 4 Synchronization 5 Worksharings 6 Task parallelism

More information

Programming Shared-memory Platforms with OpenMP

Programming Shared-memory Platforms with OpenMP Programming Shared-memory Platforms with OpenMP John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 7 31 February 2017 Introduction to OpenMP OpenMP

More information

Introduction to OpenMP.

Introduction to OpenMP. Introduction to OpenMP www.openmp.org Motivation Parallelize the following code using threads: for (i=0; i

More information