Parallel Numerical Algorithms

Size: px
Start display at page:

Download "Parallel Numerical Algorithms"

Transcription

1 Parallel Numerical Algorithms [ 8 ] OpenMP Parallel Numerical Algorithms / IST / UTokyo 1

2 PNA16 Lecture Plan General Topics 1. Architecture and Performance 2. Dependency 3. Locality 4. Scheduling MIMD / Distributed Memory 5. MPI: Message Passing Interface 6. Collective Communication 7. Distributed Data Structure MIMD / Shared Memory 8. OpenMP 9. Cache Performance Special Lectures 5/30 How to use FX10 (Prof. Ohshima) 6/6 Dynamic Parallelism (Prof. Peri) SIMD / Shared Memory 10. GPU and CUDA 11. SIMD Performance Parallel Numerical Algorithms / IST / UTokyo 2

3 Memory models Distributed memory Network Proc Proc Proc Proc Memory Memory Memory Memory Shared memory Uniform Memory Access (UMA) Non Uniform Memory Access (NUMA) Proc Proc Proc Proc Proc Proc Proc Proc Memory Mem Mem Mem Mem Parallel Numerical Algorithms / IST / UTokyo 3

4 Parallel Computer Nowadays Node Network System Core Node Memory Core PU Register Shared memory, SIMD Distributed memory, MIMD Shared memory, MIMD Processor: any computing part (PU, Core, or Node) Computer: may be equivalent to system Socket: set of cores on the same die / module CPU: can be a socket or a core Sequential or Serial: Antonym of Parallel Parallel Numerical Algorithms / IST / UTokyo 4

5 OpenMP Frequently used API for shared memory parallel computing in high performance computing FX10 supports OpenMP version 3.0 Shared memory, global view Describe whole data structure and whole computations Is not an automatic parallelization! It parallelizes only where you explicitly parallelize Does not guarantee correctness! It runs just as your code (not as your intention) Parallel Numerical Algorithms / IST / UTokyo 5

6 OpenMP Summary Available on the OpenMP web site Parallel Numerical Algorithms / IST / UTokyo 6

7 A tiny code with OpenMP #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); #pragma omp parallel { printf("i am %d out of %d threads n", omp_get_thread_num(), omp_get_num_threads()); return 0; Parallel Numerical Algorithms / IST / UTokyo 7

8 A tiny code with OpenMP #include <stdio.h> #include <omp.h> Include this header file int main(void) { omp_set_num_threads(8); #pragma omp parallel { printf("i am %d out of %d threads n", omp_get_thread_num(), omp_get_num_threads()); return 0; Parallel Numerical Algorithms / IST / UTokyo 8

9 A tiny code with OpenMP #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); Number of threads is set #pragma omp parallel { printf("i am %d out of %d threads n", omp_get_thread_num(), omp_get_num_threads()); return 0; Parallel Numerical Algorithms / IST / UTokyo 9

10 A tiny code with OpenMP #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); #pragma omp parallel Run next code in parallel (duplicated) { printf("i am %d out of %d threads n", omp_get_thread_num(), omp_get_num_threads()); return 0; Parallel Numerical Algorithms / IST / UTokyo 10

11 A tiny code with OpenMP #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); #pragma omp parallel { printf("i am %d out of %d threads n", omp_get_thread_num(), My thread ID omp_get_num_threads()); return 0; Parallel Numerical Algorithms / IST / UTokyo 11

12 A tiny code with OpenMP #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); #pragma omp parallel { printf("i am %d out of %d threads n", omp_get_thread_num(), omp_get_num_threads()); Number of threads (must be 8) return 0; Parallel Numerical Algorithms / IST / UTokyo 12

13 A tiny code with OpenMP #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); #pragma omp parallel { printf("i am %d out of %d threads n", omp_get_thread_num(), omp_get_num_threads()); I am 1 out of 8 threads I am 7 out of 8 threads I am 0 out of 8 threads I am 2 out of 8 threads I am 3 out of 8 threads I am 4 out of 8 threads I am 5 out of 8 threads I am 6 out of 8 threads return 0; Parallel Numerical Algorithms / IST / UTokyo 13

14 Another tiny code #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; #pragma omp parallel for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); return 0; Parallel Numerical Algorithms / IST / UTokyo 14

15 Another tiny code #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; #pragma omp parallel for parallel for-loop for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); return 0; Parallel Numerical Algorithms / IST / UTokyo 15

16 Another tiny code #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; #pragma omp parallel for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); I am 2 executed by 1 I am 3 executed by 1 I am 0 executed by 0 I am 1 executed by 0 I am 4 executed by 2 I am 5 executed by 2 I am 8 executed by 4 I am 9 executed by 4 I am 6 executed by 3 I am 7 executed by 3 return 0; Parallel Numerical Algorithms / IST / UTokyo 16

17 Disclosing the trick #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; #pragma omp parallel { printf("i am thread %d n", omp_get_thread_num()); #pragma omp for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); Do the following in parallel (duplicated) Assign one thread per iteration return 0; Parallel Numerical Algorithms / IST / UTokyo 17

18 Disclosing the trick #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; #pragma omp parallel { printf("i am thread %d n", omp_get_thread_num()); #pragma omp for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); I am thread 0 I am 0 executed by 0 I am 1 executed by 0 I am thread 1 I am 2 executed by 1 I am 3 executed by 1 I am thread 2 I am 4 executed by 2 I am 5 executed by 2 I am thread 5 I am thread 6 I am thread 7 I am thread 4 I am 8 executed by 4 I am 9 executed by 4 I am thread 3 I am 6 executed by 3 I am 7 executed by 3 return 0; Parallel Numerical Algorithms / IST / UTokyo 18

19 Disclosing the trick #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; #pragma omp parallel { printf("i am thread %d n", omp_get_thread_num()); #pragma omp for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); all threads do this one thread per iteration I am thread 0 I am 0 executed by 0 I am 1 executed by 0 I am thread 1 I am 2 executed by 1 I am 3 executed by 1 I am thread 2 I am 4 executed by 2 I am 5 executed by 2 I am thread 5 I am thread 6 I am thread 7 I am thread 4 I am 8 executed by 4 I am 9 executed by 4 I am thread 3 I am 6 executed by 3 I am 7 executed by 3 return 0; Parallel Numerical Algorithms / IST / UTokyo 19

20 Another tiny code (again) #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; This is actually a combination #pragma omp parallel for of parallel and for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); return 0; Parallel Numerical Algorithms / IST / UTokyo 20

21 Start parallel computations #pragma omp parallel Execute the following computation in parallel Following computation can be a sentence or a block A team of threads is created A; #pragma omp parallel B; C; A B B B B C Parallel Numerical Algorithms / IST / UTokyo 21

22 Setting number of threads There are three ways 1. Environment variable OMP_NUM_THREADS Weak 2. Function void omp_set_num_threads(int) 3. Clause #pragma omp parallel num_threads(8) Strong Parallel Numerical Algorithms / IST / UTokyo 22

23 Work-sharing #pragma omp for Assign one thread per iteration for the following for-loop For-loop must be something like for (i=0; i< n; i++) #pragma omp single Only one of the threads executes the following computation B B B B C D D D D Parallel Numerical Algorithms / IST / UTokyo 23

24 Some functions void omp_set_num_threads(int); Set the number of threads (for next parallel exec) int omp_get_num_threads(void); Returns the number of threads (for this parallel exec) int omp_get_thread_num(void); Returns my thread ID double omp_wtime(void); Returns wall-clock time (in second) Parallel Numerical Algorithms / IST / UTokyo 24

25 Synchronization #pragma omp barrier Wait until all the threads reach barrier Barrier Barrier Barrier Barrier Timing a code #pragma omp barrier t0 = omp_wtime(); do_computations(); #pragma omp barrier time = omp_wtime() t0; Parallel Numerical Algorithms / IST / UTokyo 25

26 BREAK Parallel Numerical Algorithms / IST / UTokyo 26

27 Three Pitfalls 1. Shared and Private Variables 2. Race Condition 3. Weak Consistency Parallel Numerical Algorithms / IST / UTokyo 27

28 Disclosing the trick (again) #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; #pragma omp parallel { printf("i am thread %d n", omp_get_thread_num()); #pragma omp for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); I am thread 0 I am 0 executed by 0 I am 1 executed by 0 I am thread 1 I am 2 executed by 1 I am 3 executed by 1 I am thread 2 I am 4 executed by 2 I am 5 executed by 2 I am thread 5 I am thread 6 I am thread 7 I am thread 4 I am 8 executed by 4 I am 9 executed by 4 I am thread 3 I am 6 executed by 3 I am 7 executed by 3 return 0; Parallel Numerical Algorithms / IST / UTokyo 28

29 What happens if? #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; #pragma omp parallel { printf("i am thread %d n", omp_get_thread_num()); All threads loop for 10 iterations? Completely Different! //#pragma omp for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); return 0; Parallel Numerical Algorithms / IST / UTokyo 29

30 What happens if? #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); int i; Shared variable #pragma omp parallel { printf("i am thread %d n", omp_get_thread_num()); //#pragma omp for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); i return 0; Parallel Numerical Algorithms / IST / UTokyo 30

31 Thread-private variable #include <stdio.h> #include <omp.h> int main(void) { omp_set_num_threads(8); #pragma omp parallel { int i; printf("i am thread %d n", omp_get_thread_num()); Private variable //#pragma omp for for (i = 0; i < 10; i++) printf("i am %d executed by %d n", i, omp_get_thread_num()); i i i i return 0; Parallel Numerical Algorithms / IST / UTokyo 31

32 Shared and private variables Shared variable The storage is accessible from all threads Must be careful to update Private variable Different storage is allocated for each thread Allocated when the thread starts, and destroyed when the thread stops Parallel Numerical Algorithms / IST / UTokyo 32

33 Shared or private Shared by default: Global variables Static variables Variables declared before omp parallel Private by default: Variables declared within omp parallel Loop induction variable of omp for int func(int k, int *m) { // but *m is n and thus shared int x; static int c = 0; int q = 1024; int main(void) { int n = 32; #pragma omp parallel { int z = func(q, &n); Parallel Numerical Algorithms / IST / UTokyo 33

34 Clauses for parallel construct #pragma omp parallel [clause[[,] clause] ] private(variable, ) Declares listed variables as private shared(variable, ) Declares listed variables as shared firstprivate(variable, ) Declares as private and initializes with the value just before omp parallel And more clauses Parallel Numerical Algorithms / IST / UTokyo 34

35 My recommendation Extract parallel part as a function Depend on the default setting of shared / private void do_comp(arg0, arg1) { #pragma omp parallel do_comp(arg0, arg1); Necessary and sufficient information is passed as arguments Reduced accidental side effects Assignments to the arguments does not effect on the caller s variables Side effects (update of shared variable) are possible only via pointers, global variables, etc. Parallel Numerical Algorithms / IST / UTokyo 35

36 Three Pitfalls 1. Shared and Private Variables 2. Race Condition 3. Weak Consistency Parallel Numerical Algorithms / IST / UTokyo 36

37 Race condition Count up solutions for each type int counter; #pragma omp parallel { if (found) { type = get_type(); counter ++; counter Race Condition Multiple threads access to the same variable concurrently Parallel Numerical Algorithms / IST / UTokyo 37

38 Reduction clause reduction(operation: variable) Produces a code to reduction operation int counter; #pragma omp parallel reduction(+: counter) { if (found) { type = get_type(); counter ++; Applicable only for scalar variables, no vector reduction Parallel Numerical Algorithms / IST / UTokyo 38

39 Vector updates Count up solutions for each type int counter[n]; #pragma omp parallel { if (found) { type = get_type(); counter[type] ++; counter Parallel Numerical Algorithms / IST / UTokyo 39

40 Atomic Operation #pragma omp atomic Execute it as an inseparable single operation Allowed operations: x binop= expr; x ++; ++x; x--; --x; int counter[n]; #pragma omp parallel { if (found) { type = get_type(); #pragma omp atomic counter[type] ++; Parallel Numerical Algorithms / IST / UTokyo 40

41 Three Pitfalls 1. Shared and Private Variables 2. Race Condition 3. Weak Consistency Parallel Numerical Algorithms / IST / UTokyo 41

42 Producer-Consumer Signal This is not provided by OpenMP Don t do the following! int data, flag = 0; #pragma omp parallel num_threads(2) if (producer) { Producer Data Consumer data = generate_data(); flag = 1; else { // consumer while (flag == 0); // wait until flag is set consume_data(data); Parallel Numerical Algorithms / IST / UTokyo 42

43 Freedom of execution order Compiler can reorder operations, as long as it does not change the meaning in sequential execution Compiler can keep the data on registers, not writing it on the main memory, as long time as it wants Hardware can reorder operations, as long as it does not change the meaning in sequential execution Hardware can keep the data on cache, not writing it on the main memory, as long time as it wants In short, the program doesn t run as it is written! Parallel Numerical Algorithms / IST / UTokyo 43

44 Weak consistency Consistency A set of restrictions on the execution of concurrent programs so that the concurrent execution is similar to the sequential ones But every trial resulted in severe degradation of performance But we need some control over execution order Weak consistency The order of operations are guaranteed only at special commands Parallel Numerical Algorithms / IST / UTokyo 44

45 Memory synchronization #pragma omp flush Every memory read and write operations before flush are made complete No memory read or write operation after flush is not started yet Rarely used by itself Automatically inserted At barrier, atomic and lock operations At entry to and exit from: parallel, critical and ordered Parallel Numerical Algorithms / IST / UTokyo 45

46 The solution int data; Barrier #pragma omp parallel if (producer) { produce_data data = produce_data(); #pragma omp barrier else { // consumer #pragma omp barrier consume_data(data); consume_data Flush is not enough Flush of the producer must be earlier than flush of the consumer Parallel Numerical Algorithms / IST / UTokyo 46

47 Barrier should be inserted Before writing data Wait for the threads that need to read the old data actually read the old data After writing data To make the threads that will read the new data wait for writing the new data Before reading data Wait for the thread that produces the new data actually produce the new data After reading data To keep the other threads from updating the data Parallel Numerical Algorithms / IST / UTokyo 47

48 Self check questions Explain private and shared variables Which variables are private/shared by default? What is Suda s recommended style? What is race condition? Show a few methods to solve race conditions What is weak consistency? What flush does? Implicit flushes inserted where, and where not? Explain why barrier is needed {before and after {reading and writing shared data Parallel Numerical Algorithms / IST / UTokyo 48

49 PNA16 Lecture Plan General Topics 1. Architecture and Performance 2. Dependency 3. Locality 4. Scheduling MIMD / Distributed Memory 5. MPI: Message Passing Interface 6. Collective Communication 7. Distributed Data Structure MIMD / Shared Memory 8. OpenMP 9. Cache Performance Special Lectures 5/30 How to use FX10 (Prof. Ohshima) 6/6 Dynamic Parallelism (Prof. Peri) SIMD / Shared Memory 10. GPU and CUDA 11. SIMD Performance Parallel Numerical Algorithms / IST / UTokyo 49

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms http://sudalab.is.s.u-tokyo.ac.jp/~reiji/pna16/ [ 9 ] Shared Memory Performance Parallel Numerical Algorithms / IST / UTokyo 1 PNA16 Lecture Plan General Topics 1. Architecture

More information

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono OpenMP Algoritmi e Calcolo Parallelo References Useful references Using OpenMP: Portable Shared Memory Parallel Programming, Barbara Chapman, Gabriele Jost and Ruud van der Pas OpenMP.org http://openmp.org/

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms http://sudalabissu-tokyoacjp/~reiji/pna16/ [ 5 ] MPI: Message Passing Interface Parallel Numerical Algorithms / IST / UTokyo 1 PNA16 Lecture Plan General Topics 1 Architecture

More information

Distributed Systems + Middleware Concurrent Programming with OpenMP

Distributed Systems + Middleware Concurrent Programming with OpenMP Distributed Systems + Middleware Concurrent Programming with OpenMP Gianpaolo Cugola Dipartimento di Elettronica e Informazione Politecnico, Italy cugola@elet.polimi.it http://home.dei.polimi.it/cugola

More information

ECE 574 Cluster Computing Lecture 10

ECE 574 Cluster Computing Lecture 10 ECE 574 Cluster Computing Lecture 10 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 1 October 2015 Announcements Homework #4 will be posted eventually 1 HW#4 Notes How granular

More information

Advanced C Programming Winter Term 2008/09. Guest Lecture by Markus Thiele

Advanced C Programming Winter Term 2008/09. Guest Lecture by Markus Thiele Advanced C Programming Winter Term 2008/09 Guest Lecture by Markus Thiele Lecture 14: Parallel Programming with OpenMP Motivation: Why parallelize? The free lunch is over. Herb

More information

OpenMP 4. CSCI 4850/5850 High-Performance Computing Spring 2018

OpenMP 4. CSCI 4850/5850 High-Performance Computing Spring 2018 OpenMP 4 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives

More information

Lecture 4: OpenMP Open Multi-Processing

Lecture 4: OpenMP Open Multi-Processing CS 4230: Parallel Programming Lecture 4: OpenMP Open Multi-Processing January 23, 2017 01/23/2017 CS4230 1 Outline OpenMP another approach for thread parallel programming Fork-Join execution model OpenMP

More information

Mango DSP Top manufacturer of multiprocessing video & imaging solutions.

Mango DSP Top manufacturer of multiprocessing video & imaging solutions. 1 of 11 3/3/2005 10:50 AM Linux Magazine February 2004 C++ Parallel Increase application performance without changing your source code. Mango DSP Top manufacturer of multiprocessing video & imaging solutions.

More information

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen OpenMP - II Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT

More information

Multithreading in C with OpenMP

Multithreading in C with OpenMP Multithreading in C with OpenMP ICS432 - Spring 2017 Concurrent and High-Performance Programming Henri Casanova (henric@hawaii.edu) Pthreads are good and bad! Multi-threaded programming in C with Pthreads

More information

Parallel Programming with OpenMP. CS240A, T. Yang

Parallel Programming with OpenMP. CS240A, T. Yang Parallel Programming with OpenMP CS240A, T. Yang 1 A Programmer s View of OpenMP What is OpenMP? Open specification for Multi-Processing Standard API for defining multi-threaded shared-memory programs

More information

CS 470 Spring Mike Lam, Professor. OpenMP

CS 470 Spring Mike Lam, Professor. OpenMP CS 470 Spring 2018 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism

More information

OpenMP 2. CSCI 4850/5850 High-Performance Computing Spring 2018

OpenMP 2. CSCI 4850/5850 High-Performance Computing Spring 2018 OpenMP 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives

More information

CS 470 Spring Mike Lam, Professor. OpenMP

CS 470 Spring Mike Lam, Professor. OpenMP CS 470 Spring 2017 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism

More information

Parallel Programming using OpenMP

Parallel Programming using OpenMP 1 OpenMP Multithreaded Programming 2 Parallel Programming using OpenMP OpenMP stands for Open Multi-Processing OpenMP is a multi-vendor (see next page) standard to perform shared-memory multithreading

More information

Parallel Programming using OpenMP

Parallel Programming using OpenMP 1 Parallel Programming using OpenMP Mike Bailey mjb@cs.oregonstate.edu openmp.pptx OpenMP Multithreaded Programming 2 OpenMP stands for Open Multi-Processing OpenMP is a multi-vendor (see next page) standard

More information

OpenMP Fundamentals Fork-join model and data environment

OpenMP Fundamentals Fork-join model and data environment www.bsc.es OpenMP Fundamentals Fork-join model and data environment Xavier Teruel and Xavier Martorell Agenda: OpenMP Fundamentals OpenMP brief introduction The fork-join model Data environment OpenMP

More information

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008 1 of 6 Lecture 7: March 4 CISC 879 Software Support for Multicore Architectures Spring 2008 Lecture 7: March 4, 2008 Lecturer: Lori Pollock Scribe: Navreet Virk Open MP Programming Topics covered 1. Introduction

More information

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen Programming with Shared Memory PART II HPC Fall 2007 Prof. Robert van Engelen Overview Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading HPC Fall 2007 2 Parallel

More information

DPHPC: Introduction to OpenMP Recitation session

DPHPC: Introduction to OpenMP Recitation session SALVATORE DI GIROLAMO DPHPC: Introduction to OpenMP Recitation session Based on http://openmp.org/mp-documents/intro_to_openmp_mattson.pdf OpenMP An Introduction What is it? A set

More information

DPHPC: Introduction to OpenMP Recitation session

DPHPC: Introduction to OpenMP Recitation session SALVATORE DI GIROLAMO DPHPC: Introduction to OpenMP Recitation session Based on http://openmp.org/mp-documents/intro_to_openmp_mattson.pdf OpenMP An Introduction What is it? A set of compiler directives

More information

Concurrent Programming with OpenMP

Concurrent Programming with OpenMP Concurrent Programming with OpenMP Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico March 7, 2016 CPD (DEI / IST) Parallel and Distributed

More information

Shared Memory Programming Model

Shared Memory Programming Model Shared Memory Programming Model Ahmed El-Mahdy and Waleed Lotfy What is a shared memory system? Activity! Consider the board as a shared memory Consider a sheet of paper in front of you as a local cache

More information

Parallel and Distributed Programming. OpenMP

Parallel and Distributed Programming. OpenMP Parallel and Distributed Programming OpenMP OpenMP Portability of software SPMD model Detailed versions (bindings) for different programming languages Components: directives for compiler library functions

More information

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer OpenMP examples Sergeev Efim Senior software engineer Singularis Lab, Ltd. OpenMP Is: An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism.

More information

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP)

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) V. Akishina, I. Kisel, G. Kozlov, I. Kulakov, M. Pugach, M. Zyzak Goethe University of Frankfurt am Main 2015 Task Parallelism Parallelization

More information

Parallel Programming

Parallel Programming Parallel Programming OpenMP Nils Moschüring PhD Student (LMU) Nils Moschüring PhD Student (LMU), OpenMP 1 1 Overview What is parallel software development Why do we need parallel computation? Problems

More information

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen Programming with Shared Memory PART II HPC Fall 2012 Prof. Robert van Engelen Overview Sequential consistency Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading

More information

Shared Memory Programming Models I

Shared Memory Programming Models I Shared Memory Programming Models I Peter Bastian / Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-69120 Heidelberg phone: 06221/54-8264

More information

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman)

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman) CMSC 714 Lecture 4 OpenMP and UPC Chau-Wen Tseng (from A. Sussman) Programming Model Overview Message passing (MPI, PVM) Separate address spaces Explicit messages to access shared data Send / receive (MPI

More information

Chip Multiprocessors COMP Lecture 9 - OpenMP & MPI

Chip Multiprocessors COMP Lecture 9 - OpenMP & MPI Chip Multiprocessors COMP35112 Lecture 9 - OpenMP & MPI Graham Riley 14 February 2018 1 Today s Lecture Dividing work to be done in parallel between threads in Java (as you are doing in the labs) is rather

More information

CS691/SC791: Parallel & Distributed Computing

CS691/SC791: Parallel & Distributed Computing CS691/SC791: Parallel & Distributed Computing Introduction to OpenMP 1 Contents Introduction OpenMP Programming Model and Examples OpenMP programming examples Task parallelism. Explicit thread synchronization.

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms http://sudalab.is.s.u-tokyo.ac.jp/~reiji/pna14/ [ 10 ] GPU and CUDA Parallel Numerical Algorithms / IST / UTokyo 1 PNA16 Lecture Plan General Topics 1. Architecture and Performance

More information

Overview: The OpenMP Programming Model

Overview: The OpenMP Programming Model Overview: The OpenMP Programming Model motivation and overview the parallel directive: clauses, equivalent pthread code, examples the for directive and scheduling of loop iterations Pi example in OpenMP

More information

Scientific Computing

Scientific Computing Lecture on Scientific Computing Dr. Kersten Schmidt Lecture 20 Technische Universität Berlin Institut für Mathematik Wintersemester 2014/2015 Syllabus Linear Regression, Fast Fourier transform Modelling

More information

Introduc)on to OpenMP

Introduc)on to OpenMP Introduc)on to OpenMP Chapter 5.1-5. Bryan Mills, PhD Spring 2017 OpenMP An API for shared-memory parallel programming. MP = multiprocessing Designed for systems in which each thread or process can potentially

More information

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013 OpenMP António Abreu Instituto Politécnico de Setúbal 1 de Março de 2013 António Abreu (Instituto Politécnico de Setúbal) OpenMP 1 de Março de 2013 1 / 37 openmp what? It s an Application Program Interface

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Thread-Level Parallelism (TLP) and OpenMP

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Thread-Level Parallelism (TLP) and OpenMP CS 61C: Great Ideas in Computer Architecture (Machine Structures) Thread-Level Parallelism (TLP) and OpenMP Instructors: John Wawrzynek & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Review

More information

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen OpenMP I Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT Press,

More information

Shared Memory Programming with OpenMP. Lecture 8: Memory model, flush and atomics

Shared Memory Programming with OpenMP. Lecture 8: Memory model, flush and atomics Shared Memory Programming with OpenMP Lecture 8: Memory model, flush and atomics Why do we need a memory model? On modern computers code is rarely executed in the same order as it was specified in the

More information

OpenMP Library Functions and Environmental Variables. Most of the library functions are used for querying or managing the threading environment

OpenMP Library Functions and Environmental Variables. Most of the library functions are used for querying or managing the threading environment OpenMP Library Functions and Environmental Variables Most of the library functions are used for querying or managing the threading environment The environment variables are used for setting runtime parameters

More information

Parallel Algorithm Engineering

Parallel Algorithm Engineering Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework and numa control Examples

More information

CS 5220: Shared memory programming. David Bindel

CS 5220: Shared memory programming. David Bindel CS 5220: Shared memory programming David Bindel 2017-09-26 1 Message passing pain Common message passing pattern Logical global structure Local representation per processor Local data may have redundancy

More information

Shared Memory Parallelism - OpenMP

Shared Memory Parallelism - OpenMP Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (http://www.llnl.gov/computing/tutorials/openmp/#introduction) OpenMP sc99 tutorial

More information

Parallel programming using OpenMP

Parallel programming using OpenMP Parallel programming using OpenMP Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department

More information

OpenMP threading: parallel regions. Paolo Burgio

OpenMP threading: parallel regions. Paolo Burgio OpenMP threading: parallel regions Paolo Burgio paolo.burgio@unimore.it Outline Expressing parallelism Understanding parallel threads Memory Data management Data clauses Synchronization Barriers, locks,

More information

EPL372 Lab Exercise 5: Introduction to OpenMP

EPL372 Lab Exercise 5: Introduction to OpenMP EPL372 Lab Exercise 5: Introduction to OpenMP References: https://computing.llnl.gov/tutorials/openmp/ http://openmp.org/wp/openmp-specifications/ http://openmp.org/mp-documents/openmp-4.0-c.pdf http://openmp.org/mp-documents/openmp4.0.0.examples.pdf

More information

OpenMP. OpenMP. Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum API for Fortran and C/C++

OpenMP. OpenMP. Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum API for Fortran and C/C++ OpenMP OpenMP Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum 1997-2002 API for Fortran and C/C++ directives runtime routines environment variables www.openmp.org 1

More information

OpenMP Introduction. CS 590: High Performance Computing. OpenMP. A standard for shared-memory parallel programming. MP = multiprocessing

OpenMP Introduction. CS 590: High Performance Computing. OpenMP. A standard for shared-memory parallel programming. MP = multiprocessing CS 590: High Performance Computing OpenMP Introduction Fengguang Song Department of Computer Science IUPUI OpenMP A standard for shared-memory parallel programming. MP = multiprocessing Designed for systems

More information

OpenMP: Open Multiprocessing

OpenMP: Open Multiprocessing OpenMP: Open Multiprocessing Erik Schnetter May 20-22, 2013, IHPC 2013, Iowa City 2,500 BC: Military Invents Parallelism Outline 1. Basic concepts, hardware architectures 2. OpenMP Programming 3. How to

More information

OpenMP: Open Multiprocessing

OpenMP: Open Multiprocessing OpenMP: Open Multiprocessing Erik Schnetter June 7, 2012, IHPC 2012, Iowa City Outline 1. Basic concepts, hardware architectures 2. OpenMP Programming 3. How to parallelise an existing code 4. Advanced

More information

by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by using function call

by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by using function call OpenMP Syntax The OpenMP Programming Model Number of threads are determined by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by

More information

OpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system

OpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system OpenMP A parallel language standard that support both data and functional Parallelism on a shared memory system Use by system programmers more than application programmers Considered a low level primitives

More information

Review. Tasking. 34a.cpp. Lecture 14. Work Tasking 5/31/2011. Structured block. Parallel construct. Working-Sharing contructs.

Review. Tasking. 34a.cpp. Lecture 14. Work Tasking 5/31/2011. Structured block. Parallel construct. Working-Sharing contructs. Review Lecture 14 Structured block Parallel construct clauses Working-Sharing contructs for, single, section for construct with different scheduling strategies 1 2 Tasking Work Tasking New feature in OpenMP

More information

High Performance Computing: Tools and Applications

High Performance Computing: Tools and Applications High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 2 OpenMP Shared address space programming High-level

More information

Introduction to. Slides prepared by : Farzana Rahman 1

Introduction to. Slides prepared by : Farzana Rahman 1 Introduction to OpenMP Slides prepared by : Farzana Rahman 1 Definition of OpenMP Application Program Interface (API) for Shared Memory Parallel Programming Directive based approach with library support

More information

Parallel Processing Top manufacturer of multiprocessing video & imaging solutions.

Parallel Processing Top manufacturer of multiprocessing video & imaging solutions. 1 of 10 3/3/2005 10:51 AM Linux Magazine March 2004 C++ Parallel Increase application performance without changing your source code. Parallel Processing Top manufacturer of multiprocessing video & imaging

More information

Data Environment: Default storage attributes

Data Environment: Default storage attributes COSC 6374 Parallel Computation Introduction to OpenMP(II) Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel) Edgar Gabriel Fall 2014 Data Environment: Default storage attributes

More information

Introduction to Standard OpenMP 3.1

Introduction to Standard OpenMP 3.1 Introduction to Standard OpenMP 3.1 Massimiliano Culpo - m.culpo@cineca.it Gian Franco Marras - g.marras@cineca.it CINECA - SuperComputing Applications and Innovation Department 1 / 59 Outline 1 Introduction

More information

Chap. 6 Part 3. CIS*3090 Fall Fall 2016 CIS*3090 Parallel Programming 1

Chap. 6 Part 3. CIS*3090 Fall Fall 2016 CIS*3090 Parallel Programming 1 Chap. 6 Part 3 CIS*3090 Fall 2016 Fall 2016 CIS*3090 Parallel Programming 1 OpenMP popular for decade Compiler-based technique Start with plain old C, C++, or Fortran Insert #pragmas into source file You

More information

Parallel Programming

Parallel Programming Parallel Programming Lecture delivered by: Venkatanatha Sarma Y Assistant Professor MSRSAS-Bangalore 1 Session Objectives To understand the parallelization in terms of computational solutions. To understand

More information

GLOSSARY. OpenMP. OpenMP brings the power of multiprocessing to your C, C++, and. Fortran programs. BY WOLFGANG DAUTERMANN

GLOSSARY. OpenMP. OpenMP brings the power of multiprocessing to your C, C++, and. Fortran programs. BY WOLFGANG DAUTERMANN OpenMP OpenMP brings the power of multiprocessing to your C, C++, and Fortran programs. BY WOLFGANG DAUTERMANN f you bought a new computer recently, or if you are wading through advertising material because

More information

A Short Introduction to OpenMP. Mark Bull, EPCC, University of Edinburgh

A Short Introduction to OpenMP. Mark Bull, EPCC, University of Edinburgh A Short Introduction to OpenMP Mark Bull, EPCC, University of Edinburgh Overview Shared memory systems Basic Concepts in Threaded Programming Basics of OpenMP Parallel regions Parallel loops 2 Shared memory

More information

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program Amdahl's Law About Data What is Data Race? Overview to OpenMP Components of OpenMP OpenMP Programming Model OpenMP Directives

More information

Parallel Computing. Lecture 13: OpenMP - I

Parallel Computing. Lecture 13: OpenMP - I CSCI-UA.0480-003 Parallel Computing Lecture 13: OpenMP - I Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Small and Easy Motivation #include #include int main() {

More information

OpenMP Programming. Aiichiro Nakano

OpenMP Programming. Aiichiro Nakano OpenMP Programming Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Department of Computer Science Department of Physics & Astronomy Department of Chemical Engineering & Materials Science

More information

Parallel Programming with OpenMP

Parallel Programming with OpenMP Advanced Practical Programming for Scientists Parallel Programming with OpenMP Robert Gottwald, Thorsten Koch Zuse Institute Berlin June 9 th, 2017 Sequential program From programmers perspective: Statements

More information

OpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2.

OpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2. OpenMP Overview in 30 Minutes Christian Terboven 06.12.2010 / Aachen, Germany Stand: 03.12.2010 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda OpenMP: Parallel Regions,

More information

UvA-SARA High Performance Computing Course June Clemens Grelck, University of Amsterdam. Parallel Programming with Compiler Directives: OpenMP

UvA-SARA High Performance Computing Course June Clemens Grelck, University of Amsterdam. Parallel Programming with Compiler Directives: OpenMP Parallel Programming with Compiler Directives OpenMP Clemens Grelck University of Amsterdam UvA-SARA High Performance Computing Course June 2013 OpenMP at a Glance Loop Parallelization Scheduling Parallel

More information

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines What is OpenMP? What does OpenMP stands for? What does OpenMP stands for? Open specifications for Multi

More information

OpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder

OpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder OpenMP programming Thomas Hauser Director Research Computing thomas.hauser@colorado.edu CU meetup 1 Outline OpenMP Shared-memory model Parallel for loops Declaring private variables Critical sections Reductions

More information

CSL 860: Modern Parallel

CSL 860: Modern Parallel CSL 860: Modern Parallel Computation Hello OpenMP #pragma omp parallel { // I am now thread iof n switch(omp_get_thread_num()) { case 0 : blah1.. case 1: blah2.. // Back to normal Parallel Construct Extremely

More information

Tieing the Threads Together

Tieing the Threads Together Tieing the Threads Together 1 Review Sequential software is slow software SIMD and MIMD are paths to higher performance MIMD thru: multithreading processor cores (increases utilization), Multicore processors

More information

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means High Performance Computing: Concepts, Methods & Means OpenMP Programming Prof. Thomas Sterling Department of Computer Science Louisiana State University February 8 th, 2007 Topics Introduction Overview

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical

More information

Introduction to OpenMP.

Introduction to OpenMP. Introduction to OpenMP www.openmp.org Motivation Parallelize the following code using threads: for (i=0; i

More information

Parallel Programming: OpenMP

Parallel Programming: OpenMP Parallel Programming: OpenMP Xianyi Zeng xzeng@utep.edu Department of Mathematical Sciences The University of Texas at El Paso. November 10, 2016. An Overview of OpenMP OpenMP: Open Multi-Processing An

More information

Raspberry Pi Basics. CSInParallel Project

Raspberry Pi Basics. CSInParallel Project Raspberry Pi Basics CSInParallel Project Sep 11, 2016 CONTENTS 1 Getting started with the Raspberry Pi 1 2 A simple parallel program 3 3 Running Loops in parallel 7 4 When loops have dependencies 11 5

More information

Lecture 14: Mixed MPI-OpenMP programming. Lecture 14: Mixed MPI-OpenMP programming p. 1

Lecture 14: Mixed MPI-OpenMP programming. Lecture 14: Mixed MPI-OpenMP programming p. 1 Lecture 14: Mixed MPI-OpenMP programming Lecture 14: Mixed MPI-OpenMP programming p. 1 Overview Motivations for mixed MPI-OpenMP programming Advantages and disadvantages The example of the Jacobi method

More information

15-418, Spring 2008 OpenMP: A Short Introduction

15-418, Spring 2008 OpenMP: A Short Introduction 15-418, Spring 2008 OpenMP: A Short Introduction This is a short introduction to OpenMP, an API (Application Program Interface) that supports multithreaded, shared address space (aka shared memory) parallelism.

More information

A brief introduction to OpenMP

A brief introduction to OpenMP A brief introduction to OpenMP Alejandro Duran Barcelona Supercomputing Center Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing attributes 4 Synchronization 5 Worksharings 6 Task parallelism

More information

Shared memory parallel computing

Shared memory parallel computing Shared memory parallel computing OpenMP Sean Stijven Przemyslaw Klosiewicz Shared-mem. programming API for SMP machines Introduced in 1997 by the OpenMP Architecture Review Board! More high-level than

More information

OpenMP. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen

OpenMP. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen OpenMP Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 Worksharing constructs To date: #pragma omp parallel created a team of threads We distributed

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical

More information

Session 4: Parallel Programming with OpenMP

Session 4: Parallel Programming with OpenMP Session 4: Parallel Programming with OpenMP Xavier Martorell Barcelona Supercomputing Center Agenda Agenda 10:00-11:00 OpenMP fundamentals, parallel regions 11:00-11:30 Worksharing constructs 11:30-12:00

More information

ECE/ME/EMA/CS 759 High Performance Computing for Engineering Applications

ECE/ME/EMA/CS 759 High Performance Computing for Engineering Applications ECE/ME/EMA/CS 759 High Performance Computing for Engineering Applications Work Sharing in OpenMP November 2, 2015 Lecture 21 Dan Negrut, 2015 ECE/ME/EMA/CS 759 UW-Madison Quote of the Day Success consists

More information

The Art of Parallel Processing

The Art of Parallel Processing The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a

More information

OpenMP - III. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen

OpenMP - III. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen OpenMP - III Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT

More information

Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing

Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing NTNU, IMF February 16. 2018 1 Recap: Distributed memory programming model Parallelism with MPI. An MPI execution is started

More information

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared

More information

Parallelization, OpenMP

Parallelization, OpenMP ~ Parallelization, OpenMP Scientific Computing Winter 2016/2017 Lecture 26 Jürgen Fuhrmann juergen.fuhrmann@wias-berlin.de made wit pandoc 1 / 18 Why parallelization? Computers became faster and faster

More information

OPENMP OPEN MULTI-PROCESSING

OPENMP OPEN MULTI-PROCESSING OPENMP OPEN MULTI-PROCESSING OpenMP OpenMP is a portable directive-based API that can be used with FORTRAN, C, and C++ for programming shared address space machines. OpenMP provides the programmer with

More information

Data Handling in OpenMP

Data Handling in OpenMP Data Handling in OpenMP Manipulate data by threads By private: a thread initializes and uses a variable alone Keep local copies, such as loop indices By firstprivate: a thread repeatedly reads a variable

More information

Allows program to be incrementally parallelized

Allows program to be incrementally parallelized Basic OpenMP What is OpenMP An open standard for shared memory programming in C/C+ + and Fortran supported by Intel, Gnu, Microsoft, Apple, IBM, HP and others Compiler directives and library support OpenMP

More information

Programming Shared Memory Systems with OpenMP Part I. Book

Programming Shared Memory Systems with OpenMP Part I. Book Programming Shared Memory Systems with OpenMP Part I Instructor Dr. Taufer Book Parallel Programming in OpenMP by Rohit Chandra, Leo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, Ramesh Menon 2 1 Machine

More information

OpenMP C and C++ Application Program Interface Version 1.0 October Document Number

OpenMP C and C++ Application Program Interface Version 1.0 October Document Number OpenMP C and C++ Application Program Interface Version 1.0 October 1998 Document Number 004 2229 001 Contents Page v Introduction [1] 1 Scope............................. 1 Definition of Terms.........................

More information

Review. Lecture 12 5/22/2012. Compiler Directives. Library Functions Environment Variables. Compiler directives for construct, collapse clause

Review. Lecture 12 5/22/2012. Compiler Directives. Library Functions Environment Variables. Compiler directives for construct, collapse clause Review Lecture 12 Compiler Directives Conditional compilation Parallel construct Work-sharing constructs for, section, single Synchronization Work-tasking Library Functions Environment Variables 1 2 13b.cpp

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Le Yan Scientific computing consultant User services group High Performance Computing @ LSU Goals Acquaint users with the concept of shared memory parallelism Acquaint users with

More information

Alfio Lazzaro: Introduction to OpenMP

Alfio Lazzaro: Introduction to OpenMP First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B. Bertinoro Italy, 12 17 October 2009 Alfio Lazzaro:

More information