OpenMP at Twenty Past, Present, and Future. Michael Klemm Chief Executive Officer OpenMP Architecture Review Board

Size: px
Start display at page:

Download "OpenMP at Twenty Past, Present, and Future. Michael Klemm Chief Executive Officer OpenMP Architecture Review Board"

Transcription

1 OpenMP at Twenty Past, Present, and Future Michael Klemm Chief Executive Officer OpenMP Architecture Review Board

2 Outline 20 Years OpenMP Redux The Past The Present The Future 2

3 Definitions The Past: OpenMP < 3.0 The Present: OpenMP 3.0 and OpenMP < 5.0 The Future: OpenMP 5.0 3

4 OpenMP Roadmap OpenMP has a well-defined roadmap: 5-year cadence for major releases One minor release in between (At least) one Technical Report (TR) with feature previews in every year TR6 OpenMP 5.0 TR7* OpenMP 5.x TR8* TR9* OpenMP 6.0 Nov 17 Nov 18 Nov 19 Nov 20 Nov 21 Nov 22 Nov 23 * Numbers assigned to TRs may change if additional TRs are released. 4

5 Levels of Parallelism in OpenMP 4.5 Cluster Coprocessors/Accelerators Node Socket Core Hyper-Threads Superscalar Pipeline Vector Group of computers communicating through fast interconnect Special compute devices attached to the local node through special interconnect Group of processors communicating through shared memory Group of cores communicating through shared cache Group of functional units communicating through registers Group of thread contexts sharing functional units Group of instructions sharing functional units Sequence of instructions sharing functional units Single instruction using multiple functional units 7

6 The Past (or: Stuff you shouldn t be doing no more!) 8

7 OpenMP Worksharing #pragma omp parallel { #pragma omp for for (i = 0; i<n; i++) { fork distribute work barrier #pragma omp for for (i = 0; i< N; i++) { distribute work barrier join 9

8 barrier OpenMP Worksharing/2 double a[n]; double l,s = 0; #pragma omp parallel for reduction(+:s) \ private(l) schedule(static,4) s=0 s =0 s =0 s =0 s =0 distribute work for (i = 0; i<n; i++) { l = log(a[i]); s += s s += s s += l; s += s s = s 10

9 The Present (or: Modern OpenMP) Single Instruction Multiple Data

10 In a Time before OpenMP 4.0 Programmers had to rely on auto-vectorization or to use vendor-specific extensions Programming models (e.g., Intel Cilk Plus) Compiler pragmas (e.g., #pragma vector) Low-level constructs (e.g., _mm_add_pd()) #pragma omp parallel for #pragma vector always #pragma ivdep for (int i = 0; i < N; i++) { a[i] = b[i] +...; You need to trust your compiler to do the right thing. 12

11 Example: SIMD Version of Scalar Product void sprod(float *a, float *b, int n) { float sum = 0.0f; #pragma omp for simd reduction(+:sum) for (int k=0; k<n; k++) sum += a[k] * b[k]; return sum; parallelize Thread 0 Thread 1 Thread 2 vectorize 13

12 SIMD Function Vectorization #pragma omp declare simd float min(float a, float b) { return a < b? a : b; #pragma omp declare simd float distsq(float x, float y) { return (x - y) * (x - y); vec8 min_v(vec8 a, vec8 b) { return a < b? a : b; vec8 distsq_v(vec8 x, vec8 y) { return (x - y) * (x - y); void example() { #pragma omp parallel for simd for (i=0; i<n; i++) { d[i] = min(distsq(a[i], b[i]), c[i]); vd = min_v(distsq_v(va, vb, vc))

13 The Present (or: Modern OpenMP) Task-based Programming

14 Ragged Fork/Join Traditional worksharing can lead to ragged fork/join patterns void example() { compute_in_parallel(a); compute_in_parallel_too(b); cblas_dgemm(, A, B, ); 16

15 Traditional Worksharing Worksharing constructs do not compose well Pathological example: parallel dgemm in MKL void example() { #pragma omp parallel { compute_in_parallel(a); compute_in_parallel_too(b); // dgemm is either parallel or sequential, // but has no orphaned worksharing cblas_dgemm(cblasrowmajor, CblasNoTrans, CblasNoTrans, m, n, k, alpha, A, k, B, n, beta, C, n); Writing such code either oversubscribes the system, yields bad performance due to OpenMP overheads, or needs a lot of glue code to use sequential dgemm only for sub-matrixes 17

16 (Modern) Task-based Execution Model Supports unstructured parallelism unbounded loops while ( <expr> ) {... recursive functions void myfunc( <args> ) {...; myfunc( <newargs> );...; Several scenarios are possible: single creator, multiple creators, nested tasks (tasks & worksharing) All threads in the team are candidates to execute tasks Example: #pragma omp parallel #pragma omp master while (elem!= NULL) { #pragma omp task compute(elem); elem = elem->next; Parallel Team Task pool 18

17 Task Synchronization w/ Dependencies int x = 0; #pragma omp parallel #pragma omp single { #pragma omp task std::cout << x << std::endl; #pragma omp taskwait OpenMP 3.1 int x = 0; #pragma omp parallel #pragma omp single { #pragma omp task depend(in: x) std::cout << x << std::endl; OpenMP 4.0 #pragma omp task x++; #pragma omp task depend(inout: x) x++; OpenMP 3.1 t1 t2 Task s creation time Task s execution time OpenMP 4.x t1 t2 19

18 Example: Cholesky Factorization void cholesky(int ts, int nt, double* a[nt][nt]) { for (int k = 0; k < nt; k++) { ts // Diagonal Block factorization nt potrf(a[k][k], ts, ts); // Triangular systems for (int i = k + 1; i < nt; i++) { #pragma omp task trsm(a[k][k], a[k][i], ts, ts); #pragma omp taskwait // Update trailing matrix for (int i = k + 1; i < nt; i++) { for (int j = k + 1; j < i; j++) { #pragma omp task dgemm(a[k][i], a[k][j], a[j][i], ts, ts); #pragma omp task syrk(a[k][i], a[i][i], ts, ts); #pragma omp taskwait nt ts OpenMP 3.1 ts ts void cholesky(int ts, int nt, double* a[nt][nt]) { for (int k = 0; k < nt; k++) { // Diagonal Block factorization #pragma omp task depend(inout: a[k][k]) potrf(a[k][k], ts, ts); // Triangular systems for (int i = k + 1; i < nt; i++) { #pragma omp task depend(in: a[k][k]) depend(inout: a[k][i]) trsm(a[k][k], a[k][i], ts, ts); // Update trailing matrix for (int i = k + 1; i < nt; i++) { for (int j = k + 1; j < i; j++) { #pragma omp task depend(inout: a[j][i]) depend(in: a[k][i], a[k][j]) dgemm(a[k][i], a[k][j], a[j][i], ts, ts); #pragma omp task depend(inout: a[i][i]) depend(in: a[k][i]) syrk(a[k][i], a[i][i], ts, ts); OpenMP

19 Example: saxpy Operation blocking for (i = 0; i<size; i+=1) { A[i]=A[i]*B[i]*S; taskloop for (i = 0; i<size; i+=ts) { UB = SIZE < (i+ts)? SIZE : i+ts; for (ii=i; ii<ub; ii++) { A[ii]=A[ii]*B[ii]*S; for (i = 0; i<size; i+=ts) { UB = SIZE < (i+ts)? SIZE : i+ts; #pragma omp task private(ii) \ firstprivate(i,ub) shared(s,a,b) for (ii=i; ii<ub; ii++) { A[ii]=A[ii]*B[ii]*S; #pragma omp taskloop grainsize(ts) for (i = 0; i<size; i+=1) { A[i]=A[i]*B[i]*S; With a manual transformation it is difficult to determine grain 1 single iteration too fine whole loop no parallelism Apply blocking techniques taskloop: improved programmability 21

20 Example: Sparse CG w/ taskloop #pragma omp parallel #pragma omp single for (iter = 0; iter < sc->maxiter; iter++) { precon(a, r, z); vectordot(r, z, n, &rho); beta = rho / rho_old; xpay(z, beta, n, p); matvec(a, p, q); vectordot(p, q, n, &dot_pq); alpha = rho / dot_pq; axpy(alpha, p, n, x); axpy(-alpha, q, n, r); sc->residual = sqrt(rho) * bnrm2; if (sc->residual <= sc->tolerance) break; rho_old = rho; void matvec(matrix *A, double *x, double *y) { //... #pragma omp taskloop private(j,is,ie,j0,y0) \ grain_size(grainsz) for (i = 0; i < A->n; i++) { y0 = 0; is = A->ptr[i]; ie = A->ptr[i + 1]; for (j = is; j < ie; j++) { j0 = index[j]; y0 += value[j] * x[j0]; y[i] = y0; //... 22

21 The Present (or: Modern OpenMP) Heterogeneous Programming for Coprocessors

22 Device Model OpenMP 4.0 supports accelerators/coprocessors Device model: One host Multiple accelerators/coprocessors of the same kind Coprocessors Host 24

23 Execution Model The target construct transfers the control flow to the target device Transfer of control is sequential and synchronous The transfer clauses control direction of data flow Array notation is used to describe array length The target data construct creates a scoped device data environment Does not include a transfer of control The transfer clauses control direction of data flow The device data environment is valid through the lifetime of the target data region Use target update to request data transfers from within a target data region 25

24 Example #pragma omp target data device(0) map(alloc:tmp[:n]) map(to:input[:n)) map(from:res) { #pragma omp target device(0) #pragma omp parallel for for (i=0; i<n; i++) tmp[i] = some_computation(input[i], i); update_input_array_on_the_host(input); #pragma omp target update device(0) to(input[:n]) #pragma omp target device(0) #pragma omp parallel for reduction(+:res) for (i=0; i<n; i++) res += final_computation(input[i], tmp[i], i) host target host target host 26

25 Multi-level Device Parallelism int main(int argc, const char* argv[]) { float *x = (float*) malloc(n * sizeof(float)); float *y = (float*) malloc(n * sizeof(float)); // Define scalars n, a, b & initialize x, y #pragma omp target data map(to:x[0:n]) { #pragma omp target map(tofrom:y) #pragma omp teams num_teams(num_blocks) num_threads(bsize) all do the same #pragma omp distribute for (int i = 0; i < n; i += num_blocks){ workshare (w/o barrier) #pragma omp parallel for for (int j = i; j < i + num_blocks; j++) { workshare (w/ barrier) y[j] = a*x[j] + y[j]; 27

26 Multi-level Device Parallelism/2 int main(int argc, const char* argv[]) { float *x = (float*) malloc(n * sizeof(float)); float *y = (float*) malloc(n * sizeof(float)); // Define scalars n, a, b & initialize x, y #pragma omp target map(to:x[0:n]) map(tofrom:y) { #pragma omp teams distribute parallel for \ num_teams(num_blocks) num_threads(bsize) for (int i = 0; i < n; ++i){ y[i] = a*x[i] + y[i]; 28

27 The Future

28 New OpenMP Features Task reductions New task dependencies OpenMP tools interface OpenMP debugging interface concurrent construct 30

29 Task Reductions Task reductions extend traditional reductions to arbitrary task graphs Extend the existing task and taskgroup constructs Also work with the taskloop construct int res = 0; node_t* node = NULL;... #pragma omp parallel { #pragma omp single { #pragma omp taskgroup task_reduction(+: res) { while (node) { #pragma omp task in_reduction(+: res) \ firstprivate(node) { res += node->value; node = node->next; 31

30 New Task Dependencies int x = 0, y = 0, res = 0; #pragma omp parallel #pragma omp single { #pragma omp task depend(out: res) //T0 res = 0; T1 T0 T2 #pragma omp task depend(out: x) //T1 long_computation(x); #pragma omp task depend(out: y) //T2 short_computation(y); #pragma omp task depend(in: x) res += x; depend(mutexinoutset: depend(inout: res) //T3res) //T3 T3 T3 T4 T4 #pragma omp task depend(in: y) res += y; depend(inout: depend(mutexinoutset: res) //T4res) //T4 T5 #pragma omp task depend(in: res) //T5 std::cout << res << std::endl; 32

31 OpenMP Tools Interface (OMPT) Provide interfaces for first-party tools to attach to the OpenMP implementation for performance analysis and correctness checking OpenMP Program Register callbacks 2 1 st Party Tool (same process) 4 Querying further runtime information ompt_start_tool ompt_initialize 1 OpenMP 3 Runtime Library (RTL) Callbacks on OpenMP events Application Address Space and Threads 33

32 OpenMP Tools Interface (OMPT) Developer view Implementation view 34

33 OpenMP Debugging Interface (OMPD) Provide interfaces for third-party tools to inspect the current state of OpenMP execution. OpenMP Program Debugger OpenMP Runtime Library Application Threads 1 Request OpenMP state Result Attach Callback 3 OMPD DLL Debugger OMPD DLL loaded into debugger and callbacks registered Request symbol addresses and read memory 2 35

34 OpenMP Debugging Interface (OMPD) Segmentation fault occurs in thread 100; is this an OpenMP thread? Debugger gets: Address space Load OMPD DLL Initialize OMPD Initialize process Is thread in a device? Yes Request thread handle for thread identifier No Initialize device Returned OK? Yes Thread 100 is an OpenMP thread No Debugger gets: Thread handle Debugger gets: Device Address space 36

35 concurrent Construct Assert to the compiler that all iterations of a loop are free of loop-carried dependencies and may execute in any order. Syntax: #pragma omp concurrent [clause [ [,] clause] ] for-loops!$omp concurrent [clause [ [,] clause] ] do-loops [!$omp end concurrent] 37

36 concurrent Construct Existing loop constructs are tightly bound to execution model: #pragma omp for for (i=0; i<n;++i) { #pragma omp simd for (i=0; i<n;++i) { #pragma omp taskloop for (i=0; i<n;++i) { fork generate tasks distribute work barrier join taskwait The concurrent construct is meant to let the OpenMP implementation pick choose the right parallelization scheme. 38

37 Simplifying Multi-level Device Parallelism int main(int argc, const char* argv[]) { float *x = (float*) malloc(n * sizeof(float)); float *y = (float*) malloc(n * sizeof(float)); // Define scalars n, a, b & initialize x, y #pragma omp target map(to:x[0:n]) map(tofrom:y) { #pragma omp teams concurrent distribute parallel for \ for num_teams(num_blocks) (int i = 0; i < n; ++i){ num_threads(bsize) for y[i] (int= i a*x[i] = 0; i + < y[i]; n; ++i){ y[i] = a*x[i] + y[i]; 39

38 OpenMP 5.0 Miscellaneous (Planned) Features Release/acquire semantics for the memory model Support for memory allocators Task affinity support Improved multiple device support Improved base language support Fortran 2003 C11, C++11, C++14 lots of corrections and clarifications 40

39 Adverts: Engage with the Community 41

40 UK OpenMP Users Conference May 2018 St Catherine s College, Oxford, UK Call for Submissions is now open Tech presentations, research, tutorials, posters Additional Information: Organised by: Conference Chair: Simon McIntosh-Smith, Prof. in High Performance Computing, University of Bristol Program Co-Chair: Dr Mark Bull, Architect (Research and Training), EPCC Program Co-Chair: Jim Cownie, Principal Engineer, Intel Corporation (UK) Ltd. 42

41 OpenMPCon & IWOMP 2018 Tentative dates: OpenMPCon: Sep Tutorials: Sep 26 IWOMP: Sep Location: Barcelona, Spain (?) 43

42 OpenMP Book 44

43 The Last Slide OpenMP 5.0 will be a major leap forward Well-defined interfaces for tools New ways to express parallelism OpenMP is a modern directive-based programming model Multi-level parallelism supported (coprocessors, threads, SIMD) Task-based programming model is the modern approach to parallelism 45

44 Visit for more information 46

OpenMP API Version 5.0

OpenMP API Version 5.0 OpenMP API Version 5.0 (or: Pretty Cool & New OpenMP Stuff) Michael Klemm Chief Executive Officer OpenMP Architecture Review Board michael.klemm@openmp.org Architecture Review Board The mission of the

More information

Past, Present, and Future of OpenMP* (An OpenMP Carol)

Past, Present, and Future of OpenMP* (An OpenMP Carol) Past, Present, and Future of OpenMP* (An OpenMP Carol) Dr.-Ing. Michael Klemm Senior Application Engineer Software and Services Group (michael.klemm@intel.com) *Other brands and names are the property

More information

Many-core Programming with OpenMP* 4.x

Many-core Programming with OpenMP* 4.x Many-core Programming with OpenMP* 4.x Dr.-Ing. Michael Klemm Senior Application Engineer Software and Services Group (michael.klemm@intel.com) *Other brands and names are the property of their respective

More information

Advanced OpenMP Features

Advanced OpenMP Features Advanced OpenMP Features Christian Terboven, Dirk Schmidl IT Center, RWTH Aachen University Member of the HPC Group {terboven,schmidl@itc.rwth-aachen.de IT Center der RWTH Aachen University Sudoku IT Center

More information

OpenMP Tutorial. Dirk Schmidl. IT Center, RWTH Aachen University. Member of the HPC Group Christian Terboven

OpenMP Tutorial. Dirk Schmidl. IT Center, RWTH Aachen University. Member of the HPC Group Christian Terboven OpenMP Tutorial Dirk Schmidl IT Center, RWTH Aachen University Member of the HPC Group schmidl@itc.rwth-aachen.de IT Center, RWTH Aachen University Head of the HPC Group terboven@itc.rwth-aachen.de 1 Tasking

More information

OpenMP 4.0/4.5: New Features and Protocols. Jemmy Hu

OpenMP 4.0/4.5: New Features and Protocols. Jemmy Hu OpenMP 4.0/4.5: New Features and Protocols Jemmy Hu SHARCNET HPC Consultant University of Waterloo May 10, 2017 General Interest Seminar Outline OpenMP overview Task constructs in OpenMP SIMP constructs

More information

Intel Xeon Phi programming. September 22nd-23rd 2015 University of Copenhagen, Denmark

Intel Xeon Phi programming. September 22nd-23rd 2015 University of Copenhagen, Denmark Intel Xeon Phi programming September 22nd-23rd 2015 University of Copenhagen, Denmark Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED AS IS. NO LICENSE, EXPRESS OR IMPLIED,

More information

A brief introduction to OpenMP

A brief introduction to OpenMP A brief introduction to OpenMP Alejandro Duran Barcelona Supercomputing Center Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing attributes 4 Synchronization 5 Worksharings 6 Task parallelism

More information

Advanced OpenMP Tutorial

Advanced OpenMP Tutorial Advanced OpenMP Tutorial OpenMPCon & Christian Terboven Members of the OpenMP Language Committee Advanced OpenMP Tutorial Vectorization 1 Credits The slides are jointly developed by: Christian Terboven

More information

Advanced OpenMP Features

Advanced OpenMP Features Christian Terboven, Dirk Schmidl IT Center, RWTH Aachen University Member of the HPC Group {terboven,schmidl@itc.rwth-aachen.de IT Center der RWTH Aachen University Vectorization 2 Vectorization SIMD =

More information

OpenMP 4.0/4.5. Mark Bull, EPCC

OpenMP 4.0/4.5. Mark Bull, EPCC OpenMP 4.0/4.5 Mark Bull, EPCC OpenMP 4.0/4.5 Version 4.0 was released in July 2013 Now available in most production version compilers support for device offloading not in all compilers, and not for all

More information

OpenMP 4.5: Threading, vectorization & offloading

OpenMP 4.5: Threading, vectorization & offloading OpenMP 4.5: Threading, vectorization & offloading Michal Merta michal.merta@vsb.cz 2nd of March 2018 Agenda Introduction The Basics OpenMP Tasks Vectorization with OpenMP 4.x Offloading to Accelerators

More information

SC16 OpenMP* BoF Report (BoF 113) Jim Cownie, Michael Klemm 28 November 2016

SC16 OpenMP* BoF Report (BoF 113) Jim Cownie, Michael Klemm 28 November 2016 SC16 OpenMP* BoF Report (BoF 113) Jim Cownie, Michael Klemm 28 November 2016 Summary The OpenMP BoF was held on Tuesday 15 November 5:15-7:00pm. There were >120 attendees. Oral feedback was positive, feedback

More information

Lecture 4: OpenMP Open Multi-Processing

Lecture 4: OpenMP Open Multi-Processing CS 4230: Parallel Programming Lecture 4: OpenMP Open Multi-Processing January 23, 2017 01/23/2017 CS4230 1 Outline OpenMP another approach for thread parallel programming Fork-Join execution model OpenMP

More information

Make the Most of OpenMP Tasking. Sergi Mateo Bellido Compiler engineer

Make the Most of OpenMP Tasking. Sergi Mateo Bellido Compiler engineer Make the Most of OpenMP Tasking Sergi Mateo Bellido Compiler engineer 14/11/2017 Outline Intro Data-sharing clauses Cutoff clauses Scheduling clauses 2 Intro: what s a task? A task is a piece of code &

More information

OpenMP 4.0. Mark Bull, EPCC

OpenMP 4.0. Mark Bull, EPCC OpenMP 4.0 Mark Bull, EPCC OpenMP 4.0 Version 4.0 was released in July 2013 Now available in most production version compilers support for device offloading not in all compilers, and not for all devices!

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #7 2/5/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline From last class

More information

OpenMP Tool Interfaces in OpenMP 5.0. Joachim Protze

OpenMP Tool Interfaces in OpenMP 5.0. Joachim Protze (protze@itc.rwth-aachen.de) Motivation Motivation Tool support essential for program development Users expect tool support, especially in today s complicated systems Question of productivity Users want

More information

Parallel Programming. Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops

Parallel Programming. Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops Parallel Programming Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops Single computers nowadays Several CPUs (cores) 4 to 8 cores on a single chip Hyper-threading

More information

What will be new in OpenMP 4.0?

What will be new in OpenMP 4.0? What will be new in OpenMP 4.0? Dr.-Ing. Michael Klemm Software and Services Group Intel Corporation (michael.klemm@intel.com) 1 Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED

More information

Programming Shared-memory Platforms with OpenMP. Xu Liu

Programming Shared-memory Platforms with OpenMP. Xu Liu Programming Shared-memory Platforms with OpenMP Xu Liu Introduction to OpenMP OpenMP directives concurrency directives parallel regions loops, sections, tasks Topics for Today synchronization directives

More information

Progress on OpenMP Specifications

Progress on OpenMP Specifications Progress on OpenMP Specifications Wednesday, November 13, 2012 Bronis R. de Supinski Chair, OpenMP Language Committee This work has been authored by Lawrence Livermore National Security, LLC under contract

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical

More information

Parallel Programming. OpenMP Parallel programming for multiprocessors for loops

Parallel Programming. OpenMP Parallel programming for multiprocessors for loops Parallel Programming OpenMP Parallel programming for multiprocessors for loops OpenMP OpenMP An application programming interface (API) for parallel programming on multiprocessors Assumes shared memory

More information

OpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2.

OpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2. OpenMP Overview in 30 Minutes Christian Terboven 06.12.2010 / Aachen, Germany Stand: 03.12.2010 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda OpenMP: Parallel Regions,

More information

Concurrent Programming with OpenMP

Concurrent Programming with OpenMP Concurrent Programming with OpenMP Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 11, 2012 CPD (DEI / IST) Parallel and Distributed

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Christian Terboven 10.04.2013 / Darmstadt, Germany Stand: 06.03.2013 Version 2.3 Rechen- und Kommunikationszentrum (RZ) History De-facto standard for

More information

ECE 574 Cluster Computing Lecture 10

ECE 574 Cluster Computing Lecture 10 ECE 574 Cluster Computing Lecture 10 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 1 October 2015 Announcements Homework #4 will be posted eventually 1 HW#4 Notes How granular

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming Outline OpenMP Shared-memory model Parallel for loops Declaring private variables Critical sections Reductions

More information

Asynchronous Task Creation for Task-Based Parallel Programming Runtimes

Asynchronous Task Creation for Task-Based Parallel Programming Runtimes Asynchronous Task Creation for Task-Based Parallel Programming Runtimes Jaume Bosch (jbosch@bsc.es), Xubin Tan, Carlos Álvarez, Daniel Jiménez, Xavier Martorell and Eduard Ayguadé Barcelona, Sept. 24,

More information

Shared Memory programming paradigm: openmp

Shared Memory programming paradigm: openmp IPM School of Physics Workshop on High Performance Computing - HPC08 Shared Memory programming paradigm: openmp Luca Heltai Stefano Cozzini SISSA - Democritos/INFM

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical

More information

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines What is OpenMP? What does OpenMP stands for? What does OpenMP stands for? Open specifications for Multi

More information

COMP528: Multi-core and Multi-Processor Computing

COMP528: Multi-core and Multi-Processor Computing COMP528: Multi-core and Multi-Processor Computing Dr Michael K Bane, G14, Computer Science, University of Liverpool m.k.bane@liverpool.ac.uk https://cgi.csc.liv.ac.uk/~mkbane/comp528 17 Background Reading

More information

Introduction to OpenMP

Introduction to OpenMP Christian Terboven, Dirk Schmidl IT Center, RWTH Aachen University Member of the HPC Group terboven,schmidl@itc.rwth-aachen.de IT Center der RWTH Aachen University History De-facto standard for Shared-Memory

More information

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer OpenMP examples Sergeev Efim Senior software engineer Singularis Lab, Ltd. OpenMP Is: An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism.

More information

POSIX Threads and OpenMP tasks

POSIX Threads and OpenMP tasks POSIX Threads and OpenMP tasks Jimmy Aguilar Mena February 16, 2018 Introduction Pthreads Tasks Two simple schemas Independent functions # include # include void f u n c t i

More information

Shared Memory Parallelism - OpenMP

Shared Memory Parallelism - OpenMP Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (http://www.llnl.gov/computing/tutorials/openmp/#introduction) OpenMP sc99 tutorial

More information

HPCSE - II. «OpenMP Programming Model - Tasks» Panos Hadjidoukas

HPCSE - II. «OpenMP Programming Model - Tasks» Panos Hadjidoukas HPCSE - II «OpenMP Programming Model - Tasks» Panos Hadjidoukas 1 Recap of OpenMP nested loop parallelism functional parallelism OpenMP tasking model how to use how it works examples Outline Nested Loop

More information

Advanced OpenMP. Lecture 11: OpenMP 4.0

Advanced OpenMP. Lecture 11: OpenMP 4.0 Advanced OpenMP Lecture 11: OpenMP 4.0 OpenMP 4.0 Version 4.0 was released in July 2013 Starting to make an appearance in production compilers What s new in 4.0 User defined reductions Construct cancellation

More information

OpenMP Examples - Tasking

OpenMP Examples - Tasking Dipartimento di Ingegneria Industriale e dell Informazione University of Pavia December 4, 2017 Outline 1 2 Assignment 2: Quicksort Assignment 3: Jacobi Outline 1 2 Assignment 2: Quicksort Assignment 3:

More information

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono OpenMP Algoritmi e Calcolo Parallelo References Useful references Using OpenMP: Portable Shared Memory Parallel Programming, Barbara Chapman, Gabriele Jost and Ruud van der Pas OpenMP.org http://openmp.org/

More information

Introduction to. Slides prepared by : Farzana Rahman 1

Introduction to. Slides prepared by : Farzana Rahman 1 Introduction to OpenMP Slides prepared by : Farzana Rahman 1 Definition of OpenMP Application Program Interface (API) for Shared Memory Parallel Programming Directive based approach with library support

More information

Topics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP

Topics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP Topics Lecture 11 Introduction OpenMP Some Examples Library functions Environment variables 1 2 Introduction Shared Memory Parallelization OpenMP is: a standard for parallel programming in C, C++, and

More information

OpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system

OpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system OpenMP A parallel language standard that support both data and functional Parallelism on a shared memory system Use by system programmers more than application programmers Considered a low level primitives

More information

Tasking in OpenMP 4. Mirko Cestari - Marco Rorro -

Tasking in OpenMP 4. Mirko Cestari - Marco Rorro - Tasking in OpenMP 4 Mirko Cestari - m.cestari@cineca.it Marco Rorro - m.rorro@cineca.it Outline Introduction to OpenMP General characteristics of Taks Some examples Live Demo Multi-threaded process Each

More information

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means High Performance Computing: Concepts, Methods & Means OpenMP Programming Prof. Thomas Sterling Department of Computer Science Louisiana State University February 8 th, 2007 Topics Introduction Overview

More information

Advanced programming with OpenMP. Libor Bukata a Jan Dvořák

Advanced programming with OpenMP. Libor Bukata a Jan Dvořák Advanced programming with OpenMP Libor Bukata a Jan Dvořák Programme of the lab OpenMP Tasks parallel merge sort, parallel evaluation of expressions OpenMP SIMD parallel integration to calculate π User-defined

More information

CS 470 Spring Mike Lam, Professor. OpenMP

CS 470 Spring Mike Lam, Professor. OpenMP CS 470 Spring 2018 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism

More information

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen OpenMP - II Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT

More information

Allows program to be incrementally parallelized

Allows program to be incrementally parallelized Basic OpenMP What is OpenMP An open standard for shared memory programming in C/C+ + and Fortran supported by Intel, Gnu, Microsoft, Apple, IBM, HP and others Compiler directives and library support OpenMP

More information

Cilk Plus GETTING STARTED

Cilk Plus GETTING STARTED Cilk Plus GETTING STARTED Overview Fundamentals of Cilk Plus Hyperobjects Compiler Support Case Study 3/17/2015 CHRIS SZALWINSKI 2 Fundamentals of Cilk Plus Terminology Execution Model Language Extensions

More information

Towards task-parallel reductions in OpenMP

Towards task-parallel reductions in OpenMP www.bsc.es Towards task-parallel reductions in OpenMP J. Ciesko, S. Mateo, X. Teruel, X. Martorell, E. Ayguadé, J. Labarta, A. Duran, B. De Supinski, S. Olivier, K. Li, A. Eichenberger IWOMP - Aachen,

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Shared Memory Programming OpenMP Fork-Join Model Compiler Directives / Run time library routines Compiling and

More information

OpenMPCon Developers Conference 2017, Stony Brook Univ., New York, USA

OpenMPCon Developers Conference 2017, Stony Brook Univ., New York, USA Xinmin Tian, Hideki Saito, Satish Guggilla, Elena Demikhovsky, Matt Masten, Diego Caballero, Ernesto Su, Jin Lin and Andrew Savonichev Intel Compiler and Languages, SSG, Intel Corporation September 18-20,

More information

A short stroll along the road of OpenMP* 4.0

A short stroll along the road of OpenMP* 4.0 * A short stroll along the road of OpenMP* 4.0 Christian Terboven Michael Klemm Members of the OpenMP Language Committee 1 * Other names and brands

More information

OpenMP. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen

OpenMP. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen OpenMP Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 Worksharing constructs To date: #pragma omp parallel created a team of threads We distributed

More information

Intel Array Building Blocks (Intel ArBB) Technical Presentation

Intel Array Building Blocks (Intel ArBB) Technical Presentation Intel Array Building Blocks (Intel ArBB) Technical Presentation Copyright 2010, Intel Corporation. All rights reserved. 1 Noah Clemons Software And Services Group Developer Products Division Performance

More information

COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP

COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP COMP4510 Introduction to Parallel Computation Shared Memory and OpenMP Thanks to Jon Aronsson (UofM HPC consultant) for some of the material in these notes. Outline (cont d) Shared Memory and OpenMP Including

More information

OpenMP. Dr. William McDoniel and Prof. Paolo Bientinesi WS17/18. HPAC, RWTH Aachen

OpenMP. Dr. William McDoniel and Prof. Paolo Bientinesi WS17/18. HPAC, RWTH Aachen OpenMP Dr. William McDoniel and Prof. Paolo Bientinesi HPAC, RWTH Aachen mcdoniel@aices.rwth-aachen.de WS17/18 Loop construct - Clauses #pragma omp for [clause [, clause]...] The following clauses apply:

More information

Shared memory parallel computing

Shared memory parallel computing Shared memory parallel computing OpenMP Sean Stijven Przemyslaw Klosiewicz Shared-mem. programming API for SMP machines Introduced in 1997 by the OpenMP Architecture Review Board! More high-level than

More information

Parallel Computing Parallel Programming Languages Hwansoo Han

Parallel Computing Parallel Programming Languages Hwansoo Han Parallel Computing Parallel Programming Languages Hwansoo Han Parallel Programming Practice Current Start with a parallel algorithm Implement, keeping in mind Data races Synchronization Threading syntax

More information

OPENMP TIPS, TRICKS AND GOTCHAS

OPENMP TIPS, TRICKS AND GOTCHAS OPENMP TIPS, TRICKS AND GOTCHAS Mark Bull EPCC, University of Edinburgh (and OpenMP ARB) markb@epcc.ed.ac.uk OpenMPCon 2015 OpenMPCon 2015 2 A bit of background I ve been teaching OpenMP for over 15 years

More information

Parallel Programming

Parallel Programming Parallel Programming OpenMP Nils Moschüring PhD Student (LMU) Nils Moschüring PhD Student (LMU), OpenMP 1 1 Overview What is parallel software development Why do we need parallel computation? Problems

More information

OmpSs Specification. BSC Programming Models

OmpSs Specification. BSC Programming Models OmpSs Specification BSC Programming Models March 30, 2017 CONTENTS 1 Introduction to OmpSs 3 1.1 Reference implementation........................................ 3 1.2 A bit of history..............................................

More information

OpenMP Tasking Model Unstructured parallelism

OpenMP Tasking Model Unstructured parallelism www.bsc.es OpenMP Tasking Model Unstructured parallelism Xavier Teruel and Xavier Martorell What is a task in OpenMP? Tasks are work units whose execution may be deferred or it can be executed immediately!!!

More information

OpenMP 4.0 (and now 5.0)

OpenMP 4.0 (and now 5.0) OpenMP 4.0 (and now 5.0) John Urbanic Parallel Computing Scientist Pittsburgh Supercomputing Center Copyright 2018 Classic OpenMP OpenMP was designed to replace low-level and tedious solutions like POSIX

More information

CS 470 Spring Mike Lam, Professor. OpenMP

CS 470 Spring Mike Lam, Professor. OpenMP CS 470 Spring 2017 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism

More information

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program Amdahl's Law About Data What is Data Race? Overview to OpenMP Components of OpenMP OpenMP Programming Model OpenMP Directives

More information

Shared Memory Programming with OpenMP

Shared Memory Programming with OpenMP Shared Memory Programming with OpenMP (An UHeM Training) Süha Tuna Informatics Institute, Istanbul Technical University February 12th, 2016 2 Outline - I Shared Memory Systems Threaded Programming Model

More information

Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing

Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing NTNU, IMF February 16. 2018 1 Recap: Distributed memory programming model Parallelism with MPI. An MPI execution is started

More information

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah

Introduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah Introduction to OpenMP Martin Čuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu Overview Quick introduction. Parallel loops. Parallel loop directives. Parallel sections.

More information

Open Multi-Processing: Basic Course

Open Multi-Processing: Basic Course HPC2N, UmeåUniversity, 901 87, Sweden. May 26, 2015 Table of contents Overview of Paralellism 1 Overview of Paralellism Parallelism Importance Partitioning Data Distributed Memory Working on Abisko 2 Pragmas/Sentinels

More information

Review. Lecture 12 5/22/2012. Compiler Directives. Library Functions Environment Variables. Compiler directives for construct, collapse clause

Review. Lecture 12 5/22/2012. Compiler Directives. Library Functions Environment Variables. Compiler directives for construct, collapse clause Review Lecture 12 Compiler Directives Conditional compilation Parallel construct Work-sharing constructs for, section, single Synchronization Work-tasking Library Functions Environment Variables 1 2 13b.cpp

More information

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen OpenMP I Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT Press,

More information

Alfio Lazzaro: Introduction to OpenMP

Alfio Lazzaro: Introduction to OpenMP First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B. Bertinoro Italy, 12 17 October 2009 Alfio Lazzaro:

More information

CS420: Operating Systems

CS420: Operating Systems Threads James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz, Galvin, Gagne Threads A thread is a basic unit of processing

More information

OpenMP Code Offloading: Splitting GPU Kernels, Pipelining Communication and Computation, and selecting Better Grid Geometries

OpenMP Code Offloading: Splitting GPU Kernels, Pipelining Communication and Computation, and selecting Better Grid Geometries OpenMP Code Offloading: Splitting GPU Kernels, Pipelining Communication and Computation, and selecting Better Grid Geometries Artem Chikin, Tyler Gobran, José Nelson Amaral Tyler Gobran University of Alberta

More information

Introduction to OpenMP.

Introduction to OpenMP. Introduction to OpenMP www.openmp.org Motivation Parallelize the following code using threads: for (i=0; i

More information

OpenMP 4.5 target. Presenters: Tom Scogland Oscar Hernandez. Wednesday, June 28 th, Credits for some of the material

OpenMP 4.5 target. Presenters: Tom Scogland Oscar Hernandez. Wednesday, June 28 th, Credits for some of the material OpenMP 4.5 target Wednesday, June 28 th, 2017 Presenters: Tom Scogland Oscar Hernandez Credits for some of the material IWOMP 2016 tutorial James Beyer, Bronis de Supinski OpenMP 4.5 Relevant Accelerator

More information

Parallel Computing. Prof. Marco Bertini

Parallel Computing. Prof. Marco Bertini Parallel Computing Prof. Marco Bertini Shared memory: OpenMP Implicit threads: motivations Implicit threading frameworks and libraries take care of much of the minutiae needed to create, manage, and (to

More information

COMP Parallel Computing. SMM (2) OpenMP Programming Model

COMP Parallel Computing. SMM (2) OpenMP Programming Model COMP 633 - Parallel Computing Lecture 7 September 12, 2017 SMM (2) OpenMP Programming Model Reading for next time look through sections 7-9 of the Open MP tutorial Topics OpenMP shared-memory parallel

More information

[Potentially] Your first parallel application

[Potentially] Your first parallel application [Potentially] Your first parallel application Compute the smallest element in an array as fast as possible small = array[0]; for( i = 0; i < N; i++) if( array[i] < small ) ) small = array[i] 64-bit Intel

More information

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared

More information

High Performance Computing: Tools and Applications

High Performance Computing: Tools and Applications High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 2 OpenMP Shared address space programming High-level

More information

15-418, Spring 2008 OpenMP: A Short Introduction

15-418, Spring 2008 OpenMP: A Short Introduction 15-418, Spring 2008 OpenMP: A Short Introduction This is a short introduction to OpenMP, an API (Application Program Interface) that supports multithreaded, shared address space (aka shared memory) parallelism.

More information

EPL372 Lab Exercise 5: Introduction to OpenMP

EPL372 Lab Exercise 5: Introduction to OpenMP EPL372 Lab Exercise 5: Introduction to OpenMP References: https://computing.llnl.gov/tutorials/openmp/ http://openmp.org/wp/openmp-specifications/ http://openmp.org/mp-documents/openmp-4.0-c.pdf http://openmp.org/mp-documents/openmp4.0.0.examples.pdf

More information

PARALLEL PROGRAMMING MULTI- CORE COMPUTERS

PARALLEL PROGRAMMING MULTI- CORE COMPUTERS 2016 HAWAII UNIVERSITY INTERNATIONAL CONFERENCES SCIENCE, TECHNOLOGY, ENGINEERING, ART, MATH & EDUCATION JUNE 10-12, 2016 HAWAII PRINCE HOTEL WAIKIKI, HONOLULU PARALLEL PROGRAMMING MULTI- CORE COMPUTERS

More information

An Introduction to OpenMP

An Introduction to OpenMP An Introduction to OpenMP U N C L A S S I F I E D Slide 1 What Is OpenMP? OpenMP Is: An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism

More information

OpenMP programming Part II. Shaohao Chen High performance Louisiana State University

OpenMP programming Part II. Shaohao Chen High performance Louisiana State University OpenMP programming Part II Shaohao Chen High performance computing @ Louisiana State University Part II Optimization for performance Trouble shooting and debug Common Misunderstandings and Frequent Errors

More information

New Features after OpenMP 2.5

New Features after OpenMP 2.5 New Features after OpenMP 2.5 2 Outline OpenMP Specifications Version 3.0 Task Parallelism Improvements to nested and loop parallelism Additional new Features Version 3.1 - New Features Version 4.0 simd

More information

Advanced OpenMP Tutorial

Advanced OpenMP Tutorial Advanced OpenMP Tutorial OpenMP Overview Michael Klemm Eric Stotzer Bronis R. de Supinski 1 Advanced OpenMP Tutorial OpenMP Overview Topics Core Concepts Synchronization Tasking Cancellation Misc. OpenMP

More information

Programming Shared-memory Platforms with OpenMP

Programming Shared-memory Platforms with OpenMP Programming Shared-memory Platforms with OpenMP John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 7 31 February 2017 Introduction to OpenMP OpenMP

More information

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples

Jukka Julku Multicore programming: Low-level libraries. Outline. Processes and threads TBB MPI UPC. Examples Multicore Jukka Julku 19.2.2009 1 2 3 4 5 6 Disclaimer There are several low-level, languages and directive based approaches But no silver bullets This presentation only covers some examples of them is

More information

Cray XE6 Performance Workshop

Cray XE6 Performance Workshop Cray XE6 Performance Workshop Multicore Programming Overview Shared memory systems Basic Concepts in OpenMP Brief history of OpenMP Compiling and running OpenMP programs 2 1 Shared memory systems OpenMP

More information

Tasking in OpenMP. Paolo Burgio.

Tasking in OpenMP. Paolo Burgio. asking in OpenMP Paolo Burgio paolo.burgio@unimore.it Outline Expressing parallelism Understanding parallel threads Memory Data management Data clauses Synchronization Barriers, locks, critical sections

More information

Programming Shared Memory Systems with OpenMP Part I. Book

Programming Shared Memory Systems with OpenMP Part I. Book Programming Shared Memory Systems with OpenMP Part I Instructor Dr. Taufer Book Parallel Programming in OpenMP by Rohit Chandra, Leo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, Ramesh Menon 2 1 Machine

More information

COSC 6374 Parallel Computation. Introduction to OpenMP(I) Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel)

COSC 6374 Parallel Computation. Introduction to OpenMP(I) Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel) COSC 6374 Parallel Computation Introduction to OpenMP(I) Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel) Edgar Gabriel Fall 2014 Introduction Threads vs. processes Recap of

More information

OPENMP TIPS, TRICKS AND GOTCHAS

OPENMP TIPS, TRICKS AND GOTCHAS OPENMP TIPS, TRICKS AND GOTCHAS OpenMPCon 2015 2 Directives Mistyping the sentinel (e.g.!omp or #pragma opm ) typically raises no error message. Be careful! Extra nasty if it is e.g. #pragma opm atomic

More information

Introduction to OpenMP. Lecture 2: OpenMP fundamentals

Introduction to OpenMP. Lecture 2: OpenMP fundamentals Introduction to OpenMP Lecture 2: OpenMP fundamentals Overview 2 Basic Concepts in OpenMP History of OpenMP Compiling and running OpenMP programs What is OpenMP? 3 OpenMP is an API designed for programming

More information