OpenMP! baseado em material cedido por Cristiana Amza!

Size: px
Start display at page:

Download "OpenMP! baseado em material cedido por Cristiana Amza!"

Transcription

1 OpenMP! baseado em material cedido por Cristiana Amza!

2 What is OpenMP?! Standard for shared memory programming for scientific applications.! Has specific support for scientific application needs (unlike Pthreads).! Rapidly gaining acceptance among vendors and application writers.! See for more info.! gcc a partir de 4.3.2!

3 OpenMP API Overview! API is a set of compiler directives inserted in the source program (in addition to some library functions).! Ideally, compiler directives do not affect sequential code.! pragma s in C / C++.! (special) comments in Fortran code.!

4 OpenMP API Example! Sequential code:!!statement1;!!statement2;!!statement3;! Assume we want to execute statement 2 in parallel, and statement 1 and 3 sequentially.!

5 OpenMP API Example (2 of 2)! OpenMP parallel code:!!statement 1;!!#pragma <specific OpenMP directive>!!statement2;!!statement3;! Statement 2 (may be) executed in parallel.! Statement 1 and 3 are executed sequentially.!

6 Important Note! By giving a parallel directive, the user asserts that the program will remain correct if the statement is executed in parallel.! OpenMP compiler does not check correctness.!

7 API Semantics! Master thread executes sequential code.! Master and slaves execute parallel code.! Note: very similar to fork-join semantics of Pthreads create/join primitives.! incremental parallelization (!!)!

8 OpenMP Implementation OpenMP implementation! compiler,! library.! Overview! Unlike Pthreads (purely a library).!

9 OpenMP Example Usage (1 of 2)! Sequential Program Annotated Source OpenMP Compiler compiler switch Parallel Program

10 OpenMP Example Usage (2 of 2)! If you give sequential switch,! comments and pragmas are ignored.! If you give parallel switch,! comments and/or pragmas are read, and! cause translation into parallel program.! Ideally, one source for both sequential and parallel program (big maintenance plus).!

11 OpenMP Directives! Parallelization directives:! parallel region! parallel for! Data environment directives:! shared, private, threadprivate, reduction, etc.! Synchronization directives:! barrier, critical!

12 General Rules about Directives! They always apply to the next statement, which must be a structured block.! Examples! #pragma omp!!statement! #pragma omp!!{ statement1; statement2; statement3; }!

13 OpenMP Parallel Region! #pragma omp parallel!! A number of threads are spawned at entry.! Each thread executes the same code.! Each thread waits at the end.! Very similar to a number of create/join s with the same function in Pthreads.!

14 Getting Threads to do Different Things! Through explicit thread identification (as in Pthreads).! Through work-sharing directives.!

15 Thread Identification! int omp_get_thread_num()! int omp_get_num_threads()!! Gets the thread id.! Gets the total number of threads.!

16 Example! #pragma omp parallel! {! }!!if(!omp_get_thread_num() )!!!master();!!else!!!slave();!

17 Work Sharing Directives! Always occur within a parallel region directive.! Two principal ones are! parallel for! parallel section!

18 OpenMP Parallel For! #pragma omp parallel!!#pragma omp for!!for( ) { }! Each thread executes a subset of the iterations.! All threads wait at the end of the parallel for.! iterações devem ser independentes for deve ter formato limitado!for (index-start; index oprel end; index op= exp)!!break, return, exit, goto não são permitidos variável de controle é a única privada por default

19 Multiple Work Sharing Directives! May occur within a single parallel region! #pragma omp parallel! {! #pragma omp for! for( ; ; ) { }! #pragma omp for! for( ; ; ) { }! }! All threads wait at the end of the first for.!

20 The NoWait Qualifier!! #pragma omp parallel! {! #pragma omp for nowait! for( ; ; ) { }! #pragma omp for! for( ; ; ) { }! }! Threads proceed to second for w/o waiting.!

21 Parallel Sections Directive! #pragma omp parallel! {! #pragma omp sections!!{! { }!!#pragma omp section à this is a delimiter!!{ }!!#pragma omp section!!{ }!!!!}! }!

22 A Useful Shorthand! #pragma omp parallel! #pragma omp for! for ( ; ; ) { }! is equivalent to! #pragma omp parallel for! for ( ; ; ) { }! (Same for parallel sections)!

23 Note the Difference between...! #pragma omp parallel! {! #pragma omp for! for( ; ; ) { }! f();! #pragma omp for! for( ; ; ) { }! }!

24 and...! #pragma omp parallel for! for( ; ; ) { }! f();! #pragma omp parallel for! for( ; ; ) { }!!

25 Sequential Matrix Multiply! for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!c[i][j] = 0.0;!!!for( k=0; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}!

26 OpenMP Matrix Multiply! #pragma omp parallel for! for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!c[i][j] = 0.0;!!!for( k=0; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}!

27 Sequential SOR! for some number of timesteps/iterations {!!for (i=0; i<n; i++ )!!!for( j=1, j<n, j++ )!!!!temp[i][j] = 0.25 *!!!!!( grid[i-1][j] + grid[i+1][j]!!!!!grid[i][j-1] + grid[i][j+1] );!!for( i=0; i<n; i++ )!!!for( j=1; j<n; j++ )!!!!grid[i][j] = temp[i][j];! }!

28 OpenMP SOR! for some number of timesteps/iterations {! #pragma omp parallel for!!for (i=0; i<n; i++ )!!!for( j=0, j<n, j++ )!!!!temp[i][j] = 0.25 *!!!!!( grid[i-1][j] + grid[i+1][j]!!!!!grid[i][j-1] + grid[i][j+1] );! #pragma omp parallel for!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ )!!!!grid[i][j] = temp[i][j];! }!

29 Some Advanced Features! Conditional parallelism.! Scheduling options.!

30 Conditional Parallelism: Issue! Oftentimes, parallelism is only useful if the problem size is sufficiently big.! For smaller sizes, overhead of parallelization exceeds benefit.!

31 Conditional Parallelism: Specification! #pragma omp parallel if( expression )! #pragma omp for if( expression )! #pragma omp parallel for if( expression )!! Execute in parallel if expression is true, otherwise execute sequentially.!

32 Conditional Parallelism: Example! for( i=0; i<n; i++ )!!#pragma omp parallel for if( n-i > 100 )!!for( j=i+1; j<n; j++ )!!!for( k=i+1; k<n; k++ )!!!!a[j][k] = a[j][k] - a[i][k]*a[i][j] / a[j][j]!!

33 Scheduling of Iterations: Issue! Scheduling: assigning iterations to a thread.! So far, we have assumed the default which is block scheduling.! OpenMP allows other scheduling strategies as well, for instance cyclic, gss (guided self-scheduling), etc.!

34 Scheduling of Iterations: Specification! #pragma omp parallel for schedule(<sched>)!! <sched> can be one of! block (default)! cyclic! gss!

35 Example! Multiplication of two matrices C = A x B, where the A matrix is upper-triangular (all elements below diagonal are 0).! 0 A

36 Sequential Matrix Multiply for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!!c[i][j] = 0.0;! Becomes!!!for( k=i; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}! Load imbalance with block distribution.!

37 OpenMP Matrix Multiply! #pragma omp parallel for schedule( cyclic )! for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!c[i][j] = 0.0;!!!for( k=i; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}!

38 Data Environment Directives! All variables are by default shared.! One exception: the loop variable of a parallel for is private.! By using data directives, some variables can be made private or given other special characteristics.!

39 Reminder: Matrix Multiply! #pragma omp parallel for! for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!c[i][j] = 0.0;!!!for( k=0; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}! a, b, c are shared! i, j, k are private!

40 Data Environment Directives! Private! Threadprivate! Reduction!

41 Private Variables! #pragma omp parallel for private( list )!! Makes a private copy for each thread for each variable in the list.! This and all further examples are with parallel for, but same applies to other region and work-sharing directives.!

42 Private Variables: Example! for( i=0; i<n; i++ ) {!!tmp = a[i];! }!!a[i] = b[i];!!b[i] = tmp;! Swaps the values in a and b.! Loop-carried dependence on tmp.! Easily fixed by privatizing tmp.!

43 Private Variables: Example (2 of 2)! #pragma omp parallel for private( tmp )! for( i=0; i<n; i++ ) {!!tmp = a[i];!!a[i] = b[i];!!b[i] = tmp;! }! Removes dependence on tmp.! Would be more difficult to do in Pthreads.??!

44 Private Variables: Alternative 1! for( i=0; i<n; i++ ) {!!tmp[i] = a[i];! }!!a[i] = b[i];!!b[i] = tmp[i];! Requires sequential program change.! Wasteful in space, O(n) vs. O(p).!

45 Private Variables: Alternative 2! f()! {! }!!int tmp; /* local allocation on stack */!!for( i=from; i<to; i++ ) {!!!tmp = a[i];!!!a[i] = b[i];!!!b[i] = tmp;!!}!

46 Threadprivate! Private variables are private on a parallel region basis.! Threadprivate variables are global variables that are private throughout the execution of the program.!

47 Reduction Variables! #pragma omp parallel for reduction( op:list )! op is one of +, *, -, &, ^,, &&, or! The variables in list must be used with this operator in the loop.! The variables are automatically initialized to sensible values.!

48 Reduction Variables: Example! #pragma omp parallel for reduction( +:sum )! for( i=0; i<n; i++ )!!!sum += a[i];! Sum is automatically initialized to zero.!

49 SOR Sequential Code with Convergence! for( ; diff > delta; ) {!!for (i=0; i<n; i++ )!!!for( j=0; j<n, j++ ) { }!!diff = 0;!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ ) {!!! diff = max(diff, fabs(grid[i][j] - temp[i][j]));!!! grid[i][j] = temp[i][j];!!}! }!

50 SOR Sequential Code with Convergence! for( ; diff > delta; ) {!!#pragma omp parallel for!!for (i=0; i<n; i++ )!!!for( j=0; j<n, j++ ) { }!!diff = 0;!!#pragma omp parallel for reduction( max: diff )!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ ) {!!! diff = max(diff, fabs(grid[i][j] - temp[i][j]));!!! grid[i][j] = temp[i][j];!!}! }!

51 Synchronization Primitives! Critical! #pragma omp critical name! Implements critical sections by name.! Similar to Pthreads mutex locks (name ~ lock).! Barrier! #pragma omp critical barrier! Implements global barrier.!

52 OpenMP SOR with Convergence! #pragma omp parallel private( mydiff )! for( ; diff > delta; ) {!!#pragma omp for nowait!!for( i=from; i<to; i++ )!!!for( j=0; j<n, j++ ) { }!!diff = 0.0;!!mydiff = 0.0;!!#pragma omp barrier!!#pragma omp for nowait!!for( i=from; i<to; i++ )!! for( j=0; j<n; j++ ) {!!!mydiff=max(mydiff,fabs(grid[i][j]-temp[i][j]);!!!grid[i][j] = temp[i][j];!! }!!#pragma critical!!!diff = max( diff, mydiff );!!#pragma barrier! }!!

53 Synchronization Primitives! Big bummer: no condition variables.! Result: must busy wait for condition synchronization.! Clumsy.! Very inefficient on some architectures.!

54 PIPE: Sequential Program! for( i=0; i<num_pic, read(in_pic); i++ ) {!!int_pic_1 = trans1( in_pic );!!int_pic_2 = trans2( int_pic_1);!!int_pic_3 = trans3( int_pic_2);!!out_pic = trans4( int_pic_3);! }!

55 Sequential vs. Parallel Execution! Sequential!! Parallel!! (Color -- picture; horizontal line -- processor).!!

56 PIPE: Parallel Program! P0: for( i=0; i<num_pics, read(in_pic); i++ ) {!!int_pic_1[i] = trans1( in_pic );!!signal(event_1_2[i]);! }! P1: for( i=0; i<num_pics; i++ ) {!!!wait( event_1_2[i] );!!int_pic_2[i] = trans2( int_pic_1[i] );!!signal(event_2_3[i]);! }!

57 PIPE: Main Program! #pragma omp parallel sections! {! }! #pragma omp section! stage1();! #pragma omp section! stage2();! #pragma omp section! stage3();! #pragma omp section! stage4();!

58 PIPE: Stage 1! void stage1()! {!!num1 = 0;!!for( i=0; i<num_pics, read(in_pic); i++ ) {!! int_pic_1[i] = trans1( in_pic );!!!#pragma omp critical 1!!num1++;!!}! }!

59 PIPE: Stage 2! void stage2 ()! {!!for( i=0; i<num_pic; i++ ) {!!!do {!!#pragma omp critical 1!!cond = (num1 <= i);!!!} while (cond);!!!int_pic_2[i] = trans2(int_pic_1[i]);!!!#pragma omp critical 2!!!num2++;!!}! }!

60 OpenMP PIPE! Note the need to exit critical during wait.! Otherwise no access by other thread.! Never busy-wait inside critical!!

61 Example 1! for( i=0; i<n; i++ ) {!!tmp = a[i];! }!!a[i] = b[i];!!b[i] = tmp;! Dependence on tmp.!

62 Scalar Privatization! #pragma omp parallel for private(tmp)! for( i=0; i<n; i++ ) {!!tmp = a[i];!!a[i] = b[i];!!b[i] = tmp;! }! Dependence on tmp is removed.!

63 Example 2! for( i=0, sum=0; i<n; i++ )!!sum += a[i];!! Dependence on sum.!

64 Reduction! #pragma omp parallel for reduction(+,sum)! for( i=0, sum=0; i<n; i++ )!!!sum += a[i];! Dependence on sum is removed.!

65 Example 3! for( i=0, index=0; i<n; i++ ) {!!index += i;! }!!!a[i] = b[index];! Dependence on index.! Induction variable: can be computed from loop variable.!

66 Induction Variable Elimination! #pragma omp parallel for! for( i=0, index=0; i<n; i++ ) {! }!!!a[i] = b[i*(i+1)/2];! Dependence removed by computing the induction variable.!

67 Example 4! for( i=0, index=0; i<n; i++ ) {!!index += f(i);! }!!!b[i] = g(a[index]);! Dependence on induction variable index, but no closed formula for its value.!

68 Loop Splitting! for( i=0; i<n; i++ ) {!!index[i] += f(i);! }! #pragma omp parallel for! for( i=0; i<n; i++ ) {!!b[i] = g(a[index[i]]);! }! Loop splitting has removed dependence.!

69 Example 5! for( k=0; k<n; k++ )!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ )!!!!a[i][j] += b[i][k] + c[k][j];!! Dependence on a[i][j] prevents k-loop parallelization.! No dependencies carried by i- and j-loops.!

70 Example 5 Parallelization! for( k=0; k<n; k++ )!!#pragma omp parallel for!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ )!!!!a[i][j] += b[i][k] + c[k][j];!! We can do better by reordering the loops.!

71 Loop Reordering! #pragma omp parallel for! for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ )!!!for( k=0; k<n; k++ )!!!!a[i][j] += b[i][k] + c[k][j];! Larger parallel pieces of work.!

72 Example 6! #pragma omp parallel for! for(i=0; i<n; i++ )!!a[i] = b[i];! #pragma omp parallel for! for( i=0; i<n; i++ )!!c[i] = b[i]^2;!! Make two parallel loops into one.!

73 Loop Fusion! #pragma omp parallel for! for(i=0; i<n; i++ ) {! }!!!!a[i] = b[i];!!!c[i] = b[i]^2;! Reduces loop startup overhead.!

74 Example 7: While Loops! while( *a) {!!process(a);! }!!!a++;! The number of loop iterations is unknown.!

75 Special Case of Loop Splitting! for( count=0, p=a; p!=null; count++, p++ );! #pragma omp parallel for! for( i=0; i<count; i++ )!!process( a[i] );! Count the number of loop iterations.! Then parallelize the loop.!

76 Example 8: Linked Lists! for(p=head;!p; p=p->next )!!/* Assume process does not affect linked list */!!!process(p);! Dependence on p!

77 Another Special Case of Loop Splitting! for( p=head, count=0; p!=null;!!! p=p->next, count++);! count[i] = f(count, omp_get_numthreads());! start[0] = head;! for( i=0; i< omp_get_numthreads(); i++ ) {!!for( p=start[i], cnt=0; cnt<count[i]; p=p->next);! }!!!start[i+1] = p->next;!

78 Another Special Case of Loop Splitting! #pragma omp parallel sections! for( i=0; i< omp_get_numthreads(); i++ ) #pragma omp section!!for(p=start[i]; p!=start[i+1]; p=p->next)!!!process(p);!

79 Example 9! for( i=0, wrap=n; i<n; i++ ) {!!b[i] = a[i] + a[wrap];! }!!!wrap = i;! Dependence on wrap.! Only first iteration causes dependence.!

80 Loop Peeling! b[0] = a[0] + a[n];! #pragma omp parallel for! for( i=1; i<n; i++ ) {! }!!!b[i] = a[i] + a[i-1];!

81 Example 10! for( i=0; i<n; i++ )!!a[i+m] = a[i] + b[i];!! Dependence if m<n.!

82 Another Case of Loop Peeling! if(m>n) {! #pragma omp parallel for! for( i=0; i<n; i++ )!!a[i+m] = a[i] + b[i];! }! else {! cannot be parallelized!! }!

83 Summary! Reorganize code such that! dependences are removed or reduced! large pieces of parallel work emerge! loop bounds become known!! Code can become messy there is a point of diminishing returns.!

Concurrent Programming with OpenMP

Concurrent Programming with OpenMP Concurrent Programming with OpenMP Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 11, 2012 CPD (DEI / IST) Parallel and Distributed

More information

Parallel and Distributed Computing

Parallel and Distributed Computing Concurrent Programming with OpenMP Rodrigo Miragaia Rodrigues MSc in Information Systems and Computer Engineering DEA in Computational Engineering CS Department (DEI) Instituto Superior Técnico October

More information

Shared Memory Parallel Programming. Shared Memory Systems Introduction to OpenMP

Shared Memory Parallel Programming. Shared Memory Systems Introduction to OpenMP Shared Memory Parallel Programming Shared Memory Systems Introduction to OpenMP Parallel Architectures Distributed Memory Machine (DMP) Shared Memory Machine (SMP) DMP Multicomputer Architecture SMP Multiprocessor

More information

CSL 860: Modern Parallel

CSL 860: Modern Parallel CSL 860: Modern Parallel Computation Hello OpenMP #pragma omp parallel { // I am now thread iof n switch(omp_get_thread_num()) { case 0 : blah1.. case 1: blah2.. // Back to normal Parallel Construct Extremely

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #7 2/5/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline From last class

More information

Parallel Programming with OpenMP. CS240A, T. Yang

Parallel Programming with OpenMP. CS240A, T. Yang Parallel Programming with OpenMP CS240A, T. Yang 1 A Programmer s View of OpenMP What is OpenMP? Open specification for Multi-Processing Standard API for defining multi-threaded shared-memory programs

More information

Parallel Computing Parallel Programming Languages Hwansoo Han

Parallel Computing Parallel Programming Languages Hwansoo Han Parallel Computing Parallel Programming Languages Hwansoo Han Parallel Programming Practice Current Start with a parallel algorithm Implement, keeping in mind Data races Synchronization Threading syntax

More information

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines What is OpenMP? What does OpenMP stands for? What does OpenMP stands for? Open specifications for Multi

More information

Concurrent Programming with OpenMP

Concurrent Programming with OpenMP Concurrent Programming with OpenMP Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico March 7, 2016 CPD (DEI / IST) Parallel and Distributed

More information

Multithreading in C with OpenMP

Multithreading in C with OpenMP Multithreading in C with OpenMP ICS432 - Spring 2017 Concurrent and High-Performance Programming Henri Casanova (henric@hawaii.edu) Pthreads are good and bad! Multi-threaded programming in C with Pthreads

More information

Allows program to be incrementally parallelized

Allows program to be incrementally parallelized Basic OpenMP What is OpenMP An open standard for shared memory programming in C/C+ + and Fortran supported by Intel, Gnu, Microsoft, Apple, IBM, HP and others Compiler directives and library support OpenMP

More information

Parallel Programming. OpenMP Parallel programming for multiprocessors for loops

Parallel Programming. OpenMP Parallel programming for multiprocessors for loops Parallel Programming OpenMP Parallel programming for multiprocessors for loops OpenMP OpenMP An application programming interface (API) for parallel programming on multiprocessors Assumes shared memory

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical

More information

Multi-core Architecture and Programming

Multi-core Architecture and Programming Multi-core Architecture and Programming Yang Quansheng( 杨全胜 ) http://www.njyangqs.com School of Computer Science & Engineering 1 http://www.njyangqs.com Programming with OpenMP Content What is PpenMP Parallel

More information

ECE 574 Cluster Computing Lecture 10

ECE 574 Cluster Computing Lecture 10 ECE 574 Cluster Computing Lecture 10 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 1 October 2015 Announcements Homework #4 will be posted eventually 1 HW#4 Notes How granular

More information

Introduction to OpenMP.

Introduction to OpenMP. Introduction to OpenMP www.openmp.org Motivation Parallelize the following code using threads: for (i=0; i

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming Outline OpenMP Shared-memory model Parallel for loops Declaring private variables Critical sections Reductions

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical

More information

Performance Issues in Parallelization Saman Amarasinghe Fall 2009

Performance Issues in Parallelization Saman Amarasinghe Fall 2009 Performance Issues in Parallelization Saman Amarasinghe Fall 2009 Today s Lecture Performance Issues of Parallelism Cilk provides a robust environment for parallelization It hides many issues and tries

More information

COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP

COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP COMP4510 Introduction to Parallel Computation Shared Memory and OpenMP Thanks to Jon Aronsson (UofM HPC consultant) for some of the material in these notes. Outline (cont d) Shared Memory and OpenMP Including

More information

OpenMP Tutorial. Seung-Jai Min. School of Electrical and Computer Engineering Purdue University, West Lafayette, IN

OpenMP Tutorial. Seung-Jai Min. School of Electrical and Computer Engineering Purdue University, West Lafayette, IN OpenMP Tutorial Seung-Jai Min (smin@purdue.edu) School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 1 Parallel Programming Standards Thread Libraries - Win32 API / Posix

More information

Performance Issues in Parallelization. Saman Amarasinghe Fall 2010

Performance Issues in Parallelization. Saman Amarasinghe Fall 2010 Performance Issues in Parallelization Saman Amarasinghe Fall 2010 Today s Lecture Performance Issues of Parallelism Cilk provides a robust environment for parallelization It hides many issues and tries

More information

Parallel Programming. Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops

Parallel Programming. Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops Parallel Programming Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops Single computers nowadays Several CPUs (cores) 4 to 8 cores on a single chip Hyper-threading

More information

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman)

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman) CMSC 714 Lecture 4 OpenMP and UPC Chau-Wen Tseng (from A. Sussman) Programming Model Overview Message passing (MPI, PVM) Separate address spaces Explicit messages to access shared data Send / receive (MPI

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Shared Memory Programming OpenMP Fork-Join Model Compiler Directives / Run time library routines Compiling and

More information

Alfio Lazzaro: Introduction to OpenMP

Alfio Lazzaro: Introduction to OpenMP First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B. Bertinoro Italy, 12 17 October 2009 Alfio Lazzaro:

More information

Lecture 4: OpenMP Open Multi-Processing

Lecture 4: OpenMP Open Multi-Processing CS 4230: Parallel Programming Lecture 4: OpenMP Open Multi-Processing January 23, 2017 01/23/2017 CS4230 1 Outline OpenMP another approach for thread parallel programming Fork-Join execution model OpenMP

More information

Questions from last time

Questions from last time Questions from last time Pthreads vs regular thread? Pthreads are POSIX-standard threads (1995). There exist earlier and newer standards (C++11). Pthread is probably most common. Pthread API: about a 100

More information

Overview: The OpenMP Programming Model

Overview: The OpenMP Programming Model Overview: The OpenMP Programming Model motivation and overview the parallel directive: clauses, equivalent pthread code, examples the for directive and scheduling of loop iterations Pi example in OpenMP

More information

UvA-SARA High Performance Computing Course June Clemens Grelck, University of Amsterdam. Parallel Programming with Compiler Directives: OpenMP

UvA-SARA High Performance Computing Course June Clemens Grelck, University of Amsterdam. Parallel Programming with Compiler Directives: OpenMP Parallel Programming with Compiler Directives OpenMP Clemens Grelck University of Amsterdam UvA-SARA High Performance Computing Course June 2013 OpenMP at a Glance Loop Parallelization Scheduling Parallel

More information

CSL 730: Parallel Programming. OpenMP

CSL 730: Parallel Programming. OpenMP CSL 730: Parallel Programming OpenMP int sum2d(int data[n][n]) { int i,j; #pragma omp parallel for for (i=0; i

More information

OpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system

OpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system OpenMP A parallel language standard that support both data and functional Parallelism on a shared memory system Use by system programmers more than application programmers Considered a low level primitives

More information

OpenMP. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen

OpenMP. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen OpenMP Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 Worksharing constructs To date: #pragma omp parallel created a team of threads We distributed

More information

A brief introduction to OpenMP

A brief introduction to OpenMP A brief introduction to OpenMP Alejandro Duran Barcelona Supercomputing Center Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing attributes 4 Synchronization 5 Worksharings 6 Task parallelism

More information

2

2 1 2 3 4 5 Code transformation Every time the compiler finds a #pragma omp parallel directive creates a new function in which the code belonging to the scope of the pragma itself is moved The directive

More information

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program Amdahl's Law About Data What is Data Race? Overview to OpenMP Components of OpenMP OpenMP Programming Model OpenMP Directives

More information

/Users/engelen/Sites/HPC folder/hpc/openmpexamples.c

/Users/engelen/Sites/HPC folder/hpc/openmpexamples.c /* Subset of these examples adapted from: 1. http://www.llnl.gov/computing/tutorials/openmp/exercise.html 2. NAS benchmarks */ #include #include #ifdef _OPENMP #include #endif

More information

High Performance Computing: Tools and Applications

High Performance Computing: Tools and Applications High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 2 OpenMP Shared address space programming High-level

More information

EPL372 Lab Exercise 5: Introduction to OpenMP

EPL372 Lab Exercise 5: Introduction to OpenMP EPL372 Lab Exercise 5: Introduction to OpenMP References: https://computing.llnl.gov/tutorials/openmp/ http://openmp.org/wp/openmp-specifications/ http://openmp.org/mp-documents/openmp-4.0-c.pdf http://openmp.org/mp-documents/openmp4.0.0.examples.pdf

More information

Acknowledgments. Amdahl s Law. Contents. Programming with MPI Parallel programming. 1 speedup = (1 P )+ P N. Type to enter text

Acknowledgments. Amdahl s Law. Contents. Programming with MPI Parallel programming. 1 speedup = (1 P )+ P N. Type to enter text Acknowledgments Programming with MPI Parallel ming Jan Thorbecke Type to enter text This course is partly based on the MPI courses developed by Rolf Rabenseifner at the High-Performance Computing-Center

More information

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen Programming with Shared Memory PART II HPC Fall 2007 Prof. Robert van Engelen Overview Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading HPC Fall 2007 2 Parallel

More information

CS 470 Spring Mike Lam, Professor. Advanced OpenMP

CS 470 Spring Mike Lam, Professor. Advanced OpenMP CS 470 Spring 2018 Mike Lam, Professor Advanced OpenMP Atomics OpenMP provides access to highly-efficient hardware synchronization mechanisms Use the atomic pragma to annotate a single statement Statement

More information

CS 5220: Shared memory programming. David Bindel

CS 5220: Shared memory programming. David Bindel CS 5220: Shared memory programming David Bindel 2017-09-26 1 Message passing pain Common message passing pattern Logical global structure Local representation per processor Local data may have redundancy

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Thread-Level Parallelism (TLP) and OpenMP

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Thread-Level Parallelism (TLP) and OpenMP CS 61C: Great Ideas in Computer Architecture (Machine Structures) Thread-Level Parallelism (TLP) and OpenMP Instructors: John Wawrzynek & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Review

More information

CS 470 Spring Mike Lam, Professor. OpenMP

CS 470 Spring Mike Lam, Professor. OpenMP CS 470 Spring 2018 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism

More information

Computer Architecture

Computer Architecture Jens Teubner Computer Architecture Summer 2016 1 Computer Architecture Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2016 Jens Teubner Computer Architecture Summer 2016 2 Part I Programming

More information

OpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder

OpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder OpenMP programming Thomas Hauser Director Research Computing thomas.hauser@colorado.edu CU meetup 1 Outline OpenMP Shared-memory model Parallel for loops Declaring private variables Critical sections Reductions

More information

CS 61C: Great Ideas in Computer Architecture. Synchronization, OpenMP

CS 61C: Great Ideas in Computer Architecture. Synchronization, OpenMP CS 61C: Great Ideas in Computer Architecture Synchronization, OpenMP Guest Lecturer: Justin Hsia 3/15/2013 Spring 2013 Lecture #22 1 Review of Last Lecture Multiprocessor systems uses shared memory (single

More information

Shared Memory Programming Model

Shared Memory Programming Model Shared Memory Programming Model Ahmed El-Mahdy and Waleed Lotfy What is a shared memory system? Activity! Consider the board as a shared memory Consider a sheet of paper in front of you as a local cache

More information

CS4961 Parallel Programming. Lecture 9: Task Parallelism in OpenMP 9/22/09. Administrative. Mary Hall September 22, 2009.

CS4961 Parallel Programming. Lecture 9: Task Parallelism in OpenMP 9/22/09. Administrative. Mary Hall September 22, 2009. Parallel Programming Lecture 9: Task Parallelism in OpenMP Administrative Programming assignment 1 is posted (after class) Due, Tuesday, September 22 before class - Use the handin program on the CADE machines

More information

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen OpenMP - II Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT

More information

OpenMP - III. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen

OpenMP - III. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen OpenMP - III Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT

More information

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP)

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) V. Akishina, I. Kisel, G. Kozlov, I. Kulakov, M. Pugach, M. Zyzak Goethe University of Frankfurt am Main 2015 Task Parallelism Parallelization

More information

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen OpenMP I Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT Press,

More information

OpenMP. Today s lecture. Scott B. Baden / CSE 160 / Wi '16

OpenMP. Today s lecture. Scott B. Baden / CSE 160 / Wi '16 Lecture 8 OpenMP Today s lecture 7 OpenMP A higher level interface for threads programming http://www.openmp.org Parallelization via source code annotations All major compilers support it, including gnu

More information

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means

OpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means High Performance Computing: Concepts, Methods & Means OpenMP Programming Prof. Thomas Sterling Department of Computer Science Louisiana State University February 8 th, 2007 Topics Introduction Overview

More information

A Short Introduction to OpenMP. Mark Bull, EPCC, University of Edinburgh

A Short Introduction to OpenMP. Mark Bull, EPCC, University of Edinburgh A Short Introduction to OpenMP Mark Bull, EPCC, University of Edinburgh Overview Shared memory systems Basic Concepts in Threaded Programming Basics of OpenMP Parallel regions Parallel loops 2 Shared memory

More information

CS 470 Spring Mike Lam, Professor. OpenMP

CS 470 Spring Mike Lam, Professor. OpenMP CS 470 Spring 2017 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism

More information

https://www.youtube.com/playlist?list=pllx- Q6B8xqZ8n8bwjGdzBJ25X2utwnoEG

https://www.youtube.com/playlist?list=pllx- Q6B8xqZ8n8bwjGdzBJ25X2utwnoEG https://www.youtube.com/playlist?list=pllx- Q6B8xqZ8n8bwjGdzBJ25X2utwnoEG OpenMP Basic Defs: Solution Stack HW System layer Prog. User layer Layer Directives, Compiler End User Application OpenMP library

More information

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono OpenMP Algoritmi e Calcolo Parallelo References Useful references Using OpenMP: Portable Shared Memory Parallel Programming, Barbara Chapman, Gabriele Jost and Ruud van der Pas OpenMP.org http://openmp.org/

More information

Chapter 3. OpenMP David A. Padua

Chapter 3. OpenMP David A. Padua Chapter 3. OpenMP 1 of 61 3.1 Introduction OpenMP is a collection of compiler directives, library routines, and environment variables that can be used to specify shared memory parallelism. This collection

More information

CME 213 S PRING Eric Darve

CME 213 S PRING Eric Darve CME 213 S PRING 2017 Eric Darve PTHREADS pthread_create, pthread_exit, pthread_join Mutex: locked/unlocked; used to protect access to shared variables (read/write) Condition variables: used to allow threads

More information

CS 470 Spring Mike Lam, Professor. Advanced OpenMP

CS 470 Spring Mike Lam, Professor. Advanced OpenMP CS 470 Spring 2017 Mike Lam, Professor Advanced OpenMP Atomics OpenMP provides access to highly-efficient hardware synchronization mechanisms Use the atomic pragma to annotate a single statement Statement

More information

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008 1 of 6 Lecture 7: March 4 CISC 879 Software Support for Multicore Architectures Spring 2008 Lecture 7: March 4, 2008 Lecturer: Lori Pollock Scribe: Navreet Virk Open MP Programming Topics covered 1. Introduction

More information

Synchronous Shared Memory Parallel Examples. HPC Fall 2012 Prof. Robert van Engelen

Synchronous Shared Memory Parallel Examples. HPC Fall 2012 Prof. Robert van Engelen Synchronous Shared Memory Parallel Examples HPC Fall 2012 Prof. Robert van Engelen Examples Data parallel prefix sum and OpenMP example Task parallel prefix sum and OpenMP example Simple heat distribution

More information

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen Programming with Shared Memory PART II HPC Fall 2012 Prof. Robert van Engelen Overview Sequential consistency Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading

More information

CS 61C: Great Ideas in Computer Architecture. OpenMP, Transistors

CS 61C: Great Ideas in Computer Architecture. OpenMP, Transistors CS 61C: Great Ideas in Computer Architecture OpenMP, Transistors Instructor: Justin Hsia 7/17/2012 Summer 2012 Lecture #17 1 Review of Last Lecture Amdahl s Law limits benefits of parallelization Multiprocessor

More information

CS691/SC791: Parallel & Distributed Computing

CS691/SC791: Parallel & Distributed Computing CS691/SC791: Parallel & Distributed Computing Introduction to OpenMP 1 Contents Introduction OpenMP Programming Model and Examples OpenMP programming examples Task parallelism. Explicit thread synchronization.

More information

Programming Shared Memory Systems with OpenMP Part I. Book

Programming Shared Memory Systems with OpenMP Part I. Book Programming Shared Memory Systems with OpenMP Part I Instructor Dr. Taufer Book Parallel Programming in OpenMP by Rohit Chandra, Leo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, Ramesh Menon 2 1 Machine

More information

CS 61C: Great Ideas in Computer Architecture. OpenMP, Transistors

CS 61C: Great Ideas in Computer Architecture. OpenMP, Transistors CS 61C: Great Ideas in Computer Architecture OpenMP, Transistors Instructor: Justin Hsia 7/23/2013 Summer 2013 Lecture #17 1 Review of Last Lecture Amdahl s Law limits benefits of parallelization Multiprocessor

More information

Parallelisation. Michael O Boyle. March 2014

Parallelisation. Michael O Boyle. March 2014 Parallelisation Michael O Boyle March 2014 1 Lecture Overview Parallelisation for fork/join Mapping parallelism to shared memory multi-processors Loop distribution and fusion Data Partitioning and SPMD

More information

Parallel programming using OpenMP

Parallel programming using OpenMP Parallel programming using OpenMP Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department

More information

COMP Parallel Computing. SMM (2) OpenMP Programming Model

COMP Parallel Computing. SMM (2) OpenMP Programming Model COMP 633 - Parallel Computing Lecture 7 September 12, 2017 SMM (2) OpenMP Programming Model Reading for next time look through sections 7-9 of the Open MP tutorial Topics OpenMP shared-memory parallel

More information

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared

More information

Programming Shared Address Space Platforms using OpenMP

Programming Shared Address Space Platforms using OpenMP Programming Shared Address Space Platforms using OpenMP Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Topic Overview Introduction to OpenMP OpenMP

More information

15-418, Spring 2008 OpenMP: A Short Introduction

15-418, Spring 2008 OpenMP: A Short Introduction 15-418, Spring 2008 OpenMP: A Short Introduction This is a short introduction to OpenMP, an API (Application Program Interface) that supports multithreaded, shared address space (aka shared memory) parallelism.

More information

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013 OpenMP António Abreu Instituto Politécnico de Setúbal 1 de Março de 2013 António Abreu (Instituto Politécnico de Setúbal) OpenMP 1 de Março de 2013 1 / 37 openmp what? It s an Application Program Interface

More information

Parallel Programming using OpenMP

Parallel Programming using OpenMP 1 Parallel Programming using OpenMP Mike Bailey mjb@cs.oregonstate.edu openmp.pptx OpenMP Multithreaded Programming 2 OpenMP stands for Open Multi-Processing OpenMP is a multi-vendor (see next page) standard

More information

Parallel Programming using OpenMP

Parallel Programming using OpenMP 1 OpenMP Multithreaded Programming 2 Parallel Programming using OpenMP OpenMP stands for Open Multi-Processing OpenMP is a multi-vendor (see next page) standard to perform shared-memory multithreading

More information

Shared Memory Programming Models I

Shared Memory Programming Models I Shared Memory Programming Models I Peter Bastian / Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-69120 Heidelberg phone: 06221/54-8264

More information

Announcements. Scott B. Baden / CSE 160 / Wi '16 2

Announcements. Scott B. Baden / CSE 160 / Wi '16 2 Lecture 8 Announcements Scott B. Baden / CSE 160 / Wi '16 2 Recapping from last time: Minimal barrier synchronization in odd/even sort Global bool AllDone; for (s = 0; s < MaxIter; s++) { barr.sync();

More information

Programming with OpenMP*

Programming with OpenMP* Objectives At the completion of this module you will be able to Thread serial code with basic OpenMP pragmas Use OpenMP synchronization pragmas to coordinate thread execution and memory access 2 Agenda

More information

CS4961 Parallel Programming. Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms 9/5/12. Administrative. Mary Hall September 4, 2012

CS4961 Parallel Programming. Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms 9/5/12. Administrative. Mary Hall September 4, 2012 CS4961 Parallel Programming Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms Administrative Mailing list set up, everyone should be on it - You should have received a test mail last night

More information

Topics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP

Topics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP Topics Lecture 11 Introduction OpenMP Some Examples Library functions Environment variables 1 2 Introduction Shared Memory Parallelization OpenMP is: a standard for parallel programming in C, C++, and

More information

Synchronous Shared Memory Parallel Examples. HPC Fall 2010 Prof. Robert van Engelen

Synchronous Shared Memory Parallel Examples. HPC Fall 2010 Prof. Robert van Engelen Synchronous Shared Memory Parallel Examples HPC Fall 2010 Prof. Robert van Engelen Examples Data parallel prefix sum and OpenMP example Task parallel prefix sum and OpenMP example Simple heat distribution

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP Le Yan Scientific computing consultant User services group High Performance Computing @ LSU Goals Acquaint users with the concept of shared memory parallelism Acquaint users with

More information

OpenMP Fundamentals Fork-join model and data environment

OpenMP Fundamentals Fork-join model and data environment www.bsc.es OpenMP Fundamentals Fork-join model and data environment Xavier Teruel and Xavier Martorell Agenda: OpenMP Fundamentals OpenMP brief introduction The fork-join model Data environment OpenMP

More information

Shared Memory programming paradigm: openmp

Shared Memory programming paradigm: openmp IPM School of Physics Workshop on High Performance Computing - HPC08 Shared Memory programming paradigm: openmp Luca Heltai Stefano Cozzini SISSA - Democritos/INFM

More information

Little Motivation Outline Introduction OpenMP Architecture Working with OpenMP Future of OpenMP End. OpenMP. Amasis Brauch German University in Cairo

Little Motivation Outline Introduction OpenMP Architecture Working with OpenMP Future of OpenMP End. OpenMP. Amasis Brauch German University in Cairo OpenMP Amasis Brauch German University in Cairo May 4, 2010 Simple Algorithm 1 void i n c r e m e n t e r ( short a r r a y ) 2 { 3 long i ; 4 5 for ( i = 0 ; i < 1000000; i ++) 6 { 7 a r r a y [ i ]++;

More information

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell. Specifications maintained by OpenMP Architecture Review Board (ARB)

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell. Specifications maintained by OpenMP Architecture Review Board (ARB) COMP4300/8300: The OpenMP Programming Model Alistair Rendell See: www.openmp.org Introduction to High Performance Computing for Scientists and Engineers, Hager and Wellein, Chapter 6 & 7 High Performance

More information

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell

COMP4300/8300: The OpenMP Programming Model. Alistair Rendell COMP4300/8300: The OpenMP Programming Model Alistair Rendell See: www.openmp.org Introduction to High Performance Computing for Scientists and Engineers, Hager and Wellein, Chapter 6 & 7 High Performance

More information

Simone Campanoni Loop transformations

Simone Campanoni Loop transformations Simone Campanoni simonec@eecs.northwestern.edu Loop transformations Outline Simple loop transformations Loop invariants Induction variables Complex loop transformations Simple loop transformations Simple

More information

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer OpenMP examples Sergeev Efim Senior software engineer Singularis Lab, Ltd. OpenMP Is: An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism.

More information

19.1. Unit 19. OpenMP Library for Parallelism

19.1. Unit 19. OpenMP Library for Parallelism 19.1 Unit 19 OpenMP Library for Parallelism 19.2 Overview of OpenMP A library or API (Application Programming Interface) for parallelism Requires compiler support (make sure the compiler you use supports

More information

5.12 EXERCISES Exercises 263

5.12 EXERCISES Exercises 263 5.12 Exercises 263 5.12 EXERCISES 5.1. If it s defined, the OPENMP macro is a decimal int. Write a program that prints its value. What is the significance of the value? 5.2. Download omp trap 1.c from

More information

Synchronous Computation Examples. HPC Fall 2008 Prof. Robert van Engelen

Synchronous Computation Examples. HPC Fall 2008 Prof. Robert van Engelen Synchronous Computation Examples HPC Fall 2008 Prof. Robert van Engelen Overview Data parallel prefix sum with OpenMP Simple heat distribution problem with OpenMP Iterative solver with OpenMP Simple heat

More information

Parallel Programming with OpenMP. CS240A, T. Yang, 2013 Modified from Demmel/Yelick s and Mary Hall s Slides

Parallel Programming with OpenMP. CS240A, T. Yang, 2013 Modified from Demmel/Yelick s and Mary Hall s Slides Parallel Programming with OpenMP CS240A, T. Yang, 203 Modified from Demmel/Yelick s and Mary Hall s Slides Introduction to OpenMP What is OpenMP? Open specification for Multi-Processing Standard API for

More information

Session 4: Parallel Programming with OpenMP

Session 4: Parallel Programming with OpenMP Session 4: Parallel Programming with OpenMP Xavier Martorell Barcelona Supercomputing Center Agenda Agenda 10:00-11:00 OpenMP fundamentals, parallel regions 11:00-11:30 Worksharing constructs 11:30-12:00

More information

OpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2.

OpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2. OpenMP Overview in 30 Minutes Christian Terboven 06.12.2010 / Aachen, Germany Stand: 03.12.2010 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda OpenMP: Parallel Regions,

More information

OpenMP, Part 2. EAS 520 High Performance Scientific Computing. University of Massachusetts Dartmouth. Spring 2015

OpenMP, Part 2. EAS 520 High Performance Scientific Computing. University of Massachusetts Dartmouth. Spring 2015 OpenMP, Part 2 EAS 520 High Performance Scientific Computing University of Massachusetts Dartmouth Spring 2015 References This presentation is almost an exact copy of Dartmouth College's openmp tutorial.

More information