OpenMP! baseado em material cedido por Cristiana Amza!
|
|
- Betty Randall
- 5 years ago
- Views:
Transcription
1 OpenMP! baseado em material cedido por Cristiana Amza!
2 What is OpenMP?! Standard for shared memory programming for scientific applications.! Has specific support for scientific application needs (unlike Pthreads).! Rapidly gaining acceptance among vendors and application writers.! See for more info.! gcc a partir de 4.3.2!
3 OpenMP API Overview! API is a set of compiler directives inserted in the source program (in addition to some library functions).! Ideally, compiler directives do not affect sequential code.! pragma s in C / C++.! (special) comments in Fortran code.!
4 OpenMP API Example! Sequential code:!!statement1;!!statement2;!!statement3;! Assume we want to execute statement 2 in parallel, and statement 1 and 3 sequentially.!
5 OpenMP API Example (2 of 2)! OpenMP parallel code:!!statement 1;!!#pragma <specific OpenMP directive>!!statement2;!!statement3;! Statement 2 (may be) executed in parallel.! Statement 1 and 3 are executed sequentially.!
6 Important Note! By giving a parallel directive, the user asserts that the program will remain correct if the statement is executed in parallel.! OpenMP compiler does not check correctness.!
7 API Semantics! Master thread executes sequential code.! Master and slaves execute parallel code.! Note: very similar to fork-join semantics of Pthreads create/join primitives.! incremental parallelization (!!)!
8 OpenMP Implementation OpenMP implementation! compiler,! library.! Overview! Unlike Pthreads (purely a library).!
9 OpenMP Example Usage (1 of 2)! Sequential Program Annotated Source OpenMP Compiler compiler switch Parallel Program
10 OpenMP Example Usage (2 of 2)! If you give sequential switch,! comments and pragmas are ignored.! If you give parallel switch,! comments and/or pragmas are read, and! cause translation into parallel program.! Ideally, one source for both sequential and parallel program (big maintenance plus).!
11 OpenMP Directives! Parallelization directives:! parallel region! parallel for! Data environment directives:! shared, private, threadprivate, reduction, etc.! Synchronization directives:! barrier, critical!
12 General Rules about Directives! They always apply to the next statement, which must be a structured block.! Examples! #pragma omp!!statement! #pragma omp!!{ statement1; statement2; statement3; }!
13 OpenMP Parallel Region! #pragma omp parallel!! A number of threads are spawned at entry.! Each thread executes the same code.! Each thread waits at the end.! Very similar to a number of create/join s with the same function in Pthreads.!
14 Getting Threads to do Different Things! Through explicit thread identification (as in Pthreads).! Through work-sharing directives.!
15 Thread Identification! int omp_get_thread_num()! int omp_get_num_threads()!! Gets the thread id.! Gets the total number of threads.!
16 Example! #pragma omp parallel! {! }!!if(!omp_get_thread_num() )!!!master();!!else!!!slave();!
17 Work Sharing Directives! Always occur within a parallel region directive.! Two principal ones are! parallel for! parallel section!
18 OpenMP Parallel For! #pragma omp parallel!!#pragma omp for!!for( ) { }! Each thread executes a subset of the iterations.! All threads wait at the end of the parallel for.! iterações devem ser independentes for deve ter formato limitado!for (index-start; index oprel end; index op= exp)!!break, return, exit, goto não são permitidos variável de controle é a única privada por default
19 Multiple Work Sharing Directives! May occur within a single parallel region! #pragma omp parallel! {! #pragma omp for! for( ; ; ) { }! #pragma omp for! for( ; ; ) { }! }! All threads wait at the end of the first for.!
20 The NoWait Qualifier!! #pragma omp parallel! {! #pragma omp for nowait! for( ; ; ) { }! #pragma omp for! for( ; ; ) { }! }! Threads proceed to second for w/o waiting.!
21 Parallel Sections Directive! #pragma omp parallel! {! #pragma omp sections!!{! { }!!#pragma omp section à this is a delimiter!!{ }!!#pragma omp section!!{ }!!!!}! }!
22 A Useful Shorthand! #pragma omp parallel! #pragma omp for! for ( ; ; ) { }! is equivalent to! #pragma omp parallel for! for ( ; ; ) { }! (Same for parallel sections)!
23 Note the Difference between...! #pragma omp parallel! {! #pragma omp for! for( ; ; ) { }! f();! #pragma omp for! for( ; ; ) { }! }!
24 and...! #pragma omp parallel for! for( ; ; ) { }! f();! #pragma omp parallel for! for( ; ; ) { }!!
25 Sequential Matrix Multiply! for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!c[i][j] = 0.0;!!!for( k=0; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}!
26 OpenMP Matrix Multiply! #pragma omp parallel for! for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!c[i][j] = 0.0;!!!for( k=0; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}!
27 Sequential SOR! for some number of timesteps/iterations {!!for (i=0; i<n; i++ )!!!for( j=1, j<n, j++ )!!!!temp[i][j] = 0.25 *!!!!!( grid[i-1][j] + grid[i+1][j]!!!!!grid[i][j-1] + grid[i][j+1] );!!for( i=0; i<n; i++ )!!!for( j=1; j<n; j++ )!!!!grid[i][j] = temp[i][j];! }!
28 OpenMP SOR! for some number of timesteps/iterations {! #pragma omp parallel for!!for (i=0; i<n; i++ )!!!for( j=0, j<n, j++ )!!!!temp[i][j] = 0.25 *!!!!!( grid[i-1][j] + grid[i+1][j]!!!!!grid[i][j-1] + grid[i][j+1] );! #pragma omp parallel for!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ )!!!!grid[i][j] = temp[i][j];! }!
29 Some Advanced Features! Conditional parallelism.! Scheduling options.!
30 Conditional Parallelism: Issue! Oftentimes, parallelism is only useful if the problem size is sufficiently big.! For smaller sizes, overhead of parallelization exceeds benefit.!
31 Conditional Parallelism: Specification! #pragma omp parallel if( expression )! #pragma omp for if( expression )! #pragma omp parallel for if( expression )!! Execute in parallel if expression is true, otherwise execute sequentially.!
32 Conditional Parallelism: Example! for( i=0; i<n; i++ )!!#pragma omp parallel for if( n-i > 100 )!!for( j=i+1; j<n; j++ )!!!for( k=i+1; k<n; k++ )!!!!a[j][k] = a[j][k] - a[i][k]*a[i][j] / a[j][j]!!
33 Scheduling of Iterations: Issue! Scheduling: assigning iterations to a thread.! So far, we have assumed the default which is block scheduling.! OpenMP allows other scheduling strategies as well, for instance cyclic, gss (guided self-scheduling), etc.!
34 Scheduling of Iterations: Specification! #pragma omp parallel for schedule(<sched>)!! <sched> can be one of! block (default)! cyclic! gss!
35 Example! Multiplication of two matrices C = A x B, where the A matrix is upper-triangular (all elements below diagonal are 0).! 0 A
36 Sequential Matrix Multiply for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!!c[i][j] = 0.0;! Becomes!!!for( k=i; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}! Load imbalance with block distribution.!
37 OpenMP Matrix Multiply! #pragma omp parallel for schedule( cyclic )! for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!c[i][j] = 0.0;!!!for( k=i; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}!
38 Data Environment Directives! All variables are by default shared.! One exception: the loop variable of a parallel for is private.! By using data directives, some variables can be made private or given other special characteristics.!
39 Reminder: Matrix Multiply! #pragma omp parallel for! for( i=0; i<n; i++ )!!for( j=0; j<n; j++ ) {!!!c[i][j] = 0.0;!!!for( k=0; k<n; k++ )!!!!c[i][j] += a[i][k]*b[k][j];!!}! a, b, c are shared! i, j, k are private!
40 Data Environment Directives! Private! Threadprivate! Reduction!
41 Private Variables! #pragma omp parallel for private( list )!! Makes a private copy for each thread for each variable in the list.! This and all further examples are with parallel for, but same applies to other region and work-sharing directives.!
42 Private Variables: Example! for( i=0; i<n; i++ ) {!!tmp = a[i];! }!!a[i] = b[i];!!b[i] = tmp;! Swaps the values in a and b.! Loop-carried dependence on tmp.! Easily fixed by privatizing tmp.!
43 Private Variables: Example (2 of 2)! #pragma omp parallel for private( tmp )! for( i=0; i<n; i++ ) {!!tmp = a[i];!!a[i] = b[i];!!b[i] = tmp;! }! Removes dependence on tmp.! Would be more difficult to do in Pthreads.??!
44 Private Variables: Alternative 1! for( i=0; i<n; i++ ) {!!tmp[i] = a[i];! }!!a[i] = b[i];!!b[i] = tmp[i];! Requires sequential program change.! Wasteful in space, O(n) vs. O(p).!
45 Private Variables: Alternative 2! f()! {! }!!int tmp; /* local allocation on stack */!!for( i=from; i<to; i++ ) {!!!tmp = a[i];!!!a[i] = b[i];!!!b[i] = tmp;!!}!
46 Threadprivate! Private variables are private on a parallel region basis.! Threadprivate variables are global variables that are private throughout the execution of the program.!
47 Reduction Variables! #pragma omp parallel for reduction( op:list )! op is one of +, *, -, &, ^,, &&, or! The variables in list must be used with this operator in the loop.! The variables are automatically initialized to sensible values.!
48 Reduction Variables: Example! #pragma omp parallel for reduction( +:sum )! for( i=0; i<n; i++ )!!!sum += a[i];! Sum is automatically initialized to zero.!
49 SOR Sequential Code with Convergence! for( ; diff > delta; ) {!!for (i=0; i<n; i++ )!!!for( j=0; j<n, j++ ) { }!!diff = 0;!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ ) {!!! diff = max(diff, fabs(grid[i][j] - temp[i][j]));!!! grid[i][j] = temp[i][j];!!}! }!
50 SOR Sequential Code with Convergence! for( ; diff > delta; ) {!!#pragma omp parallel for!!for (i=0; i<n; i++ )!!!for( j=0; j<n, j++ ) { }!!diff = 0;!!#pragma omp parallel for reduction( max: diff )!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ ) {!!! diff = max(diff, fabs(grid[i][j] - temp[i][j]));!!! grid[i][j] = temp[i][j];!!}! }!
51 Synchronization Primitives! Critical! #pragma omp critical name! Implements critical sections by name.! Similar to Pthreads mutex locks (name ~ lock).! Barrier! #pragma omp critical barrier! Implements global barrier.!
52 OpenMP SOR with Convergence! #pragma omp parallel private( mydiff )! for( ; diff > delta; ) {!!#pragma omp for nowait!!for( i=from; i<to; i++ )!!!for( j=0; j<n, j++ ) { }!!diff = 0.0;!!mydiff = 0.0;!!#pragma omp barrier!!#pragma omp for nowait!!for( i=from; i<to; i++ )!! for( j=0; j<n; j++ ) {!!!mydiff=max(mydiff,fabs(grid[i][j]-temp[i][j]);!!!grid[i][j] = temp[i][j];!! }!!#pragma critical!!!diff = max( diff, mydiff );!!#pragma barrier! }!!
53 Synchronization Primitives! Big bummer: no condition variables.! Result: must busy wait for condition synchronization.! Clumsy.! Very inefficient on some architectures.!
54 PIPE: Sequential Program! for( i=0; i<num_pic, read(in_pic); i++ ) {!!int_pic_1 = trans1( in_pic );!!int_pic_2 = trans2( int_pic_1);!!int_pic_3 = trans3( int_pic_2);!!out_pic = trans4( int_pic_3);! }!
55 Sequential vs. Parallel Execution! Sequential!! Parallel!! (Color -- picture; horizontal line -- processor).!!
56 PIPE: Parallel Program! P0: for( i=0; i<num_pics, read(in_pic); i++ ) {!!int_pic_1[i] = trans1( in_pic );!!signal(event_1_2[i]);! }! P1: for( i=0; i<num_pics; i++ ) {!!!wait( event_1_2[i] );!!int_pic_2[i] = trans2( int_pic_1[i] );!!signal(event_2_3[i]);! }!
57 PIPE: Main Program! #pragma omp parallel sections! {! }! #pragma omp section! stage1();! #pragma omp section! stage2();! #pragma omp section! stage3();! #pragma omp section! stage4();!
58 PIPE: Stage 1! void stage1()! {!!num1 = 0;!!for( i=0; i<num_pics, read(in_pic); i++ ) {!! int_pic_1[i] = trans1( in_pic );!!!#pragma omp critical 1!!num1++;!!}! }!
59 PIPE: Stage 2! void stage2 ()! {!!for( i=0; i<num_pic; i++ ) {!!!do {!!#pragma omp critical 1!!cond = (num1 <= i);!!!} while (cond);!!!int_pic_2[i] = trans2(int_pic_1[i]);!!!#pragma omp critical 2!!!num2++;!!}! }!
60 OpenMP PIPE! Note the need to exit critical during wait.! Otherwise no access by other thread.! Never busy-wait inside critical!!
61 Example 1! for( i=0; i<n; i++ ) {!!tmp = a[i];! }!!a[i] = b[i];!!b[i] = tmp;! Dependence on tmp.!
62 Scalar Privatization! #pragma omp parallel for private(tmp)! for( i=0; i<n; i++ ) {!!tmp = a[i];!!a[i] = b[i];!!b[i] = tmp;! }! Dependence on tmp is removed.!
63 Example 2! for( i=0, sum=0; i<n; i++ )!!sum += a[i];!! Dependence on sum.!
64 Reduction! #pragma omp parallel for reduction(+,sum)! for( i=0, sum=0; i<n; i++ )!!!sum += a[i];! Dependence on sum is removed.!
65 Example 3! for( i=0, index=0; i<n; i++ ) {!!index += i;! }!!!a[i] = b[index];! Dependence on index.! Induction variable: can be computed from loop variable.!
66 Induction Variable Elimination! #pragma omp parallel for! for( i=0, index=0; i<n; i++ ) {! }!!!a[i] = b[i*(i+1)/2];! Dependence removed by computing the induction variable.!
67 Example 4! for( i=0, index=0; i<n; i++ ) {!!index += f(i);! }!!!b[i] = g(a[index]);! Dependence on induction variable index, but no closed formula for its value.!
68 Loop Splitting! for( i=0; i<n; i++ ) {!!index[i] += f(i);! }! #pragma omp parallel for! for( i=0; i<n; i++ ) {!!b[i] = g(a[index[i]]);! }! Loop splitting has removed dependence.!
69 Example 5! for( k=0; k<n; k++ )!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ )!!!!a[i][j] += b[i][k] + c[k][j];!! Dependence on a[i][j] prevents k-loop parallelization.! No dependencies carried by i- and j-loops.!
70 Example 5 Parallelization! for( k=0; k<n; k++ )!!#pragma omp parallel for!!for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ )!!!!a[i][j] += b[i][k] + c[k][j];!! We can do better by reordering the loops.!
71 Loop Reordering! #pragma omp parallel for! for( i=0; i<n; i++ )!!!for( j=0; j<n; j++ )!!!for( k=0; k<n; k++ )!!!!a[i][j] += b[i][k] + c[k][j];! Larger parallel pieces of work.!
72 Example 6! #pragma omp parallel for! for(i=0; i<n; i++ )!!a[i] = b[i];! #pragma omp parallel for! for( i=0; i<n; i++ )!!c[i] = b[i]^2;!! Make two parallel loops into one.!
73 Loop Fusion! #pragma omp parallel for! for(i=0; i<n; i++ ) {! }!!!!a[i] = b[i];!!!c[i] = b[i]^2;! Reduces loop startup overhead.!
74 Example 7: While Loops! while( *a) {!!process(a);! }!!!a++;! The number of loop iterations is unknown.!
75 Special Case of Loop Splitting! for( count=0, p=a; p!=null; count++, p++ );! #pragma omp parallel for! for( i=0; i<count; i++ )!!process( a[i] );! Count the number of loop iterations.! Then parallelize the loop.!
76 Example 8: Linked Lists! for(p=head;!p; p=p->next )!!/* Assume process does not affect linked list */!!!process(p);! Dependence on p!
77 Another Special Case of Loop Splitting! for( p=head, count=0; p!=null;!!! p=p->next, count++);! count[i] = f(count, omp_get_numthreads());! start[0] = head;! for( i=0; i< omp_get_numthreads(); i++ ) {!!for( p=start[i], cnt=0; cnt<count[i]; p=p->next);! }!!!start[i+1] = p->next;!
78 Another Special Case of Loop Splitting! #pragma omp parallel sections! for( i=0; i< omp_get_numthreads(); i++ ) #pragma omp section!!for(p=start[i]; p!=start[i+1]; p=p->next)!!!process(p);!
79 Example 9! for( i=0, wrap=n; i<n; i++ ) {!!b[i] = a[i] + a[wrap];! }!!!wrap = i;! Dependence on wrap.! Only first iteration causes dependence.!
80 Loop Peeling! b[0] = a[0] + a[n];! #pragma omp parallel for! for( i=1; i<n; i++ ) {! }!!!b[i] = a[i] + a[i-1];!
81 Example 10! for( i=0; i<n; i++ )!!a[i+m] = a[i] + b[i];!! Dependence if m<n.!
82 Another Case of Loop Peeling! if(m>n) {! #pragma omp parallel for! for( i=0; i<n; i++ )!!a[i+m] = a[i] + b[i];! }! else {! cannot be parallelized!! }!
83 Summary! Reorganize code such that! dependences are removed or reduced! large pieces of parallel work emerge! loop bounds become known!! Code can become messy there is a point of diminishing returns.!
Concurrent Programming with OpenMP
Concurrent Programming with OpenMP Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 11, 2012 CPD (DEI / IST) Parallel and Distributed
More informationParallel and Distributed Computing
Concurrent Programming with OpenMP Rodrigo Miragaia Rodrigues MSc in Information Systems and Computer Engineering DEA in Computational Engineering CS Department (DEI) Instituto Superior Técnico October
More informationShared Memory Parallel Programming. Shared Memory Systems Introduction to OpenMP
Shared Memory Parallel Programming Shared Memory Systems Introduction to OpenMP Parallel Architectures Distributed Memory Machine (DMP) Shared Memory Machine (SMP) DMP Multicomputer Architecture SMP Multiprocessor
More informationCSL 860: Modern Parallel
CSL 860: Modern Parallel Computation Hello OpenMP #pragma omp parallel { // I am now thread iof n switch(omp_get_thread_num()) { case 0 : blah1.. case 1: blah2.. // Back to normal Parallel Construct Extremely
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #7 2/5/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline From last class
More informationParallel Programming with OpenMP. CS240A, T. Yang
Parallel Programming with OpenMP CS240A, T. Yang 1 A Programmer s View of OpenMP What is OpenMP? Open specification for Multi-Processing Standard API for defining multi-threaded shared-memory programs
More informationParallel Computing Parallel Programming Languages Hwansoo Han
Parallel Computing Parallel Programming Languages Hwansoo Han Parallel Programming Practice Current Start with a parallel algorithm Implement, keeping in mind Data races Synchronization Threading syntax
More informationIntroduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines
Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines What is OpenMP? What does OpenMP stands for? What does OpenMP stands for? Open specifications for Multi
More informationConcurrent Programming with OpenMP
Concurrent Programming with OpenMP Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico March 7, 2016 CPD (DEI / IST) Parallel and Distributed
More informationMultithreading in C with OpenMP
Multithreading in C with OpenMP ICS432 - Spring 2017 Concurrent and High-Performance Programming Henri Casanova (henric@hawaii.edu) Pthreads are good and bad! Multi-threaded programming in C with Pthreads
More informationAllows program to be incrementally parallelized
Basic OpenMP What is OpenMP An open standard for shared memory programming in C/C+ + and Fortran supported by Intel, Gnu, Microsoft, Apple, IBM, HP and others Compiler directives and library support OpenMP
More informationParallel Programming. OpenMP Parallel programming for multiprocessors for loops
Parallel Programming OpenMP Parallel programming for multiprocessors for loops OpenMP OpenMP An application programming interface (API) for parallel programming on multiprocessors Assumes shared memory
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical
More informationMulti-core Architecture and Programming
Multi-core Architecture and Programming Yang Quansheng( 杨全胜 ) http://www.njyangqs.com School of Computer Science & Engineering 1 http://www.njyangqs.com Programming with OpenMP Content What is PpenMP Parallel
More informationECE 574 Cluster Computing Lecture 10
ECE 574 Cluster Computing Lecture 10 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 1 October 2015 Announcements Homework #4 will be posted eventually 1 HW#4 Notes How granular
More informationIntroduction to OpenMP.
Introduction to OpenMP www.openmp.org Motivation Parallelize the following code using threads: for (i=0; i
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming Outline OpenMP Shared-memory model Parallel for loops Declaring private variables Critical sections Reductions
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical
More informationPerformance Issues in Parallelization Saman Amarasinghe Fall 2009
Performance Issues in Parallelization Saman Amarasinghe Fall 2009 Today s Lecture Performance Issues of Parallelism Cilk provides a robust environment for parallelization It hides many issues and tries
More informationCOMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP
COMP4510 Introduction to Parallel Computation Shared Memory and OpenMP Thanks to Jon Aronsson (UofM HPC consultant) for some of the material in these notes. Outline (cont d) Shared Memory and OpenMP Including
More informationOpenMP Tutorial. Seung-Jai Min. School of Electrical and Computer Engineering Purdue University, West Lafayette, IN
OpenMP Tutorial Seung-Jai Min (smin@purdue.edu) School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 1 Parallel Programming Standards Thread Libraries - Win32 API / Posix
More informationPerformance Issues in Parallelization. Saman Amarasinghe Fall 2010
Performance Issues in Parallelization Saman Amarasinghe Fall 2010 Today s Lecture Performance Issues of Parallelism Cilk provides a robust environment for parallelization It hides many issues and tries
More informationParallel Programming. Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops
Parallel Programming Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops Single computers nowadays Several CPUs (cores) 4 to 8 cores on a single chip Hyper-threading
More informationCMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman)
CMSC 714 Lecture 4 OpenMP and UPC Chau-Wen Tseng (from A. Sussman) Programming Model Overview Message passing (MPI, PVM) Separate address spaces Explicit messages to access shared data Send / receive (MPI
More informationIntroduction to OpenMP
Introduction to OpenMP Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Shared Memory Programming OpenMP Fork-Join Model Compiler Directives / Run time library routines Compiling and
More informationAlfio Lazzaro: Introduction to OpenMP
First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B. Bertinoro Italy, 12 17 October 2009 Alfio Lazzaro:
More informationLecture 4: OpenMP Open Multi-Processing
CS 4230: Parallel Programming Lecture 4: OpenMP Open Multi-Processing January 23, 2017 01/23/2017 CS4230 1 Outline OpenMP another approach for thread parallel programming Fork-Join execution model OpenMP
More informationQuestions from last time
Questions from last time Pthreads vs regular thread? Pthreads are POSIX-standard threads (1995). There exist earlier and newer standards (C++11). Pthread is probably most common. Pthread API: about a 100
More informationOverview: The OpenMP Programming Model
Overview: The OpenMP Programming Model motivation and overview the parallel directive: clauses, equivalent pthread code, examples the for directive and scheduling of loop iterations Pi example in OpenMP
More informationUvA-SARA High Performance Computing Course June Clemens Grelck, University of Amsterdam. Parallel Programming with Compiler Directives: OpenMP
Parallel Programming with Compiler Directives OpenMP Clemens Grelck University of Amsterdam UvA-SARA High Performance Computing Course June 2013 OpenMP at a Glance Loop Parallelization Scheduling Parallel
More informationCSL 730: Parallel Programming. OpenMP
CSL 730: Parallel Programming OpenMP int sum2d(int data[n][n]) { int i,j; #pragma omp parallel for for (i=0; i
More informationOpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system
OpenMP A parallel language standard that support both data and functional Parallelism on a shared memory system Use by system programmers more than application programmers Considered a low level primitives
More informationOpenMP. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen
OpenMP Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 Worksharing constructs To date: #pragma omp parallel created a team of threads We distributed
More informationA brief introduction to OpenMP
A brief introduction to OpenMP Alejandro Duran Barcelona Supercomputing Center Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing attributes 4 Synchronization 5 Worksharings 6 Task parallelism
More information2
1 2 3 4 5 Code transformation Every time the compiler finds a #pragma omp parallel directive creates a new function in which the code belonging to the scope of the pragma itself is moved The directive
More informationModule 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program
The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program Amdahl's Law About Data What is Data Race? Overview to OpenMP Components of OpenMP OpenMP Programming Model OpenMP Directives
More information/Users/engelen/Sites/HPC folder/hpc/openmpexamples.c
/* Subset of these examples adapted from: 1. http://www.llnl.gov/computing/tutorials/openmp/exercise.html 2. NAS benchmarks */ #include #include #ifdef _OPENMP #include #endif
More informationHigh Performance Computing: Tools and Applications
High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 2 OpenMP Shared address space programming High-level
More informationEPL372 Lab Exercise 5: Introduction to OpenMP
EPL372 Lab Exercise 5: Introduction to OpenMP References: https://computing.llnl.gov/tutorials/openmp/ http://openmp.org/wp/openmp-specifications/ http://openmp.org/mp-documents/openmp-4.0-c.pdf http://openmp.org/mp-documents/openmp4.0.0.examples.pdf
More informationAcknowledgments. Amdahl s Law. Contents. Programming with MPI Parallel programming. 1 speedup = (1 P )+ P N. Type to enter text
Acknowledgments Programming with MPI Parallel ming Jan Thorbecke Type to enter text This course is partly based on the MPI courses developed by Rolf Rabenseifner at the High-Performance Computing-Center
More informationProgramming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen
Programming with Shared Memory PART II HPC Fall 2007 Prof. Robert van Engelen Overview Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading HPC Fall 2007 2 Parallel
More informationCS 470 Spring Mike Lam, Professor. Advanced OpenMP
CS 470 Spring 2018 Mike Lam, Professor Advanced OpenMP Atomics OpenMP provides access to highly-efficient hardware synchronization mechanisms Use the atomic pragma to annotate a single statement Statement
More informationCS 5220: Shared memory programming. David Bindel
CS 5220: Shared memory programming David Bindel 2017-09-26 1 Message passing pain Common message passing pattern Logical global structure Local representation per processor Local data may have redundancy
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Thread-Level Parallelism (TLP) and OpenMP
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Thread-Level Parallelism (TLP) and OpenMP Instructors: John Wawrzynek & Vladimir Stojanovic http://inst.eecs.berkeley.edu/~cs61c/ Review
More informationCS 470 Spring Mike Lam, Professor. OpenMP
CS 470 Spring 2018 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism
More informationComputer Architecture
Jens Teubner Computer Architecture Summer 2016 1 Computer Architecture Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2016 Jens Teubner Computer Architecture Summer 2016 2 Part I Programming
More informationOpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder
OpenMP programming Thomas Hauser Director Research Computing thomas.hauser@colorado.edu CU meetup 1 Outline OpenMP Shared-memory model Parallel for loops Declaring private variables Critical sections Reductions
More informationCS 61C: Great Ideas in Computer Architecture. Synchronization, OpenMP
CS 61C: Great Ideas in Computer Architecture Synchronization, OpenMP Guest Lecturer: Justin Hsia 3/15/2013 Spring 2013 Lecture #22 1 Review of Last Lecture Multiprocessor systems uses shared memory (single
More informationShared Memory Programming Model
Shared Memory Programming Model Ahmed El-Mahdy and Waleed Lotfy What is a shared memory system? Activity! Consider the board as a shared memory Consider a sheet of paper in front of you as a local cache
More informationCS4961 Parallel Programming. Lecture 9: Task Parallelism in OpenMP 9/22/09. Administrative. Mary Hall September 22, 2009.
Parallel Programming Lecture 9: Task Parallelism in OpenMP Administrative Programming assignment 1 is posted (after class) Due, Tuesday, September 22 before class - Use the handin program on the CADE machines
More informationOpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen
OpenMP - II Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT
More informationOpenMP - III. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen
OpenMP - III Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT
More informationHPC Practical Course Part 3.1 Open Multi-Processing (OpenMP)
HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) V. Akishina, I. Kisel, G. Kozlov, I. Kulakov, M. Pugach, M. Zyzak Goethe University of Frankfurt am Main 2015 Task Parallelism Parallelization
More informationOpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen
OpenMP I Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT Press,
More informationOpenMP. Today s lecture. Scott B. Baden / CSE 160 / Wi '16
Lecture 8 OpenMP Today s lecture 7 OpenMP A higher level interface for threads programming http://www.openmp.org Parallelization via source code annotations All major compilers support it, including gnu
More informationOpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means
High Performance Computing: Concepts, Methods & Means OpenMP Programming Prof. Thomas Sterling Department of Computer Science Louisiana State University February 8 th, 2007 Topics Introduction Overview
More informationA Short Introduction to OpenMP. Mark Bull, EPCC, University of Edinburgh
A Short Introduction to OpenMP Mark Bull, EPCC, University of Edinburgh Overview Shared memory systems Basic Concepts in Threaded Programming Basics of OpenMP Parallel regions Parallel loops 2 Shared memory
More informationCS 470 Spring Mike Lam, Professor. OpenMP
CS 470 Spring 2017 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism
More informationhttps://www.youtube.com/playlist?list=pllx- Q6B8xqZ8n8bwjGdzBJ25X2utwnoEG
https://www.youtube.com/playlist?list=pllx- Q6B8xqZ8n8bwjGdzBJ25X2utwnoEG OpenMP Basic Defs: Solution Stack HW System layer Prog. User layer Layer Directives, Compiler End User Application OpenMP library
More informationOpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono
OpenMP Algoritmi e Calcolo Parallelo References Useful references Using OpenMP: Portable Shared Memory Parallel Programming, Barbara Chapman, Gabriele Jost and Ruud van der Pas OpenMP.org http://openmp.org/
More informationChapter 3. OpenMP David A. Padua
Chapter 3. OpenMP 1 of 61 3.1 Introduction OpenMP is a collection of compiler directives, library routines, and environment variables that can be used to specify shared memory parallelism. This collection
More informationCME 213 S PRING Eric Darve
CME 213 S PRING 2017 Eric Darve PTHREADS pthread_create, pthread_exit, pthread_join Mutex: locked/unlocked; used to protect access to shared variables (read/write) Condition variables: used to allow threads
More informationCS 470 Spring Mike Lam, Professor. Advanced OpenMP
CS 470 Spring 2017 Mike Lam, Professor Advanced OpenMP Atomics OpenMP provides access to highly-efficient hardware synchronization mechanisms Use the atomic pragma to annotate a single statement Statement
More information1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008
1 of 6 Lecture 7: March 4 CISC 879 Software Support for Multicore Architectures Spring 2008 Lecture 7: March 4, 2008 Lecturer: Lori Pollock Scribe: Navreet Virk Open MP Programming Topics covered 1. Introduction
More informationSynchronous Shared Memory Parallel Examples. HPC Fall 2012 Prof. Robert van Engelen
Synchronous Shared Memory Parallel Examples HPC Fall 2012 Prof. Robert van Engelen Examples Data parallel prefix sum and OpenMP example Task parallel prefix sum and OpenMP example Simple heat distribution
More informationProgramming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen
Programming with Shared Memory PART II HPC Fall 2012 Prof. Robert van Engelen Overview Sequential consistency Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading
More informationCS 61C: Great Ideas in Computer Architecture. OpenMP, Transistors
CS 61C: Great Ideas in Computer Architecture OpenMP, Transistors Instructor: Justin Hsia 7/17/2012 Summer 2012 Lecture #17 1 Review of Last Lecture Amdahl s Law limits benefits of parallelization Multiprocessor
More informationCS691/SC791: Parallel & Distributed Computing
CS691/SC791: Parallel & Distributed Computing Introduction to OpenMP 1 Contents Introduction OpenMP Programming Model and Examples OpenMP programming examples Task parallelism. Explicit thread synchronization.
More informationProgramming Shared Memory Systems with OpenMP Part I. Book
Programming Shared Memory Systems with OpenMP Part I Instructor Dr. Taufer Book Parallel Programming in OpenMP by Rohit Chandra, Leo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, Ramesh Menon 2 1 Machine
More informationCS 61C: Great Ideas in Computer Architecture. OpenMP, Transistors
CS 61C: Great Ideas in Computer Architecture OpenMP, Transistors Instructor: Justin Hsia 7/23/2013 Summer 2013 Lecture #17 1 Review of Last Lecture Amdahl s Law limits benefits of parallelization Multiprocessor
More informationParallelisation. Michael O Boyle. March 2014
Parallelisation Michael O Boyle March 2014 1 Lecture Overview Parallelisation for fork/join Mapping parallelism to shared memory multi-processors Loop distribution and fusion Data Partitioning and SPMD
More informationParallel programming using OpenMP
Parallel programming using OpenMP Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department
More informationCOMP Parallel Computing. SMM (2) OpenMP Programming Model
COMP 633 - Parallel Computing Lecture 7 September 12, 2017 SMM (2) OpenMP Programming Model Reading for next time look through sections 7-9 of the Open MP tutorial Topics OpenMP shared-memory parallel
More informationMPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016
MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared
More informationProgramming Shared Address Space Platforms using OpenMP
Programming Shared Address Space Platforms using OpenMP Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Topic Overview Introduction to OpenMP OpenMP
More information15-418, Spring 2008 OpenMP: A Short Introduction
15-418, Spring 2008 OpenMP: A Short Introduction This is a short introduction to OpenMP, an API (Application Program Interface) that supports multithreaded, shared address space (aka shared memory) parallelism.
More informationOpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013
OpenMP António Abreu Instituto Politécnico de Setúbal 1 de Março de 2013 António Abreu (Instituto Politécnico de Setúbal) OpenMP 1 de Março de 2013 1 / 37 openmp what? It s an Application Program Interface
More informationParallel Programming using OpenMP
1 Parallel Programming using OpenMP Mike Bailey mjb@cs.oregonstate.edu openmp.pptx OpenMP Multithreaded Programming 2 OpenMP stands for Open Multi-Processing OpenMP is a multi-vendor (see next page) standard
More informationParallel Programming using OpenMP
1 OpenMP Multithreaded Programming 2 Parallel Programming using OpenMP OpenMP stands for Open Multi-Processing OpenMP is a multi-vendor (see next page) standard to perform shared-memory multithreading
More informationShared Memory Programming Models I
Shared Memory Programming Models I Peter Bastian / Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-69120 Heidelberg phone: 06221/54-8264
More informationAnnouncements. Scott B. Baden / CSE 160 / Wi '16 2
Lecture 8 Announcements Scott B. Baden / CSE 160 / Wi '16 2 Recapping from last time: Minimal barrier synchronization in odd/even sort Global bool AllDone; for (s = 0; s < MaxIter; s++) { barr.sync();
More informationProgramming with OpenMP*
Objectives At the completion of this module you will be able to Thread serial code with basic OpenMP pragmas Use OpenMP synchronization pragmas to coordinate thread execution and memory access 2 Agenda
More informationCS4961 Parallel Programming. Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms 9/5/12. Administrative. Mary Hall September 4, 2012
CS4961 Parallel Programming Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms Administrative Mailing list set up, everyone should be on it - You should have received a test mail last night
More informationTopics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP
Topics Lecture 11 Introduction OpenMP Some Examples Library functions Environment variables 1 2 Introduction Shared Memory Parallelization OpenMP is: a standard for parallel programming in C, C++, and
More informationSynchronous Shared Memory Parallel Examples. HPC Fall 2010 Prof. Robert van Engelen
Synchronous Shared Memory Parallel Examples HPC Fall 2010 Prof. Robert van Engelen Examples Data parallel prefix sum and OpenMP example Task parallel prefix sum and OpenMP example Simple heat distribution
More informationIntroduction to OpenMP
Introduction to OpenMP Le Yan Scientific computing consultant User services group High Performance Computing @ LSU Goals Acquaint users with the concept of shared memory parallelism Acquaint users with
More informationOpenMP Fundamentals Fork-join model and data environment
www.bsc.es OpenMP Fundamentals Fork-join model and data environment Xavier Teruel and Xavier Martorell Agenda: OpenMP Fundamentals OpenMP brief introduction The fork-join model Data environment OpenMP
More informationShared Memory programming paradigm: openmp
IPM School of Physics Workshop on High Performance Computing - HPC08 Shared Memory programming paradigm: openmp Luca Heltai Stefano Cozzini SISSA - Democritos/INFM
More informationLittle Motivation Outline Introduction OpenMP Architecture Working with OpenMP Future of OpenMP End. OpenMP. Amasis Brauch German University in Cairo
OpenMP Amasis Brauch German University in Cairo May 4, 2010 Simple Algorithm 1 void i n c r e m e n t e r ( short a r r a y ) 2 { 3 long i ; 4 5 for ( i = 0 ; i < 1000000; i ++) 6 { 7 a r r a y [ i ]++;
More informationCOMP4300/8300: The OpenMP Programming Model. Alistair Rendell. Specifications maintained by OpenMP Architecture Review Board (ARB)
COMP4300/8300: The OpenMP Programming Model Alistair Rendell See: www.openmp.org Introduction to High Performance Computing for Scientists and Engineers, Hager and Wellein, Chapter 6 & 7 High Performance
More informationCOMP4300/8300: The OpenMP Programming Model. Alistair Rendell
COMP4300/8300: The OpenMP Programming Model Alistair Rendell See: www.openmp.org Introduction to High Performance Computing for Scientists and Engineers, Hager and Wellein, Chapter 6 & 7 High Performance
More informationSimone Campanoni Loop transformations
Simone Campanoni simonec@eecs.northwestern.edu Loop transformations Outline Simple loop transformations Loop invariants Induction variables Complex loop transformations Simple loop transformations Simple
More informationOpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer
OpenMP examples Sergeev Efim Senior software engineer Singularis Lab, Ltd. OpenMP Is: An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism.
More information19.1. Unit 19. OpenMP Library for Parallelism
19.1 Unit 19 OpenMP Library for Parallelism 19.2 Overview of OpenMP A library or API (Application Programming Interface) for parallelism Requires compiler support (make sure the compiler you use supports
More information5.12 EXERCISES Exercises 263
5.12 Exercises 263 5.12 EXERCISES 5.1. If it s defined, the OPENMP macro is a decimal int. Write a program that prints its value. What is the significance of the value? 5.2. Download omp trap 1.c from
More informationSynchronous Computation Examples. HPC Fall 2008 Prof. Robert van Engelen
Synchronous Computation Examples HPC Fall 2008 Prof. Robert van Engelen Overview Data parallel prefix sum with OpenMP Simple heat distribution problem with OpenMP Iterative solver with OpenMP Simple heat
More informationParallel Programming with OpenMP. CS240A, T. Yang, 2013 Modified from Demmel/Yelick s and Mary Hall s Slides
Parallel Programming with OpenMP CS240A, T. Yang, 203 Modified from Demmel/Yelick s and Mary Hall s Slides Introduction to OpenMP What is OpenMP? Open specification for Multi-Processing Standard API for
More informationSession 4: Parallel Programming with OpenMP
Session 4: Parallel Programming with OpenMP Xavier Martorell Barcelona Supercomputing Center Agenda Agenda 10:00-11:00 OpenMP fundamentals, parallel regions 11:00-11:30 Worksharing constructs 11:30-12:00
More informationOpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2.
OpenMP Overview in 30 Minutes Christian Terboven 06.12.2010 / Aachen, Germany Stand: 03.12.2010 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda OpenMP: Parallel Regions,
More informationOpenMP, Part 2. EAS 520 High Performance Scientific Computing. University of Massachusetts Dartmouth. Spring 2015
OpenMP, Part 2 EAS 520 High Performance Scientific Computing University of Massachusetts Dartmouth Spring 2015 References This presentation is almost an exact copy of Dartmouth College's openmp tutorial.
More information