COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP
|
|
- Debra Perkins
- 6 years ago
- Views:
Transcription
1 COMP4510 Introduction to Parallel Computation Shared Memory and OpenMP Thanks to Jon Aronsson (UofM HPC consultant) for some of the material in these notes. Outline (cont d) Shared Memory and OpenMP Including Distributed Shared Memory N.B. Reordered Message Based Parallel Computing with MPI Hybrid parallel programming With MPI and OpenMP Introduction to Multithreading Pre-fetching, simultaneous multithreading, chip multiprocessing (CMP), SMT 10/2/07 COMP Introduction to Parallel Computation 2 1
2 Shared Memory Machines Recall - UMA Shared Memory Architecture P 0 P 1 P 2 P 3 P 4 P 5 P 6 P 7 cache cache cache cache cache cache cache cache Memory Traditional parallel computer One big box Quite expensive Relatively familiar environment Big memory Limited scalability 10/2/07 COMP Introduction to Parallel Computation 3 Shared Memory Machines (cont d) In a shared memory machine, all processors share a single memory More precisely, all processors address the same memory We may have local or shared variables but everything is addressable Because memory is shared, data that is written by one process/thread can be read by another Familiar programming model but... We must provide synchronization e.g. mutexes, barriers, etc. 10/2/07 COMP Introduction to Parallel Computation 4 2
3 Shared Memory Machines (cont d) This means that we will tend to write parallel programs that share data structures This is convenient You have probably all written a parallel program already using pthreads (or fork under Unix/Linux) in your OS class e.g. the producer consumer problem In parallel programming, the scale of parallelism will be larger i.e. many threads, not just a few 10/2/07 COMP Introduction to Parallel Computation 5 Shared Memory Programming Writing parallel programs for shared memory systems can be done in many ways e.g. Pthreads, process forking, etc. Parallel programming is hard so we look to use tools/techniques to simplify things OpenMP is a parallel programming system for shared memory machines Much simpler to use than pthreads, for example 10/2/07 COMP Introduction to Parallel Computation 6 3
4 What is OpenMP? OpenMP is an Application Programming Interface for shared memory parallel programming The OpenMP API offers the programmer full control of parallelization OpenMP is an industry standard Hence, OpenMP code is portable across different hardware architectures and operating systems OpenMP is available for Fortran and C/C++ OpenMP is ONLY compatible with shared memory machines (SMPs) 10/2/07 COMP Introduction to Parallel Computation 7 What s OpenMP? (cont d) A common way to use OpenMP is to take non-parallel code and annotate it with special directives to enable parallel execution Of course, new parallel code can also be developed from scratch as well but parallelizing a serial program might be easier again, the 80:20 rule applies get most of the parallelism in a parallelization effort at much lower cost than a complete redesign 10/2/07 COMP Introduction to Parallel Computation 8 4
5 What s OpenMP? (cont d) The OpenMP API consists of 3 primary components: Compiler Directives In OpenMP, the parallelism is defined by directives that are embedded in the source code These directives are ignored by the compiler unless you tell the compiler to parallelize the code with OpenMP Library Routines Environment Variables 10/2/07 COMP Introduction to Parallel Computation 9 What s OpenMP? (cont d) The OpenMP API consists of 3 primary components: Compiler Directives Library Routines OpenMP includes a set of library routines for dynamically querying and altering the parallelism at runtime For example, you can dynamically change the number of processors being used Environment Variables 10/2/07 COMP Introduction to Parallel Computation 10 5
6 What s OpenMP? (cont d) The OpenMP API consists of 3 primary components: Compiler Directives Library Routines Environment Variables The environment variables are set before the program is executed and they control the default parallelism the defaults may be dynamically altered by library calls 10/2/07 COMP Introduction to Parallel Computation 11 Parallel Regions OpenMP uses parallel regions to specify the blocks of code to be executed in parallel. The code that is outside of the parallel regions is run serially When the program reaches a parallel region it creates a number of threads and each thread executes the same block of code in the region separately Operating on different data (SPMD //ism) 10/2/07 COMP Introduction to Parallel Computation 12 6
7 Parallel Regions (cont d) OpenMP uses the fork-join model of parallel execution: Parallel Threads Master Thread F O R K J O I N F O R K J O I N Parallel Region Parallel Region 10/2/07 COMP Introduction to Parallel Computation 13 Parallel Regions (cont d) So, how do you code a parallel region? You use an OpenMP directive to indicate that a block of code is a parallel region In (Free Form) Fortran:!$OMP PARALLEL [block of code]!$omp END PARALLEL IN C/C++: #pragma omp parallel { [block of code] Note: In Fortran, all OpenMP directives begin with!$omp Note: In C/C++, all OpenMP directives begin with #pragma omp 10/2/07 COMP Introduction to Parallel Computation 14 7
8 Parallel Regions (cont d) Simple Example int main (void) { printf( Before Parallel Region\n ); #pragma omp parallel { printf( Inside Parallel Region\n ); printf( After Parallel Region\n ); 10/2/07 COMP Introduction to Parallel Computation 15 Parallel Regions (cont d) Master Thread printf(before Parallel Region) 3 Parallel Threads F O R K printf( Inside Parallel Region ) printf( Inside Parallel Region ) printf( Inside Parallel Region ) Parallel Region J O I N printf( After Parallel Region ) 10/2/07 COMP Introduction to Parallel Computation 16 8
9 Parallel Regions (cont d) OpenMP compilers typically generate pthreads code This makes it easy to add OpenMP support to existing compilers E.g. newer versions of gcc support pthreads enabled by a compiler switch gcc - gcc -fopenmp Advanced, parallelizing compilers may also support OpenMP E.g. The Portland Group compilers - pgcc -mp 10/2/07 COMP Introduction to Parallel Computation 17 Parallel Regions (cont d) Running an OpenMP program An OpenMP program is executed just like any other program (since its shared memory - 1 box) The number of processors to be used can be set using the OMP_NUM_THREADS environment variable (recommended!) or by using a library routine In BASH: export OMP_NUM_THREADS=value In CSH: setenv OMP_NUM_THREADS value 10/2/07 COMP Introduction to Parallel Computation 18 9
10 Parallel Regions (cont d) Compiling and Running a program $ gcc fopenmp simple_example.c $ export OMP_NUM_THREADS=2 $./a.out Before Parallel Region Inside Parallel Region Inside Parallel Region After Parallel Region $ export OMP_NUM_THREADS=3 $./a.out Before Parallel Region Inside Parallel Region Inside Parallel Region Inside Parallel Region After Parallel Region Compile Set number of processors to 2 Execute Set number of processors to 3 Execute again 10/2/07 COMP Introduction to Parallel Computation 19 Parallel Regions (cont d) There are two OpenMP library routines that are particularly useful: omp_get_thread_num, returns the rank of the current thread inside a parallel region The rank ranges from 0 to N-1, where N is the number of threads. Each thread has a unique rank omp_get_num_threads, returns the total number of parallel threads not necessarily equal to the number of processors 10/2/07 COMP Introduction to Parallel Computation 20 10
11 Parallel Regions (cont d) Another Example int main (void) { printf( Before Parallel Region\n ); #pragma omp parallel { printf( Rank %d\n, omp_get_thread_num()); printf( After Parallel Region\n ); 10/2/07 COMP Introduction to Parallel Computation 21 Parallel Regions (cont d) Master Thread printf( Before Parallel Region ) 3 Parallel Threads F O R K printf( Rank,0) printf( Rank,1) printf( Rank,2) Parallel Region J O I N printf( After Parallel Region ) 10/2/07 COMP Introduction to Parallel Computation 22 11
12 Parallel Regions (cont d) Another Example $ gcc fopenmp example.c $ export OMP_NUM_THREADS=3 $./a.out Before Parallel Region Inside Rank no. 2 Inside Rank no. 0 Inside Rank no. 1 After Parallel Region $ Compile Set number of processors to 3 Execute Note the order of the output! (parallel non-deterministic) 10/2/07 COMP Introduction to Parallel Computation 23 Parallel Regions (cont d) The calls, omp_get_thread_num and omp_get_num_threads provide the information necessary to write a SPMD (Single Program, Multiple Data) parallel program using the parallel region construct The idea is to subdivide the data to be processed into omp_get_num_threads pieces and let each thread work on its part Hence, SPMD 10/2/07 COMP Introduction to Parallel Computation 24 12
13 Parallel Regions (cont d) For example, if we want to search for a value, V, in a vector, VEC we can have each thread search in part of the vector First thread searches here Second thread searches here Last thread searches here VEC 10/2/07 COMP Introduction to Parallel Computation 25 Parallel Regions (cont d) Assuming the size of VEC is k and this is evenly divisible by the number of threads we can write an OpenMP code segment to do the searching First we compute the size of each piece of the vector (to be searched by each thread) Then we will call a function to search in our part of the vector Passing the start point and size of our piece 10/2/07 COMP Introduction to Parallel Computation 26 13
14 Parallel Regions (cont d) #pragma omp parallel { int size=k/omp_get_num_threads(); SrchSubVec (omp_get_thread_num()*size, size); 10/2/07 COMP Introduction to Parallel Computation 27 Variables Consider the following modification to our previous program: int main(int argc, char *argv[]) { int rank; printf( Before Parallel Region\n ); #pragma omp parallel { rank = omp_get_thread_num() printf( Inside Rank\n,rank); printf( After Parallel Region\n ); What happens if two threads attempt to assign a value to rank at the same time? 10/2/07 COMP Introduction to Parallel Computation 28 14
15 Variables (cont d) This leads to shared and private variables All threads share the same address space so all threads can modify the same variables which might result in undesirable behaviour Variables in a parallel region are shared by default! A private variable is only accessed by a single thread Variables are declared private with the private(list-of-variables) clause. The default(private) clause can be used to change the default to be private variables. In this case, shared variables must be declared using the shared(list-of-variables) clause. 10/2/07 COMP Introduction to Parallel Computation 29 Variables (cont d) The proper way of writing the previous example is: int main(int argc, char *argv[]) { int rank; printf( Before Parallel Region \n); #pragma omp parallel private(rank) { rank=omp_get_thread_num(); printf( Inside Rank %d\n,rank); printf( After Parallel Region ); Note: private clause added to the omp parallel directive 10/2/07 COMP Introduction to Parallel Computation 30 15
16 Loops OpenMP also provides a mechanism for parallelizing loop based computations There is a parallel for directive that can be used in C/C++ to indicate that a loop can be executed in parallel. The loop index is automatically declared as a private variable Consider the following simple sequential loop: for (i=0; i<100; i++) array[i] = i*i; 10/2/07 COMP Introduction to Parallel Computation 31 Loops (cont d) Because there are no dependences within the loop, it may be freely parallelized in OpenMP This is accomplished using the #pragma omp parallel for construct Note that the determination of freedom from dependences is up to the programmer not the compiler The revised example code would be: #pragma omp parallel for for (i=0; i<100; i++) array[i] = i*i; 10/2/07 COMP Introduction to Parallel Computation 32 16
17 Loops (cont d) #pragma omp parallel for { for (i=0;i<100;i++) array[i] = i*i Using 4 threads, this is how OpenMP will parallelize the loop in the previous example F O R K for(i=0;i<25... for(i=25;i<50... for(i=50;i<75... for(i=75;i< J O I N 10/2/07 COMP Introduction to Parallel Computation 33 Loops (cont d) What if we had some dependences though? #pragma omp parallel for { sum=0; for (i=0;i<100;i++) sum=sum+vec[i]; One solution to the problem would be to only let one thread update sum at any time. OpenMP provides a synchronization directive, critical, to define a critical region that only one thread can execute at a time #pragma omp critical { code 10/2/07 COMP Introduction to Parallel Computation 34 17
18 Loops (cont d) If more than one thread arrives at a critical region they will wait and execute the section sequentially But, synchronization is expensive so... We want to minimize the number of synchronization constructs while maintaining correctness Also, we want to do as much parallel computation as possible between synchronizations i.e. between critical regions, beginning/end of parallel region, start of parallel loops E.g. SGI recommends at least 1,000,000 floating point operations per synchronization for their SMP machines (typically 100 s of processors). 10/2/07 COMP Introduction to Parallel Computation 35 Loops (cont d) What overhead is involved in using a critical region around sum=sum+...? F O R K i=0 i=25 i=50 i=75 sum=sum+ i=1 sum=sum+ i=26 sum=sum+ i=51 sum=sum+ 10/2/07 COMP Introduction to Parallel Computation 36 18
19 Loops (cont d) Synchronization overhead F O R K i=1 i=25 i=50 i=75 sum=sum+ i=2 sum=sum+ i=26 sum=sum+ i=51 sum=sum+ 10/2/07 COMP Introduction to Parallel Computation 37 Loops (cont d) Synchronization overhead F O R K i=1 i=25 i=50 i=75 sum=sum+ i=2 sum=sum+ i=26 sum=sum+ i=51 sum=sum+ 10/2/07 COMP Introduction to Parallel Computation 38 19
20 Loops (cont d) In our examples of parallel loops so far, the data accessed in each loop iteration has been distinct from that access in other iterations This makes life very easy! In common parallel programming, this is not always the case Consider: for (i=1; i<100000; i++) { a[i]=a[i-1]+b[i]; 10/2/07 COMP Introduction to Parallel Computation 39 Loops (cont d) Iter 1: Iter 2: Iter 3: for (i=1; i<100000; i++) { a[i]=a[i-1]+b[i]; Iter 4: a[4] = a[1] = a[0] + b[1] a[2] = a[1] + b[2] a[3] = a[2] + b[3] 10/2/07 COMP Introduction to Parallel Computation 40 20
21 Loops (cont d) for (i=1; i<100000; i++) { a[i]=a[i-1]+b[i]; Iter 1: Iter 2: Iter 3: a[1] = a[0] + b[1] a[2] = a[1] + b[2] a[3] = a[2] + b[3] Iter 4: a[4] = 10/2/07 COMP Introduction to Parallel Computation 41 Loops (cont d) In cases where there are such dependences between loop iterations, we cannot use OpenMP parallel loop structures Such situations are either entirely nonparallelizable or require code restructuring E.g. manual division of the loop into parallel iterations of size 8 assuming a dependence distance of 8 10/2/07 COMP Introduction to Parallel Computation 42 21
22 Reduction Variables The REDUCTION(operator:variable) clause performs a reduction on the listed variable. i.e. Some sort of aggregation of partial results Each thread is assigned a private copy of the variable. At the end of the parallel region, the reduction operator is applied to each private copy and the result is copied to the shared variable. #pragma omp parallel for reduction(+:sum) { for (i=1;i<=100;i++) { sum = sum+vec[i] C code for vector sum using reduction 10/2/07 COMP Introduction to Parallel Computation 43 Reduction Variables (cont d) #pragma omp parallel for reduction(+:sum) for(i=1;i<=100;i++) sum = sum+vec[i] F O R K for i=1,25 sum 0 =... for i=26,50 sum 1 = REDUCE sum=sum 0+sum J O I N 10/2/07 COMP Introduction to Parallel Computation 44 22
23 Reduction Variables (cont d) #pragma omp parallel for reduction(+:sum) for(i=1;i<=100;i++) sum = sum+vec[i] F O R K Each thread computes its partial sum independently => far less overhead than using critical regions! for i=1,25 sum 0 =... for i=26,50 sum 1 = REDUCE sum=sum 0+sum J O I N 10/2/07 COMP Introduction to Parallel Computation 45 Timing Runs We ve seen code execute in parallel and when we included appropriate printf/write statements the parallel execution was obvious But, how do we know if we are executing in parallel without the printf/writes? The code should run faster! But, how do we know its running faster? And, more importantly how do we determine how much speedup are we getting from running in parallel? 10/2/07 COMP Introduction to Parallel Computation 46 23
24 Timing Runs (cont d) A useful Unix/Linux command for measuring runtime for OpenMP programs is the time command: time your_program [arguments] The time command will run your program and collect some timing statistics: Elapsed real time CPU time used by the program. This is split between user and system time, representing time spent executing your program code and time spend executing O/S functions (e.g. I/O calls), respectively. 10/2/07 COMP Introduction to Parallel Computation 47 Timing Runs (cont d) The output format of time varies between different versions of the command. Under the BASH shell: Use the elapsed time to determine a program s parallel efficiency. In this example, the program runs 5.282/1.456=3.6 times faster on four processors compare to one processor. Elapsed time CPU time Note how real and user time changes on 4 threads $ export OMP_NUM_THREADS=1 $ time./my_program real 0m5.282s user 0m4.980s sys 0m0.010s $ export OMP_NUM_THREADS=4 $ time./my_program real 0m1.456s user 0m5.140s sys 0m0.020s 10/2/07 COMP Introduction to Parallel Computation 48 24
25 Timing Runs (cont d) You must remember that using the Unix/ Linux time command times everything sometimes this is good and sometimes not E.g. what if you want to time part of your program? Also, this command is not applicable when you are running parallel code on clusters since there is more than one machine being used and more than one command being run More on timing parallel programs later! 10/2/07 COMP Introduction to Parallel Computation 49 25
Lecture 4: OpenMP Open Multi-Processing
CS 4230: Parallel Programming Lecture 4: OpenMP Open Multi-Processing January 23, 2017 01/23/2017 CS4230 1 Outline OpenMP another approach for thread parallel programming Fork-Join execution model OpenMP
More informationIntroduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines
Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines What is OpenMP? What does OpenMP stands for? What does OpenMP stands for? Open specifications for Multi
More informationShared Memory programming paradigm: openmp
IPM School of Physics Workshop on High Performance Computing - HPC08 Shared Memory programming paradigm: openmp Luca Heltai Stefano Cozzini SISSA - Democritos/INFM
More information<Insert Picture Here> OpenMP on Solaris
1 OpenMP on Solaris Wenlong Zhang Senior Sales Consultant Agenda What s OpenMP Why OpenMP OpenMP on Solaris 3 What s OpenMP Why OpenMP OpenMP on Solaris
More informationIntroduction to OpenMP
Introduction to OpenMP Le Yan Scientific computing consultant User services group High Performance Computing @ LSU Goals Acquaint users with the concept of shared memory parallelism Acquaint users with
More informationA brief introduction to OpenMP
A brief introduction to OpenMP Alejandro Duran Barcelona Supercomputing Center Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing attributes 4 Synchronization 5 Worksharings 6 Task parallelism
More informationA Short Introduction to OpenMP. Mark Bull, EPCC, University of Edinburgh
A Short Introduction to OpenMP Mark Bull, EPCC, University of Edinburgh Overview Shared memory systems Basic Concepts in Threaded Programming Basics of OpenMP Parallel regions Parallel loops 2 Shared memory
More informationMPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016
MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared
More informationParallel Processing Top manufacturer of multiprocessing video & imaging solutions.
1 of 10 3/3/2005 10:51 AM Linux Magazine March 2004 C++ Parallel Increase application performance without changing your source code. Parallel Processing Top manufacturer of multiprocessing video & imaging
More informationOpenMP 2. CSCI 4850/5850 High-Performance Computing Spring 2018
OpenMP 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives
More informationIntroduction to OpenMP
Introduction to OpenMP Le Yan Objectives of Training Acquaint users with the concept of shared memory parallelism Acquaint users with the basics of programming with OpenMP Memory System: Shared Memory
More informationLittle Motivation Outline Introduction OpenMP Architecture Working with OpenMP Future of OpenMP End. OpenMP. Amasis Brauch German University in Cairo
OpenMP Amasis Brauch German University in Cairo May 4, 2010 Simple Algorithm 1 void i n c r e m e n t e r ( short a r r a y ) 2 { 3 long i ; 4 5 for ( i = 0 ; i < 1000000; i ++) 6 { 7 a r r a y [ i ]++;
More informationCMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman)
CMSC 714 Lecture 4 OpenMP and UPC Chau-Wen Tseng (from A. Sussman) Programming Model Overview Message passing (MPI, PVM) Separate address spaces Explicit messages to access shared data Send / receive (MPI
More informationEPL372 Lab Exercise 5: Introduction to OpenMP
EPL372 Lab Exercise 5: Introduction to OpenMP References: https://computing.llnl.gov/tutorials/openmp/ http://openmp.org/wp/openmp-specifications/ http://openmp.org/mp-documents/openmp-4.0-c.pdf http://openmp.org/mp-documents/openmp4.0.0.examples.pdf
More informationIntroduction to OpenMP
Introduction to OpenMP Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Shared Memory Programming OpenMP Fork-Join Model Compiler Directives / Run time library routines Compiling and
More informationEE/CSCI 451 Introduction to Parallel and Distributed Computation. Discussion #4 2/3/2017 University of Southern California
EE/CSCI 451 Introduction to Parallel and Distributed Computation Discussion #4 2/3/2017 University of Southern California 1 USC HPCC Access Compile Submit job OpenMP Today s topic What is OpenMP OpenMP
More informationParallel Programming with OpenMP. CS240A, T. Yang
Parallel Programming with OpenMP CS240A, T. Yang 1 A Programmer s View of OpenMP What is OpenMP? Open specification for Multi-Processing Standard API for defining multi-threaded shared-memory programs
More informationOpenMP Programming. Prof. Thomas Sterling. High Performance Computing: Concepts, Methods & Means
High Performance Computing: Concepts, Methods & Means OpenMP Programming Prof. Thomas Sterling Department of Computer Science Louisiana State University February 8 th, 2007 Topics Introduction Overview
More informationShared memory programming model OpenMP TMA4280 Introduction to Supercomputing
Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing NTNU, IMF February 16. 2018 1 Recap: Distributed memory programming model Parallelism with MPI. An MPI execution is started
More information1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008
1 of 6 Lecture 7: March 4 CISC 879 Software Support for Multicore Architectures Spring 2008 Lecture 7: March 4, 2008 Lecturer: Lori Pollock Scribe: Navreet Virk Open MP Programming Topics covered 1. Introduction
More informationData Environment: Default storage attributes
COSC 6374 Parallel Computation Introduction to OpenMP(II) Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel) Edgar Gabriel Fall 2014 Data Environment: Default storage attributes
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical
More informationECE 574 Cluster Computing Lecture 10
ECE 574 Cluster Computing Lecture 10 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 1 October 2015 Announcements Homework #4 will be posted eventually 1 HW#4 Notes How granular
More informationMango DSP Top manufacturer of multiprocessing video & imaging solutions.
1 of 11 3/3/2005 10:50 AM Linux Magazine February 2004 C++ Parallel Increase application performance without changing your source code. Mango DSP Top manufacturer of multiprocessing video & imaging solutions.
More informationOpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono
OpenMP Algoritmi e Calcolo Parallelo References Useful references Using OpenMP: Portable Shared Memory Parallel Programming, Barbara Chapman, Gabriele Jost and Ruud van der Pas OpenMP.org http://openmp.org/
More informationOpenMP Shared Memory Programming
OpenMP Shared Memory Programming John Burkardt, Information Technology Department, Virginia Tech.... Mathematics Department, Ajou University, Suwon, Korea, 13 May 2009.... http://people.sc.fsu.edu/ jburkardt/presentations/
More informationOpenMPand the PGAS Model. CMSC714 Sept 15, 2015 Guest Lecturer: Ray Chen
OpenMPand the PGAS Model CMSC714 Sept 15, 2015 Guest Lecturer: Ray Chen LastTime: Message Passing Natural model for distributed-memory systems Remote ( far ) memory must be retrieved before use Programmer
More informationChip Multiprocessors COMP Lecture 9 - OpenMP & MPI
Chip Multiprocessors COMP35112 Lecture 9 - OpenMP & MPI Graham Riley 14 February 2018 1 Today s Lecture Dividing work to be done in parallel between threads in Java (as you are doing in the labs) is rather
More informationComputer Architecture
Jens Teubner Computer Architecture Summer 2016 1 Computer Architecture Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2016 Jens Teubner Computer Architecture Summer 2016 2 Part I Programming
More informationMultithreading in C with OpenMP
Multithreading in C with OpenMP ICS432 - Spring 2017 Concurrent and High-Performance Programming Henri Casanova (henric@hawaii.edu) Pthreads are good and bad! Multi-threaded programming in C with Pthreads
More informationCS691/SC791: Parallel & Distributed Computing
CS691/SC791: Parallel & Distributed Computing Introduction to OpenMP 1 Contents Introduction OpenMP Programming Model and Examples OpenMP programming examples Task parallelism. Explicit thread synchronization.
More informationOpenMP: Open Multiprocessing
OpenMP: Open Multiprocessing Erik Schnetter June 7, 2012, IHPC 2012, Iowa City Outline 1. Basic concepts, hardware architectures 2. OpenMP Programming 3. How to parallelise an existing code 4. Advanced
More informationDepartment of Informatics V. HPC-Lab. Session 2: OpenMP M. Bader, A. Breuer. Alex Breuer
HPC-Lab Session 2: OpenMP M. Bader, A. Breuer Meetings Date Schedule 10/13/14 Kickoff 10/20/14 Q&A 10/27/14 Presentation 1 11/03/14 H. Bast, Intel 11/10/14 Presentation 2 12/01/14 Presentation 3 12/08/14
More informationOpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013
OpenMP António Abreu Instituto Politécnico de Setúbal 1 de Março de 2013 António Abreu (Instituto Politécnico de Setúbal) OpenMP 1 de Março de 2013 1 / 37 openmp what? It s an Application Program Interface
More informationShared Memory Programming with OpenMP
Shared Memory Programming with OpenMP (An UHeM Training) Süha Tuna Informatics Institute, Istanbul Technical University February 12th, 2016 2 Outline - I Shared Memory Systems Threaded Programming Model
More informationHPC Practical Course Part 3.1 Open Multi-Processing (OpenMP)
HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) V. Akishina, I. Kisel, G. Kozlov, I. Kulakov, M. Pugach, M. Zyzak Goethe University of Frankfurt am Main 2015 Task Parallelism Parallelization
More informationModule 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program
The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program Amdahl's Law About Data What is Data Race? Overview to OpenMP Components of OpenMP OpenMP Programming Model OpenMP Directives
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical
More informationIntroduction to OpenMP
Introduction to OpenMP Ekpe Okorafor School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014 A little about me! PhD Computer Engineering Texas A&M University Computer Science
More informationIntroduction to OpenMP
Introduction to OpenMP Le Yan HPC Consultant User Services Goals Acquaint users with the concept of shared memory parallelism Acquaint users with the basics of programming with OpenMP Discuss briefly the
More informationParallel Programming using OpenMP
1 OpenMP Multithreaded Programming 2 Parallel Programming using OpenMP OpenMP stands for Open Multi-Processing OpenMP is a multi-vendor (see next page) standard to perform shared-memory multithreading
More informationParallel Programming using OpenMP
1 Parallel Programming using OpenMP Mike Bailey mjb@cs.oregonstate.edu openmp.pptx OpenMP Multithreaded Programming 2 OpenMP stands for Open Multi-Processing OpenMP is a multi-vendor (see next page) standard
More informationParallel Programming with OpenMP. CS240A, T. Yang, 2013 Modified from Demmel/Yelick s and Mary Hall s Slides
Parallel Programming with OpenMP CS240A, T. Yang, 203 Modified from Demmel/Yelick s and Mary Hall s Slides Introduction to OpenMP What is OpenMP? Open specification for Multi-Processing Standard API for
More informationITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 2016 Solutions Name:...
ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 016 Solutions Name:... Answer questions in space provided below questions. Use additional paper if necessary but make sure
More informationParallel Programming
Parallel Programming OpenMP Dr. Hyrum D. Carroll November 22, 2016 Parallel Programming in a Nutshell Load balancing vs Communication This is the eternal problem in parallel computing. The basic approaches
More informationIntroduction to OpenMP
Introduction to OpenMP Christian Terboven 10.04.2013 / Darmstadt, Germany Stand: 06.03.2013 Version 2.3 Rechen- und Kommunikationszentrum (RZ) History De-facto standard for
More informationOpenMP 4. CSCI 4850/5850 High-Performance Computing Spring 2018
OpenMP 4 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives
More informationOpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer
OpenMP examples Sergeev Efim Senior software engineer Singularis Lab, Ltd. OpenMP Is: An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism.
More informationTopics. Introduction. Shared Memory Parallelization. Example. Lecture 11. OpenMP Execution Model Fork-Join model 5/15/2012. Introduction OpenMP
Topics Lecture 11 Introduction OpenMP Some Examples Library functions Environment variables 1 2 Introduction Shared Memory Parallelization OpenMP is: a standard for parallel programming in C, C++, and
More informationOpenMP: Open Multiprocessing
OpenMP: Open Multiprocessing Erik Schnetter May 20-22, 2013, IHPC 2013, Iowa City 2,500 BC: Military Invents Parallelism Outline 1. Basic concepts, hardware architectures 2. OpenMP Programming 3. How to
More informationIntroduc)on to OpenMP
Introduc)on to OpenMP Chapter 5.1-5. Bryan Mills, PhD Spring 2017 OpenMP An API for shared-memory parallel programming. MP = multiprocessing Designed for systems in which each thread or process can potentially
More informationParallel Programming. Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops
Parallel Programming Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops Single computers nowadays Several CPUs (cores) 4 to 8 cores on a single chip Hyper-threading
More informationCOSC 6374 Parallel Computation. Introduction to OpenMP(I) Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel)
COSC 6374 Parallel Computation Introduction to OpenMP(I) Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel) Edgar Gabriel Fall 2014 Introduction Threads vs. processes Recap of
More informationCS 470 Spring Mike Lam, Professor. OpenMP
CS 470 Spring 2018 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism
More informationOpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen
OpenMP I Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT Press,
More informationOpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen
OpenMP - II Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming Outline OpenMP Shared-memory model Parallel for loops Declaring private variables Critical sections Reductions
More informationJANUARY 2004 LINUX MAGAZINE Linux in Europe User Mode Linux PHP 5 Reflection Volume 6 / Issue 1 OPEN SOURCE. OPEN STANDARDS.
0104 Cover (Curtis) 11/19/03 9:52 AM Page 1 JANUARY 2004 LINUX MAGAZINE Linux in Europe User Mode Linux PHP 5 Reflection Volume 6 / Issue 1 LINUX M A G A Z I N E OPEN SOURCE. OPEN STANDARDS. THE STATE
More informationCS4961 Parallel Programming. Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms 9/5/12. Administrative. Mary Hall September 4, 2012
CS4961 Parallel Programming Lecture 5: More OpenMP, Introduction to Data Parallel Algorithms Administrative Mailing list set up, everyone should be on it - You should have received a test mail last night
More informationHPC Workshop University of Kentucky May 9, 2007 May 10, 2007
HPC Workshop University of Kentucky May 9, 2007 May 10, 2007 Part 3 Parallel Programming Parallel Programming Concepts Amdahl s Law Parallel Programming Models Tools Compiler (Intel) Math Libraries (Intel)
More informationOpenMP - Introduction
OpenMP - Introduction Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı - 21.06.2012 Outline What is OpenMP? Introduction (Code Structure, Directives, Threads etc.) Limitations Data Scope Clauses Shared,
More informationSession 4: Parallel Programming with OpenMP
Session 4: Parallel Programming with OpenMP Xavier Martorell Barcelona Supercomputing Center Agenda Agenda 10:00-11:00 OpenMP fundamentals, parallel regions 11:00-11:30 Worksharing constructs 11:30-12:00
More informationIntroduction to OpenMP.
Introduction to OpenMP www.openmp.org Motivation Parallelize the following code using threads: for (i=0; i
More informationParallel Programming
Parallel Programming Lecture delivered by: Venkatanatha Sarma Y Assistant Professor MSRSAS-Bangalore 1 Session Objectives To understand the parallelization in terms of computational solutions. To understand
More informationINTRODUCTION TO OPENMP
INTRODUCTION TO OPENMP Hossein Pourreza hossein.pourreza@umanitoba.ca February 25, 2016 Acknowledgement: Examples used in this presentation are courtesy of SciNet. What is High Performance Computing (HPC)
More informationCOSC 6374 Parallel Computation. Introduction to OpenMP. Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel)
COSC 6374 Parallel Computation Introduction to OpenMP Some slides based on material by Barbara Chapman (UH) and Tim Mattson (Intel) Edgar Gabriel Fall 2015 OpenMP Provides thread programming model at a
More informationHigh Performance Computing: Tools and Applications
High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 2 OpenMP Shared address space programming High-level
More informationIntroduction to OpenMP
1 Introduction to OpenMP NTNU-IT HPC Section John Floan Notur: NTNU HPC http://www.notur.no/ www.hpc.ntnu.no/ Name, title of the presentation 2 Plan for the day Introduction to OpenMP and parallel programming
More informationIntroduction to OpenMP
Introduction to OpenMP Xiaoxu Guan High Performance Computing, LSU April 6, 2016 LSU HPC Training Series, Spring 2016 p. 1/44 Overview Overview of Parallel Computing LSU HPC Training Series, Spring 2016
More informationOpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system
OpenMP A parallel language standard that support both data and functional Parallelism on a shared memory system Use by system programmers more than application programmers Considered a low level primitives
More informationIntroduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah
Introduction to OpenMP Martin Čuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu Overview Quick introduction. Parallel loops. Parallel loop directives. Parallel sections.
More informationIntroduction to OpenMP
Introduction to OpenMP Lecture 2: OpenMP fundamentals Overview Basic Concepts in OpenMP History of OpenMP Compiling and running OpenMP programs 2 1 What is OpenMP? OpenMP is an API designed for programming
More informationParallel Programming
Parallel Programming OpenMP Nils Moschüring PhD Student (LMU) Nils Moschüring PhD Student (LMU), OpenMP 1 1 Overview What is parallel software development Why do we need parallel computation? Problems
More informationScientific Programming in C XIV. Parallel programming
Scientific Programming in C XIV. Parallel programming Susi Lehtola 11 December 2012 Introduction The development of microchips will soon reach the fundamental physical limits of operation quantum coherence
More informationAlfio Lazzaro: Introduction to OpenMP
First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B. Bertinoro Italy, 12 17 October 2009 Alfio Lazzaro:
More informationIntroduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah
Introduction to OpenMP Martin Čuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu Overview Quick introduction. Parallel loops. Parallel loop directives. Parallel sections.
More informationParallel Programming. OpenMP Parallel programming for multiprocessors for loops
Parallel Programming OpenMP Parallel programming for multiprocessors for loops OpenMP OpenMP An application programming interface (API) for parallel programming on multiprocessors Assumes shared memory
More informationCS420: Operating Systems
Threads James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz, Galvin, Gagne Threads A thread is a basic unit of processing
More informationIntroduction to OpenMP
Presentation Introduction to OpenMP Martin Cuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu September 9, 2004 http://www.chpc.utah.edu 4/13/2006 http://www.chpc.utah.edu
More informationDistributed Systems + Middleware Concurrent Programming with OpenMP
Distributed Systems + Middleware Concurrent Programming with OpenMP Gianpaolo Cugola Dipartimento di Elettronica e Informazione Politecnico, Italy cugola@elet.polimi.it http://home.dei.polimi.it/cugola
More informationShared Memory Programming Model
Shared Memory Programming Model Ahmed El-Mahdy and Waleed Lotfy What is a shared memory system? Activity! Consider the board as a shared memory Consider a sheet of paper in front of you as a local cache
More informationIntroduction to Standard OpenMP 3.1
Introduction to Standard OpenMP 3.1 Massimiliano Culpo - m.culpo@cineca.it Gian Franco Marras - g.marras@cineca.it CINECA - SuperComputing Applications and Innovation Department 1 / 59 Outline 1 Introduction
More informationCS 470 Spring Mike Lam, Professor. OpenMP
CS 470 Spring 2017 Mike Lam, Professor OpenMP OpenMP Programming language extension Compiler support required "Open Multi-Processing" (open standard; latest version is 4.5) Automatic thread-level parallelism
More informationGLOSSARY. OpenMP. OpenMP brings the power of multiprocessing to your C, C++, and. Fortran programs. BY WOLFGANG DAUTERMANN
OpenMP OpenMP brings the power of multiprocessing to your C, C++, and Fortran programs. BY WOLFGANG DAUTERMANN f you bought a new computer recently, or if you are wading through advertising material because
More informationOpenACC. Part I. Ned Nedialkov. McMaster University Canada. October 2016
OpenACC. Part I Ned Nedialkov McMaster University Canada October 2016 Outline Introduction Execution model Memory model Compiling pgaccelinfo Example Speedups Profiling c 2016 Ned Nedialkov 2/23 Why accelerators
More informationOpenMP Introduction. CS 590: High Performance Computing. OpenMP. A standard for shared-memory parallel programming. MP = multiprocessing
CS 590: High Performance Computing OpenMP Introduction Fengguang Song Department of Computer Science IUPUI OpenMP A standard for shared-memory parallel programming. MP = multiprocessing Designed for systems
More informationRaspberry Pi Basics. CSInParallel Project
Raspberry Pi Basics CSInParallel Project Sep 11, 2016 CONTENTS 1 Getting started with the Raspberry Pi 1 2 A simple parallel program 3 3 Running Loops in parallel 7 4 When loops have dependencies 11 5
More informationOpenMP Tutorial. Seung-Jai Min. School of Electrical and Computer Engineering Purdue University, West Lafayette, IN
OpenMP Tutorial Seung-Jai Min (smin@purdue.edu) School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 1 Parallel Programming Standards Thread Libraries - Win32 API / Posix
More informationIntroduction to OpenMP
Christian Terboven, Dirk Schmidl IT Center, RWTH Aachen University Member of the HPC Group terboven,schmidl@itc.rwth-aachen.de IT Center der RWTH Aachen University History De-facto standard for Shared-Memory
More informationParallel Computing. Prof. Marco Bertini
Parallel Computing Prof. Marco Bertini Shared memory: OpenMP Implicit threads: motivations Implicit threading frameworks and libraries take care of much of the minutiae needed to create, manage, and (to
More informationSHARCNET Workshop on Parallel Computing. Hugh Merz Laurentian University May 2008
SHARCNET Workshop on Parallel Computing Hugh Merz Laurentian University May 2008 What is Parallel Computing? A computational method that utilizes multiple processing elements to solve a problem in tandem
More informationThreaded Programming. Lecture 9: Alternatives to OpenMP
Threaded Programming Lecture 9: Alternatives to OpenMP What s wrong with OpenMP? OpenMP is designed for programs where you want a fixed number of threads, and you always want the threads to be consuming
More informationAmdahl s Law. AMath 483/583 Lecture 13 April 25, Amdahl s Law. Amdahl s Law. Today: Amdahl s law Speed up, strong and weak scaling OpenMP
AMath 483/583 Lecture 13 April 25, 2011 Amdahl s Law Today: Amdahl s law Speed up, strong and weak scaling OpenMP Typically only part of a computation can be parallelized. Suppose 50% of the computation
More informationOpenMP Overview. in 30 Minutes. Christian Terboven / Aachen, Germany Stand: Version 2.
OpenMP Overview in 30 Minutes Christian Terboven 06.12.2010 / Aachen, Germany Stand: 03.12.2010 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda OpenMP: Parallel Regions,
More informationIntroduction to. Slides prepared by : Farzana Rahman 1
Introduction to OpenMP Slides prepared by : Farzana Rahman 1 Definition of OpenMP Application Program Interface (API) for Shared Memory Parallel Programming Directive based approach with library support
More informationAdvanced C Programming Winter Term 2008/09. Guest Lecture by Markus Thiele
Advanced C Programming Winter Term 2008/09 Guest Lecture by Markus Thiele Lecture 14: Parallel Programming with OpenMP Motivation: Why parallelize? The free lunch is over. Herb
More informationCS 5220: Shared memory programming. David Bindel
CS 5220: Shared memory programming David Bindel 2017-09-26 1 Message passing pain Common message passing pattern Logical global structure Local representation per processor Local data may have redundancy
More informationIntroduction to OpenMP. Martin Čuma Center for High Performance Computing University of Utah
Introduction to OpenMP Martin Čuma Center for High Performance Computing University of Utah m.cuma@utah.edu Overview Quick introduction. Parallel loops. Parallel loop directives. Parallel sections. Some
More informationOpenMP on Ranger and Stampede (with Labs)
OpenMP on Ranger and Stampede (with Labs) Steve Lantz Senior Research Associate Cornell CAC Parallel Computing at TACC: Ranger to Stampede Transition November 6, 2012 Based on materials developed by Kent
More informationOpenMP Fundamentals Fork-join model and data environment
www.bsc.es OpenMP Fundamentals Fork-join model and data environment Xavier Teruel and Xavier Martorell Agenda: OpenMP Fundamentals OpenMP brief introduction The fork-join model Data environment OpenMP
More information