Parallel Execution with OpenMP

Size: px
Start display at page:

Download "Parallel Execution with OpenMP"

Transcription

1 Parallel Execution with OpenMP Advanced Statistical Programming Camp Jonathan Olmsted (Q-APS) Day 4: May 30th, 2014 PM Session ASPC Parallel Execution with OpenMP Day 4 AM 1 / 28

2 Outline 1 OpenMP 2 Use and Mis-Use 3 Examples ASPC Parallel Execution with OpenMP Day 4 AM 2 / 28

3 What is OpenMP? OpenMP is an API for one of several C++ parallelization techniques. Restricted to a single machine, like our use of a socket cluster at the R level. Unlike the socket cluster, though, it will not run everywhere. Requires special compile-time instructions. Users annotate the section of their code that should be parallelized. We will only focus on parallel for loops (the simplest). ASPC Parallel Execution with OpenMP Day 4 AM 3 / 28

4 Outline 1 OpenMP 2 Use and Mis-Use 3 Examples ASPC Parallel Execution with OpenMP Day 4 AM 4 / 28

5 Header File # include <RcppArmadillo.h> // [[Rcpp::depends(RcppArmadillo)]] # include <omp.h> using namespace Rcpp ; We now have an additional dependency to build against. ASPC Parallel Execution with OpenMP Day 4 AM 5 / 28

6 Compiler/Linking Options library("rcpp") Sys.setenv("PKG_CXXFLAGS" = "-fopenmp") Sys.setenv("PKG_LIBS" = "-fopenmp") We set environmental variables so that the annotations are read and OpenMP is used. ASPC Parallel Execution with OpenMP Day 4 AM 6 / 28

7 Printing Output Sequentially // [[Rcpp::export()]] void omp1 () { for (int i = 0 ; i < 10 ; i++) { Rcout << " " << i << " " ; Rcout << std::endl ; Rcout sequentially is helpful. ASPC Parallel Execution with OpenMP Day 4 AM 7 / 28

8 sourcecpp("omp_functions.cpp") omp1() ## omp1() ## ASPC Parallel Execution with OpenMP Day 4 AM 8 / 28

9 Printing Output in Parallel // [[Rcpp::export()]] void omp2 (int t = 1) { omp_set_num_threads(t) ; # pragma omp parallel for for (int i = 0 ; i < 10 ; i++) { Rcout << " " << i << " " ; Rcout << std::endl ; Rcout in parallel is less helpful and will cause problems. Do not do this with your code! ASPC Parallel Execution with OpenMP Day 4 AM 9 / 28

10 omp2(4) ## omp2(4) ## There is no guaranteed order in which iterations are processed. ASPC Parallel Execution with OpenMP Day 4 AM 10 / 28

11 Iterations Must be Independent // [[Rcpp::export()]] NumericVector omp3 (NumericVector x, int t = 1 ) { omp_set_num_threads(t) ; NumericVector y(x.length()) ; # pragma omp parallel for for (int n = 0 ; n < x.length() ; n++) { y(n) = pow(x(n), 2); return(y) ; Typical usage where each iteration s task can depend on preliminary data, but iterations do not interact with each other. ASPC Parallel Execution with OpenMP Day 4 AM 11 / 28

12 omp3(9) ## [1] 81 vec <- rnorm(1e6) library(microbenchmark) ASPC Parallel Execution with OpenMP Day 4 AM 12 / 28

13 microbenchmark(c0 = vec ^ 2, c1 = omp3(vec, 1), c2 = omp3(vec, 2), c3 = omp3(vec, 3), c4 = omp3(vec, 4), c8 = omp3(vec, 8), c12 = omp3(vec, 12), c24 = omp3(vec, 24), c96 = omp3(vec, 96), c192 = omp3(vec, 96), times = 20 ) ## Unit: milliseconds ## expr min lq median uq max neval ## c ## c ## c ## c ## c ## c ## c ## c ## c ## c The best number of threads will be problem-specific. The sweet spot here is around 8. ASPC Parallel Execution with OpenMP Day 4 AM 13 / 28

14 RcppArmadillo and OpenMP // [[Rcpp::export()]] bool omp4 (arma::mat x, int iter, int t = 1 ) { omp_set_num_threads(t) ; # pragma omp parallel for for (int i = 0 ; i < iter ; i++) { arma::mat y = x.row(i).t() * x.row(i) ; return(true) ; As long as there is no cross-iteration dependence, you can do most operations in parallel. However, the ideal degree of parallelism always depends on the problem. ASPC Parallel Execution with OpenMP Day 4 AM 14 / 28

15 mx <- matrix(rnorm(5e2 * 5e2), nrow = 5e2) microbenchmark(omp4(mx, 96, 1), omp4(mx, 96, 2), omp4(mx, 96, 4), omp4(mx, 96, 12), omp4(mx, 96, 24), times = 10 ) ## Unit: milliseconds ## expr min lq median uq max neval ## omp4(mx, 96, 1) ## omp4(mx, 96, 2) ## omp4(mx, 96, 4) ## omp4(mx, 96, 12) ## omp4(mx, 96, 24) ASPC Parallel Execution with OpenMP Day 4 AM 15 / 28

16 Scalability Depends on Demands // [[Rcpp::export()]] bool omp5 (int d, int iter, int t = 1 ) { omp_set_num_threads(t) ; # pragma omp parallel for for (int i = 0 ; i < iter ; i++) { arma::mat x(d, d) ; x.randn() ; chol(x.t() * x) ; return(true) ; ASPC Parallel Execution with OpenMP Day 4 AM 16 / 28

17 microbenchmark(omp5(10, 20, 1), omp5(10, 20, 4), omp5(600, 20, 1), omp5(600, 20, 4), times = 20 ) ## Unit: microseconds ## expr min lq median uq ## omp5(10, 20, 1) ## omp5(10, 20, 4) ## omp5(600, 20, 1) ## omp5(600, 20, 4) ## max neval ## ## ## ## ASPC Parallel Execution with OpenMP Day 4 AM 17 / 28

18 R s RNGs are Not Reproducible in Parallel // [[Rcpp::export()]] NumericVector omp6 () { int N = 100 ; NumericVector x(n) ; omp_set_num_threads(4) ; #pragma omp parallel for for (int t = 0 ; t < N ; t++) { x(t) = R::rnorm(0, 1) ; return(x) ; Never use R s RNG functions in parallel. ASPC Parallel Execution with OpenMP Day 4 AM 18 / 28

19 set.seed(1) mean(omp6()) ## [1] set.seed(1) mean(omp6()) ## [1] ASPC Parallel Execution with OpenMP Day 4 AM 19 / 28

20 Outline 1 OpenMP 2 Use and Mis-Use 3 Examples ASPC Parallel Execution with OpenMP Day 4 AM 20 / 28

21 Parallel Pairwise Distance sourcecpp("pwdomp.cpp") dfcounties <- read.csv("counties.csv") mcoord <- cbind(dfcounties$latitude, dfcounties$longitude ) ASPC Parallel Execution with OpenMP Day 4 AM 21 / 28

22 Rcpp::NumericMatrix calcpwdomp (Rcpp::NumericMatrix x, int nthread = 1 ) { omp_set_num_threads(nthread) ; int nrows = x.nrow() ; int ncols = x.nrow() ; Rcpp::NumericMatrix out(nrows, ncols) ; double rad = ; const double pi = ; #pragma omp parallel for for(int arow = 0; arow < nrows; arow++) { for(int acol = 0; acol < ncols; acol++) { double phi1 = x(arow, 0) * pi / 180 ; double phi2 = x(acol, 0) * pi / 180 ; double lambda1 = x(arow, 1) * pi / 180 ; double lambda2 = x(acol, 1) * pi / 180; double q1 = 2 * rad ; double q2 = pow(sin((phi1 - phi2) / 2), 2) ; double q3 = pow(sin((lambda1 - lambda2) / 2), 2) ; double q4 = cos(phi1) * cos(phi2) ; out(arow, acol) = q1 * asin(sqrt(q2 + q4 * q3)) ; return(out) ; ASPC Parallel Execution with OpenMP Day 4 AM 22 / 28

23 dftimes <- microbenchmark(n1 = calcpwdomp(mcoord, 1), n2 = calcpwdomp(mcoord, 2), n3 = calcpwdomp(mcoord, 3), n4 = calcpwdomp(mcoord, 4), n6 = calcpwdomp(mcoord, 6), n8 = calcpwdomp(mcoord, 8), n12 = calcpwdomp(mcoord, 12), times = 10 ) ASPC Parallel Execution with OpenMP Day 4 AM 23 / 28

24 ASPC Parallel Execution with OpenMP Day 4 AM 24 / 28

25 Instead of #pragma omp parallel for for(int arow = 0; arow < nrows; arow++) { for(int acol = 0; acol < ncols; acol++) { create a function with for(int arow = 0; arow < nrows; arow++) { #pragma omp parallel for for(int acol = 0; acol < ncols; acol++) { Time them. Which is faster? ASPC Parallel Execution with OpenMP Day 4 AM 25 / 28

26 Parallel E-Step in EM Probit sourcecpp("em_probit_omp.cpp") library(zelig) data(turnout) fit0 <- glm(vote ~ income + educate + age, data = turnout, family = binomial(link = "probit") ) y <- matrix(turnout$vote) X <- model.matrix(fit0) ASPC Parallel Execution with OpenMP Day 4 AM 26 / 28

27 List em_probitomp (arma::mat y, arma::mat X, int maxit = 10, int nthr = 1 ) { omp_set_num_threads(nthr) ; int N = y.n_rows ; int K = X.n_cols ; arma::mat beta(k, 1) ; beta.fill(0.0) ; arma::mat eystar(n, 1) ; eystar.fill(0) ; for (int it = 0 ; it < maxit ; it++) { arma::mat mu = X * beta ; #pragma omp parallel for for (int n = 0 ; n < N ; n++) { if (y(n, 0) == 1) { eystar(n, 0) = mu(n, 0) + f(mu(n, 0)) ; if (y(n, 0) == 0) { eystar(n, 0) = mu(n, 0) - g(mu(n, 0)) ; beta = (X.t() * X).i() * X.t() * eystar ; List ret ; ret["n"] = N ; ret["k"] = K ; ret["beta"] = beta ; return(ret) ; ASPC Parallel Execution with OpenMP Day 4 AM 27 / 28

28 microbenchmark( a = {em_probitomp(y = y,x = X, maxit = 100,nthr = 1), b = {em_probitomp(y = y,x = X, maxit = 100,nthr = 4), c = {em_probitomp(y = y,x = X, maxit = 100,nthr = 16), times = 20 ) ## Unit: milliseconds ## expr min lq median uq max neval ## a ## b ## c ASPC Parallel Execution with OpenMP Day 4 AM 28 / 28

Basic C++ through Rcpp

Basic C++ through Rcpp Basic C++ through Rcpp Advanced Statistical Programming Camp Jonathan Olmsted (Q-APS) Day 3: May 29th, 2014 PM Session ASPC Basic C++ through Rcpp Day 3 PM 1 / 30 Outline 1 Practice 2 Rcpp Sugar 3 RNG

More information

Introduction to Topics

Introduction to Topics Introduction to Topics Advanced Statistical Programming Camp Jonathan Olmsted (Q-APS) Day 1: May 27th, 2014 AM Session ASPC Introduction to Topics Day 1 AM 1 / 42 Administrative Issues Materials posted

More information

Basic Performance Improvements

Basic Performance Improvements Basic Performance Improvements Advanced Statistical Programming Camp Jonathan Olmsted (Q-APS) Day 1: May 27th, 2014 PM Session ASPC Basic Performance Improvements Day 1 PM 1 / 66 This Session... 1 monitoring

More information

Package RcppProgress

Package RcppProgress Package RcppProgress May 11, 2018 Maintainer Karl Forner License GPL (>= 3) Title An Interruptible Progress Bar with OpenMP Support for C++ in R Packages Type Package LazyLoad yes

More information

What is Scalable Data Processing?

What is Scalable Data Processing? SCALABLE DATA PROCESSING IN R What is Scalable Data Processing? Michael J. Kane and Simon Urbanek Instructors, DataCamp In this course.. Work with data that is too large for your computer Write Scalable

More information

CPS343 Parallel and High Performance Computing Project 1 Spring 2018

CPS343 Parallel and High Performance Computing Project 1 Spring 2018 CPS343 Parallel and High Performance Computing Project 1 Spring 2018 Assignment Write a program using OpenMP to compute the estimate of the dominant eigenvalue of a matrix Due: Wednesday March 21 The program

More information

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen

OpenMP - II. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS15/16. HPAC, RWTH Aachen OpenMP - II Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT

More information

RcppExport SEXP transgraph(sexp adjm) { int num1s, // i th element will be the number // of 1s in row i of adjm cumul1s, // cumulative sums in num1s

RcppExport SEXP transgraph(sexp adjm) { int num1s, // i th element will be the number // of 1s in row i of adjm cumul1s, // cumulative sums in num1s 138 CHAPTER 5. SHARED-MEMORY: C complexity, though, can be largely hidden from the programmer through the use of the Rcpp package, and in fact the net result is that the Rcpp route is actually easier than

More information

OpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system

OpenMP. A parallel language standard that support both data and functional Parallelism on a shared memory system OpenMP A parallel language standard that support both data and functional Parallelism on a shared memory system Use by system programmers more than application programmers Considered a low level primitives

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms http://sudalab.is.s.u-tokyo.ac.jp/~reiji/pna16/ [ 8 ] OpenMP Parallel Numerical Algorithms / IST / UTokyo 1 PNA16 Lecture Plan General Topics 1. Architecture and Performance

More information

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen OpenMP I Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS16/17 OpenMP References Using OpenMP: Portable Shared Memory Parallel Programming. The MIT Press,

More information

OPEN MP and MPI on Kingspeak chpc cluster

OPEN MP and MPI on Kingspeak chpc cluster OPEN MP and MPI on Kingspeak chpc cluster Command to compile the code with openmp and mpi /uufs/kingspeak.peaks/sys/pkg/openmpi/std_intel/bin/mpicc -o hem hemhotlz.c -I /uufs/kingspeak.peaks/sys/pkg/openmpi/std_intel/include

More information

Parallel and Distributed Programming. OpenMP

Parallel and Distributed Programming. OpenMP Parallel and Distributed Programming OpenMP OpenMP Portability of software SPMD model Detailed versions (bindings) for different programming languages Components: directives for compiler library functions

More information

Higher-Performance R via C++

Higher-Performance R via C++ Higher-Performance R via C++ Part 4: Rcpp Gallery Examples Dirk Eddelbuettel UZH/ETH Zürich R Courses June 24-25, 2015 0/34 Rcpp Gallery Rcpp Gallery: Overview Overview The Rcpp Gallery at http://gallery.rcpp.org

More information

Higher-Performance R Programming with C++ Extensions

Higher-Performance R Programming with C++ Extensions Higher-Performance R Programming with C++ Extensions Part 4: Rcpp Gallery Examples Dirk Eddelbuettel June 28 and 29, 2017 University of Zürich & ETH Zürich Zürich R Courses 2017 1/34 Rcpp Gallery Zürich

More information

Parallel Programming

Parallel Programming Parallel Programming OpenMP Dr. Hyrum D. Carroll November 22, 2016 Parallel Programming in a Nutshell Load balancing vs Communication This is the eternal problem in parallel computing. The basic approaches

More information

Higher-Performance R via C++

Higher-Performance R via C++ Higher-Performance R via C++ Part 2: First Steps with Rcpp Dirk Eddelbuettel UZH/ETH Zürich R Courses June 24-25, 2015 0/48 Overview The R API R is a C program, and C programs can be extended R exposes

More information

OpenMP Fundamentals Fork-join model and data environment

OpenMP Fundamentals Fork-join model and data environment www.bsc.es OpenMP Fundamentals Fork-join model and data environment Xavier Teruel and Xavier Martorell Agenda: OpenMP Fundamentals OpenMP brief introduction The fork-join model Data environment OpenMP

More information

CME 213 S PRING Eric Darve

CME 213 S PRING Eric Darve CME 213 S PRING 2017 Eric Darve PTHREADS pthread_create, pthread_exit, pthread_join Mutex: locked/unlocked; used to protect access to shared variables (read/write) Condition variables: used to allow threads

More information

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP)

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) V. Akishina, I. Kisel, G. Kozlov, I. Kulakov, M. Pugach, M. Zyzak Goethe University of Frankfurt am Main 2015 Task Parallelism Parallelization

More information

OPENMP OPEN MULTI-PROCESSING

OPENMP OPEN MULTI-PROCESSING OPENMP OPEN MULTI-PROCESSING OpenMP OpenMP is a portable directive-based API that can be used with FORTRAN, C, and C++ for programming shared address space machines. OpenMP provides the programmer with

More information

DPHPC: Introduction to OpenMP Recitation session

DPHPC: Introduction to OpenMP Recitation session SALVATORE DI GIROLAMO DPHPC: Introduction to OpenMP Recitation session Based on http://openmp.org/mp-documents/intro_to_openmp_mattson.pdf OpenMP An Introduction What is it? A set of compiler directives

More information

Parallel programming using OpenMP

Parallel programming using OpenMP Parallel programming using OpenMP Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department

More information

EPL372 Lab Exercise 5: Introduction to OpenMP

EPL372 Lab Exercise 5: Introduction to OpenMP EPL372 Lab Exercise 5: Introduction to OpenMP References: https://computing.llnl.gov/tutorials/openmp/ http://openmp.org/wp/openmp-specifications/ http://openmp.org/mp-documents/openmp-4.0-c.pdf http://openmp.org/mp-documents/openmp4.0.0.examples.pdf

More information

Lecture 4: OpenMP Open Multi-Processing

Lecture 4: OpenMP Open Multi-Processing CS 4230: Parallel Programming Lecture 4: OpenMP Open Multi-Processing January 23, 2017 01/23/2017 CS4230 1 Outline OpenMP another approach for thread parallel programming Fork-Join execution model OpenMP

More information

Extending R with C++: A Brief Introduction to Rcpp

Extending R with C++: A Brief Introduction to Rcpp Extending R with C++: A Brief Introduction to Rcpp Dirk Eddelbuettel Debian and R Projects and James Joseph Balamuta Departments of Informatics and Statistics University of Illinois at Urbana-Champaign

More information

19.1. Unit 19. OpenMP Library for Parallelism

19.1. Unit 19. OpenMP Library for Parallelism 19.1 Unit 19 OpenMP Library for Parallelism 19.2 Overview of OpenMP A library or API (Application Programming Interface) for parallelism Requires compiler support (make sure the compiler you use supports

More information

Parallel Computing. Lecture 13: OpenMP - I

Parallel Computing. Lecture 13: OpenMP - I CSCI-UA.0480-003 Parallel Computing Lecture 13: OpenMP - I Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Small and Easy Motivation #include #include int main() {

More information

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono

OpenMP Algoritmi e Calcolo Parallelo. Daniele Loiacono OpenMP Algoritmi e Calcolo Parallelo References Useful references Using OpenMP: Portable Shared Memory Parallel Programming, Barbara Chapman, Gabriele Jost and Ruud van der Pas OpenMP.org http://openmp.org/

More information

ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 2016 Solutions Name:...

ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 2016 Solutions Name:... ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 016 Solutions Name:... Answer questions in space provided below questions. Use additional paper if necessary but make sure

More information

CS420: Operating Systems

CS420: Operating Systems Threads James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz, Galvin, Gagne Threads A thread is a basic unit of processing

More information

5.5 Example: Transforming an Adjacency Matrix, R-Callable Code

5.5 Example: Transforming an Adjacency Matrix, R-Callable Code 127 5.5 Example: Transforming an Adjacency Matrix, R-Callable Code A typical application might involve an analyst writing most of his code in R, for convenience, but write the parallel part of the code

More information

Accelerator Programming Lecture 1

Accelerator Programming Lecture 1 Accelerator Programming Lecture 1 Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences, M17 manfred.liebmann@tum.de January 11, 2016 Accelerator Programming

More information

Introduction to HPC and Optimization Tutorial VI

Introduction to HPC and Optimization Tutorial VI Felix Eckhofer Institut für numerische Mathematik und Optimierung Introduction to HPC and Optimization Tutorial VI January 8, 2013 TU Bergakademie Freiberg Going parallel HPC cluster in Freiberg 144 nodes,

More information

The following program computes a Calculus value, the "trapezoidal approximation of

The following program computes a Calculus value, the trapezoidal approximation of Multicore machines and shared memory Multicore CPUs have more than one core processor that can execute instructions at the same time. The cores share main memory. In the next few activities, we will learn

More information

Lecture 2: Introduction to OpenMP with application to a simple PDE solver

Lecture 2: Introduction to OpenMP with application to a simple PDE solver Lecture 2: Introduction to OpenMP with application to a simple PDE solver Mike Giles Mathematical Institute Mike Giles Lecture 2: Introduction to OpenMP 1 / 24 Hardware and software Hardware: a processor

More information

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico. OpenMP and MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico November 15, 2010 José Monteiro (DEI / IST) Parallel and Distributed Computing

More information

Parallel Programming Languages 1 - OpenMP

Parallel Programming Languages 1 - OpenMP some slides are from High-Performance Parallel Scientific Computing, 2008, Purdue University & CSCI-UA.0480-003: Parallel Computing, Spring 2015, New York University Parallel Programming Languages 1 -

More information

Parallel Programming

Parallel Programming Parallel Programming Lecture delivered by: Venkatanatha Sarma Y Assistant Professor MSRSAS-Bangalore 1 Session Objectives To understand the parallelization in terms of computational solutions. To understand

More information

Overview: The OpenMP Programming Model

Overview: The OpenMP Programming Model Overview: The OpenMP Programming Model motivation and overview the parallel directive: clauses, equivalent pthread code, examples the for directive and scheduling of loop iterations Pi example in OpenMP

More information

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008

1 of 6 Lecture 7: March 4. CISC 879 Software Support for Multicore Architectures Spring Lecture 7: March 4, 2008 1 of 6 Lecture 7: March 4 CISC 879 Software Support for Multicore Architectures Spring 2008 Lecture 7: March 4, 2008 Lecturer: Lori Pollock Scribe: Navreet Virk Open MP Programming Topics covered 1. Introduction

More information

OpenMP Programming. Aiichiro Nakano

OpenMP Programming. Aiichiro Nakano OpenMP Programming Aiichiro Nakano Collaboratory for Advanced Computing & Simulations Department of Computer Science Department of Physics & Astronomy Department of Chemical Engineering & Materials Science

More information

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

OpenMP and MPI. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico. OpenMP and MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico November 16, 2011 CPD (DEI / IST) Parallel and Distributed Computing 18

More information

Parallel Programming: OpenMP

Parallel Programming: OpenMP Parallel Programming: OpenMP Xianyi Zeng xzeng@utep.edu Department of Mathematical Sciences The University of Texas at El Paso. November 10, 2016. An Overview of OpenMP OpenMP: Open Multi-Processing An

More information

R et C++ Romain

R et C++ Romain R et C++ Romain François romain@r-enthusiasts.com @romain_francois Mise en route Que font ces fonctions C++? Chacune correspond à une fonction R de base double f1(numericvector x) { int n = x.size(); double

More information

Alfio Lazzaro: Introduction to OpenMP

Alfio Lazzaro: Introduction to OpenMP First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B. Bertinoro Italy, 12 17 October 2009 Alfio Lazzaro:

More information

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program

Module 10: Open Multi-Processing Lecture 19: What is Parallelization? The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program The Lecture Contains: What is Parallelization? Perfectly Load-Balanced Program Amdahl's Law About Data What is Data Race? Overview to OpenMP Components of OpenMP OpenMP Programming Model OpenMP Directives

More information

OpenMP threading: parallel regions. Paolo Burgio

OpenMP threading: parallel regions. Paolo Burgio OpenMP threading: parallel regions Paolo Burgio paolo.burgio@unimore.it Outline Expressing parallelism Understanding parallel threads Memory Data management Data clauses Synchronization Barriers, locks,

More information

Scientific Computing

Scientific Computing Lecture on Scientific Computing Dr. Kersten Schmidt Lecture 20 Technische Universität Berlin Institut für Mathematik Wintersemester 2014/2015 Syllabus Linear Regression, Fast Fourier transform Modelling

More information

Shared memory parallel computing

Shared memory parallel computing Shared memory parallel computing OpenMP Sean Stijven Przemyslaw Klosiewicz Shared-mem. programming API for SMP machines Introduced in 1997 by the OpenMP Architecture Review Board! More high-level than

More information

rtrng: Advanced Parallel Random Number Generation in R

rtrng: Advanced Parallel Random Number Generation in R rtrng: Advanced Parallel Random Number Generation in R Riccardo Porreca Mirai Solutions GmbH Tödistrasse 48 CH-8002 Zurich Switzerland user!2017 Brussels info@mirai-solutions.com www.mirai-solutions.com

More information

Multithreading in C with OpenMP

Multithreading in C with OpenMP Multithreading in C with OpenMP ICS432 - Spring 2017 Concurrent and High-Performance Programming Henri Casanova (henric@hawaii.edu) Pthreads are good and bad! Multi-threaded programming in C with Pthreads

More information

Hands-on Rcpp by Examples

Hands-on Rcpp by Examples Intro Usage Sugar Examples More Hands-on edd@debian.org @eddelbuettel R User Group München 24 June 2014 Outline Intro Usage Sugar Examples More First Steps Vision Speed Users R STL C++11 1 Intro 2 Usage

More information

Little Motivation Outline Introduction OpenMP Architecture Working with OpenMP Future of OpenMP End. OpenMP. Amasis Brauch German University in Cairo

Little Motivation Outline Introduction OpenMP Architecture Working with OpenMP Future of OpenMP End. OpenMP. Amasis Brauch German University in Cairo OpenMP Amasis Brauch German University in Cairo May 4, 2010 Simple Algorithm 1 void i n c r e m e n t e r ( short a r r a y ) 2 { 3 long i ; 4 5 for ( i = 0 ; i < 1000000; i ++) 6 { 7 a r r a y [ i ]++;

More information

Overview of OpenMP. Unit 19. Using OpenMP. Parallel for. OpenMP Library for Parallelism

Overview of OpenMP. Unit 19. Using OpenMP. Parallel for. OpenMP Library for Parallelism 19.1 Overview of OpenMP 19.2 Unit 19 OpenMP Library for Parallelism A library or API (Application Programming Interface) for parallelism Requires compiler support (make sure the compiler you use supports

More information

Session 4: Parallel Programming with OpenMP

Session 4: Parallel Programming with OpenMP Session 4: Parallel Programming with OpenMP Xavier Martorell Barcelona Supercomputing Center Agenda Agenda 10:00-11:00 OpenMP fundamentals, parallel regions 11:00-11:30 Worksharing constructs 11:30-12:00

More information

OpenMP and more Deadlock 2/16/18

OpenMP and more Deadlock 2/16/18 OpenMP and more Deadlock 2/16/18 Administrivia HW due Tuesday Cache simulator (direct-mapped and FIFO) Steps to using threads for parallelism Move code for thread into a function Create a struct to hold

More information

COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP

COMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP COMP4510 Introduction to Parallel Computation Shared Memory and OpenMP Thanks to Jon Aronsson (UofM HPC consultant) for some of the material in these notes. Outline (cont d) Shared Memory and OpenMP Including

More information

Advanced programming with OpenMP. Libor Bukata a Jan Dvořák

Advanced programming with OpenMP. Libor Bukata a Jan Dvořák Advanced programming with OpenMP Libor Bukata a Jan Dvořák Programme of the lab OpenMP Tasks parallel merge sort, parallel evaluation of expressions OpenMP SIMD parallel integration to calculate π User-defined

More information

Programming Shared Memory Systems with OpenMP Part I. Book

Programming Shared Memory Systems with OpenMP Part I. Book Programming Shared Memory Systems with OpenMP Part I Instructor Dr. Taufer Book Parallel Programming in OpenMP by Rohit Chandra, Leo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, Ramesh Menon 2 1 Machine

More information

Cache Awareness. Course Level: CS1/CS2. PDC Concepts Covered: PDC Concept Locality False Sharing

Cache Awareness. Course Level: CS1/CS2. PDC Concepts Covered: PDC Concept Locality False Sharing Cache Awareness Course Level: CS1/CS PDC Concepts Covered: PDC Concept Locality False Sharing Bloom Level C C Programming Knowledge Prerequisites: Know how to compile Java/C++ Be able to understand loops

More information

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer

OpenMP examples. Sergeev Efim. Singularis Lab, Ltd. Senior software engineer OpenMP examples Sergeev Efim Senior software engineer Singularis Lab, Ltd. OpenMP Is: An Application Program Interface (API) that may be used to explicitly direct multi-threaded, shared memory parallelism.

More information

Rcpp by Examples. Dirk

Rcpp by Examples. Dirk Intro Usage Sugar Examples More Dr. dirk@eddelbuettel.com @eddelbuettel Workshop preceding R/Finance 2014 University of Illinois at Chicago 17 May 2013 Outline Intro Usage Sugar Examples More First Steps

More information

OpenMP. Today s lecture. Scott B. Baden / CSE 160 / Wi '16

OpenMP. Today s lecture. Scott B. Baden / CSE 160 / Wi '16 Lecture 8 OpenMP Today s lecture 7 OpenMP A higher level interface for threads programming http://www.openmp.org Parallelization via source code annotations All major compilers support it, including gnu

More information

Extending R with C++ Motivation, Examples, and Context. Dirk Eddelbuettel. 20 April Debian / R Project / U of Illinois

Extending R with C++ Motivation, Examples, and Context. Dirk Eddelbuettel. 20 April Debian / R Project / U of Illinois Extending R with C++ Motivation, Examples, and Context Dirk Eddelbuettel 20 April 2018 Debian / R Project / U of Illinois U Illinois Stat 385 Guest Lecture 1/71 Outline U Illinois Stat 385 Guest Lecture

More information

Lock Yourself Out. Ruud van der Pas. Distinguished Engineer SPARC Microelectronics. Santa Clara, CA, USA

Lock Yourself Out. Ruud van der Pas. Distinguished Engineer SPARC Microelectronics. Santa Clara, CA, USA Lock Yourself Out Distinguished Engineer SPARC Microelectronics Santa Clara, CA, USA SC 17 Talk at OpenMP Booth Tuesday, November 14, 2017 1 Heads Up! Mini Tutorial! If you have a question Our friendly

More information

OpenMP, Part 2. EAS 520 High Performance Scientific Computing. University of Massachusetts Dartmouth. Spring 2015

OpenMP, Part 2. EAS 520 High Performance Scientific Computing. University of Massachusetts Dartmouth. Spring 2015 OpenMP, Part 2 EAS 520 High Performance Scientific Computing University of Massachusetts Dartmouth Spring 2015 References This presentation is almost an exact copy of Dartmouth College's openmp tutorial.

More information

OpenMP loops. Paolo Burgio.

OpenMP loops. Paolo Burgio. OpenMP loops Paolo Burgio paolo.burgio@unimore.it Outline Expressing parallelism Understanding parallel threads Memory Data management Data clauses Synchronization Barriers, locks, critical sections Work

More information

Introduction to OpenMP.

Introduction to OpenMP. Introduction to OpenMP www.openmp.org Motivation Parallelize the following code using threads: for (i=0; i

More information

R and C++ Integration with Rcpp. Motivation and Examples

R and C++ Integration with Rcpp. Motivation and Examples Intro Objects Sugar Usage Examples RInside More : Motivation and Examples Dr. edd@debian.org dirk.eddelbuettel@r-project.org Guest Lecture on April 30, 2013 CMSC 12300 Computer Science with Applications-3

More information

Parallel Programming with OpenMP. CS240A, T. Yang

Parallel Programming with OpenMP. CS240A, T. Yang Parallel Programming with OpenMP CS240A, T. Yang 1 A Programmer s View of OpenMP What is OpenMP? Open specification for Multi-Processing Standard API for defining multi-threaded shared-memory programs

More information

Make the Most of OpenMP Tasking. Sergi Mateo Bellido Compiler engineer

Make the Most of OpenMP Tasking. Sergi Mateo Bellido Compiler engineer Make the Most of OpenMP Tasking Sergi Mateo Bellido Compiler engineer 14/11/2017 Outline Intro Data-sharing clauses Cutoff clauses Scheduling clauses 2 Intro: what s a task? A task is a piece of code &

More information

cp r /global/scratch/workshop/openmp-wg-oct2017 cd openmp-wg-oct2017 && ls Current directory

cp r /global/scratch/workshop/openmp-wg-oct2017 cd openmp-wg-oct2017 && ls Current directory $ ssh username@grex.westgrid.ca username@tatanka ~ username@bison ~ cp r /global/scratch/workshop/openmp-wg-oct2017 cd openmp-wg-oct2017 && ls Current directory sh get_node_workshop.sh [ username@n139

More information

DPHPC: Introduction to OpenMP Recitation session

DPHPC: Introduction to OpenMP Recitation session SALVATORE DI GIROLAMO DPHPC: Introduction to OpenMP Recitation session Based on http://openmp.org/mp-documents/intro_to_openmp_mattson.pdf OpenMP An Introduction What is it? A set

More information

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman)

CMSC 714 Lecture 4 OpenMP and UPC. Chau-Wen Tseng (from A. Sussman) CMSC 714 Lecture 4 OpenMP and UPC Chau-Wen Tseng (from A. Sussman) Programming Model Overview Message passing (MPI, PVM) Separate address spaces Explicit messages to access shared data Send / receive (MPI

More information

High Performance Computing: Tools and Applications

High Performance Computing: Tools and Applications High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 2 OpenMP Shared address space programming High-level

More information

Parallelization, OpenMP

Parallelization, OpenMP ~ Parallelization, OpenMP Scientific Computing Winter 2016/2017 Lecture 26 Jürgen Fuhrmann juergen.fuhrmann@wias-berlin.de made wit pandoc 1 / 18 Why parallelization? Computers became faster and faster

More information

Shared Memory Programming Paradigm!

Shared Memory Programming Paradigm! Shared Memory Programming Paradigm! Ivan Girotto igirotto@ictp.it Information & Communication Technology Section (ICTS) International Centre for Theoretical Physics (ICTP) 1 Multi-CPUs & Multi-cores NUMA

More information

Parallel Programming with OpenMP

Parallel Programming with OpenMP Advanced Practical Programming for Scientists Parallel Programming with OpenMP Robert Gottwald, Thorsten Koch Zuse Institute Berlin June 9 th, 2017 Sequential program From programmers perspective: Statements

More information

CS 5220: Shared memory programming. David Bindel

CS 5220: Shared memory programming. David Bindel CS 5220: Shared memory programming David Bindel 2017-09-26 1 Message passing pain Common message passing pattern Logical global structure Local representation per processor Local data may have redundancy

More information

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines

Introduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines What is OpenMP? What does OpenMP stands for? What does OpenMP stands for? Open specifications for Multi

More information

Parallel Algorithm Engineering

Parallel Algorithm Engineering Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework and numa control Examples

More information

Assignment 1 OpenMP Tutorial Assignment

Assignment 1 OpenMP Tutorial Assignment Assignment 1 OpenMP Tutorial Assignment B. Wilkinson and C Ferner: Modification date Aug 5, 2014 Overview In this assignment, you will write and execute run some simple OpenMP programs as a tutorial. First

More information

Announcements. Scott B. Baden / CSE 160 / Wi '16 2

Announcements. Scott B. Baden / CSE 160 / Wi '16 2 Lecture 8 Announcements Scott B. Baden / CSE 160 / Wi '16 2 Recapping from last time: Minimal barrier synchronization in odd/even sort Global bool AllDone; for (s = 0; s < MaxIter; s++) { barr.sync();

More information

CS691/SC791: Parallel & Distributed Computing

CS691/SC791: Parallel & Distributed Computing CS691/SC791: Parallel & Distributed Computing Introduction to OpenMP 1 Contents Introduction OpenMP Programming Model and Examples OpenMP programming examples Task parallelism. Explicit thread synchronization.

More information

OpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder

OpenMP programming. Thomas Hauser Director Research Computing Research CU-Boulder OpenMP programming Thomas Hauser Director Research Computing thomas.hauser@colorado.edu CU meetup 1 Outline OpenMP Shared-memory model Parallel for loops Declaring private variables Critical sections Reductions

More information

UvA-SARA High Performance Computing Course June Clemens Grelck, University of Amsterdam. Parallel Programming with Compiler Directives: OpenMP

UvA-SARA High Performance Computing Course June Clemens Grelck, University of Amsterdam. Parallel Programming with Compiler Directives: OpenMP Parallel Programming with Compiler Directives OpenMP Clemens Grelck University of Amsterdam UvA-SARA High Performance Computing Course June 2013 OpenMP at a Glance Loop Parallelization Scheduling Parallel

More information

Interacting with Remote Systems + MPI

Interacting with Remote Systems + MPI Interacting with Remote Systems + MPI Advanced Statistical Programming Camp Jonathan Olmsted (Q-APS) Day 2: May 28th, 2014 PM Session ASPC Interacting with Remote Systems + MPI Day 2 PM 1 / 17 Getting

More information

Shared Memory Parallel Programming

Shared Memory Parallel Programming Shared Memory Parallel Programming Abhishek Somani, Debdeep Mukhopadhyay Mentor Graphics, IIT Kharagpur August 2, 2015 Abhishek, Debdeep (IIT Kgp) Parallel Programming August 2, 2015 1 / 46 Overview 1

More information

Parallel Programming in C with MPI and OpenMP

Parallel Programming in C with MPI and OpenMP Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical

More information

GCC Developers Summit Ottawa, Canada, June 2006

GCC Developers Summit Ottawa, Canada, June 2006 OpenMP Implementation in GCC Diego Novillo dnovillo@redhat.com Red Hat Canada GCC Developers Summit Ottawa, Canada, June 2006 OpenMP Language extensions for shared memory concurrency (C, C++ and Fortran)

More information

Parallele Numerik. Blatt 1

Parallele Numerik. Blatt 1 Universität Konstanz FB Mathematik & Statistik Prof. Dr. M. Junk Dr. Z. Yang Ausgabe: 02. Mai; SS08 Parallele Numerik Blatt 1 As a first step, we consider two basic problems. Hints for the realization

More information

Comparison between QuickThread and OpenMP 3.0 under various system load conditions.

Comparison between QuickThread and OpenMP 3.0 under various system load conditions. Comparison between QuickThread and OpenMP 3.0 under various system load conditions. 1M parallel for stride task with background load 1.2 1 0.8 0.6 0.4 QTx32_pf_Str_tq QTx64_pf_Str_tq OMPx32_pf_Str_tq OMPx64_pf_Str_tq

More information

Data Challenge 2. 1 Introduction. Milo Page. 1.1 Model. 1.2 Results. October 26, 2016

Data Challenge 2. 1 Introduction. Milo Page. 1.1 Model. 1.2 Results. October 26, 2016 Data Challenge 2 Milo Page October 26, 2016 1 Introduction Data Challenge 2: Impute missing values from an artificially generated, spatially correlated data set. 1.1 Model Based on some exploratory plots,

More information

Scientific Programming in C XIV. Parallel programming

Scientific Programming in C XIV. Parallel programming Scientific Programming in C XIV. Parallel programming Susi Lehtola 11 December 2012 Introduction The development of microchips will soon reach the fundamental physical limits of operation quantum coherence

More information

Proseminar. Programmieren in R. Rcpp. Oliver Heidmann Betreuer: Julian Kunkel. Deutsches Klima Rechenzentrum University of Hamburg. 30.

Proseminar. Programmieren in R. Rcpp. Oliver Heidmann Betreuer: Julian Kunkel. Deutsches Klima Rechenzentrum University of Hamburg. 30. Proseminar Programmieren in R Rcpp Oliver Heidmann Betreuer: Julian Kunkel Deutsches Klima Rechenzentrum University of Hamburg 30.September 2016 Contents 1 R 1 2 C++ 1 3 Basics 1 3.1 Installing packages........................

More information

Distributed Systems + Middleware Concurrent Programming with OpenMP

Distributed Systems + Middleware Concurrent Programming with OpenMP Distributed Systems + Middleware Concurrent Programming with OpenMP Gianpaolo Cugola Dipartimento di Elettronica e Informazione Politecnico, Italy cugola@elet.polimi.it http://home.dei.polimi.it/cugola

More information

Tasks and Threads. What? When? Tasks and Threads. Use OpenMP Threading Building Blocks (TBB) Intel Math Kernel Library (MKL)

Tasks and Threads. What? When? Tasks and Threads. Use OpenMP Threading Building Blocks (TBB) Intel Math Kernel Library (MKL) CGT 581I - Parallel Graphics and Simulation Knights Landing Tasks and Threads Bedrich Benes, Ph.D. Professor Department of Computer Graphics Purdue University Tasks and Threads Use OpenMP Threading Building

More information

HPCSE - I. «OpenMP Programming Model - Part I» Panos Hadjidoukas

HPCSE - I. «OpenMP Programming Model - Part I» Panos Hadjidoukas HPCSE - I «OpenMP Programming Model - Part I» Panos Hadjidoukas 1 Schedule and Goals 13.10.2017: OpenMP - part 1 study the basic features of OpenMP able to understand and write OpenMP programs 20.10.2017:

More information

Shared Memory Parallelism - OpenMP

Shared Memory Parallelism - OpenMP Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (http://www.llnl.gov/computing/tutorials/openmp/#introduction) OpenMP sc99 tutorial

More information