Introduction to Parallel Processing. Lecture #10 May 2002 Guy Tel-Zur

Size: px
Start display at page:

Download "Introduction to Parallel Processing. Lecture #10 May 2002 Guy Tel-Zur"

Transcription

1 Introduction to Parallel Processing Lecture #10 May 2002 Guy Tel-Zur

2 Topics Parallel Numerical Algorithms Allen & Wilkinson s book chapter 10 More MPI Commands Home assignment #5

3 Wilkinson&Allen PDF

4 Direct, Recursive and Mesh Gauss Elimination Jacobi Red-! Gauss-Seidel Over-relaxation! Black ordering

5 " Matrix Addition Matrix Multiplication Matrix-Vector Multiplication Linear Equations Matrix Multiplication Recursive Implementation Mesh Implementation 2D pipeline Systolic Array Gauss Elimination Jacobi Iteration

6 " Gauss-Seidel Relaxation Red-Black Ordering Over-relaxation Multi-Grid

7 Intermediate MPI Parts from: Using MPI book by Gropp, Lusk and Skjellum, Chapter 4. The source codes can be downloaded from:

8 Topics The Poisson Problem Topologies Jacobi Iterations

9 Terminology The general form of a second order linear PDE: a * 2 u/x 2 + b * 2 u/xy + c * 2 u/y 2 + d * u/x + e * u/y + f = 0 (y denotes time for hyperbolic and parabolic equations) Analog to solutions of general quadratic equation a * x 2 + b * x*y + c * y 2 + d * x + e * y + f = 0 Ellipse: 4*a*c b^2 > 0 Hyperbola: 4*a*c b^2 < 0 Parabola: 4*a*c b^2 = 0 Heat Equation

10 Poisson s equation arises in many models 1D: 2 u/x 2 = f(x) 2D: 2 u/x u/y 2 = f(x,y) 3D: 2 u/x u/y u/z 2 = f(x,y,z) Heat flow: Temperature(position, time) Diffusion: Concentration(position, time) Electrostatic or Gravitational Potential: Potential(position) Fluid flow: Velocity,Pressure,Density(position,time) Quantum mechanics: Wave-function(position,time) Elasticity: Stress,Strain(position,time)

11 Poisson s equation in 1D: 2 u/x 2 = f(x) Discretize 2 u/x 2 = f(x) on regular mesh u i = u(i*h) to get [ u i+1 2*u i + u i-1 ] / h 2 = f(x) Write as solving Tu = -h 2 * f for u where T = Graph and stencil

12 2D Poisson s equation: 2 u/x u/y 2 = f(x,y) Similar to the 1D case, but the matrix T is now Grid points numbered left to right, top row to bottom row Graph and stencil T = D is analogous Similar adjacency matrix for arbitrary graph

13 Composite mesh from a mechanical structure

14 Converting the mesh to a matrix

15 Irregular mesh: NASA Airfoil in 2D (direct solution)

16 Adaptive Mesh Refinement (AMR) Adaptive mesh around an explosion John Bell and Phil Colella at LBL

17 1 0,..,, 1 1 0,..,, 1 : Define a square mesh (grid) on the boundary ), ( ), ( in the interior ), ( 2 n j n j y n i n i x y x g y x u y x f u i i Problem Definition

18 Discretization Poisson Equation: -4* u(i,j) + u(i-1,j) + u(i+1,j) + u(i,j-1) + u(i,j+1) = f(i,j) Jacobi Iterations: u k+1 (i,j)=1/4(u k (i-1,j)+u k (i+1,j)+u k (i,j-1)+u k (i,j+1)-h 2 f(i,j))

19 5 point stencil approx. for 2-D Poisson problem

20 Jacobi Iteration Serial Version (Fortran)

21 Jacobi Iteration Serial Version (C)

22 Finite Difference Algorithm

23 s = start, e=end Jacobi Iteration for a Slice

24 1-D Decomposition of the Domain

25

26 Ghost Points double precision u(0:n+1,s-1:e+1)

27 Topology Virtual Topology Cartesian Topology In the next slides: 2-D Cartesian Topology

28 2D Cartesian Decomposition 4 x 3 domain

29 Defining Cartesian Topologies Our next task is to define how to assign processes to each part of the decomposed domain MPI lets user specify various application topologies The routine MPI_Cart_create() creates a Cartesian decomposition of the processes, with the number of dimensions given by the ndim argument This creates a new communicator with the same processes as the input communicator, but with the specified topology dims[0]=4; dims[1]=3; periods[0]=0; periods[1]=0; /* specify if connection is with wrap round */ ndim=2; MPI_Cart_create(MPI_COMM_WORLD,ndim,*dims,*per iods,reorder,comm2d);

30 Domain Decomposition C bindings: MPI_Cart_create(MPI_Comm comm_old, int ndims, int *dims, int *isperiodic, int reorder, MPI_Comm *new_comm) MPI_Cart_get -

31 MPI_CART_CREATE integer dims(2) logical isperiodic(2), reoeder dims(1) = 4 dims(2) = 3 isperiodic(1) =.false. isperiodic(2)=.false. reorder =.true. ndim = 2 call MPI_CART_CREATE(MPI_COMM_WORLD, ndim, dims, isperiodic, reorder, comm2d, ierr)

32 To determine the coordinates of a calling process FORTRAN examples: call MPI_CART_GET(comm1d, 2, dims, periods, coords, ierr) print *, '('coords(1), ','coords(2), ')' call MPI_COMM_RANK(comm2d, myrank, ierr) call MPI_CART_COORDS(comm2d, myrank,2,coords,ierr)

33 2-Step Process to Transfer Data

34 More Exotic MPI Functions MPI_Cart_shift(MPI_Comm comm, int direction, int displ, int *src, int *dest)

35 MPI_Cart_shift /* create cartesian topology for processes */ dims[0] = nrow; /* number of rows */ dims[1] = mcol; /* number of columns */ period[0] = 1; /* cyclic in this direction */ period[1] = 0; /* no cyclic in this direction */ MPI_Cart_create(MPI_COMM_WORLD, ndim, dims, period, reorder, &comm2d); MPI_Comm_rank(comm2D, &me); MPI_Cart_coords(comm2D, me, ndim, coords); source = me; /* calling process rank in 2D communicator */ index = 0; /* shift along the 1st index (out of 2) */ displ = 1; /* shift by 1 */ MPI_Cart_shift(comm2D, index, displ, source, &dest1);

36 MPI_PROC_NULL! Compute neighbors IF (myrank.eq.0) THEN left = MPI_PROC_NULL ELSE left = myrank - 1 END IF IF (myrank.eq.p-1)then right = MPI_PROC_NULL ELSE right = myrank+1 END IF

37 MPE_DECOMP1D Determine the array limits (s and e in our code): call MPE_DECOMP1D(n, nprocs, myrank, s, e) Where: nprocs = # of processes in the Cartesian coordinates, myrank = cart. coord. of the calling process n = size of the array (1..n)

38 MPE_DECOMP1D Similar to: s = 1+myrank*(n/nprocs) e = s+(n/nprocs) - 1

39 MPE_DECOMP1D C This file contains a routine for producing a decomposition of C a 1-d array c when given a number of processors. C It may be used in "direct" product decomposition. C The values returned assume a "global" domain in [1:n] subroutine MPE_DECOMP1D(n, numprocs, myid, s, e ) integer n, numprocs, myid, s, e integer nlocal integer deficit

40 MPE_DECOMP1D nlocal = n / numprocs s = myid * nlocal + 1 deficit = mod(n,numprocs) s = s + min(myid,deficit) if (myid.lt. deficit) then nlocal = nlocal + 1 endif e = s + nlocal - 1 if (e.gt.n.or.myid.eq.numprocs-1) e = n return end

41 A code to exchange data for ghost points using blocking send/recv subroutine exchng1(a,nx,s,e,comm1d,nbrbottom,nbrtop ) include "mpif.h" integer nx, s, e double precision a(0:nx+1,s-1:e+1) integer comm1d, nbrbottom, nbrtop integer status(mpi_status_size), ierr call MPI_SEND(a(1,e),nx,MPI_DOUBLE_PRECISION,nbrtop, 0, comm1d, ierr) call MPI_RECV(a(1,s-1),nx,MPI_DOUBLE_PRECISION, nbrbottom,0,comm1d,ierr) call MPI_SEND(a(1,s),nx,MPI_DOUBLE_PRECISION, nbrbottom, 1,comm1d,ierr) call MPI_RECV(a(1,e+1),nx,MPI_DOUBLE_PRECISION, nbrtop,1,comm1d,ierr) return end

42 The previous example was simple But It is not necessarily the best way to implement the exchange of ghost points

43 sendrecv (exchange data ver. 2) subroutine exchng1( a, nx, s, e, comm1d, nbrbottom, nbrtop ) include "mpif.h" integer nx, s, e double precision a(0:nx+1,s-1:e+1) integer comm1d, nbrbottom, nbrtop integer status(mpi_status_size), ierr call MPI_SENDRECV( $ a(1,e),nx,mpi_double_precision, nbrtop, 0, $ a(1,s-1),nx,mpi_double_precision,nbrbottom, 0, $ comm1d, status, ierr ) call MPI_SENDRECV( $ a(1,s), nx, MPI_DOUBLE_PRECISION, nbrbottom, 1, $ a(1,e+1), nx, MPI_DOUBLE_PRECISION, nbrtop, 1, $ comm1d, status, ierr ) return end

44 Implementation of the Jacobi Iteration-1 program main include "mpif.h" integer maxn parameter (maxn = 128) double precision a(maxn,maxn),b(maxn,maxn),f(maxn,maxn) integer nx, ny integer myid, numprocs, ierr integer comm1d, nbrbottom, nbrtop, s, e, it double precision diff, diffnorm, dwork double precision t1, t2 double precision MPI_WTIME external MPI_WTIME external diff call MPI_INIT( ierr ) call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr ) call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )

45 c c c Implementation of the Jacobi Iteration-2 if (myid.eq. 0) then Get the size of the problem print *, 'Enter nx' read *, nx nx = 110 endif call MPI_BCAST(nx,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr) ny = nx c Get a new communicator for a decomposition of the domain call MPI_CART_CREATE(MPI_COMM_WORLD,1,numprocs,.false.,.true.,comm1d,ierr )

46 Implementation of the Jacobi Iteration-3 c c Get my position in this communicator, and my neighbors c call MPI_COMM_RANK (comm1d,myid,ierr) call MPI_Cart_shift(comm1d,0,1,nbrbottom, nbrtop,ierr) c c Compute the actual decomposition c call MPE_DECOMP1D(ny,numprocs,myid,s,e ) c c Initialize the right-hand-side (f) and the initial solution guess (a) c call onedinit( a, b, f, nx, s, e )

47 Implementation of the Jacobi Iteration-4 C Actually do the computation. Note the use of a collective C operation to check for convergence, and a do-loop to bound the C number of iterations. call MPI_BARRIER( MPI_COMM_WORLD, ierr ) t1 = MPI_WTIME() do 10 it=1, 100 call exchng1( a, nx, s, e, comm1d, nbrbottom, nbrtop ) call sweep1d( a, f, nx, s, e, b ) call exchng1( b, nx, s, e, comm1d, nbrbottom, nbrtop ) call sweep1d( b, f, nx, s, e, a ) dwork = diff( a, b, nx, s, e ) call MPI_Allreduce( dwork, diffnorm, 1, $ MPI_DOUBLE_PRECISION, MPI_SUM, comm1d, ierr ) if (diffnorm.lt. 1.0e-5) goto 20 if (myid.eq. 0) print *, 2*it, ' Difference is ', diffnorm 10 continue

48 Implementation of the Jacobi Iteration-5 if (myid.eq. 0) print *, 'Failed to converge' 20 continue t2 = MPI_WTIME() if (myid.eq. 0) then print *, 'Converged after ', 2*it, ' Iterations in ', t2 - t1,' secs ' endif c call MPI_FINALIZE(ierr) end

49 Implementation of the Jacobi Iteration-6 c Perform a Jacobi sweep for a 1-d decomposition. c Sweep from a into b subroutine sweep1d( a, f, nx, s, e, b ) integer nx, s, e double precision a(0:nx+1,s-1:e+1), f(0:nx+1,s- 1:e+1), + b(0:nx+1,s-1:e+1) integer i, j double precision h h = 1.0d0 / dble(nx+1) do 10 j=s, e do 10 i=1, nx b(i,j) = 0.25 * (a(i-1,j)+a(i,j+1)+a(i,j- 1)+a(i+1,j)) + h * h * f(i,j) 10 continue return end

50 Implementation of the Jacobi Iteration-7 c c The rest of the 1-d program double precision function diff( a, b, nx, s, e ) integer nx, s, e double precision a(0:nx+1, s-1:e+1), b(0:nx+1, s- 1:e+1) double precision sum integer i, j sum = 0.0d0 do 10 j=s,e do 10 i=1,nx sum = sum + (a(i,j) - b(i,j)) ** 2 10 continue diff = sum return end

51 Timing for variants of the 1-D decomposition of the Poisson problem P Blocking Send Ordered Send Sendrecv Buffered Send Non Blocking Isend e~20% (1/14 faster than 1proc)

52 $"#

53 % " # Grid computing#

lslogin3$ cd lslogin3$ tar -xvf ~train00/mpibasic_lab.tar cd mpibasic_lab/pi cd mpibasic_lab/decomp1d

lslogin3$ cd lslogin3$ tar -xvf ~train00/mpibasic_lab.tar cd mpibasic_lab/pi cd mpibasic_lab/decomp1d MPI Lab Getting Started Login to ranger.tacc.utexas.edu Untar the lab source code lslogin3$ cd lslogin3$ tar -xvf ~train00/mpibasic_lab.tar Part 1: Getting Started with simple parallel coding hello mpi-world

More information

AMath 483/583 Lecture 24. Notes: Notes: Steady state diffusion. Notes: Finite difference method. Outline:

AMath 483/583 Lecture 24. Notes: Notes: Steady state diffusion. Notes: Finite difference method. Outline: AMath 483/583 Lecture 24 Outline: Heat equation and discretization OpenMP and MPI for iterative methods Jacobi, Gauss-Seidel, SOR Notes and Sample codes: Class notes: Linear algebra software $UWHPSC/codes/openmp/jacobi1d_omp1.f90

More information

AMath 483/583 Lecture 24

AMath 483/583 Lecture 24 AMath 483/583 Lecture 24 Outline: Heat equation and discretization OpenMP and MPI for iterative methods Jacobi, Gauss-Seidel, SOR Notes and Sample codes: Class notes: Linear algebra software $UWHPSC/codes/openmp/jacobi1d_omp1.f90

More information

AMath 483/583 Lecture 21 May 13, 2011

AMath 483/583 Lecture 21 May 13, 2011 AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versions of Jacobi iteration Gauss-Seidel and SOR iterative methods Next week: More MPI Debugging and totalview GPU computing Read: Class notes

More information

Large-Scale Simulations on Parallel Computers!

Large-Scale Simulations on Parallel Computers! http://users.wpi.edu/~gretar/me612.html! Large-Scale Simulations on Parallel Computers! Grétar Tryggvason! Spring 2010! Outline!!Basic Machine configurations!!parallelization!!the Message Passing Interface

More information

MPI Lab. How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums

MPI Lab. How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums MPI Lab Parallelization (Calculating π in parallel) How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums Sharing Data Across Processors

More information

Introduction to MPI. Ricardo Fonseca. https://sites.google.com/view/rafonseca2017/

Introduction to MPI. Ricardo Fonseca. https://sites.google.com/view/rafonseca2017/ Introduction to MPI Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Distributed Memory Programming (MPI) Message Passing Model Initializing and terminating programs Point to point

More information

Introduction to MPI part II. Fabio AFFINITO

Introduction to MPI part II. Fabio AFFINITO Introduction to MPI part II Fabio AFFINITO (f.affinito@cineca.it) Collective communications Communications involving a group of processes. They are called by all the ranks involved in a communicator (or

More information

MPI Lab. Steve Lantz Susan Mehringer. Parallel Computing on Ranger and Longhorn May 16, 2012

MPI Lab. Steve Lantz Susan Mehringer. Parallel Computing on Ranger and Longhorn May 16, 2012 MPI Lab Steve Lantz Susan Mehringer Parallel Computing on Ranger and Longhorn May 16, 2012 1 MPI Lab Parallelization (Calculating p in parallel) How to split a problem across multiple processors Broadcasting

More information

Review of MPI Part 2

Review of MPI Part 2 Review of MPI Part Russian-German School on High Performance Computer Systems, June, 7 th until July, 6 th 005, Novosibirsk 3. Day, 9 th of June, 005 HLRS, University of Stuttgart Slide Chap. 5 Virtual

More information

Reusing this material

Reusing this material Virtual Topologies Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

High-Performance Computing: MPI (ctd)

High-Performance Computing: MPI (ctd) High-Performance Computing: MPI (ctd) Adrian F. Clark: alien@essex.ac.uk 2015 16 Adrian F. Clark: alien@essex.ac.uk High-Performance Computing: MPI (ctd) 2015 16 1 / 22 A reminder Last time, we started

More information

CME 213 SPRING Eric Darve

CME 213 SPRING Eric Darve CME 213 SPRING 2017 Eric Darve LINEAR ALGEBRA MATRIX-VECTOR PRODUCTS Application example: matrix-vector product We are going to use that example to illustrate additional MPI functionalities. This will

More information

INTRODUCTION TO MPI VIRTUAL TOPOLOGIES

INTRODUCTION TO MPI VIRTUAL TOPOLOGIES INTRODUCTION TO MPI VIRTUAL TOPOLOGIES Introduction to Parallel Computing with MPI and OpenMP 18-19-20 november 2013 a.marani@cineca.it g.muscianisi@cineca.it l.ferraro@cineca.it VIRTUAL TOPOLOGY Topology:

More information

Topics. Lecture 7. Review. Other MPI collective functions. Collective Communication (cont d) MPI Programming (III)

Topics. Lecture 7. Review. Other MPI collective functions. Collective Communication (cont d) MPI Programming (III) Topics Lecture 7 MPI Programming (III) Collective communication (cont d) Point-to-point communication Basic point-to-point communication Non-blocking point-to-point communication Four modes of blocking

More information

Examples of MPI programming. Examples of MPI programming p. 1/18

Examples of MPI programming. Examples of MPI programming p. 1/18 Examples of MPI programming Examples of MPI programming p. 1/18 Examples of MPI programming p. 2/18 A 1D example The computational problem: A uniform mesh in x-direction with M +2 points: x 0 is left boundary

More information

Source:

Source: MPI Message Passing Interface Communicator groups and Process Topologies Source: http://www.netlib.org/utk/papers/mpi-book/mpi-book.html Communicators and Groups Communicators For logical division of processes

More information

Topologies in MPI. Instructor: Dr. M. Taufer

Topologies in MPI. Instructor: Dr. M. Taufer Topologies in MPI Instructor: Dr. M. Taufer WS2004/2005 Topology We can associate additional information (beyond the group and the context) to a communicator. A linear ranking of processes may not adequately

More information

Topics. Lecture 6. Point-to-point Communication. Point-to-point Communication. Broadcast. Basic Point-to-point communication. MPI Programming (III)

Topics. Lecture 6. Point-to-point Communication. Point-to-point Communication. Broadcast. Basic Point-to-point communication. MPI Programming (III) Topics Lecture 6 MPI Programming (III) Point-to-point communication Basic point-to-point communication Non-blocking point-to-point communication Four modes of blocking communication Manager-Worker Programming

More information

MPI Workshop - III. Research Staff Cartesian Topologies in MPI and Passing Structures in MPI Week 3 of 3

MPI Workshop - III. Research Staff Cartesian Topologies in MPI and Passing Structures in MPI Week 3 of 3 MPI Workshop - III Research Staff Cartesian Topologies in MPI and Passing Structures in MPI Week 3 of 3 Schedule 4Course Map 4Fix environments to run MPI codes 4CartesianTopology! MPI_Cart_create! MPI_

More information

MPI: Message Passing Interface An Introduction. S. Lakshmivarahan School of Computer Science

MPI: Message Passing Interface An Introduction. S. Lakshmivarahan School of Computer Science MPI: Message Passing Interface An Introduction S. Lakshmivarahan School of Computer Science MPI: A specification for message passing libraries designed to be a standard for distributed memory message passing,

More information

Computational Fluid Dynamics. Large Problems & Computer Science Issues. Computational Fluid Dynamics

Computational Fluid Dynamics. Large Problems & Computer Science Issues. Computational Fluid Dynamics Computational Fluid Dynamics Lecture 26 May 1, 2017 Large Problems & Computer Science Issues Grétar Tryggvason As physical problems of interest become more complex and codes grow larger, the importance

More information

COSC 4397 Parallel Computation. Introduction to MPI (III) Process Grouping. Terminology (I)

COSC 4397 Parallel Computation. Introduction to MPI (III) Process Grouping. Terminology (I) COSC 4397 Introduction to MPI (III) Process Grouping Spring 2010 Terminology (I) an MPI_Group is the object describing the list of processes forming a logical entity a group has a size MPI_Group_size every

More information

Communicators. MPI Communicators and Topologies. Why Communicators? MPI_Comm_split

Communicators. MPI Communicators and Topologies. Why Communicators? MPI_Comm_split Communicators MPI Communicators and Topologies Based on notes by Science & Technology Support High Performance Computing Ohio Supercomputer Center A communicator is a parameter in all MPI message passing

More information

AMath 483/583 Lecture 18 May 6, 2011

AMath 483/583 Lecture 18 May 6, 2011 AMath 483/583 Lecture 18 May 6, 2011 Today: MPI concepts Communicators, broadcast, reduce Next week: MPI send and receive Iterative methods Read: Class notes and references $CLASSHG/codes/mpi MPI Message

More information

Lecture 7: More about MPI programming. Lecture 7: More about MPI programming p. 1

Lecture 7: More about MPI programming. Lecture 7: More about MPI programming p. 1 Lecture 7: More about MPI programming Lecture 7: More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems

More information

INTRODUCTION TO MPI COMMUNICATORS AND VIRTUAL TOPOLOGIES

INTRODUCTION TO MPI COMMUNICATORS AND VIRTUAL TOPOLOGIES INTRODUCTION TO MPI COMMUNICATORS AND VIRTUAL TOPOLOGIES Introduction to Parallel Computing with MPI and OpenMP 24 november 2017 a.marani@cineca.it WHAT ARE COMMUNICATORS? Many users are familiar with

More information

Parallel Programming Using Basic MPI. Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center

Parallel Programming Using Basic MPI. Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center 05 Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Talk Overview Background on MPI Documentation Hello world in MPI Basic communications Simple

More information

AMath 483/583 Lecture 21

AMath 483/583 Lecture 21 AMath 483/583 Lecture 21 Outline: Review MPI, reduce and bcast MPI send and receive Master Worker paradigm References: $UWHPSC/codes/mpi class notes: MPI section class notes: MPI section of bibliography

More information

MPI version of the Serial Code With One-Dimensional Decomposition. Timothy H. Kaiser, Ph.D.

MPI version of the Serial Code With One-Dimensional Decomposition. Timothy H. Kaiser, Ph.D. MPI version of the Serial Code With One-Dimensional Decomposition Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Overview We will choose one of the two dimensions and subdivide the domain to allow the distribution

More information

High Performance Computing Lecture 41. Matthew Jacob Indian Institute of Science

High Performance Computing Lecture 41. Matthew Jacob Indian Institute of Science High Performance Computing Lecture 41 Matthew Jacob Indian Institute of Science Example: MPI Pi Calculating Program /Each process initializes, determines the communicator size and its own rank MPI_Init

More information

Practical Scientific Computing: Performanceoptimized

Practical Scientific Computing: Performanceoptimized Practical Scientific Computing: Performanceoptimized Programming Advanced MPI Programming December 13, 2006 Dr. Ralf-Peter Mundani Department of Computer Science Chair V Technische Universität München,

More information

A Heat-Transfer Example with MPI Rolf Rabenseifner

A Heat-Transfer Example with MPI Rolf Rabenseifner A Heat-Transfer Example with MPI (short version) Rolf Rabenseifner rabenseifner@hlrs.de University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de A Heat-Transfer Example with

More information

Code Parallelization

Code Parallelization Code Parallelization a guided walk-through m.cestari@cineca.it f.salvadore@cineca.it Summer School ed. 2015 Code Parallelization two stages to write a parallel code problem domain algorithm program domain

More information

Intermediate MPI features

Intermediate MPI features Intermediate MPI features Advanced message passing Collective communication Topologies Group communication Forms of message passing (1) Communication modes: Standard: system decides whether message is

More information

MPI Programming Techniques

MPI Programming Techniques MPI Programming Techniques Copyright (c) 2012 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any

More information

Introduction to MPI Part II Collective Communications and communicators

Introduction to MPI Part II Collective Communications and communicators Introduction to MPI Part II Collective Communications and communicators Andrew Emerson, Fabio Affinito {a.emerson,f.affinito}@cineca.it SuperComputing Applications and Innovation Department Collective

More information

Numerical Algorithms

Numerical Algorithms Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0

More information

MPI Version of the Stommel Code with One and Two Dimensional Decomposition

MPI Version of the Stommel Code with One and Two Dimensional Decomposition MPI Version of the Stommel Code with One and Two Dimensional Decomposition Timothy H. Kaiser, Ph.D. tkaiser@sdsc.edu 1 Overview We will choose one of the two dimensions and subdivide the domain to allow

More information

More Communication (cont d)

More Communication (cont d) Data types and the use of communicators can simplify parallel program development and improve code readability Sometimes, however, simply treating the processors as an unstructured collection is less than

More information

Introduction to Parallel. Programming

Introduction to Parallel. Programming University of Nizhni Novgorod Faculty of Computational Mathematics & Cybernetics Introduction to Parallel Section 4. Part 3. Programming Parallel Programming with MPI Gergel V.P., Professor, D.Sc., Software

More information

A Heat-Transfer Example with MPI Rolf Rabenseifner

A Heat-Transfer Example with MPI Rolf Rabenseifner A Heat-Transfer Example with MPI Rolf Rabenseifner University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de A Heat-Transfer Example with MPI Slide 1 Goals first complex MPI

More information

CS4961 Parallel Programming. Lecture 18: Introduction to Message Passing 11/3/10. Final Project Purpose: Mary Hall November 2, 2010.

CS4961 Parallel Programming. Lecture 18: Introduction to Message Passing 11/3/10. Final Project Purpose: Mary Hall November 2, 2010. Parallel Programming Lecture 18: Introduction to Message Passing Mary Hall November 2, 2010 Final Project Purpose: - A chance to dig in deeper into a parallel programming model and explore concepts. -

More information

Department of Informatics V. HPC-Lab. Session 4: MPI, CG M. Bader, A. Breuer. Alex Breuer

Department of Informatics V. HPC-Lab. Session 4: MPI, CG M. Bader, A. Breuer. Alex Breuer HPC-Lab Session 4: MPI, CG M. Bader, A. Breuer Meetings Date Schedule 10/13/14 Kickoff 10/20/14 Q&A 10/27/14 Presentation 1 11/03/14 H. Bast, Intel 11/10/14 Presentation 2 12/01/14 Presentation 3 12/08/14

More information

MPI 8. CSCI 4850/5850 High-Performance Computing Spring 2018

MPI 8. CSCI 4850/5850 High-Performance Computing Spring 2018 MPI 8 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives

More information

An Introduction to MPI

An Introduction to MPI An Introduction to MPI Parallel Programming with the Message Passing Interface William Gropp Ewing Lusk Argonne National Laboratory 1 Outline Background The message-passing model Origins of MPI and current

More information

MPI Optimization. HPC Saudi, March 15 th 2016

MPI Optimization. HPC Saudi, March 15 th 2016 MPI Optimization HPC Saudi, March 15 th 2016 Useful Variables MPI Variables to get more insights MPICH_VERSION_DISPLAY=1 MPICH_ENV_DISPLAY=1 MPICH_CPUMASK_DISPLAY=1 MPICH_RANK_REORDER_DISPLAY=1 When using

More information

Collective Communication in MPI and Advanced Features

Collective Communication in MPI and Advanced Features Collective Communication in MPI and Advanced Features Pacheco s book. Chapter 3 T. Yang, CS240A. Part of slides from the text book, CS267 K. Yelick from UC Berkeley and B. Gropp, ANL Outline Collective

More information

Lecture 5. Applications: N-body simulation, sorting, stencil methods

Lecture 5. Applications: N-body simulation, sorting, stencil methods Lecture 5 Applications: N-body simulation, sorting, stencil methods Announcements Quiz #1 in section on 10/13 Midterm: evening of 10/30, 7:00 to 8:20 PM In Assignment 2, the following variation is suggested

More information

Decomposing onto different processors

Decomposing onto different processors N-Body II: MPI Decomposing onto different processors Direct summation (N 2 ) - each particle needs to know about all other particles No locality possible Inherently a difficult problem to parallelize in

More information

Introduction to MPI. Table of Contents

Introduction to MPI. Table of Contents 1. Program Structure 2. Communication Model Topology Messages 3. Basic Functions 4. Made-up Example Programs 5. Global Operations 6. LaPlace Equation Solver 7. Asynchronous Communication 8. Communication

More information

Parallel Programming, MPI Lecture 2

Parallel Programming, MPI Lecture 2 Parallel Programming, MPI Lecture 2 Ehsan Nedaaee Oskoee 1 1 Department of Physics IASBS IPM Grid and HPC workshop IV, 2011 Outline 1 Introduction and Review The Von Neumann Computer Kinds of Parallel

More information

More about MPI programming. More about MPI programming p. 1

More about MPI programming. More about MPI programming p. 1 More about MPI programming More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems, the CPUs share

More information

CINES MPI. Johanne Charpentier & Gabriel Hautreux

CINES MPI. Johanne Charpentier & Gabriel Hautreux Training @ CINES MPI Johanne Charpentier & Gabriel Hautreux charpentier@cines.fr hautreux@cines.fr Clusters Architecture OpenMP MPI Hybrid MPI+OpenMP MPI Message Passing Interface 1. Introduction 2. MPI

More information

Report S1 Fortran. Kengo Nakajima Information Technology Center

Report S1 Fortran. Kengo Nakajima Information Technology Center Report S1 Fortran Kengo Nakajima Information Technology Center Technical & Scientific Computing II (4820-1028) Seminar on Computer Science II (4810-1205) Problem S1-1 Report S1 Read local files /a1.0~a1.3,

More information

Lecture 14: Mixed MPI-OpenMP programming. Lecture 14: Mixed MPI-OpenMP programming p. 1

Lecture 14: Mixed MPI-OpenMP programming. Lecture 14: Mixed MPI-OpenMP programming p. 1 Lecture 14: Mixed MPI-OpenMP programming Lecture 14: Mixed MPI-OpenMP programming p. 1 Overview Motivations for mixed MPI-OpenMP programming Advantages and disadvantages The example of the Jacobi method

More information

Optimization of MPI Applications Rolf Rabenseifner

Optimization of MPI Applications Rolf Rabenseifner Optimization of MPI Applications Rolf Rabenseifner University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Optimization of MPI Applications Slide 1 Optimization and Standardization

More information

Introduction to parallel computing

Introduction to parallel computing Introduction to parallel computing What is parallel computing? Serial computing Single processing unit (core) is used for solving a problem Single task performed at once Parallel computing Multiple cores

More information

mpi-02.c 1/1. 15/10/26 mpi-01.c 1/1. 15/10/26

mpi-02.c 1/1. 15/10/26 mpi-01.c 1/1. 15/10/26 mpi-01.c 1/1 main ( argc, char * argv[]) rank, size; prf ("I am process %d of %d\n", rank, size); mpi-02.c 1/1 #include main ( argc, char * argv[]) rank, size, src, dest, nc; tag = 50; // tag

More information

Week 3: MPI. Day 03 :: Data types, groups, topologies, error handling, parallel debugging

Week 3: MPI. Day 03 :: Data types, groups, topologies, error handling, parallel debugging Week 3: MPI Day 03 :: Data types, groups, topologies, error handling, parallel debugging Other MPI-1 features Timing REAL(8) :: t t = MPI_Wtime() -- not a sub, no ierror! Get the current wall-clock time

More information

An introduction to MPI

An introduction to MPI An introduction to MPI C MPI is a Library for Message-Passing Not built in to compiler Function calls that can be made from any compiler, many languages Just link to it Wrappers: mpicc, mpif77 Fortran

More information

Topologies. Ned Nedialkov. McMaster University Canada. SE 4F03 March 2016

Topologies. Ned Nedialkov. McMaster University Canada. SE 4F03 March 2016 Topologies Ned Nedialkov McMaster University Canada SE 4F03 March 2016 Outline Introduction Cartesian topology Some Cartesian topology functions Some graph topology functions c 2013 16 Ned Nedialkov 2/11

More information

IPM Workshop on High Performance Computing (HPC08) IPM School of Physics Workshop on High Perfomance Computing/HPC08

IPM Workshop on High Performance Computing (HPC08) IPM School of Physics Workshop on High Perfomance Computing/HPC08 IPM School of Physics Workshop on High Perfomance Computing/HPC08 16-21 February 2008 MPI tutorial Luca Heltai Stefano Cozzini Democritos/INFM + SISSA 1 When

More information

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3 6 Iterative Solvers Lab Objective: Many real-world problems of the form Ax = b have tens of thousands of parameters Solving such systems with Gaussian elimination or matrix factorizations could require

More information

Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR

Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR clematis@ge.imati.cnr.it Ack. & riferimenti An Introduction to MPI Parallel Programming with the Message Passing InterfaceWilliam

More information

Fabio AFFINITO.

Fabio AFFINITO. Introduction to Message Passing Interface Fabio AFFINITO Collective communications Communications involving a group of processes. They are called by all the ranks involved in a communicator (or a group)

More information

Programming for High Performance Computing. Programming Environment Dec 11, 2014 Osamu Tatebe

Programming for High Performance Computing. Programming Environment Dec 11, 2014 Osamu Tatebe Programming for High Performance Computing Programming Environment Dec 11, 2014 Osamu Tatebe Distributed Memory Machine (PC Cluster) A distributed memory machine consists of computers (compute nodes) connected

More information

PARALLEL METHODS FOR SOLVING PARTIAL DIFFERENTIAL EQUATIONS. Ioana Chiorean

PARALLEL METHODS FOR SOLVING PARTIAL DIFFERENTIAL EQUATIONS. Ioana Chiorean 5 Kragujevac J. Math. 25 (2003) 5 18. PARALLEL METHODS FOR SOLVING PARTIAL DIFFERENTIAL EQUATIONS Ioana Chiorean Babeş-Bolyai University, Department of Mathematics, Cluj-Napoca, Romania (Received May 28,

More information

Introduction to Multigrid and its Parallelization

Introduction to Multigrid and its Parallelization Introduction to Multigrid and its Parallelization! Thomas D. Economon Lecture 14a May 28, 2014 Announcements 2 HW 1 & 2 have been returned. Any questions? Final projects are due June 11, 5 pm. If you are

More information

MPI introduction - exercises -

MPI introduction - exercises - MPI introduction - exercises - Paolo Ramieri, Maurizio Cremonesi May 2016 Startup notes Access the server and go on scratch partition: ssh a08tra49@login.galileo.cineca.it cd $CINECA_SCRATCH Create a job

More information

Introduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign

Introduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Introduction to MPI May 20, 2013 Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Top500.org PERFORMANCE DEVELOPMENT 1 Eflop/s 162 Pflop/s PROJECTED 100 Pflop/s

More information

CS 426. Building and Running a Parallel Application

CS 426. Building and Running a Parallel Application CS 426 Building and Running a Parallel Application 1 Task/Channel Model Design Efficient Parallel Programs (or Algorithms) Mainly for distributed memory systems (e.g. Clusters) Break Parallel Computations

More information

Comparison of different solvers for two-dimensional steady heat conduction equation ME 412 Project 2

Comparison of different solvers for two-dimensional steady heat conduction equation ME 412 Project 2 Comparison of different solvers for two-dimensional steady heat conduction equation ME 412 Project 2 Jingwei Zhu March 19, 2014 Instructor: Surya Pratap Vanka 1 Project Description The purpose of this

More information

Acknowledgments. Programming with MPI Basic send and receive. A Minimal MPI Program (C) Contents. Type to enter text

Acknowledgments. Programming with MPI Basic send and receive. A Minimal MPI Program (C) Contents. Type to enter text Acknowledgments Programming with MPI Basic send and receive Jan Thorbecke Type to enter text This course is partly based on the MPI course developed by Rolf Rabenseifner at the High-Performance Computing-Center

More information

Programming with MPI Basic send and receive

Programming with MPI Basic send and receive Programming with MPI Basic send and receive Jan Thorbecke Type to enter text Delft University of Technology Challenge the future Acknowledgments This course is partly based on the MPI course developed

More information

High Performance Computing Course Notes Message Passing Programming III

High Performance Computing Course Notes Message Passing Programming III High Performance Computing Course Notes 2008-2009 2009 Message Passing Programming III Communication modes Synchronous mode The communication is considered complete when the sender receives the acknowledgement

More information

Advanced Parallel Programming

Advanced Parallel Programming Sebastian von Alfthan Jussi Enkovaara Pekka Manninen Advanced Parallel Programming February 15-17, 2016 PRACE Advanced Training Center CSC IT Center for Science Ltd, Finland All material (C) 2011-2016

More information

Parallelization of an Example Program

Parallelization of an Example Program Parallelization of an Example Program [ 2.3] In this lecture, we will consider a parallelization of the kernel of the Ocean application. Goals: Illustrate parallel programming in a low-level parallel language.

More information

Simulating ocean currents

Simulating ocean currents Simulating ocean currents We will study a parallel application that simulates ocean currents. Goal: Simulate the motion of water currents in the ocean. Important to climate modeling. Motion depends on

More information

CSCE 5160 Parallel Processing. CSCE 5160 Parallel Processing

CSCE 5160 Parallel Processing. CSCE 5160 Parallel Processing HW #9 10., 10.3, 10.7 Due April 17 { } Review Completing Graph Algorithms Maximal Independent Set Johnson s shortest path algorithm using adjacency lists Q= V; for all v in Q l[v] = infinity; l[s] = 0;

More information

High Performance Computing Course Notes Message Passing Programming III

High Performance Computing Course Notes Message Passing Programming III High Performance Computing Course Notes 2009-2010 2010 Message Passing Programming III Blocking synchronous send the sender doesn t return until it receives the acknowledgement from the receiver that the

More information

CSE 590: Special Topics Course ( Supercomputing ) Lecture 6 ( Analyzing Distributed Memory Algorithms )

CSE 590: Special Topics Course ( Supercomputing ) Lecture 6 ( Analyzing Distributed Memory Algorithms ) CSE 590: Special Topics Course ( Supercomputing ) Lecture 6 ( Analyzing Distributed Memory Algorithms ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2012 2D Heat Diffusion

More information

CEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced

CEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced 1 / 32 CEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced Albert S. Kim Department of Civil and Environmental Engineering University of Hawai i at Manoa 2540 Dole

More information

Numerical Modelling in Fortran: day 6. Paul Tackley, 2017

Numerical Modelling in Fortran: day 6. Paul Tackley, 2017 Numerical Modelling in Fortran: day 6 Paul Tackley, 2017 Today s Goals 1. Learn about pointers, generic procedures and operators 2. Learn about iterative solvers for boundary value problems, including

More information

Slides prepared by : Farzana Rahman 1

Slides prepared by : Farzana Rahman 1 Introduction to MPI 1 Background on MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers, and parallel programmers Used to create parallel programs based

More information

Recap of Parallelism & MPI

Recap of Parallelism & MPI Recap of Parallelism & MPI Chris Brady Heather Ratcliffe The Angry Penguin, used under creative commons licence from Swantje Hess and Jannis Pohlmann. Warwick RSE 13/12/2017 Parallel programming Break

More information

An Investigation into Iterative Methods for Solving Elliptic PDE s Andrew M Brown Computer Science/Maths Session (2000/2001)

An Investigation into Iterative Methods for Solving Elliptic PDE s Andrew M Brown Computer Science/Maths Session (2000/2001) An Investigation into Iterative Methods for Solving Elliptic PDE s Andrew M Brown Computer Science/Maths Session (000/001) Summary The objectives of this project were as follows: 1) Investigate iterative

More information

Lecture 6. Floating Point Arithmetic Stencil Methods Introduction to OpenMP

Lecture 6. Floating Point Arithmetic Stencil Methods Introduction to OpenMP Lecture 6 Floating Point Arithmetic Stencil Methods Introduction to OpenMP Announcements Section and Lecture will be switched next week Thursday: section and Q2 Friday: Lecture 2 Today s lecture Floating

More information

Buffering in MPI communications

Buffering in MPI communications Buffering in MPI communications Application buffer: specified by the first parameter in MPI_Send/Recv functions System buffer: Hidden from the programmer and managed by the MPI library Is limitted and

More information

Introduction to TDDC78 Lab Series. Lu Li Linköping University Parts of Slides developed by Usman Dastgeer

Introduction to TDDC78 Lab Series. Lu Li Linköping University Parts of Slides developed by Usman Dastgeer Introduction to TDDC78 Lab Series Lu Li Linköping University Parts of Slides developed by Usman Dastgeer Goals Shared- and Distributed-memory systems Programming parallelism (typical problems) Goals Shared-

More information

MA471. Lecture 5. Collective MPI Communication

MA471. Lecture 5. Collective MPI Communication MA471 Lecture 5 Collective MPI Communication Today: When all the processes want to send, receive or both Excellent website for MPI command syntax available at: http://www-unix.mcs.anl.gov/mpi/www/ 9/10/2003

More information

Parallel I/O. and split communicators. David Henty, Fiona Ried, Gavin J. Pringle

Parallel I/O. and split communicators. David Henty, Fiona Ried, Gavin J. Pringle Parallel I/O and split communicators David Henty, Fiona Ried, Gavin J. Pringle Dr Gavin J. Pringle Applications Consultant gavin@epcc.ed.ac.uk +44 131 650 6709 4x4 array on 2x2 Process Grid Parallel IO

More information

Parallel Poisson Solver in Fortran

Parallel Poisson Solver in Fortran Parallel Poisson Solver in Fortran Nilas Mandrup Hansen, Ask Hjorth Larsen January 19, 1 1 Introduction In this assignment the D Poisson problem (Eq.1) is to be solved in either C/C++ or FORTRAN, first

More information

Practical Introduction to Message-Passing Interface (MPI)

Practical Introduction to Message-Passing Interface (MPI) 1 Outline of the workshop 2 Practical Introduction to Message-Passing Interface (MPI) Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@mcgill.ca Theoretical / practical introduction Parallelizing your

More information

MPI introduction - exercises -

MPI introduction - exercises - MPI introduction - exercises - Introduction to Parallel Computing with MPI and OpenMP P. Ramieri May 2015 Hello world! (Fortran) As an ice breaking activity try to compile and run the Helloprogram, either

More information

MPI. Communicating non-contiguous data or mixed datatypes Collective communications Virtual topologies

MPI. Communicating non-contiguous data or mixed datatypes Collective communications Virtual topologies MPI 1 Communicating non-contiguous data or mixed datatypes Collective communications Virtual topologies 2 Communicating Non-contiguous Data or Mixed Datatypes The following strategies and features in MPI

More information

Parallel Programming using MPI. Supercomputing group CINECA

Parallel Programming using MPI. Supercomputing group CINECA Parallel Programming using MPI Supercomputing group CINECA Contents Programming with message passing Introduction to message passing and MPI Basic MPI programs MPI Communicators Send and Receive function

More information

Message Passing with MPI

Message Passing with MPI Message Passing with MPI PPCES 2016 Hristo Iliev IT Center / JARA-HPC IT Center der RWTH Aachen University Agenda Motivation Part 1 Concepts Point-to-point communication Non-blocking operations Part 2

More information

Parallel Programming Using MPI

Parallel Programming Using MPI Parallel Programming Using MPI Short Course on HPC 15th February 2019 Aditya Krishna Swamy adityaks@iisc.ac.in SERC, Indian Institute of Science When Parallel Computing Helps? Want to speed up your calculation

More information

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution

Multigrid Pattern. I. Problem. II. Driving Forces. III. Solution Multigrid Pattern I. Problem Problem domain is decomposed into a set of geometric grids, where each element participates in a local computation followed by data exchanges with adjacent neighbors. The grids

More information