Introduction to Parallel Processing. Lecture #10 May 2002 Guy Tel-Zur
|
|
- Patrick Washington
- 6 years ago
- Views:
Transcription
1 Introduction to Parallel Processing Lecture #10 May 2002 Guy Tel-Zur
2 Topics Parallel Numerical Algorithms Allen & Wilkinson s book chapter 10 More MPI Commands Home assignment #5
3 Wilkinson&Allen PDF
4 Direct, Recursive and Mesh Gauss Elimination Jacobi Red-! Gauss-Seidel Over-relaxation! Black ordering
5 " Matrix Addition Matrix Multiplication Matrix-Vector Multiplication Linear Equations Matrix Multiplication Recursive Implementation Mesh Implementation 2D pipeline Systolic Array Gauss Elimination Jacobi Iteration
6 " Gauss-Seidel Relaxation Red-Black Ordering Over-relaxation Multi-Grid
7 Intermediate MPI Parts from: Using MPI book by Gropp, Lusk and Skjellum, Chapter 4. The source codes can be downloaded from:
8 Topics The Poisson Problem Topologies Jacobi Iterations
9 Terminology The general form of a second order linear PDE: a * 2 u/x 2 + b * 2 u/xy + c * 2 u/y 2 + d * u/x + e * u/y + f = 0 (y denotes time for hyperbolic and parabolic equations) Analog to solutions of general quadratic equation a * x 2 + b * x*y + c * y 2 + d * x + e * y + f = 0 Ellipse: 4*a*c b^2 > 0 Hyperbola: 4*a*c b^2 < 0 Parabola: 4*a*c b^2 = 0 Heat Equation
10 Poisson s equation arises in many models 1D: 2 u/x 2 = f(x) 2D: 2 u/x u/y 2 = f(x,y) 3D: 2 u/x u/y u/z 2 = f(x,y,z) Heat flow: Temperature(position, time) Diffusion: Concentration(position, time) Electrostatic or Gravitational Potential: Potential(position) Fluid flow: Velocity,Pressure,Density(position,time) Quantum mechanics: Wave-function(position,time) Elasticity: Stress,Strain(position,time)
11 Poisson s equation in 1D: 2 u/x 2 = f(x) Discretize 2 u/x 2 = f(x) on regular mesh u i = u(i*h) to get [ u i+1 2*u i + u i-1 ] / h 2 = f(x) Write as solving Tu = -h 2 * f for u where T = Graph and stencil
12 2D Poisson s equation: 2 u/x u/y 2 = f(x,y) Similar to the 1D case, but the matrix T is now Grid points numbered left to right, top row to bottom row Graph and stencil T = D is analogous Similar adjacency matrix for arbitrary graph
13 Composite mesh from a mechanical structure
14 Converting the mesh to a matrix
15 Irregular mesh: NASA Airfoil in 2D (direct solution)
16 Adaptive Mesh Refinement (AMR) Adaptive mesh around an explosion John Bell and Phil Colella at LBL
17 1 0,..,, 1 1 0,..,, 1 : Define a square mesh (grid) on the boundary ), ( ), ( in the interior ), ( 2 n j n j y n i n i x y x g y x u y x f u i i Problem Definition
18 Discretization Poisson Equation: -4* u(i,j) + u(i-1,j) + u(i+1,j) + u(i,j-1) + u(i,j+1) = f(i,j) Jacobi Iterations: u k+1 (i,j)=1/4(u k (i-1,j)+u k (i+1,j)+u k (i,j-1)+u k (i,j+1)-h 2 f(i,j))
19 5 point stencil approx. for 2-D Poisson problem
20 Jacobi Iteration Serial Version (Fortran)
21 Jacobi Iteration Serial Version (C)
22 Finite Difference Algorithm
23 s = start, e=end Jacobi Iteration for a Slice
24 1-D Decomposition of the Domain
25
26 Ghost Points double precision u(0:n+1,s-1:e+1)
27 Topology Virtual Topology Cartesian Topology In the next slides: 2-D Cartesian Topology
28 2D Cartesian Decomposition 4 x 3 domain
29 Defining Cartesian Topologies Our next task is to define how to assign processes to each part of the decomposed domain MPI lets user specify various application topologies The routine MPI_Cart_create() creates a Cartesian decomposition of the processes, with the number of dimensions given by the ndim argument This creates a new communicator with the same processes as the input communicator, but with the specified topology dims[0]=4; dims[1]=3; periods[0]=0; periods[1]=0; /* specify if connection is with wrap round */ ndim=2; MPI_Cart_create(MPI_COMM_WORLD,ndim,*dims,*per iods,reorder,comm2d);
30 Domain Decomposition C bindings: MPI_Cart_create(MPI_Comm comm_old, int ndims, int *dims, int *isperiodic, int reorder, MPI_Comm *new_comm) MPI_Cart_get -
31 MPI_CART_CREATE integer dims(2) logical isperiodic(2), reoeder dims(1) = 4 dims(2) = 3 isperiodic(1) =.false. isperiodic(2)=.false. reorder =.true. ndim = 2 call MPI_CART_CREATE(MPI_COMM_WORLD, ndim, dims, isperiodic, reorder, comm2d, ierr)
32 To determine the coordinates of a calling process FORTRAN examples: call MPI_CART_GET(comm1d, 2, dims, periods, coords, ierr) print *, '('coords(1), ','coords(2), ')' call MPI_COMM_RANK(comm2d, myrank, ierr) call MPI_CART_COORDS(comm2d, myrank,2,coords,ierr)
33 2-Step Process to Transfer Data
34 More Exotic MPI Functions MPI_Cart_shift(MPI_Comm comm, int direction, int displ, int *src, int *dest)
35 MPI_Cart_shift /* create cartesian topology for processes */ dims[0] = nrow; /* number of rows */ dims[1] = mcol; /* number of columns */ period[0] = 1; /* cyclic in this direction */ period[1] = 0; /* no cyclic in this direction */ MPI_Cart_create(MPI_COMM_WORLD, ndim, dims, period, reorder, &comm2d); MPI_Comm_rank(comm2D, &me); MPI_Cart_coords(comm2D, me, ndim, coords); source = me; /* calling process rank in 2D communicator */ index = 0; /* shift along the 1st index (out of 2) */ displ = 1; /* shift by 1 */ MPI_Cart_shift(comm2D, index, displ, source, &dest1);
36 MPI_PROC_NULL! Compute neighbors IF (myrank.eq.0) THEN left = MPI_PROC_NULL ELSE left = myrank - 1 END IF IF (myrank.eq.p-1)then right = MPI_PROC_NULL ELSE right = myrank+1 END IF
37 MPE_DECOMP1D Determine the array limits (s and e in our code): call MPE_DECOMP1D(n, nprocs, myrank, s, e) Where: nprocs = # of processes in the Cartesian coordinates, myrank = cart. coord. of the calling process n = size of the array (1..n)
38 MPE_DECOMP1D Similar to: s = 1+myrank*(n/nprocs) e = s+(n/nprocs) - 1
39 MPE_DECOMP1D C This file contains a routine for producing a decomposition of C a 1-d array c when given a number of processors. C It may be used in "direct" product decomposition. C The values returned assume a "global" domain in [1:n] subroutine MPE_DECOMP1D(n, numprocs, myid, s, e ) integer n, numprocs, myid, s, e integer nlocal integer deficit
40 MPE_DECOMP1D nlocal = n / numprocs s = myid * nlocal + 1 deficit = mod(n,numprocs) s = s + min(myid,deficit) if (myid.lt. deficit) then nlocal = nlocal + 1 endif e = s + nlocal - 1 if (e.gt.n.or.myid.eq.numprocs-1) e = n return end
41 A code to exchange data for ghost points using blocking send/recv subroutine exchng1(a,nx,s,e,comm1d,nbrbottom,nbrtop ) include "mpif.h" integer nx, s, e double precision a(0:nx+1,s-1:e+1) integer comm1d, nbrbottom, nbrtop integer status(mpi_status_size), ierr call MPI_SEND(a(1,e),nx,MPI_DOUBLE_PRECISION,nbrtop, 0, comm1d, ierr) call MPI_RECV(a(1,s-1),nx,MPI_DOUBLE_PRECISION, nbrbottom,0,comm1d,ierr) call MPI_SEND(a(1,s),nx,MPI_DOUBLE_PRECISION, nbrbottom, 1,comm1d,ierr) call MPI_RECV(a(1,e+1),nx,MPI_DOUBLE_PRECISION, nbrtop,1,comm1d,ierr) return end
42 The previous example was simple But It is not necessarily the best way to implement the exchange of ghost points
43 sendrecv (exchange data ver. 2) subroutine exchng1( a, nx, s, e, comm1d, nbrbottom, nbrtop ) include "mpif.h" integer nx, s, e double precision a(0:nx+1,s-1:e+1) integer comm1d, nbrbottom, nbrtop integer status(mpi_status_size), ierr call MPI_SENDRECV( $ a(1,e),nx,mpi_double_precision, nbrtop, 0, $ a(1,s-1),nx,mpi_double_precision,nbrbottom, 0, $ comm1d, status, ierr ) call MPI_SENDRECV( $ a(1,s), nx, MPI_DOUBLE_PRECISION, nbrbottom, 1, $ a(1,e+1), nx, MPI_DOUBLE_PRECISION, nbrtop, 1, $ comm1d, status, ierr ) return end
44 Implementation of the Jacobi Iteration-1 program main include "mpif.h" integer maxn parameter (maxn = 128) double precision a(maxn,maxn),b(maxn,maxn),f(maxn,maxn) integer nx, ny integer myid, numprocs, ierr integer comm1d, nbrbottom, nbrtop, s, e, it double precision diff, diffnorm, dwork double precision t1, t2 double precision MPI_WTIME external MPI_WTIME external diff call MPI_INIT( ierr ) call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr ) call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
45 c c c Implementation of the Jacobi Iteration-2 if (myid.eq. 0) then Get the size of the problem print *, 'Enter nx' read *, nx nx = 110 endif call MPI_BCAST(nx,1,MPI_INTEGER,0,MPI_COMM_WORLD,ierr) ny = nx c Get a new communicator for a decomposition of the domain call MPI_CART_CREATE(MPI_COMM_WORLD,1,numprocs,.false.,.true.,comm1d,ierr )
46 Implementation of the Jacobi Iteration-3 c c Get my position in this communicator, and my neighbors c call MPI_COMM_RANK (comm1d,myid,ierr) call MPI_Cart_shift(comm1d,0,1,nbrbottom, nbrtop,ierr) c c Compute the actual decomposition c call MPE_DECOMP1D(ny,numprocs,myid,s,e ) c c Initialize the right-hand-side (f) and the initial solution guess (a) c call onedinit( a, b, f, nx, s, e )
47 Implementation of the Jacobi Iteration-4 C Actually do the computation. Note the use of a collective C operation to check for convergence, and a do-loop to bound the C number of iterations. call MPI_BARRIER( MPI_COMM_WORLD, ierr ) t1 = MPI_WTIME() do 10 it=1, 100 call exchng1( a, nx, s, e, comm1d, nbrbottom, nbrtop ) call sweep1d( a, f, nx, s, e, b ) call exchng1( b, nx, s, e, comm1d, nbrbottom, nbrtop ) call sweep1d( b, f, nx, s, e, a ) dwork = diff( a, b, nx, s, e ) call MPI_Allreduce( dwork, diffnorm, 1, $ MPI_DOUBLE_PRECISION, MPI_SUM, comm1d, ierr ) if (diffnorm.lt. 1.0e-5) goto 20 if (myid.eq. 0) print *, 2*it, ' Difference is ', diffnorm 10 continue
48 Implementation of the Jacobi Iteration-5 if (myid.eq. 0) print *, 'Failed to converge' 20 continue t2 = MPI_WTIME() if (myid.eq. 0) then print *, 'Converged after ', 2*it, ' Iterations in ', t2 - t1,' secs ' endif c call MPI_FINALIZE(ierr) end
49 Implementation of the Jacobi Iteration-6 c Perform a Jacobi sweep for a 1-d decomposition. c Sweep from a into b subroutine sweep1d( a, f, nx, s, e, b ) integer nx, s, e double precision a(0:nx+1,s-1:e+1), f(0:nx+1,s- 1:e+1), + b(0:nx+1,s-1:e+1) integer i, j double precision h h = 1.0d0 / dble(nx+1) do 10 j=s, e do 10 i=1, nx b(i,j) = 0.25 * (a(i-1,j)+a(i,j+1)+a(i,j- 1)+a(i+1,j)) + h * h * f(i,j) 10 continue return end
50 Implementation of the Jacobi Iteration-7 c c The rest of the 1-d program double precision function diff( a, b, nx, s, e ) integer nx, s, e double precision a(0:nx+1, s-1:e+1), b(0:nx+1, s- 1:e+1) double precision sum integer i, j sum = 0.0d0 do 10 j=s,e do 10 i=1,nx sum = sum + (a(i,j) - b(i,j)) ** 2 10 continue diff = sum return end
51 Timing for variants of the 1-D decomposition of the Poisson problem P Blocking Send Ordered Send Sendrecv Buffered Send Non Blocking Isend e~20% (1/14 faster than 1proc)
52 $"#
53 % " # Grid computing#
lslogin3$ cd lslogin3$ tar -xvf ~train00/mpibasic_lab.tar cd mpibasic_lab/pi cd mpibasic_lab/decomp1d
MPI Lab Getting Started Login to ranger.tacc.utexas.edu Untar the lab source code lslogin3$ cd lslogin3$ tar -xvf ~train00/mpibasic_lab.tar Part 1: Getting Started with simple parallel coding hello mpi-world
More informationAMath 483/583 Lecture 24. Notes: Notes: Steady state diffusion. Notes: Finite difference method. Outline:
AMath 483/583 Lecture 24 Outline: Heat equation and discretization OpenMP and MPI for iterative methods Jacobi, Gauss-Seidel, SOR Notes and Sample codes: Class notes: Linear algebra software $UWHPSC/codes/openmp/jacobi1d_omp1.f90
More informationAMath 483/583 Lecture 24
AMath 483/583 Lecture 24 Outline: Heat equation and discretization OpenMP and MPI for iterative methods Jacobi, Gauss-Seidel, SOR Notes and Sample codes: Class notes: Linear algebra software $UWHPSC/codes/openmp/jacobi1d_omp1.f90
More informationAMath 483/583 Lecture 21 May 13, 2011
AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versions of Jacobi iteration Gauss-Seidel and SOR iterative methods Next week: More MPI Debugging and totalview GPU computing Read: Class notes
More informationLarge-Scale Simulations on Parallel Computers!
http://users.wpi.edu/~gretar/me612.html! Large-Scale Simulations on Parallel Computers! Grétar Tryggvason! Spring 2010! Outline!!Basic Machine configurations!!parallelization!!the Message Passing Interface
More informationMPI Lab. How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums
MPI Lab Parallelization (Calculating π in parallel) How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums Sharing Data Across Processors
More informationIntroduction to MPI. Ricardo Fonseca. https://sites.google.com/view/rafonseca2017/
Introduction to MPI Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Distributed Memory Programming (MPI) Message Passing Model Initializing and terminating programs Point to point
More informationIntroduction to MPI part II. Fabio AFFINITO
Introduction to MPI part II Fabio AFFINITO (f.affinito@cineca.it) Collective communications Communications involving a group of processes. They are called by all the ranks involved in a communicator (or
More informationMPI Lab. Steve Lantz Susan Mehringer. Parallel Computing on Ranger and Longhorn May 16, 2012
MPI Lab Steve Lantz Susan Mehringer Parallel Computing on Ranger and Longhorn May 16, 2012 1 MPI Lab Parallelization (Calculating p in parallel) How to split a problem across multiple processors Broadcasting
More informationReview of MPI Part 2
Review of MPI Part Russian-German School on High Performance Computer Systems, June, 7 th until July, 6 th 005, Novosibirsk 3. Day, 9 th of June, 005 HLRS, University of Stuttgart Slide Chap. 5 Virtual
More informationReusing this material
Virtual Topologies Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationHigh-Performance Computing: MPI (ctd)
High-Performance Computing: MPI (ctd) Adrian F. Clark: alien@essex.ac.uk 2015 16 Adrian F. Clark: alien@essex.ac.uk High-Performance Computing: MPI (ctd) 2015 16 1 / 22 A reminder Last time, we started
More informationCME 213 SPRING Eric Darve
CME 213 SPRING 2017 Eric Darve LINEAR ALGEBRA MATRIX-VECTOR PRODUCTS Application example: matrix-vector product We are going to use that example to illustrate additional MPI functionalities. This will
More informationINTRODUCTION TO MPI VIRTUAL TOPOLOGIES
INTRODUCTION TO MPI VIRTUAL TOPOLOGIES Introduction to Parallel Computing with MPI and OpenMP 18-19-20 november 2013 a.marani@cineca.it g.muscianisi@cineca.it l.ferraro@cineca.it VIRTUAL TOPOLOGY Topology:
More informationTopics. Lecture 7. Review. Other MPI collective functions. Collective Communication (cont d) MPI Programming (III)
Topics Lecture 7 MPI Programming (III) Collective communication (cont d) Point-to-point communication Basic point-to-point communication Non-blocking point-to-point communication Four modes of blocking
More informationExamples of MPI programming. Examples of MPI programming p. 1/18
Examples of MPI programming Examples of MPI programming p. 1/18 Examples of MPI programming p. 2/18 A 1D example The computational problem: A uniform mesh in x-direction with M +2 points: x 0 is left boundary
More informationSource:
MPI Message Passing Interface Communicator groups and Process Topologies Source: http://www.netlib.org/utk/papers/mpi-book/mpi-book.html Communicators and Groups Communicators For logical division of processes
More informationTopologies in MPI. Instructor: Dr. M. Taufer
Topologies in MPI Instructor: Dr. M. Taufer WS2004/2005 Topology We can associate additional information (beyond the group and the context) to a communicator. A linear ranking of processes may not adequately
More informationTopics. Lecture 6. Point-to-point Communication. Point-to-point Communication. Broadcast. Basic Point-to-point communication. MPI Programming (III)
Topics Lecture 6 MPI Programming (III) Point-to-point communication Basic point-to-point communication Non-blocking point-to-point communication Four modes of blocking communication Manager-Worker Programming
More informationMPI Workshop - III. Research Staff Cartesian Topologies in MPI and Passing Structures in MPI Week 3 of 3
MPI Workshop - III Research Staff Cartesian Topologies in MPI and Passing Structures in MPI Week 3 of 3 Schedule 4Course Map 4Fix environments to run MPI codes 4CartesianTopology! MPI_Cart_create! MPI_
More informationMPI: Message Passing Interface An Introduction. S. Lakshmivarahan School of Computer Science
MPI: Message Passing Interface An Introduction S. Lakshmivarahan School of Computer Science MPI: A specification for message passing libraries designed to be a standard for distributed memory message passing,
More informationComputational Fluid Dynamics. Large Problems & Computer Science Issues. Computational Fluid Dynamics
Computational Fluid Dynamics Lecture 26 May 1, 2017 Large Problems & Computer Science Issues Grétar Tryggvason As physical problems of interest become more complex and codes grow larger, the importance
More informationCOSC 4397 Parallel Computation. Introduction to MPI (III) Process Grouping. Terminology (I)
COSC 4397 Introduction to MPI (III) Process Grouping Spring 2010 Terminology (I) an MPI_Group is the object describing the list of processes forming a logical entity a group has a size MPI_Group_size every
More informationCommunicators. MPI Communicators and Topologies. Why Communicators? MPI_Comm_split
Communicators MPI Communicators and Topologies Based on notes by Science & Technology Support High Performance Computing Ohio Supercomputer Center A communicator is a parameter in all MPI message passing
More informationAMath 483/583 Lecture 18 May 6, 2011
AMath 483/583 Lecture 18 May 6, 2011 Today: MPI concepts Communicators, broadcast, reduce Next week: MPI send and receive Iterative methods Read: Class notes and references $CLASSHG/codes/mpi MPI Message
More informationLecture 7: More about MPI programming. Lecture 7: More about MPI programming p. 1
Lecture 7: More about MPI programming Lecture 7: More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems
More informationINTRODUCTION TO MPI COMMUNICATORS AND VIRTUAL TOPOLOGIES
INTRODUCTION TO MPI COMMUNICATORS AND VIRTUAL TOPOLOGIES Introduction to Parallel Computing with MPI and OpenMP 24 november 2017 a.marani@cineca.it WHAT ARE COMMUNICATORS? Many users are familiar with
More informationParallel Programming Using Basic MPI. Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center
05 Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Talk Overview Background on MPI Documentation Hello world in MPI Basic communications Simple
More informationAMath 483/583 Lecture 21
AMath 483/583 Lecture 21 Outline: Review MPI, reduce and bcast MPI send and receive Master Worker paradigm References: $UWHPSC/codes/mpi class notes: MPI section class notes: MPI section of bibliography
More informationMPI version of the Serial Code With One-Dimensional Decomposition. Timothy H. Kaiser, Ph.D.
MPI version of the Serial Code With One-Dimensional Decomposition Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Overview We will choose one of the two dimensions and subdivide the domain to allow the distribution
More informationHigh Performance Computing Lecture 41. Matthew Jacob Indian Institute of Science
High Performance Computing Lecture 41 Matthew Jacob Indian Institute of Science Example: MPI Pi Calculating Program /Each process initializes, determines the communicator size and its own rank MPI_Init
More informationPractical Scientific Computing: Performanceoptimized
Practical Scientific Computing: Performanceoptimized Programming Advanced MPI Programming December 13, 2006 Dr. Ralf-Peter Mundani Department of Computer Science Chair V Technische Universität München,
More informationA Heat-Transfer Example with MPI Rolf Rabenseifner
A Heat-Transfer Example with MPI (short version) Rolf Rabenseifner rabenseifner@hlrs.de University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de A Heat-Transfer Example with
More informationCode Parallelization
Code Parallelization a guided walk-through m.cestari@cineca.it f.salvadore@cineca.it Summer School ed. 2015 Code Parallelization two stages to write a parallel code problem domain algorithm program domain
More informationIntermediate MPI features
Intermediate MPI features Advanced message passing Collective communication Topologies Group communication Forms of message passing (1) Communication modes: Standard: system decides whether message is
More informationMPI Programming Techniques
MPI Programming Techniques Copyright (c) 2012 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any
More informationIntroduction to MPI Part II Collective Communications and communicators
Introduction to MPI Part II Collective Communications and communicators Andrew Emerson, Fabio Affinito {a.emerson,f.affinito}@cineca.it SuperComputing Applications and Innovation Department Collective
More informationNumerical Algorithms
Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0
More informationMPI Version of the Stommel Code with One and Two Dimensional Decomposition
MPI Version of the Stommel Code with One and Two Dimensional Decomposition Timothy H. Kaiser, Ph.D. tkaiser@sdsc.edu 1 Overview We will choose one of the two dimensions and subdivide the domain to allow
More informationMore Communication (cont d)
Data types and the use of communicators can simplify parallel program development and improve code readability Sometimes, however, simply treating the processors as an unstructured collection is less than
More informationIntroduction to Parallel. Programming
University of Nizhni Novgorod Faculty of Computational Mathematics & Cybernetics Introduction to Parallel Section 4. Part 3. Programming Parallel Programming with MPI Gergel V.P., Professor, D.Sc., Software
More informationA Heat-Transfer Example with MPI Rolf Rabenseifner
A Heat-Transfer Example with MPI Rolf Rabenseifner University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de A Heat-Transfer Example with MPI Slide 1 Goals first complex MPI
More informationCS4961 Parallel Programming. Lecture 18: Introduction to Message Passing 11/3/10. Final Project Purpose: Mary Hall November 2, 2010.
Parallel Programming Lecture 18: Introduction to Message Passing Mary Hall November 2, 2010 Final Project Purpose: - A chance to dig in deeper into a parallel programming model and explore concepts. -
More informationDepartment of Informatics V. HPC-Lab. Session 4: MPI, CG M. Bader, A. Breuer. Alex Breuer
HPC-Lab Session 4: MPI, CG M. Bader, A. Breuer Meetings Date Schedule 10/13/14 Kickoff 10/20/14 Q&A 10/27/14 Presentation 1 11/03/14 H. Bast, Intel 11/10/14 Presentation 2 12/01/14 Presentation 3 12/08/14
More informationMPI 8. CSCI 4850/5850 High-Performance Computing Spring 2018
MPI 8 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives
More informationAn Introduction to MPI
An Introduction to MPI Parallel Programming with the Message Passing Interface William Gropp Ewing Lusk Argonne National Laboratory 1 Outline Background The message-passing model Origins of MPI and current
More informationMPI Optimization. HPC Saudi, March 15 th 2016
MPI Optimization HPC Saudi, March 15 th 2016 Useful Variables MPI Variables to get more insights MPICH_VERSION_DISPLAY=1 MPICH_ENV_DISPLAY=1 MPICH_CPUMASK_DISPLAY=1 MPICH_RANK_REORDER_DISPLAY=1 When using
More informationCollective Communication in MPI and Advanced Features
Collective Communication in MPI and Advanced Features Pacheco s book. Chapter 3 T. Yang, CS240A. Part of slides from the text book, CS267 K. Yelick from UC Berkeley and B. Gropp, ANL Outline Collective
More informationLecture 5. Applications: N-body simulation, sorting, stencil methods
Lecture 5 Applications: N-body simulation, sorting, stencil methods Announcements Quiz #1 in section on 10/13 Midterm: evening of 10/30, 7:00 to 8:20 PM In Assignment 2, the following variation is suggested
More informationDecomposing onto different processors
N-Body II: MPI Decomposing onto different processors Direct summation (N 2 ) - each particle needs to know about all other particles No locality possible Inherently a difficult problem to parallelize in
More informationIntroduction to MPI. Table of Contents
1. Program Structure 2. Communication Model Topology Messages 3. Basic Functions 4. Made-up Example Programs 5. Global Operations 6. LaPlace Equation Solver 7. Asynchronous Communication 8. Communication
More informationParallel Programming, MPI Lecture 2
Parallel Programming, MPI Lecture 2 Ehsan Nedaaee Oskoee 1 1 Department of Physics IASBS IPM Grid and HPC workshop IV, 2011 Outline 1 Introduction and Review The Von Neumann Computer Kinds of Parallel
More informationMore about MPI programming. More about MPI programming p. 1
More about MPI programming More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems, the CPUs share
More informationCINES MPI. Johanne Charpentier & Gabriel Hautreux
Training @ CINES MPI Johanne Charpentier & Gabriel Hautreux charpentier@cines.fr hautreux@cines.fr Clusters Architecture OpenMP MPI Hybrid MPI+OpenMP MPI Message Passing Interface 1. Introduction 2. MPI
More informationReport S1 Fortran. Kengo Nakajima Information Technology Center
Report S1 Fortran Kengo Nakajima Information Technology Center Technical & Scientific Computing II (4820-1028) Seminar on Computer Science II (4810-1205) Problem S1-1 Report S1 Read local files /a1.0~a1.3,
More informationLecture 14: Mixed MPI-OpenMP programming. Lecture 14: Mixed MPI-OpenMP programming p. 1
Lecture 14: Mixed MPI-OpenMP programming Lecture 14: Mixed MPI-OpenMP programming p. 1 Overview Motivations for mixed MPI-OpenMP programming Advantages and disadvantages The example of the Jacobi method
More informationOptimization of MPI Applications Rolf Rabenseifner
Optimization of MPI Applications Rolf Rabenseifner University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Optimization of MPI Applications Slide 1 Optimization and Standardization
More informationIntroduction to parallel computing
Introduction to parallel computing What is parallel computing? Serial computing Single processing unit (core) is used for solving a problem Single task performed at once Parallel computing Multiple cores
More informationmpi-02.c 1/1. 15/10/26 mpi-01.c 1/1. 15/10/26
mpi-01.c 1/1 main ( argc, char * argv[]) rank, size; prf ("I am process %d of %d\n", rank, size); mpi-02.c 1/1 #include main ( argc, char * argv[]) rank, size, src, dest, nc; tag = 50; // tag
More informationWeek 3: MPI. Day 03 :: Data types, groups, topologies, error handling, parallel debugging
Week 3: MPI Day 03 :: Data types, groups, topologies, error handling, parallel debugging Other MPI-1 features Timing REAL(8) :: t t = MPI_Wtime() -- not a sub, no ierror! Get the current wall-clock time
More informationAn introduction to MPI
An introduction to MPI C MPI is a Library for Message-Passing Not built in to compiler Function calls that can be made from any compiler, many languages Just link to it Wrappers: mpicc, mpif77 Fortran
More informationTopologies. Ned Nedialkov. McMaster University Canada. SE 4F03 March 2016
Topologies Ned Nedialkov McMaster University Canada SE 4F03 March 2016 Outline Introduction Cartesian topology Some Cartesian topology functions Some graph topology functions c 2013 16 Ned Nedialkov 2/11
More informationIPM Workshop on High Performance Computing (HPC08) IPM School of Physics Workshop on High Perfomance Computing/HPC08
IPM School of Physics Workshop on High Perfomance Computing/HPC08 16-21 February 2008 MPI tutorial Luca Heltai Stefano Cozzini Democritos/INFM + SISSA 1 When
More information1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3
6 Iterative Solvers Lab Objective: Many real-world problems of the form Ax = b have tens of thousands of parameters Solving such systems with Gaussian elimination or matrix factorizations could require
More informationIntroduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR
Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR clematis@ge.imati.cnr.it Ack. & riferimenti An Introduction to MPI Parallel Programming with the Message Passing InterfaceWilliam
More informationFabio AFFINITO.
Introduction to Message Passing Interface Fabio AFFINITO Collective communications Communications involving a group of processes. They are called by all the ranks involved in a communicator (or a group)
More informationProgramming for High Performance Computing. Programming Environment Dec 11, 2014 Osamu Tatebe
Programming for High Performance Computing Programming Environment Dec 11, 2014 Osamu Tatebe Distributed Memory Machine (PC Cluster) A distributed memory machine consists of computers (compute nodes) connected
More informationPARALLEL METHODS FOR SOLVING PARTIAL DIFFERENTIAL EQUATIONS. Ioana Chiorean
5 Kragujevac J. Math. 25 (2003) 5 18. PARALLEL METHODS FOR SOLVING PARTIAL DIFFERENTIAL EQUATIONS Ioana Chiorean Babeş-Bolyai University, Department of Mathematics, Cluj-Napoca, Romania (Received May 28,
More informationIntroduction to Multigrid and its Parallelization
Introduction to Multigrid and its Parallelization! Thomas D. Economon Lecture 14a May 28, 2014 Announcements 2 HW 1 & 2 have been returned. Any questions? Final projects are due June 11, 5 pm. If you are
More informationMPI introduction - exercises -
MPI introduction - exercises - Paolo Ramieri, Maurizio Cremonesi May 2016 Startup notes Access the server and go on scratch partition: ssh a08tra49@login.galileo.cineca.it cd $CINECA_SCRATCH Create a job
More informationIntroduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign
Introduction to MPI May 20, 2013 Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Top500.org PERFORMANCE DEVELOPMENT 1 Eflop/s 162 Pflop/s PROJECTED 100 Pflop/s
More informationCS 426. Building and Running a Parallel Application
CS 426 Building and Running a Parallel Application 1 Task/Channel Model Design Efficient Parallel Programs (or Algorithms) Mainly for distributed memory systems (e.g. Clusters) Break Parallel Computations
More informationComparison of different solvers for two-dimensional steady heat conduction equation ME 412 Project 2
Comparison of different solvers for two-dimensional steady heat conduction equation ME 412 Project 2 Jingwei Zhu March 19, 2014 Instructor: Surya Pratap Vanka 1 Project Description The purpose of this
More informationAcknowledgments. Programming with MPI Basic send and receive. A Minimal MPI Program (C) Contents. Type to enter text
Acknowledgments Programming with MPI Basic send and receive Jan Thorbecke Type to enter text This course is partly based on the MPI course developed by Rolf Rabenseifner at the High-Performance Computing-Center
More informationProgramming with MPI Basic send and receive
Programming with MPI Basic send and receive Jan Thorbecke Type to enter text Delft University of Technology Challenge the future Acknowledgments This course is partly based on the MPI course developed
More informationHigh Performance Computing Course Notes Message Passing Programming III
High Performance Computing Course Notes 2008-2009 2009 Message Passing Programming III Communication modes Synchronous mode The communication is considered complete when the sender receives the acknowledgement
More informationAdvanced Parallel Programming
Sebastian von Alfthan Jussi Enkovaara Pekka Manninen Advanced Parallel Programming February 15-17, 2016 PRACE Advanced Training Center CSC IT Center for Science Ltd, Finland All material (C) 2011-2016
More informationParallelization of an Example Program
Parallelization of an Example Program [ 2.3] In this lecture, we will consider a parallelization of the kernel of the Ocean application. Goals: Illustrate parallel programming in a low-level parallel language.
More informationSimulating ocean currents
Simulating ocean currents We will study a parallel application that simulates ocean currents. Goal: Simulate the motion of water currents in the ocean. Important to climate modeling. Motion depends on
More informationCSCE 5160 Parallel Processing. CSCE 5160 Parallel Processing
HW #9 10., 10.3, 10.7 Due April 17 { } Review Completing Graph Algorithms Maximal Independent Set Johnson s shortest path algorithm using adjacency lists Q= V; for all v in Q l[v] = infinity; l[s] = 0;
More informationHigh Performance Computing Course Notes Message Passing Programming III
High Performance Computing Course Notes 2009-2010 2010 Message Passing Programming III Blocking synchronous send the sender doesn t return until it receives the acknowledgement from the receiver that the
More informationCSE 590: Special Topics Course ( Supercomputing ) Lecture 6 ( Analyzing Distributed Memory Algorithms )
CSE 590: Special Topics Course ( Supercomputing ) Lecture 6 ( Analyzing Distributed Memory Algorithms ) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2012 2D Heat Diffusion
More informationCEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced
1 / 32 CEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced Albert S. Kim Department of Civil and Environmental Engineering University of Hawai i at Manoa 2540 Dole
More informationNumerical Modelling in Fortran: day 6. Paul Tackley, 2017
Numerical Modelling in Fortran: day 6 Paul Tackley, 2017 Today s Goals 1. Learn about pointers, generic procedures and operators 2. Learn about iterative solvers for boundary value problems, including
More informationSlides prepared by : Farzana Rahman 1
Introduction to MPI 1 Background on MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers, and parallel programmers Used to create parallel programs based
More informationRecap of Parallelism & MPI
Recap of Parallelism & MPI Chris Brady Heather Ratcliffe The Angry Penguin, used under creative commons licence from Swantje Hess and Jannis Pohlmann. Warwick RSE 13/12/2017 Parallel programming Break
More informationAn Investigation into Iterative Methods for Solving Elliptic PDE s Andrew M Brown Computer Science/Maths Session (2000/2001)
An Investigation into Iterative Methods for Solving Elliptic PDE s Andrew M Brown Computer Science/Maths Session (000/001) Summary The objectives of this project were as follows: 1) Investigate iterative
More informationLecture 6. Floating Point Arithmetic Stencil Methods Introduction to OpenMP
Lecture 6 Floating Point Arithmetic Stencil Methods Introduction to OpenMP Announcements Section and Lecture will be switched next week Thursday: section and Q2 Friday: Lecture 2 Today s lecture Floating
More informationBuffering in MPI communications
Buffering in MPI communications Application buffer: specified by the first parameter in MPI_Send/Recv functions System buffer: Hidden from the programmer and managed by the MPI library Is limitted and
More informationIntroduction to TDDC78 Lab Series. Lu Li Linköping University Parts of Slides developed by Usman Dastgeer
Introduction to TDDC78 Lab Series Lu Li Linköping University Parts of Slides developed by Usman Dastgeer Goals Shared- and Distributed-memory systems Programming parallelism (typical problems) Goals Shared-
More informationMA471. Lecture 5. Collective MPI Communication
MA471 Lecture 5 Collective MPI Communication Today: When all the processes want to send, receive or both Excellent website for MPI command syntax available at: http://www-unix.mcs.anl.gov/mpi/www/ 9/10/2003
More informationParallel I/O. and split communicators. David Henty, Fiona Ried, Gavin J. Pringle
Parallel I/O and split communicators David Henty, Fiona Ried, Gavin J. Pringle Dr Gavin J. Pringle Applications Consultant gavin@epcc.ed.ac.uk +44 131 650 6709 4x4 array on 2x2 Process Grid Parallel IO
More informationParallel Poisson Solver in Fortran
Parallel Poisson Solver in Fortran Nilas Mandrup Hansen, Ask Hjorth Larsen January 19, 1 1 Introduction In this assignment the D Poisson problem (Eq.1) is to be solved in either C/C++ or FORTRAN, first
More informationPractical Introduction to Message-Passing Interface (MPI)
1 Outline of the workshop 2 Practical Introduction to Message-Passing Interface (MPI) Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@mcgill.ca Theoretical / practical introduction Parallelizing your
More informationMPI introduction - exercises -
MPI introduction - exercises - Introduction to Parallel Computing with MPI and OpenMP P. Ramieri May 2015 Hello world! (Fortran) As an ice breaking activity try to compile and run the Helloprogram, either
More informationMPI. Communicating non-contiguous data or mixed datatypes Collective communications Virtual topologies
MPI 1 Communicating non-contiguous data or mixed datatypes Collective communications Virtual topologies 2 Communicating Non-contiguous Data or Mixed Datatypes The following strategies and features in MPI
More informationParallel Programming using MPI. Supercomputing group CINECA
Parallel Programming using MPI Supercomputing group CINECA Contents Programming with message passing Introduction to message passing and MPI Basic MPI programs MPI Communicators Send and Receive function
More informationMessage Passing with MPI
Message Passing with MPI PPCES 2016 Hristo Iliev IT Center / JARA-HPC IT Center der RWTH Aachen University Agenda Motivation Part 1 Concepts Point-to-point communication Non-blocking operations Part 2
More informationParallel Programming Using MPI
Parallel Programming Using MPI Short Course on HPC 15th February 2019 Aditya Krishna Swamy adityaks@iisc.ac.in SERC, Indian Institute of Science When Parallel Computing Helps? Want to speed up your calculation
More informationMultigrid Pattern. I. Problem. II. Driving Forces. III. Solution
Multigrid Pattern I. Problem Problem domain is decomposed into a set of geometric grids, where each element participates in a local computation followed by data exchanges with adjacent neighbors. The grids
More information