ASTROPHYSIKALISCHES INSTITUT POTSDAM AIP. Helmholtz school. Introduction to MPI. Stefan Gottlöber

Size: px
Start display at page:

Download "ASTROPHYSIKALISCHES INSTITUT POTSDAM AIP. Helmholtz school. Introduction to MPI. Stefan Gottlöber"

Transcription

1 ASTROPHYSIKALISCHES INSTITUT POTSDAM AIP Helmholtz school Introduction to MPI Stefan Gottlöber 1

2 Topics Basics of parallel programming Calculation of π (an example for the basic structure of MPI programs and the possible combination with OpenMP) Direct integration (an example for message passing in MPI programs and for the scaling of MPI programs developing this MPI program could be your exercise during the first week) ART-MPI (an example for some more elaborated programs) Potsdam, July

3 Modern methods in science Numerical simulations are used to proof or disproof observations because experiments are impossible (astrophysics) are too expensive are too time consuming... Potsdam, July

4 Methods of parallelization OpenMP MPI needs computer with shared memory (JUMP at NIC, 107 Gb, NASA s COLUMBIA) works on distributed memory (in general more memory available) Potsdam, July

5 What is OpenMP? If you don t know it already you will learn about OpenMP tomorrow during Anatoly s lecture. Potsdam, July

6 What is MPI? What is MPI? Message Passing Interface: libraries, designed to be a standard for parallel computing on distributed memory. Goal: to be practical, portable, efficient, and flexible MPI history 1980s - early 1990s: distributed memory parallel computing develops, need for a standard arose April 1992: Workshop on Standards for Message Passing in a Distributed Memory Environment November 1992: Meeting in Minneapolis, MPI draft proposal (MPI1) November 1993: Supercomputing 93, draft MPI standard 1995: MPI1 standard 1997: MPI2 standard Potsdam, July

7 OpenMP vs MPI: Overview Potsdam, July

8 How to install MPI? The MPI home page is maintained at Argonne National Laboratory. Standards, archives, documentation and links to implementations are available. MPI is a library of subroutines for Fortran functions for C classes and methods for C++ Potsdam, July

9 How to install MPI? User programs are compiled as usual and then linked with the appropriate MPI libraries. Implementations are: MPICH ( is available from Argonne National Laboratory. It is free and easily downloaded and can be installed at the user level (i.e., without superuser privileges). Subroutines are provided for Fortran 90, C and C++. The CH in MPICH stands for Chameleon, symbol of adaptability to one s environment and thus of portability. Chameleons are fast, and from the beginning a secondary goal was to give up as little efficiency as possible for the portability. LAM/MPI ( is available from Indiana University. LAM stands for Local Area Multicomputer. WMPI II ( is a commercial (but free to academics) implementation for Windows. Potsdam, July

10 Which tasks can be parallized by MPI Trivial parallel programs parameter studies analysis of many time steps image processing Independent tasks Nbody interaction halo finding and treatment density field (smoothed) Problems in both cases Are the tasks scalable over many CPUs? Over how many? Load balance (Do all CPUs work, do many lie idle?) Potsdam, July

11 Examples: Trivial parallel programs calculation of π by different methods (one CPU rivals the others: which method is faster or more accurate) this little task is an excellent exercise for the combination of OpenMP and MPI Evolution of many clusters of galaxies (Hitachi project: 8 nodes with 8 processors on each node, 8 MPI processes, each with 8 OpenMP threads Potsdam, July

12 IMPLICIT REAL*8 (A-H,O-Z) IMPLICIT INTEGER*4 (I-N) include mpif.h Calculation of π N = pii = ! Num. Rec. p. 914 CALL mpi_init(ierr) CALL mpi_comm_size(mpi_comm_world, msize, ierr ) CALL mpi_comm_rank(mpi_comm_world, mrank, ierr ) CALL Calc_PI(pi,N, mrank)... CALL mpi_finalize(ierr) end SUBROUTINE Calc_PI(pi,N, mrank)... Potsdam, July

13 on mpif.h /* -*- Mode: Fortran; -*- */!! (C) 2001 by Argonne National Laboratory.! See COPYRIGHT in top-level directory.!! DO NOT EDIT! This file created by buildiface! INTEGER MPI_SOURCE, MPI_TAG, MPI_ERROR PARAMETER (MPI_SOURCE=3,MPI_TAG=4,MPI_ERROR=5)... Your compiler will see that file if you have the right environment: source /opt/env/pgi-mpich sh Potsdam, July

14 mpi init CALL mpi_init(ierr) Initializes the MPI execution environment. This function must be called in every MPI program, must be called before any other MPI functions and must be called only once in an MPI program. Potsdam, July

15 mpi comm size CALL mpi_comm_size(mpi_comm_world, msize, ierr ) Determines the number of processes msize in the group associated with a communicator. Generally used within the communicator MPI COMM WORLD to determine the number of processes being used by your application. Potsdam, July

16 What is MPI COMM WORLD? MPI uses objects called communicators and groups to define which collection of processes may communicate with each other. Most MPI routines require you to specify a communicator as an argument. MPI COMM WORLD is the predefined communicator which includes all of your MPI processes. Potsdam, July

17 MPI COMM WORLD extension Potsdam, July

18 mpi comm rank CALL mpi_comm_rank(mpi_comm_world, mrank, ierr ) Determines the rank mrank of the calling process within the communicator. Initially, each process will be assigned a unique integer rank between 0 and number of processors - 1 within the communicator MPI COMM WORLD. This rank is often referred to as a task ID. If a process becomes associated with other communicators, it will have a unique rank within each of these as well. Potsdam, July

19 CALL mpi_finalize(ierr) mpi finalize Terminates the MPI execution environment. This function should be the last MPI routine called in every MPI program - no other MPI routines may be called after it. Potsdam, July

20 include mpif.h Coming back to the calculation of π N = pii = ! Num. Rec. p. 914 CALL mpi_init(ierr) CALL mpi_comm_size(mpi_comm_world, msize, ierr ) CALL mpi_comm_rank(mpi_comm_world, mrank, ierr ) CALL Calc_PI(pi,N, mrank)... CALL mpi_finalize(ierr) end SUBROUTINE Calc_PI(pi,N, mrank) Now you can use on different processors different series to calculate π and check speed, convergence... Potsdam, July

21 Important note Don t use within MPI programs commands like the following: IF (error.gt. error_max) STOP increase accuracy One node would stop. During parallelization of your serial program you should replace such lines by something similar to IF (error.gt. error_max) THEN write(*,*) increase accuracy call mpi_abort(mpi_comm_world,ierr1,ierr2) STOP ENDIF which terminates all MPI processes associated with the communicator. In most MPI implementations it terminates ALL processes regardless of the communicator specified. Potsdam, July

22 Summary: structure of MPI programs Potsdam, July

23 first ART MPI code: Evolution of many clusters of galaxies c ==================================================== c c Adaptive Refinement Tree (ART) N-body solver c c Version 3 - February 1997 c c Andrey Kravtsov, Anatoly Klypin, Alexei Khokhlov c c ==================================================== c c this is a simple test version for MPI c changes only in ART_Main.f and ART_IO.f c program ART include mpif.h Potsdam, July

24 ... some more initialisation for ART CALL mpi_init(ierr) CALL mpi_comm_size(mpi_comm_world, isize, ierr ) CALL mpi_comm_rank(mpi_comm_world, irank, ierr )... the main ART program... read data... do loop n steps... integrate one step... decide whether results should be written to disk... enddo CALL mpi_finalize(ierr) STOP END Potsdam, July

25 c SUBROUTINE construct_name(name,in1,jn1) c c c purpose: construct file names for the output from different nodes c into different directories include mpif.h CHARACTER*120 name,tmp3 CHARACTER*5 tmp1 CHARACTER*1 tmp2 tmp1 = node_ tmp2 = / CALL mpi_comm_rank(mpi_comm_world, i_node, ierr ) tmp3 = name CALL get_name(tmp3,in1,jn1) write(name, (a,i1,a,a) ) tmp1,i_node,tmp2,tmp3(in1:jn1) CALL get_name(name,in1,jn1) write(*,*) name(in1:jn1) Potsdam, July

26 END That s all of changes! Each MPI process reads its own data (one cluster of galaxies) and integrates it completely independent of the others tasks. No communication. You will run into problems if there is any STOP in the code. Also here is nothing done concerning load balance, less massive clusters will finish earlier than more massive ones. Potsdam, July

27 Nbody code Examples: Independent tasks The interaction between any two particles does not depend on all the other particles. Straightforward parallelization (for example the direct integration code): more communication Parallelization of tasks in different sub-volumes (for example the MPI version of ART): less communications but problems with load balance Potsdam, July

28 Direct integration N particles position x velocity v move all particles to a new position after t use the leap-frog scheme calculate the movement for a subset of N p N/N CP U of the N CP U processors on each simple to parallize, however, all nodes need to know all positions and velocities (not really a disadvantage on present day computers with large memory) Potsdam, July

29 Leap frog scheme Define positions x and forces at time t, time step n. Define velocities v at time t + t 2, time step n Then we have for particle i x n+1 i = x n i + v n+1/2 i t (1) v n+1/2 i = v n 1/2 i + F i (x n i ) t/m (2) Potsdam, July

30 Initial conditions To start the integration, we need the initial position of all particles x and their velocities v at two separate times: x(t 0 ) and v(t 0 t/2). see Anatoly s lecture about initial conditions (PMstartM.f). Potsdam, July

31 Accuracy of the leap frog scheme x n+1 i = x n i + v n+1/2 i t (3) v n+1/2 i = v n 1/2 i + F i (x n i ) t/m (4) Substitute v n 1/2 i in the second equation using the first. v n+1/2 i = (x n i x n 1 i )/ t + F i (x n i ) t/m (5) Substitute back into first equation x n+1 i = x n i + (x n i x n 1 i ) + F i (x n i )( t) 2 /m (6) we get the central difference formula for F = ma. x n+1 i 2x n i + xn 1 i ) t 2 = F i (x n i )/m (7) Potsdam, July

32 Accuracy of the leap frog scheme Let us assume that X is the true solution. X n+1 i 2Xi n + Xn 1 i t 2 = F i (Xi n )/m + δ (8) Insert Taylor expansion for X n+1 i and X n 1 i, thus X n+1 i 2Xi n + X n 1 i = t 2d2 X dt 2 + t4 12 d 4 X +... (9) dt4 Substitute back and get the truncation error O( t 2 ), δ = t2 12 d 4 X +... (10) dt4 Potsdam, July

33 Consistency of the leap frog scheme As t 0 the difference equation converges to the differential equation: d 2 X dt 2 = F ( x)/m (11) and it is also a sympletic method (time symmetric). The scheme has the same accuracy for negative t. Potsdam, July

34 Truncation error vs. round-off error Truncation error can be reduced by smaller step t can be reduced by higher-order algorithm is not related to round-off error Round-off error representation of real numbers with finite number of bits can be reduced by higher precision (64 bit, REAL*8) can be reduced also by careful ordering of operations Potsdam, July

35 nbody par.f Reading by root Distribution of tasks Load balance N p particles per processor out of N particles Broadcast to all processors move particles on each processor distribute moved particles to all processors root writes to disk Potsdam, July

36 nbody par.f INTEGER Np_on_rank(maxrank+1)... CALL MPI_COMM_RANK( MPI_COMM_WORLD, mrank, ierr ) mroot=0 IF(mrank.eq. mroot) THEN... read the data ENDIF Np_per_process = N/msize Write first particle number and last for each processor on the integer array Np_on_rank(maxrank+1). Note, that in this construction Np_per_process * msize is not necessary equal N, thus the last CPU may get (much) more particles than the others = bad load balance. Potsdam, July

37 nbody par.f Now the root process has all necessary informations. The informations have to be distributed to all the others processors. Root has to tell them which processor has which tasks. = Message passing in systems with distributed memory Potsdam, July

38 Message passing Every processor has its own local memory which can be accessed directly only by its own CPU. We have to distribute data from root to all processors over the network. Potsdam, July

39 Message passing A synchronous send operation will complete only after acknowledgment that the message was safely received by the receiving process. Asynchronous send operations may complete even though the receiving process has not actually received the message. Potsdam, July

40 Point to Point Communication MPI_SEND (buf,count,datatype,dest,tag,comm,ierr) The basic blocking send operation returns only after the application buffer in the sending task is free for reuse. Note that this routine may be implemented differently on different systems. The MPI standard permits the use of a system buffer but does not require it. Some implementations may actually use a synchronous send (block longer until the destination process has started to receive the message) to implement the basic blocking send. Potsdam, July

41 Using a system buffer Potsdam, July

42 Point to Point Communication Buffer Count MPI_SEND (buf,count,datatype,dest,tag,comm,ierr) Address space which references the data that is to be sent or or received = variable name that is be sent/received number of data elements of the particular type to be sent Data Type MPI data type (next slide) Destination This argument indicates the process where the message should be delivered (rank of the receiving process). Potsdam, July

43 tag Arbitrary non-negative integer ( )assigned by the programmer to uniquely identify a message. Send and receive operations should match message tags. Communicator the predefined communicator MPI COMM WORLD is usually used Potsdam, July

44 Message passing - MPI data types MPI data types MPI INTEGER MPI REAL MPI DOUBLE PRECISION MPI COMPLEX MPI LOGICAL MPI CHARACTER MPI BYTE MPI PACKED Fortran data types INTEGER REAL DOUBLE PRECISION COMPLEX LOGICAL CHARACTER(1) 8 binary digits data (un)packed with MPI Pack (MPI Unpack) Potsdam, July

45 Point to Point Communication Source Status MPI_RECV (buf,count,datatype,source,tag,comm,status,ierr) This argument indicates the originating process of the message (rank of the sending process). This may be set to the wild card MPI ANY SOURCE to receive a message from any task. For a receive operation, indicates the source of the message and the tag of the message. In Fortran, it is an integer array of size MPI STATUS SIZE. Potsdam, July

46 nbody par.f Distribute data from root to all processors... CALL MPI_Bcast(Np,1,MPI_INTEGER,mroot,MPI_COMM_WORLD, ierr) CALL MPI_Bcast(dt,1,MPI_DOUBLE_PRECISION, + mroot,mpi_comm_world, ierr) nsend = 10*Nmax CALL MPI_Bcast(Coords,nsend,MPI_DOUBLE_PRECISION, + mroot,mpi_comm_world, ierr) CALL MPI_Bcast(Np_on_rank,maxrank, + MPI_INTEGER,mroot,MPI_COMM_WORLD, ierr)... where we have defined in the original serial program PARAMETER (Nmax =50000)! maximum number of particles REAL*8 Coords COMMON /MAINDATA/Coords(10,Nmax) Potsdam, July

47 MPI Bcast MPI_BCAST (buffer,count,datatype,root,comm,ierr) Broadcasts (sends) a message from the process with rank root to all other processes in the group. Potsdam, July

48 nbody par.f c Do i=1,nsteps Call GetAccelerations_NP Call MoveParticles time = time + dt istep= istep+ 1 distribute particles CALL Send_Receive()! main loop Distribute new positions and velocities after each time step to all processors. Each processor has to send data and to receive data from all other processors. Potsdam, July

49 MPI ALLGATHER Collect data from all tasks and distribute them to all tasks in a group. Each task in the group, in effect, performs a one-to-all broadcasting operation within the group. Potsdam, July

50 sendbuf MPI ALLGATHER MPI_ALLGATHER (sendbuf,sendcount,sendtype,recvbuf, recvcount,recvtype,comm,ierr) starting address of the send buffer (Fortran variable) sendcount sendtype recvbuf revcount number of data elements in the send buffer (integer) MPI data type address of the send buffer (Fortran variable) number of elements received from any process (integer) Potsdam, July

51 recvtype MPI data type ( = sendtype) Potsdam, July

52 MPI ALLGATHERV MPI ALLGATHERV extends the functionality of MPI ALLGATHER by allowing a varying count of data to be send from each process. MPI_ALLGATHERV (sendbuf,sendcount,sendtype,recvbuf, recvcounts,dipls,recvtype,comm,ierr) revcounts dipls integer array of length group size (msize) containing the number of elements that are received from each process integer array of length group size (msize). The entry i specifies the displacement (relative to recbuf) at which to place the incoming data from process i. Potsdam, July

53 subroutine Send Receive() c c distribute particles SUBROUTINE Send_Receive() c INCLUDE nbody_par.h INTEGER & REAL*8 sendcount,recvcount(msize), rdispl(msize) send(np),receive(np) Do i = 1, msize rdispl(i) = Np_on_rank(i) recvcount(i) = Np_on_rank(i+1) - Np_on_rank(i) ENDDO istart = Np_on_rank(mrank+1)+1 iend = Np_on_rank(mrank+2) sendcount = Np_on_rank(mrank+2) - Np_on_rank(mrank+1) Potsdam, July

54 Note: In our example already all processes know how many particles are handled by the different processors (information is stored in the array Np on rank). Thus each processor can calculate the amount of data which it receives and the corresponding displacement. In a more general case this information must be distributed before calling MPI ALLGATHER. Potsdam, July

55 DO k = 1, 10 DO i = 1,sendcount ii = istart -1 + i send(i) = Coords(k,ii) ENDDO CALL mpi_allgatherv(send, sendcount, MPI_REAL8, + receive, recvcount, rdispl, MPI_REAL8, + MPI_COMM_WORLD,ierr) CALL MPI_BARRIER(MPI_COMM_WORLD, ierrbar) DO i = 1,Np Coords(k,i) = ENDDO ENDDO RETURN End receive(i) Potsdam, July

56 MPI BARRIER CALL MPI_BARRIER(MPI_COMM_WORLD, ierrbar) Creates a barrier synchronization in a group. It blocks the calling process until all group members have called it; i.e. the call returns at any process only after all group members have entered the call. Potsdam, July

57 nbody par.f Do i=1,nsteps! main loop... if(mod(istep,1000).eq.0) then Open(ifile,file=log_file,position= append )... close(ifile) endif... It is useful to write informations (for example timing) for each processor into separate log-files constructed like: WRITE(a6, (I6) ) mrank log_file = DATA/timing_ //a6(5:6)//.log ifile = 100+mrank Potsdam, July

58 Scaling behavior on octopus Computation (black) and communication (gray) times This is only an example! 200 particles, steps 1 CPU: s 2 CPUs: s 4 CPUs: s 8 CPUs: s 16 CPUs: s ( 200/16 = 12.5, i.e ) too simple distribution of tasks Potsdam, July

59 Scaling behavior on octopus 2000 particles, time measured for integration steps processors time speedup efficiency particles per CPU s s in the original version s s s s s ( 1/ ) s ( 1/ ) speedup = sequential execution time parallel execution time (12) efficiency = sequential execution time processors used parallel execution time (13) Potsdam, July

60 test it! Parallelization and testing of the direct integration Nbody code will be your homework during the first week A serial version of the code is available at: Potsdam, July

61 Performance analysis Speedup ψ(n, p) for a problem of size n on p processors. We have three categories of operations: Computations that must be performed sequentially: σ(n) Computations that can be performed in parallel: ϕ(n) Parallel overhead (communication operations, redundant computations, load balance): κ(n, p) Then the speedup ψ(n, p) is ψ(n, p) σ(n) + ϕ(n) σ(n) + ϕ(n)/p + κ(n, p) (14) and the efficiency 0 ɛ(n, p) 1 is ɛ(n, p) σ(n) + ϕ(n) pσ(n) + ϕ(n) + pκ(n, p) (15) Potsdam, July

62 Amdahl s law Let us neglect the overhead κ(n, p) and define the inherently sequential portion f = σ(n) σ(n) + ϕ(n) (16) of the computation. Then the speedup on a parallel computer with p processors is (Amdahl s law) ψ 1 f + (1 f)/p (17) In particular interesting for estimation of the maximum speedup as p. Potsdam, July

63 Amdahl s law Anteil parallel = 1 f Potsdam, July

64 ART MPI Basic concept: To run the simulation using N MPI MPI-processes we divide the box into N MPI sub-boxes in such a way that all sub-boxes will need approximately the same amount of computational time for one integration step. Each MPIprocess uses N OMP CPUs within OpenMP, thus N CPU = N MPI N OMP. After each basic integration step the box is divided again into sub-boxes according to the best forecast of load balance. Input/output via parallel reading/writing of N MPI processors on N MPI files. The files contain for each primary particle 9 variables (3 coordinates, 3 velocities, mass, individual time step, particle id). Finding of sub-boxes is easy for the initial conditions where matter is distributed almost homogeneously almost impossible after structures have developed even more complicated for multi-mass realizations in the original box Potsdam, July

65 an artist s view of the ART MPI simulation box Potsdam, July

66 ART MPI Example for the load balance in the WMAP run (80h 1 Mpc box size, particles, 64 MPI processes, 512 CPU, done on COLUMBIA) Potsdam, July

67 ART MPI each sub-box is surrounded by a thin shell with primary particles m p more shells contain particles with increasing mass m 2 > m 1 > m p rest of the box is filled with most massive particles m b > m 2 Potsdam, July

68 ART MPI periodicity of the box is taken into account each sub-box runs one integration step of the multi-mass version of ART (tidal fields represented by the more massive particles) after each integration steps new subboxes are determined Potsdam, July

69 Main tasks for parallelization: ART MPI determine on each node which particles has to be send to which nodes (Fortran) construct the corresponding massive particles from the primary ones (Fortran) each node has to inform all others about the particles it wishes to send (MPI allgather) all nodes send to all nodes their particles (MPI alltoallv) Advantage: Advantage: less communications communications only after each basic integration step Potsdam, July

70 ART MPI Example: 5 sends only massive box particles m b to sends only massive box particles m b to 4 10 sends primary particles m p to 11 as well as massive ones, m 1, m 2, m b 5 sends primary particles m p to 4 as well as massive ones, m 1, m 2, m b Potsdam, July

71 ART MPI - the main program c ART_MPI_Main.f... CALL mpi_init(ierr) CALL mpi_comm_size(mpi_comm_world, mpisize, ierr ) CALL mpi_comm_rank(mpi_comm_world, irank, ierr ) IF(mpisize.NE. n_nodes) THEN write(*,*) mpisize.ne. n_nodes,mpisize,n_nodes call mpi_abort(mpi_comm_world,ierr1,ierr2) STOP ENDIF... Potsdam, July

72 CALL Read_ART_MPI_Inp () C$OMP PARALLEL DO DEFAULT(SHARED) C$OMP+PRIVATE ( ic1) do ic1 = 1, mcell var(1,ic1) = -1.0 var(2,ic1) = -1.0 ref(ic1) = zero pot(ic1) = zero enddo... call Read_Control() call Read_Particles()... Potsdam, July

73 c... main loop over mstep (read from input) integration steps DO ijkl = 1, mstep... CALL Send_Small () CALL Send_Large () c integrate one time step... c If(aexpn.lt.0.6)Then call LoadBalance2 Else call LoadBalance1 EndIf redistribution of primary particles call Redistribute_Primaries() Potsdam, July

74 c write output, if necessary call Save_Check ()... ENDDO 999 Continue CALL Save(0) CALL mpi_finalize(ierr) END Potsdam, July

75 ART MPI - send small particles c SUBROUTINE Send_Small() c c c purpose: gathers and sends small particles c c input: in0(3) - coordinates of sending node c c output: sends particles, sets n_refin =number of particles... node = irank + 1 CALL Node_to_IJK(node,iN0)... Do kn =1,n_divz! Loop over other nodes Do jn =1,n_divy Do in =1,n_divx Potsdam, July

76 c c... find boundaries of two nodes in 3D CALL BoundNode(iN0,iN1,Nbound) find primary particles, which node in0 will send for node in1 CALL Find_Small(iN0,Nbound,np_node,nn,ncount)... EndDo EndDo EndDo CALL Send_Receive()... RETURN End An analogous routine exists for sending large particles. Potsdam, July

77 ART MPI - send small particles c SUBROUTINE Send_Receive() c c c purpose: sends and reives data for particles INTEGER INTEGER... sendcount(n_nodes), recvcount(n_nodes) sendcount_all(n_nodes*n_nodes) c send integers with the lengths of all arrays which will be c sent (sendcount) and received (recvcount) to/from all nodes CALL mpi_allgather(sendcount, n_nodes, MPI_INTEGER, + sendcount_all, n_nodes, MPI_INTEGER, + MPI_COMM_WORLD, ierr) Potsdam, July

78 rdispl_new = 0 DO i = 1, n_nodes rdispl(i) = rdispl_new recvcount(i) = sendcount_all(irank+1+n_nodes*(i-1)) rdispl_new = rdispl(i) + recvcount(i) ENDDO CALL mpi_alltoallv(x_se, sendcount, sdispl, MPI_REAL8, + x_re, recvcount, rdispl, MPI_REAL8, + MPI_COMM_WORLD,ierr) CALL mpi_alltoallv(y_se, sendcount, sdispl, MPI_REAL8, + y_re, recvcount, rdispl, MPI_REAL8, + MPI_COMM_WORLD,ierr) CALL mpi_alltoallv(z_se, sendcount, sdispl, MPI_REAL8, + z_re, recvcount, rdispl, MPI_REAL8, + MPI_COMM_WORLD,ierr) CALL mpi_alltoallv(vx_se, sendcount, sdispl, MPI_REAL8, + vx_re, recvcount, rdispl, MPI_REAL8, + MPI_COMM_WORLD,ierr) Potsdam, July

79 CALL mpi_alltoallv(vy_se, sendcount, sdispl, MPI_REAL8, + vy_re, recvcount, rdispl, MPI_REAL8, + MPI_COMM_WORLD,ierr) CALL mpi_alltoallv(vz_se, sendcount, sdispl, MPI_REAL8, + vz_re, recvcount, rdispl, MPI_REAL8, + MPI_COMM_WORLD,ierr) CALL mpi_alltoallv(pt_se, sendcount, sdispl, MPI_REAL, + pt_re, recvcount, rdispl, MPI_REAL, + MPI_COMM_WORLD,ierr) CALL mpi_alltoallv(wpar_se, sendcount, sdispl, MPI_REAL, + wpar_re, recvcount, rdispl, MPI_REAL, + MPI_COMM_WORLD,ierr) CALL mpi_alltoallv(ip_se, sendcount, sdispl, MPI_INTEGER, + ip_re, recvcount, rdispl, MPI_INTEGER, + MPI_COMM_WORLD,ierr) RETURN END Potsdam, July

80 MPI ALLGATHER Has been used to distribute particles in the direct integration example. Here we distribute informations about how many particles each node is going to send to the other nodes (sendcount(n_nodes)) so that all nodes know how many particles arrive from the others (sendcount_all(n_nodes*n_nodes)). Having this information each processor can calculate where it has to put the arriving recvcount particles, i.e. rdispl. Potsdam, July

81 MPI ALLTOALL Each task in a group performs a scatter operation, sending a distinct message to all the tasks in the group in order by index. Potsdam, July

82 MPI ALLTOALL sendbuf MPI_ALLTOALL (sendbuf,sendcount,sendtype,recvbuf, recvcnt,recvtype,comm,ierr) starting address of the send buffer (Fortran variable) sendcount sendtype recvbuf number of data elements in the send buffer (integer) MPI data type address of the send buffer (Fortran variable) Potsdam, July

83 revcount number of elements received from any process (integer) recvtype MPI data type ( = sendtype) Potsdam, July

84 MPI ALLTOALLV MPI ALLTOALLV adds flexibility to MPI ALLTOALL in that the location of data for the send is specified by sdispls and the location of data on the receive side is specified by rdispls. sendcounts recvcnts MPI_ALLTOALL (sendbuf,sendcounts,sdipls,sendtype,recvbuf, recvcnts,rdipls,recvtype,comm,ierr) now: number of data elements to send to each processor (integer array of length msize) now: number of data elements which can be received by each processor (integer array of length msize) Potsdam, July

85 sdipls rdipls new: integer array of length msize specifying the displacement relative to sendbuf from which the data destined for process j has to be taken new: integer array of length msize specifying the displacement relative to recbuf at which to place the incoming data from process i Potsdam, July

86 After running MPI ART = analyze the data an MPI version of the BDM halo finder exists (Arman Khalatyan) an MPI version of the minimum spanning tree and friends-of-friends halo finder exists (Victor Turchaninov) Potsdam, July

87 Bugs leading to a deadlock Single process calls collective function. Example: root calls MPI Bcast Prevention: Do not put collective communications inside conditionally executed parts of the code. Two or more processes are trying to exchange data, but all call a blocking receive function (MPI Recv) before sending. Prevention: You could use MPI SendRecv. A process tries to receive data from a process that never will send it. Prevention: Use collective communications whenever it is possible, if using pointto-point communication, use simple communication patterns. Potsdam, July

88 Web pages about MPI Writing Message-Passing Parallel Programs with MPI SP Parallel Programming Workshop The Message Passing Interface (MPI) standard MPI: A Message-Passing Interface Standard MPI-2: Extensions to the Message-Passing Interface Potsdam, July

89 Books about MPI Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, MC GrawHill, 2004, ISBN Peter S. Pacheco, Parallel Programming with MPI, Morgan Kaufmann Publishers, 1997, ISBN MPI: The Complete Reference (The MIT Press, ISBN ) Potsdam, July

Collective Communication: Gather. MPI - v Operations. Collective Communication: Gather. MPI_Gather. root WORKS A OK

Collective Communication: Gather. MPI - v Operations. Collective Communication: Gather. MPI_Gather. root WORKS A OK Collective Communication: Gather MPI - v Operations A Gather operation has data from all processes collected, or gathered, at a central process, referred to as the root Even the root process contributes

More information

MPI - v Operations. Collective Communication: Gather

MPI - v Operations. Collective Communication: Gather MPI - v Operations Based on notes by Dr. David Cronk Innovative Computing Lab University of Tennessee Cluster Computing 1 Collective Communication: Gather A Gather operation has data from all processes

More information

MPI MESSAGE PASSING INTERFACE

MPI MESSAGE PASSING INTERFACE MPI MESSAGE PASSING INTERFACE David COLIGNON CÉCI - Consortium des Équipements de Calcul Intensif http://hpc.montefiore.ulg.ac.be Outline Introduction From serial source code to parallel execution MPI

More information

Introduction to MPI Part II Collective Communications and communicators

Introduction to MPI Part II Collective Communications and communicators Introduction to MPI Part II Collective Communications and communicators Andrew Emerson, Fabio Affinito {a.emerson,f.affinito}@cineca.it SuperComputing Applications and Innovation Department Collective

More information

Collective Communication: Gatherv. MPI v Operations. root

Collective Communication: Gatherv. MPI v Operations. root Collective Communication: Gather MPI v Operations A Gather operation has data from all processes collected, or gathered, at a central process, referred to as the root Even the root process contributes

More information

CEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced

CEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced 1 / 32 CEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced Albert S. Kim Department of Civil and Environmental Engineering University of Hawai i at Manoa 2540 Dole

More information

HPC Parallel Programing Multi-node Computation with MPI - I

HPC Parallel Programing Multi-node Computation with MPI - I HPC Parallel Programing Multi-node Computation with MPI - I Parallelization and Optimization Group TATA Consultancy Services, Sahyadri Park Pune, India TCS all rights reserved April 29, 2013 Copyright

More information

AMath 483/583 Lecture 18 May 6, 2011

AMath 483/583 Lecture 18 May 6, 2011 AMath 483/583 Lecture 18 May 6, 2011 Today: MPI concepts Communicators, broadcast, reduce Next week: MPI send and receive Iterative methods Read: Class notes and references $CLASSHG/codes/mpi MPI Message

More information

Programming with MPI Collectives

Programming with MPI Collectives Programming with MPI Collectives Jan Thorbecke Type to enter text Delft University of Technology Challenge the future Collectives Classes Communication types exercise: BroadcastBarrier Gather Scatter exercise:

More information

AMath 483/583 Lecture 21

AMath 483/583 Lecture 21 AMath 483/583 Lecture 21 Outline: Review MPI, reduce and bcast MPI send and receive Master Worker paradigm References: $UWHPSC/codes/mpi class notes: MPI section class notes: MPI section of bibliography

More information

MA471. Lecture 5. Collective MPI Communication

MA471. Lecture 5. Collective MPI Communication MA471 Lecture 5 Collective MPI Communication Today: When all the processes want to send, receive or both Excellent website for MPI command syntax available at: http://www-unix.mcs.anl.gov/mpi/www/ 9/10/2003

More information

Recap of Parallelism & MPI

Recap of Parallelism & MPI Recap of Parallelism & MPI Chris Brady Heather Ratcliffe The Angry Penguin, used under creative commons licence from Swantje Hess and Jannis Pohlmann. Warwick RSE 13/12/2017 Parallel programming Break

More information

The MPI Message-passing Standard Practical use and implementation (V) SPD Course 6/03/2017 Massimo Coppola

The MPI Message-passing Standard Practical use and implementation (V) SPD Course 6/03/2017 Massimo Coppola The MPI Message-passing Standard Practical use and implementation (V) SPD Course 6/03/2017 Massimo Coppola Intracommunicators COLLECTIVE COMMUNICATIONS SPD - MPI Standard Use and Implementation (5) 2 Collectives

More information

Introduction to parallel computing with MPI

Introduction to parallel computing with MPI Introduction to parallel computing with MPI Sergiy Bubin Department of Physics Nazarbayev University Distributed Memory Environment image credit: LLNL Hybrid Memory Environment Most modern clusters and

More information

Topics. Lecture 7. Review. Other MPI collective functions. Collective Communication (cont d) MPI Programming (III)

Topics. Lecture 7. Review. Other MPI collective functions. Collective Communication (cont d) MPI Programming (III) Topics Lecture 7 MPI Programming (III) Collective communication (cont d) Point-to-point communication Basic point-to-point communication Non-blocking point-to-point communication Four modes of blocking

More information

Slides prepared by : Farzana Rahman 1

Slides prepared by : Farzana Rahman 1 Introduction to MPI 1 Background on MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers, and parallel programmers Used to create parallel programs based

More information

CSE. Parallel Algorithms on a cluster of PCs. Ian Bush. Daresbury Laboratory (With thanks to Lorna Smith and Mark Bull at EPCC)

CSE. Parallel Algorithms on a cluster of PCs. Ian Bush. Daresbury Laboratory (With thanks to Lorna Smith and Mark Bull at EPCC) Parallel Algorithms on a cluster of PCs Ian Bush Daresbury Laboratory I.J.Bush@dl.ac.uk (With thanks to Lorna Smith and Mark Bull at EPCC) Overview This lecture will cover General Message passing concepts

More information

Outline. Communication modes MPI Message Passing Interface Standard. Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa Tp.HCM

Outline. Communication modes MPI Message Passing Interface Standard. Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa Tp.HCM THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking

More information

Advanced Message-Passing Interface (MPI)

Advanced Message-Passing Interface (MPI) Outline of the workshop 2 Advanced Message-Passing Interface (MPI) Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@mcgill.ca Morning: Advanced MPI Revision More on Collectives More on Point-to-Point

More information

L15: Putting it together: N-body (Ch. 6)!

L15: Putting it together: N-body (Ch. 6)! Outline L15: Putting it together: N-body (Ch. 6)! October 30, 2012! Review MPI Communication - Blocking - Non-Blocking - One-Sided - Point-to-Point vs. Collective Chapter 6 shows two algorithms (N-body

More information

Introduction to MPI, the Message Passing Library

Introduction to MPI, the Message Passing Library Chapter 3, p. 1/57 Basics of Basic Messages -To-? Introduction to, the Message Passing Library School of Engineering Sciences Computations for Large-Scale Problems I Chapter 3, p. 2/57 Outline Basics of

More information

MPI 5. CSCI 4850/5850 High-Performance Computing Spring 2018

MPI 5. CSCI 4850/5850 High-Performance Computing Spring 2018 MPI 5 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives

More information

Parallel Programming Using Basic MPI. Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center

Parallel Programming Using Basic MPI. Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center 05 Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Talk Overview Background on MPI Documentation Hello world in MPI Basic communications Simple

More information

MPI Collective communication

MPI Collective communication MPI Collective communication CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) MPI Collective communication Spring 2018 1 / 43 Outline 1 MPI Collective communication

More information

CS4961 Parallel Programming. Lecture 18: Introduction to Message Passing 11/3/10. Final Project Purpose: Mary Hall November 2, 2010.

CS4961 Parallel Programming. Lecture 18: Introduction to Message Passing 11/3/10. Final Project Purpose: Mary Hall November 2, 2010. Parallel Programming Lecture 18: Introduction to Message Passing Mary Hall November 2, 2010 Final Project Purpose: - A chance to dig in deeper into a parallel programming model and explore concepts. -

More information

Message Passing Interface

Message Passing Interface MPSoC Architectures MPI Alberto Bosio, Associate Professor UM Microelectronic Departement bosio@lirmm.fr Message Passing Interface API for distributed-memory programming parallel code that runs across

More information

Introduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign

Introduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Introduction to MPI May 20, 2013 Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Top500.org PERFORMANCE DEVELOPMENT 1 Eflop/s 162 Pflop/s PROJECTED 100 Pflop/s

More information

Introduction to MPI. Ricardo Fonseca. https://sites.google.com/view/rafonseca2017/

Introduction to MPI. Ricardo Fonseca. https://sites.google.com/view/rafonseca2017/ Introduction to MPI Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Distributed Memory Programming (MPI) Message Passing Model Initializing and terminating programs Point to point

More information

Agenda. MPI Application Example. Praktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI. 1) Recap: MPI. 2) 2.

Agenda. MPI Application Example. Praktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI. 1) Recap: MPI. 2) 2. Praktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI Agenda 1) Recap: MPI 2) 2. Übungszettel 3) Projektpräferenzen? 4) Nächste Woche: 3. Übungszettel, Projektauswahl, Konzepte 5)

More information

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016

MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared

More information

Introduction to MPI Programming Part 2

Introduction to MPI Programming Part 2 Introduction to MPI Programming Part 2 Outline Collective communication Derived data types Collective Communication Collective communications involves all processes in a communicator One to all, all to

More information

Cornell Theory Center. Discussion: MPI Collective Communication I. Table of Contents. 1. Introduction

Cornell Theory Center. Discussion: MPI Collective Communication I. Table of Contents. 1. Introduction 1 of 18 11/1/2006 3:59 PM Cornell Theory Center Discussion: MPI Collective Communication I This is the in-depth discussion layer of a two-part module. For an explanation of the layers and how to navigate

More information

Week 3: MPI. Day 02 :: Message passing, point-to-point and collective communications

Week 3: MPI. Day 02 :: Message passing, point-to-point and collective communications Week 3: MPI Day 02 :: Message passing, point-to-point and collective communications Message passing What is MPI? A message-passing interface standard MPI-1.0: 1993 MPI-1.1: 1995 MPI-2.0: 1997 (backward-compatible

More information

First day. Basics of parallel programming. RIKEN CCS HPC Summer School Hiroya Matsuba, RIKEN CCS

First day. Basics of parallel programming. RIKEN CCS HPC Summer School Hiroya Matsuba, RIKEN CCS First day Basics of parallel programming RIKEN CCS HPC Summer School Hiroya Matsuba, RIKEN CCS Today s schedule: Basics of parallel programming 7/22 AM: Lecture Goals Understand the design of typical parallel

More information

Basic MPI Communications. Basic MPI Communications (cont d)

Basic MPI Communications. Basic MPI Communications (cont d) Basic MPI Communications MPI provides two non-blocking routines: MPI_Isend(buf,cnt,type,dst,tag,comm,reqHandle) buf: source of data to be sent cnt: number of data elements to be sent type: type of each

More information

Outline. Communication modes MPI Message Passing Interface Standard

Outline. Communication modes MPI Message Passing Interface Standard MPI THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking

More information

Message-Passing and MPI Programming

Message-Passing and MPI Programming Message-Passing and MPI Programming 5.1 Introduction More on Datatypes and Collectives N.M. Maclaren nmm1@cam.ac.uk July 2010 There are a few important facilities we have not covered yet; they are less

More information

MPI Tutorial. Shao-Ching Huang. High Performance Computing Group UCLA Institute for Digital Research and Education

MPI Tutorial. Shao-Ching Huang. High Performance Computing Group UCLA Institute for Digital Research and Education MPI Tutorial Shao-Ching Huang High Performance Computing Group UCLA Institute for Digital Research and Education Center for Vision, Cognition, Learning and Art, UCLA July 15 22, 2013 A few words before

More information

Claudio Chiaruttini Dipartimento di Matematica e Informatica Centro Interdipartimentale per le Scienze Computazionali (CISC) Università di Trieste

Claudio Chiaruttini Dipartimento di Matematica e Informatica Centro Interdipartimentale per le Scienze Computazionali (CISC) Università di Trieste Claudio Chiaruttini Dipartimento di Matematica e Informatica Centro Interdipartimentale per le Scienze Computazionali (CISC) Università di Trieste http://www.dmi.units.it/~chiarutt/didattica/parallela

More information

Parallel Programming

Parallel Programming Parallel Programming for Multicore and Cluster Systems von Thomas Rauber, Gudula Rünger 1. Auflage Parallel Programming Rauber / Rünger schnell und portofrei erhältlich bei beck-shop.de DIE FACHBUCHHANDLUNG

More information

MPI Lab. How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums

MPI Lab. How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums MPI Lab Parallelization (Calculating π in parallel) How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums Sharing Data Across Processors

More information

Non-Blocking Communications

Non-Blocking Communications Non-Blocking Communications Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Paul Burton April 2015 An Introduction to MPI Programming

Paul Burton April 2015 An Introduction to MPI Programming Paul Burton April 2015 Topics Introduction Initialising MPI & basic concepts Compiling and running a parallel program on the Cray Practical : Hello World MPI program Synchronisation Practical Data types

More information

Message Passing Interface: Basic Course

Message Passing Interface: Basic Course Overview of DM- HPC2N, UmeåUniversity, 901 87, Sweden. April 23, 2015 Table of contents Overview of DM- 1 Overview of DM- Parallelism Importance Partitioning Data Distributed Memory Working on Abisko 2

More information

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI CS 470 Spring 2018 Mike Lam, Professor Distributed Programming & MPI MPI paradigm Single program, multiple data (SPMD) One program, multiple processes (ranks) Processes communicate via messages An MPI

More information

MPI Optimisation. Advanced Parallel Programming. David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh

MPI Optimisation. Advanced Parallel Programming. David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh MPI Optimisation Advanced Parallel Programming David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh Overview Can divide overheads up into four main categories: Lack of parallelism Load imbalance

More information

Communication Characteristics in the NAS Parallel Benchmarks

Communication Characteristics in the NAS Parallel Benchmarks Communication Characteristics in the NAS Parallel Benchmarks Ahmad Faraj Xin Yuan Department of Computer Science, Florida State University, Tallahassee, FL 32306 {faraj, xyuan}@cs.fsu.edu Abstract In this

More information

MPI Tutorial. Shao-Ching Huang. IDRE High Performance Computing Workshop

MPI Tutorial. Shao-Ching Huang. IDRE High Performance Computing Workshop MPI Tutorial Shao-Ching Huang IDRE High Performance Computing Workshop 2013-02-13 Distributed Memory Each CPU has its own (local) memory This needs to be fast for parallel scalability (e.g. Infiniband,

More information

Holland Computing Center Kickstart MPI Intro

Holland Computing Center Kickstart MPI Intro Holland Computing Center Kickstart 2016 MPI Intro Message Passing Interface (MPI) MPI is a specification for message passing library that is standardized by MPI Forum Multiple vendor-specific implementations:

More information

Message-Passing and MPI Programming

Message-Passing and MPI Programming Message-Passing and MPI Programming More on Collectives N.M. Maclaren Computing Service nmm1@cam.ac.uk ext. 34761 July 2010 5.1 Introduction There are two important facilities we have not covered yet;

More information

MPI. (message passing, MIMD)

MPI. (message passing, MIMD) MPI (message passing, MIMD) What is MPI? a message-passing library specification extension of C/C++ (and Fortran) message passing for distributed memory parallel programming Features of MPI Point-to-point

More information

Practical stuff! ü OpenMP

Practical stuff! ü OpenMP Practical stuff! REALITY: Ways of actually get stuff done in HPC: Ø Message Passing (send, receive, broadcast,...) Ø Shared memory (load, store, lock, unlock) ü MPI Ø Transparent (compiler works magic)

More information

Non-Blocking Communications

Non-Blocking Communications Non-Blocking Communications Deadlock 1 5 2 3 4 Communicator 0 2 Completion The mode of a communication determines when its constituent operations complete. - i.e. synchronous / asynchronous The form of

More information

L19: Putting it together: N-body (Ch. 6)!

L19: Putting it together: N-body (Ch. 6)! Administrative L19: Putting it together: N-body (Ch. 6)! November 22, 2011! Project sign off due today, about a third of you are done (will accept it tomorrow, otherwise 5% loss on project grade) Next

More information

The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing

The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Parallelism Decompose the execution into several tasks according to the work to be done: Function/Task

More information

L14 Supercomputing - Part 2

L14 Supercomputing - Part 2 Geophysical Computing L14-1 L14 Supercomputing - Part 2 1. MPI Code Structure Writing parallel code can be done in either C or Fortran. The Message Passing Interface (MPI) is just a set of subroutines

More information

Introduction to MPI part II. Fabio AFFINITO

Introduction to MPI part II. Fabio AFFINITO Introduction to MPI part II Fabio AFFINITO (f.affinito@cineca.it) Collective communications Communications involving a group of processes. They are called by all the ranks involved in a communicator (or

More information

Collective Communication in MPI and Advanced Features

Collective Communication in MPI and Advanced Features Collective Communication in MPI and Advanced Features Pacheco s book. Chapter 3 T. Yang, CS240A. Part of slides from the text book, CS267 K. Yelick from UC Berkeley and B. Gropp, ANL Outline Collective

More information

Message passing. Week 3: MPI. Day 02 :: Message passing, point-to-point and collective communications. What is MPI?

Message passing. Week 3: MPI. Day 02 :: Message passing, point-to-point and collective communications. What is MPI? Week 3: MPI Day 02 :: Message passing, point-to-point and collective communications Message passing What is MPI? A message-passing interface standard MPI-1.0: 1993 MPI-1.1: 1995 MPI-2.0: 1997 (backward-compatible

More information

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI CS 470 Spring 2017 Mike Lam, Professor Distributed Programming & MPI MPI paradigm Single program, multiple data (SPMD) One program, multiple processes (ranks) Processes communicate via messages An MPI

More information

Parallel programming MPI

Parallel programming MPI Parallel programming MPI Distributed memory Each unit has its own memory space If a unit needs data in some other memory space, explicit communication (often through network) is required Point-to-point

More information

character :: buffer(100) integer :: position real :: a, b integer :: n position = 0 call MPI_PACK(a, 1, MPI_REAL, buffer, 100, & position, MPI_COMM_WO

character :: buffer(100) integer :: position real :: a, b integer :: n position = 0 call MPI_PACK(a, 1, MPI_REAL, buffer, 100, & position, MPI_COMM_WO MPI_PACK and MPI_UNPACK Each communication incurs a latency penalty so it is best to group communications together Requires data to be contiguous in memory with no gaps between variables This is true for

More information

Message Passing Programming. Designing MPI Applications

Message Passing Programming. Designing MPI Applications Message Passing Programming Designing MPI Applications Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

ECE 574 Cluster Computing Lecture 13

ECE 574 Cluster Computing Lecture 13 ECE 574 Cluster Computing Lecture 13 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 21 March 2017 Announcements HW#5 Finally Graded Had right idea, but often result not an *exact*

More information

CSE 160 Lecture 23. Matrix Multiplication Continued Managing communicators Gather and Scatter (Collectives)

CSE 160 Lecture 23. Matrix Multiplication Continued Managing communicators Gather and Scatter (Collectives) CS 160 Lecture 23 Matrix Multiplication Continued Managing communicators Gather and Scatter (Collectives) Today s lecture All to all communication Application to Parallel Sorting Blocking for cache 2013

More information

MPI Message Passing Interface. Source:

MPI Message Passing Interface. Source: MPI Message Passing Interface Source: http://www.netlib.org/utk/papers/mpi-book/mpi-book.html Message Passing Principles Explicit communication and synchronization Programming complexity is high But widely

More information

CSE 613: Parallel Programming. Lecture 21 ( The Message Passing Interface )

CSE 613: Parallel Programming. Lecture 21 ( The Message Passing Interface ) CSE 613: Parallel Programming Lecture 21 ( The Message Passing Interface ) Jesmin Jahan Tithi Department of Computer Science SUNY Stony Brook Fall 2013 ( Slides from Rezaul A. Chowdhury ) Principles of

More information

High Performance Computing

High Performance Computing High Performance Computing Course Notes 2009-2010 2010 Message Passing Programming II 1 Communications Point-to-point communications: involving exact two processes, one sender and one receiver For example,

More information

Practical Introduction to Message-Passing Interface (MPI)

Practical Introduction to Message-Passing Interface (MPI) 1 Practical Introduction to Message-Passing Interface (MPI) October 1st, 2015 By: Pier-Luc St-Onge Partners and Sponsors 2 Setup for the workshop 1. Get a user ID and password paper (provided in class):

More information

Parallel Programming Using MPI

Parallel Programming Using MPI Parallel Programming Using MPI Short Course on HPC 15th February 2019 Aditya Krishna Swamy adityaks@iisc.ac.in SERC, Indian Institute of Science When Parallel Computing Helps? Want to speed up your calculation

More information

Introduction to parallel computing concepts and technics

Introduction to parallel computing concepts and technics Introduction to parallel computing concepts and technics Paschalis Korosoglou (support@grid.auth.gr) User and Application Support Unit Scientific Computing Center @ AUTH Overview of Parallel computing

More information

CS 426. Building and Running a Parallel Application

CS 426. Building and Running a Parallel Application CS 426 Building and Running a Parallel Application 1 Task/Channel Model Design Efficient Parallel Programs (or Algorithms) Mainly for distributed memory systems (e.g. Clusters) Break Parallel Computations

More information

Distributed Memory Programming with MPI

Distributed Memory Programming with MPI Distributed Memory Programming with MPI Moreno Marzolla Dip. di Informatica Scienza e Ingegneria (DISI) Università di Bologna moreno.marzolla@unibo.it Algoritmi Avanzati--modulo 2 2 Credits Peter Pacheco,

More information

Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR

Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR clematis@ge.imati.cnr.it Ack. & riferimenti An Introduction to MPI Parallel Programming with the Message Passing InterfaceWilliam

More information

Peter Pacheco. Chapter 3. Distributed Memory Programming with MPI. Copyright 2010, Elsevier Inc. All rights Reserved

Peter Pacheco. Chapter 3. Distributed Memory Programming with MPI. Copyright 2010, Elsevier Inc. All rights Reserved An Introduction to Parallel Programming Peter Pacheco Chapter 3 Distributed Memory Programming with MPI 1 Roadmap Writing your first MPI program. Using the common MPI functions. The Trapezoidal Rule in

More information

Distributed Memory Programming with MPI. Copyright 2010, Elsevier Inc. All rights Reserved

Distributed Memory Programming with MPI. Copyright 2010, Elsevier Inc. All rights Reserved An Introduction to Parallel Programming Peter Pacheco Chapter 3 Distributed Memory Programming with MPI 1 Roadmap Writing your first MPI program. Using the common MPI functions. The Trapezoidal Rule in

More information

Topics. Lecture 6. Point-to-point Communication. Point-to-point Communication. Broadcast. Basic Point-to-point communication. MPI Programming (III)

Topics. Lecture 6. Point-to-point Communication. Point-to-point Communication. Broadcast. Basic Point-to-point communication. MPI Programming (III) Topics Lecture 6 MPI Programming (III) Point-to-point communication Basic point-to-point communication Non-blocking point-to-point communication Four modes of blocking communication Manager-Worker Programming

More information

NUMERICAL PARALLEL COMPUTING

NUMERICAL PARALLEL COMPUTING Lecture 5, March 23, 2012: The Message Passing Interface http://people.inf.ethz.ch/iyves/pnc12/ Peter Arbenz, Andreas Adelmann Computer Science Dept, ETH Zürich E-mail: arbenz@inf.ethz.ch Paul Scherrer

More information

Message Passing Interface

Message Passing Interface Message Passing Interface DPHPC15 TA: Salvatore Di Girolamo DSM (Distributed Shared Memory) Message Passing MPI (Message Passing Interface) A message passing specification implemented

More information

More about MPI programming. More about MPI programming p. 1

More about MPI programming. More about MPI programming p. 1 More about MPI programming More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems, the CPUs share

More information

Introduction to MPI. Ekpe Okorafor. School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014

Introduction to MPI. Ekpe Okorafor. School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014 Introduction to MPI Ekpe Okorafor School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014 Topics Introduction MPI Model and Basic Calls MPI Communication Summary 2 Topics Introduction

More information

Data parallelism. [ any app performing the *same* operation across a data stream ]

Data parallelism. [ any app performing the *same* operation across a data stream ] Data parallelism [ any app performing the *same* operation across a data stream ] Contrast stretching: Version Cores Time (secs) Speedup while (step < NumSteps &&!converged) { step++; diffs = 0; foreach

More information

Exercises: April 11. Hermann Härtig, TU Dresden, Distributed OS, Load Balancing

Exercises: April 11. Hermann Härtig, TU Dresden, Distributed OS, Load Balancing Exercises: April 11 1 PARTITIONING IN MPI COMMUNICATION AND NOISE AS HPC BOTTLENECK LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2017 Hermann Härtig THIS LECTURE Partitioning: bulk synchronous

More information

Practical Introduction to Message-Passing Interface (MPI)

Practical Introduction to Message-Passing Interface (MPI) 1 Outline of the workshop 2 Practical Introduction to Message-Passing Interface (MPI) Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@mcgill.ca Theoretical / practical introduction Parallelizing your

More information

Introduction to MPI. SHARCNET MPI Lecture Series: Part I of II. Paul Preney, OCT, M.Sc., B.Ed., B.Sc.

Introduction to MPI. SHARCNET MPI Lecture Series: Part I of II. Paul Preney, OCT, M.Sc., B.Ed., B.Sc. Introduction to MPI SHARCNET MPI Lecture Series: Part I of II Paul Preney, OCT, M.Sc., B.Ed., B.Sc. preney@sharcnet.ca School of Computer Science University of Windsor Windsor, Ontario, Canada Copyright

More information

Standard MPI - Message Passing Interface

Standard MPI - Message Passing Interface c Ewa Szynkiewicz, 2007 1 Standard MPI - Message Passing Interface The message-passing paradigm is one of the oldest and most widely used approaches for programming parallel machines, especially those

More information

Message Passing Interface. most of the slides taken from Hanjun Kim

Message Passing Interface. most of the slides taken from Hanjun Kim Message Passing Interface most of the slides taken from Hanjun Kim Message Passing Pros Scalable, Flexible Cons Someone says it s more difficult than DSM MPI (Message Passing Interface) A standard message

More information

CS 6230: High-Performance Computing and Parallelization Introduction to MPI

CS 6230: High-Performance Computing and Parallelization Introduction to MPI CS 6230: High-Performance Computing and Parallelization Introduction to MPI Dr. Mike Kirby School of Computing and Scientific Computing and Imaging Institute University of Utah Salt Lake City, UT, USA

More information

Lecture 7: More about MPI programming. Lecture 7: More about MPI programming p. 1

Lecture 7: More about MPI programming. Lecture 7: More about MPI programming p. 1 Lecture 7: More about MPI programming Lecture 7: More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems

More information

Scientific Computing

Scientific Computing Lecture on Scientific Computing Dr. Kersten Schmidt Lecture 21 Technische Universität Berlin Institut für Mathematik Wintersemester 2014/2015 Syllabus Linear Regression, Fast Fourier transform Modelling

More information

A short overview of parallel paradigms. Fabio Affinito, SCAI

A short overview of parallel paradigms. Fabio Affinito, SCAI A short overview of parallel paradigms Fabio Affinito, SCAI Why parallel? In principle, if you have more than one computing processing unit you can exploit that to: -Decrease the time to solution - Increase

More information

Introduction to Parallel Programming with MPI

Introduction to Parallel Programming with MPI Introduction to Parallel Programming with MPI PICASso Tutorial October 25-26, 2006 Stéphane Ethier (ethier@pppl.gov) Computational Plasma Physics Group Princeton Plasma Physics Lab Why Parallel Computing?

More information

CINES MPI. Johanne Charpentier & Gabriel Hautreux

CINES MPI. Johanne Charpentier & Gabriel Hautreux Training @ CINES MPI Johanne Charpentier & Gabriel Hautreux charpentier@cines.fr hautreux@cines.fr Clusters Architecture OpenMP MPI Hybrid MPI+OpenMP MPI Message Passing Interface 1. Introduction 2. MPI

More information

Programming with MPI

Programming with MPI Programming with MPI p. 1/?? Programming with MPI More on Datatypes and Collectives Nick Maclaren nmm1@cam.ac.uk May 2008 Programming with MPI p. 2/?? Less Basic Collective Use A few important facilities

More information

Optimization of MPI Applications Rolf Rabenseifner

Optimization of MPI Applications Rolf Rabenseifner Optimization of MPI Applications Rolf Rabenseifner University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Optimization of MPI Applications Slide 1 Optimization and Standardization

More information

Decomposing onto different processors

Decomposing onto different processors N-Body II: MPI Decomposing onto different processors Direct summation (N 2 ) - each particle needs to know about all other particles No locality possible Inherently a difficult problem to parallelize in

More information

Advanced Parallel Programming

Advanced Parallel Programming Advanced Parallel Programming Networks and All-to-All communication David Henty, Joachim Hein EPCC The University of Edinburgh Overview of this Lecture All-to-All communications MPI_Alltoall MPI_Alltoallv

More information

Lecture 9: MPI continued

Lecture 9: MPI continued Lecture 9: MPI continued David Bindel 27 Sep 2011 Logistics Matrix multiply is done! Still have to run. Small HW 2 will be up before lecture on Thursday, due next Tuesday. Project 2 will be posted next

More information

MPI MESSAGE PASSING INTERFACE

MPI MESSAGE PASSING INTERFACE MPI MESSAGE PASSING INTERFACE David COLIGNON, ULiège CÉCI - Consortium des Équipements de Calcul Intensif http://www.ceci-hpc.be Outline Introduction From serial source code to parallel execution MPI functions

More information

a. Assuming a perfect balance of FMUL and FADD instructions and no pipeline stalls, what would be the FLOPS rate of the FPU?

a. Assuming a perfect balance of FMUL and FADD instructions and no pipeline stalls, what would be the FLOPS rate of the FPU? CPS 540 Fall 204 Shirley Moore, Instructor Test November 9, 204 Answers Please show all your work.. Draw a sketch of the extended von Neumann architecture for a 4-core multicore processor with three levels

More information