Parallel Computing MPI. Christoph Beetz. September 7, Parallel Computing. Introduction. Parallel Computing
|
|
- Martin Welch
- 6 years ago
- Views:
Transcription
1 Christoph Beetz Theoretical Physics I, Ruhr-Universität Bochum September 7, 2010 What is Basic
2 Overview to code What is Basic
3 Why Calculation is too big to fit in memory of one machine domain on several machines takes too much time with only one CPU speed up with lots of CPU s What is Basic
4 Parallel Computer Architectures SMP: Symmetric Multiprocessor uses shared system resources What is Basic
5 Parallel Computer Architectures MPP: Massively Parallel Processors consists of nodes connected by a high-speed network What is Basic
6 Parallel Computer Architectures Mixed SMP and MPP consists of nodes with several cores What is Basic
7 Parallel Computer Architectures Hardware (Examples) Name CPUs CPU/Node Mem/Node Paddy GB portable Euler GB DaVinci x 516 GPUs/Node BlueGene / P GB RoadRunner x16 GB Opteron + Cell Jaguar x6 16 GB What is Basic
8 Software SMP Multithreaded ms What is Basic
9 Software SMP Multithreaded ms Posix-Threads ( Threads) OpenMP ( What is Basic
10 Software MPP Local serial s with communication What is Basic
11 Software MPP Local serial s with communication High Performance Fortran ( (main focus of this lecture) CH Open What is Basic
12 Heat transfer Problem Solve heat transfer equation in 2 dimensions on a cartesian grid with periodic boundary conditions: tu = a u (1) What is Basic
13 Heat transfer Serial heat implicit none integer, parameter :: mx=32, my=32 real(kind=8) :: u(0:mx+1,0:my+1) real(kind=8) :: unp1(0:mx+1,0:my+1) integer :: i,j,ic real(kind=8) :: x,y,xb,xe,yb,ye,dx,dy real(kind=8) :: ax,ay,pi,time,dt ax = 1 ay = 1 pi = 4.*atan(1.d0) dt = ! critical value dt = xb = 0.d0 xe = 1 yb = 0.d0 ye = 1 dx = (xe-xb)/mx dy = (ye-yb)/my write(*, (a,e12.6) ) cfl =, dt/min(dx,dy)**2 u=0. do j = -1,my+1 y = yb + j*dy do i = -1,mx+1 x = xb + i*dx if ((x-0.5)**2+(y-0.5)**2.lt. 0.2) then u(i,j)= 1 end if end do end do time = 0. ic = 0 call VTKout(u,mx,my) c --> simple Euler step do while (time.le. 1000*dt) c--> heat equation unp1(1:mx,1:my) = u(1:mx,1:my) & +dt*((u(2:mx+1,1:my)-2.*u(1:mx,1:my) & +u(0:mx-1,1:my))/dx**2 & +(u(1:mx,2:my+1)-2.*u(1:mx,1:my) & +u(1:mx,0:my-1))/dy**2) call boundary(unp1,mx,my) u = unp1 time = time+dt ic = ic+1 c write(*,*) time =,time if (ic > 10) then ic = 0 call VTKout(u,mx,my) end if end do return end subroutine boundary(u,mx,my) implicit none integer :: mx,my real(kind=8) :: u(0:mx+1,0:my+1) integer :: i,j do i=1,mx u(i,0 ) = u(i,my) u(i,my+1) = u(i,1 ) end do do j=0,my+1 u(0,j ) = u(mx,j) u(mx+1,j) = u(1,j) end do return end What is Basic
14 Serial The Grid What is Basic real(kind=8) :: u(0:mx+1,0:my+1)
15 Serial 1.Step: Computation What is Basic & & unp1(1:mx,1:my) = u(1:mx,1:my) +dt*((u(2:mx+1,1:my)-2.*u(1:mx,1:my)+u(0:mx-1,1:my))/dx**2 +(u(1:mx,2:my+1)-2.*u(1:mx,1:my)+u(1:mx,0:my-1))/dy**2)
16 Serial 2.Step: Boundary Exchange What is Basic call boundary(unp1,mx,my)
17 Serial Parallel The Grid mpirun -np 4 heatmpi call _INIT(ierror) call _COMM_SIZE(_COMM_WORLD, & nprocs,ierror) call _COMM_RANK(_COMM_WORLD, & myrank,ierror) What is Basic
18 Parallel The Grid! GLOBAL FIELD DIMENSIONS integer, parameter :: nx=1024, ny=1024! LOCAL FIELD DIMENSIONS integer :: mx,my nprocsxy(1)=int(sqrt(dble(nprocs))) nprocsxy(2)=nprocs/nprocsxy(2) if mod(nx,nprocsxy(1).ne.0) then stop mx=nx/nprocsxy(1) my=ny/nprocsxy(2) What is Basic fields must be allocatable now. Size (1:mx,1:my) depends on number of processes.
19 Parallel 1.Step: Computation For each process exactly the same computation as in serial case. What is Basic & & unp1(1:mx,1:my) = u(1:mx,1:my) +dt*((u(2:mx+1,1:my)-2.*u(1:mx,1:my)+u(0:mx-1,1:my))/dx**2 +(u(1:mx,2:my+1)-2.*u(1:mx,1:my)+u(1:mx,0:my-1))/dy**2)
20 Parallel 2.Step: Boundary Exchange What is Basic absolutely necessary! Each process has to send to his upper and lower neighbour in each direction. In one direction the data to be send is not contiguous in memory.
21 What is is a tool to consolidate what parallelization has seperated a library for communication between processes available for Fortran / C / C++ portable What is Basic
22 Basic Over 150 subroutines Only need a dozen to pararellize Classification Environment Management Collective Data Transfer Point to Point Data Transfer... What is Basic
23 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic
24 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic mpi.h provides parameter like COMM WORLD and variable types like INTEGER
25 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic Init has to be called once and only once, before calling any other -subroutine to initialize the environment.
26 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic In C++ all parameters have to be passed as pointers.
27 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic With Communicators you can define groups of processes. COMM WORLD is the Communicator for all processes.
28 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic Comm size returns number of processes in variable nprocs.
29 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic Comm rank returns the identification number of the process in variable myrank.
30 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic All processes run the same code. Different behaviour can be achived by using the rank of the processes.
31 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic Finalize terminates the environment.
32 Environment Management A minimal in C++ #include <mpi.h> void main(int* argc, char*** argv) { _Init(&argc, &argv); int nprocs,myrank; _Comm_size(_COMM_WORLD, &nprocs); _Comm_rank(_COMM_WORLD, &myrank); if (myrank==0) { std::cout << nproc << processes started. << std::endl; } std::cout << Hello from process << myrank << std::endl; _Finalize(); } What is Basic Sample Output: 2 processes started Hello from process 1 Hello from process 0
33 Environment Management A minimal in Fortran PROGRAM hello IMPLICIT NONE include mpif.h INTEGER ierror, nprocs, myrank CALL _INIT(ierror) CALL _COMM_SIZE(_COMM_WORLD, nprocs, ierror) CALL _COMM_RANK(_COMM_WORLD, myrank, ierror) IF (myrank==0) PRINT *,nprocs, processes started PRINT *, Hello from process,myrank CALL _FINALIZE(ierror) END PROGRAM hello What is Basic use Fortran include file.
34 Environment Management A minimal in Fortran PROGRAM hello IMPLICIT NONE include mpif.h INTEGER ierror, nprocs, myrank CALL _INIT(ierror) CALL _COMM_SIZE(_COMM_WORLD, nprocs, ierror) CALL _COMM_RANK(_COMM_WORLD, myrank, ierror) IF (myrank==0) PRINT *,nprocs, processes started PRINT *, Hello from process,myrank CALL _FINALIZE(ierror) END PROGRAM hello What is Basic Return code is given in the last argument. Note, that the wrong number of arguments will result in segmentation fault during run-time.
35 Environment Management BARRIER Blocks all processes until the last one calls this routine. CALL _BARRIER(communicator, ierror) Can be used to serialize parallel s. e.g. to write in a file in order of processes. DO (i=0,nprocs) IF (i==myrank) THEN do some IO-Routines END IF CALL _BARRIER(_COMM_WORLD,ierror) END DO What is Basic
36 Collective Communication BROADCAST Broadcasts data from one process to all processes in the communicator. CALL _BCAST(buffer, count, datatype, root, communicator, ierror) buffer count datatype root comm ierror First element of send / receive data Number of elements to be send datatype of elements in buffer Which process sends the data Communicator of Processes data will be send to Return value What is Basic
37 Collective Communication BROADCAST Broadcasts data from one process to all processes in the communicator. IF (myrank==0) buffer=42 ELSE buffer=0 CALL _BCAST(buffer, 1, _INTEGER, 0, _COMM_WORLD, ierror) What is Basic
38 Collective Communication GATHER Collects individual messages from each process at the root process CALL _GATHER(sendbuf, sendcnt, sendtype, recvbuf, recvcnt, recvtype, root, comm, ierror) sendbuf sendcnt sendtype recvbuf recvcnt recvtype root comm ierror First element of send data Number of elements to be send datatype of elements in sendbuf Buffer, where data is collected. Cannot overlap sendbuf Number of elements in recvbuf datatype of elements in recvbuf Which process collects the data Communicator of Processes Return value What is Basic
39 Collective Communication GATHER Collects individual messages from each process at the root process INTEGER :: irecv(3) INTEGER :: isend isend=myrank+1 CALL _GATHER(isend, 1, _INTEGER, irecv, 3, _INTEGER, 2, _COMM_WORLD, ierror) What is Basic
40 Collective Communication REDUCE Reduces data from all processes to one structure via data reduction operation CALL _REDUCE(sendbuf, recvbuf, count, datatype, op, root, communicator, ierror) sendbuf recvbuf count datatype op root comm ierror First element of send data buffer First element of receive data buffer Number of elements to be send datatype of elements in buffer reduction operation Which process receives the data Communicator of Processes data will be reduced from Return value What is Basic
41 Collective Communication REDUCE Reduces data from all processes to one structure via data reduction operation CALL _REDUCE(sendbuf, recvbuf, count, datatype, op, root, communicator, ierror) Reduction Operations: Write SUM the sum PROD the product MAX the maximum MIN the minimum What is Basic of sendbuf of all processes to recvbuf of process root.
42 Collective Communication REDUCE Reduces data from all processes to one structure via data reduction operation CALL _REDUCE(sendbuf, recvbuf, 1, _INTEGER, _SUM, 1, _COMM_WORLD, ierror) What is Basic
43 Collective Communication ALLREDUCE / ALLGATHER Same as REDUCE and GATHER but result is stored on every process CALL _ALLREDUCE(sendbuf, recvbuf, count, datatype, op, communicator, ierror) What is Basic
44 Collective Communication ALLTOALL Send a distinct message from each process to every process. Process i sends j-th block of sendbuffer to process j. Process j stores data from process i in i-th block of recvbuffer. CALL _ALLTOALL(sendbuf, sendcnt, sendtype, recvbuf, recvcnt, recvtype, comm, ierror) sendbuf sendcount sendtype recvbuf recvcount recvtype comm ierror First element of data to be send Number of elements to be send to each process Data type Starting adress of data to be received Number of elements received from any process Data type Communicator of Processes data will be send to Return value What is Basic
45 Collective Communication ALLTOALL Send a distinct message from each process to every process. Process i sends j-th block of sendbuffer to process j. Process j stores data from process i in i-th block of recvbuffer. CALL _ALLTOALL(sendbuf, 3, _INTEGER, recvbuf, 3, _INTEGER, _COMM_WORLD, ierror) What is Basic
46 Point to Point Communication SEND, RECV Used to send data to one specific process CALL _SEND(buffer, count, datatype, dest, tag, communicator, ierror) buffer count datatype dest tag comm ierror First element of data to be send Number of elements in buffer datatype of elements in buffer Rank of destination process Message tag. Any positiv integer number. Communicator of Processes data will be send to Return value What is Basic
47 Point to Point Communication SEND, RECV Used to receive data from one specific process CALL _RECV(buffer, count, datatype, source, tag, communicator, status, ierror) buffer count datatype source tag comm status ierror Received data will be stored here Number of elements to be received datatype of elements in buffer Rank of sending process Tag of the message or ANY TAG Communicator of Processes data will be send to Integer( STATUS SIZE) Return value What is Basic
48 Point to Point Communication SEND, RECV Send an INTEGER value from Proc0 to Proc1 INTEGER :: B INTEGER, DIMENSION(_STATUS_SIZE) :: status... B=0 IF(myRank==0) THEN B=42 CALL _SEND(B, 1, _INTEGER, 1, 0, _COMM_WORLD, ierror) ELSE IF (myrank==1) THEN CALL _RECV(B, 1, _INTEGER, 0, _ANY_TAG,_COMM_WORLD, status, ierror) END IF... What is Basic
49 Point to Point Communication SENDRECV Send and receive data in a combined operation CALL _SENDRECV(sendbuf,sendcount,sendtype,dest,sendtag, recvbuf,recvcount,recvtype,source,recvtag, comm,status,ierror) Parameter of SEND What is Basic
50 Point to Point Communication SENDRECV Send and receive data in a combined operation CALL _SENDRECV(sendbuf,sendcount,sendtype,dest,sendtag, recvbuf,recvcount,recvtype,source,recvtag, comm,status,ierror) Parameter of RECV What is Basic
51 Point to Point Communication SENDRECV Send and receive data in a combined operation src=myrank-1 dest=myrank+1 if (myrank==0) src=nprocs-1 if (myrank==nprocs-1) dest=0 call _SENDRECV(sendbuf, 1, _INTEGER, dest, 1, recvbuf, 1, _INTEGER, src, _ANY_TAG, _COMM_WORLD, stat, ierror) What is Basic
52 Point to Point Communication What happens on Send and Receive Send Receive Copy data into user buffer Call SEND copies user buffer into system buffer System buffer is send to receiving process Call RECV Receive data in system buffer copies data into user buffer Use data in user buffer What is Basic
53 Point to Point Communication Blocking/ Nonblocking Blocking: procedure blocks until copy to system buffer has completed Nonblocking : procedure returns immediately What is Basic all collective operations send, recv isend, irecv exceedingly effective with communication co-processor
54 Point to Point Communication IRECV / ISEND / WAIT / WAITALL _ISEND(buffer, count, datatype, dest, tag, communicator, request, ierror) _IRECV(buffer, count, datatype, source, tag, communicator, request, ierror) _WAIT(request, status, ierror) _WAITALL(count, request, status, ierror) request Communication handle status Integer( STATUS SIZE) WAIT blocks the m until the request-operation is finished. ISEND/ IRECV followed immediatly by WAIT is the same as a blocking operation. You can mix blocking with nonblocking operations. e.g.: send with ISEND and receive with RECV WAITALL waits for serveral ISEND/IRECV. request has to be an array then. What is Basic
55 Point to Point Communication IRECV / ISEND / WAIT / WAITALL left=myrank-1 right=myrank+1 if (myrank==0) left=nprocs-1 if (myrank==nprocs-1) right=0 call _IRECV(lRank, 1, _INTEGER, left, _ANY_TAG, _COMM_WORLD, req(1), ierror) call _IRECV(rRank, 1, _INTEGER, right, _ANY_TAG, _COMM_WORLD, req(2), ierror) call _ISEND(myRank, 1, _INTEGER, right, myrank, _COMM_WORLD, req(3), ierror) call _ISEND(myRank, 1, _INTEGER, left, myrank, _COMM_WORLD, req(4), ierror) call _WAITALL(4, req, stat, ierror) What is Basic MyRank= 0 left Rank= 3 right Rank= 1 MyRank= 1 left Rank= 0 right Rank= 2 MyRank= 2 left Rank= 1 right Rank= 3 MyRank= 3 left Rank= 2 right Rank= 0
56 Point to Point Communication Bidirectional Communication Three cases of bidirectional communication Both processes call send routine first, then receive Both processes call receive routine first, then send One process calls receive first then send and the other calls in opposite order What is Basic
57 Point to Point Communication First Send then Receive Blocking Program is returning from send when the copy is finished. Can cause deadlocks, if system buffer is smaller than sendbuffer Nonblocking Both processes: ISEND(.., req,...), RECV, WAIT(..,req,..) free from deadlock What is Basic
58 Point to Point Communication First Receive then Send Blocking Blocking receive returns when copy from system buffer completed Deadlock on any machine Nonblocking Both processes: IRECV(.., req,...), SEND, WAIT(..,req,..) free from deadlock What is Basic
59 Point to Point Communication Receive / Send and Send / Receive Always save Use blocking or non-blocking subroutines Considering performance and the avoidance of deadlocks, use if (myrank==0) then call _ISEND(sendbuf,...,ireq1,...) call _IRECV(recvbuf,...,ireq2,...) elseif (myrank==1) then call _IRECV(recvbuf,...,ireq1,...) call _ISEND(sendbuf,...,ireq2,...) endif call _WAIT(ireq1,...) call _WAIT(ireq2,...) What is Basic
60 Whenever possible use collective operations Mind the succession of Send and Recvs Don t reuse buffer, until operation is finished Don t mix nonblocking point to point with collective operations Care about scalabillity IF (myrank==0) THEN _SEND(myrank,..,myrank+1,..) _RECV(sum,..,nproc-1,..) PRINT *,"The sum is ",sum ELSE IF (myrank==nproc-1) THEN _RECV(sum,..,myrank-1,..) sum=sum+myrank _SEND(sum,..,0,..) ELSE _RECV(sum,..,myrank-1,..) sum=sum+myrank _SEND(sum,..,myrank+1,..) END IF CALL _REDUCE(myrank, sum,..., _SUM, 0,...) IF (myrank==0) PRINT *, The sum is,sum What is Basic
61 Whenever possible use collective operations Mind the succession of Send and Recvs Don t reuse buffer, until operation is finished Don t mix nonblocking point to point with collective operations Care about scalabillity CPU0 _ISEND(..,cpu1,..,tag1,..) _ISEND(..,cpu1,..,tag2,..) _ISEND(..,cpu1,..,tag3,..) _ISEND(..,cpu1,..,tag4,..) CPU1 _RECV(..,cpu0,..,tag4,..) _RECV(..,cpu0,..,tag3,..) _RECV(..,cpu0,..,tag2,..) _RECV(..,cpu0,..,tag1,..) What is Basic Will allocate huge amount of memory. System crash possible.
62 Whenever possible use collective operations Mind the succession of Send and Recvs Don t reuse buffer, until operation is finished Don t mix nonblocking point to point with collective operations Care about scalabillity _ISEND(buffer,..,req,..)... buffer(0)=something... _WAIT(req,ierror) _IRECV(buffer,..,req,..)... z=buffer(0)... _WAIT(req,ierror) What is Basic Results in data race and wrong results.
63 Whenever possible use collective operations Mind the succession of Send and Recvs Don t reuse buffer, until operation is finished Don t mix nonblocking point to point with collective operations Care about scalabillity CPU0 _ISEND(buffer,..,req,..)... _BARRIER(..)... _WAIT(req,ierror) CPU1 _RECV(buffer,...)... _BARRIER(..) What is Basic On the ragged edge of legality. Will your survive it?
64 Whenever possible use collective operations Mind the succession of Send and Recvs Don t reuse buffer, until operation is finished Don t mix nonblocking point to point with collective operations Care about scalabillity CPU1 to CPUn _SEND(buffer,..,CPU0,..) CPU0 do i=1,n _RECV(buffer,..,CPUi,..) What is Basic Not scalable! Don t write such code.
65 Parallelization Strategy Speed up Increase fraction of that can be parallelized Balance the workload. Sometimes hard but worth it. Minimize communication time by decreasing amount of data transmitted decreasing number of times data is transmitted Minimize cache misses in loops What is Basic
66 Decrease amount of data transmitted What is Basic
67 Decrease number of times data is transmitted What is Basic prefere domain with contiguous boundary copy uncontiguous memory to contiguous buffer and send it in one operation use derived data type, but it may be not optimal
68 Minimize cache misses do j=jstart, jend do i=1, n a(i,j)=b(i,j)+c(i,j) end do end do What is Basic minimize cache misses.
69 Minimize cache misses do j=jstart, jend do i=1, n a(i,j)=b(i,j)+c(i,j) end do end do do j=1, n do i=istart, iend a(i,j)=b(i-1,j)+b(i+1,j) end do end do What is Basic minimize cache misses. but, communication latency is much larger than memory latency.
70 Virtual topologies CART CREATE Makes a new communicator to which topology information of an cartesian grid has been attached. _CART_CREATE ( comm, dim, procperdim, periods, reorder, newcomm, ierror ) comm dim procperdim periods reorder newcomm ierror Input communicator Number of dimensions of cartesian grid Number of processes in each dimension, integer(dim) Grid is periodic (true) or not (false), bool(dim) Ranking of processes may be reordered (true) or not (false) New communicator Return value What is Basic
71 Virtual topologies CART CREATE Makes a new communicator to which topology information of an cartesian grid has been attached. procs=(/4,2,3/) periods=(/1,1,1/) _CART_CREATE (_COMM_WORLD, 3, procs, periods, &.true., comm3d, ierror ) What is Basic
72 Virtual topologies CART SHIFT Returns shifted source and destination ranks. _CART_SHIFT (comm, direction, displ, source, dest, ierror ) comm direction displ source dest ierror Communicator with cartesian structure Coordinate dimension of shift Displacement(>0: upwards, <0 downwards) Rank of source process Rank of destination process Return value What is Basic If the dimension specified by direction is non-periodic, off-end shifts result in the value PROC NULL.
73 Virtual topologies CART COORDS Translates process rank in a communicator into cartesian coordinates. _CART_COORDS(comm,rank,maxdims,coords,ierror); comm Communicator with cartesian structure. rank Rank of process within group comm. dim Number of dimensions of cartesian grid. coords Cartesian coordinates of the process. ierror Return value What is Basic
74 Virtual topologies Example: Dimension=3, 24 Processes call _Cart_create(_COMM_WORLD, dim, nproc, periods, 0, & comm3d, ierror) call _Cart_coords(comm3d, myrank, dim, coords,ierror) call _Cart_rank(comm3d, coords, rank3d, ierror) call _Cart_shift(comm3d, 0, 1, left,right,ierror) call _Cart_shift(comm3d, 1, 1, down, up,ierror) call _Cart_shift(comm3d, 2, 1, south,north,ierror) I am proc 13: coords(1)= 2 I am proc 13: coords(2)= 0 I am proc 13: coords(3)= 1 I am proc 13: left = 7 I am proc 13: right = 19 I am proc 13: down = 16 I am proc 13: up = 16 I am proc 13: north = 14 I am proc 13: south = 12 What is Basic
75 analysis tool MPE Method 0 copy to buffer irecv, isend, waitall What is Basic
76 analysis tool MPE Method 1 copy to buffer (irecv,isend / isend,irecv), waitall What is Basic
77 analysis tool MPE Method 2 copy to buffer sendrecv What is Basic
78 racoon II Scaling of a magnetohydrodynamic turbulence simulation Speedup: how much faster is the computation on x processors than on 256 Speedup mixed hard/weak scaling of driven compmhd3d 1/256 x 512 3, BS , BS , BS , BS 32 What is Basic # cores
Introduction to MPI, the Message Passing Library
Chapter 3, p. 1/57 Basics of Basic Messages -To-? Introduction to, the Message Passing Library School of Engineering Sciences Computations for Large-Scale Problems I Chapter 3, p. 2/57 Outline Basics of
More informationMasterpraktikum - Scientific Computing, High Performance Computing
Masterpraktikum - Scientific Computing, High Performance Computing Message Passing Interface (MPI) Thomas Auckenthaler Wolfgang Eckhardt Technische Universität München, Germany Outline Hello World P2P
More informationMasterpraktikum - Scientific Computing, High Performance Computing
Masterpraktikum - Scientific Computing, High Performance Computing Message Passing Interface (MPI) and CG-method Michael Bader Alexander Heinecke Technische Universität München, Germany Outline MPI Hello
More informationHPC Parallel Programing Multi-node Computation with MPI - I
HPC Parallel Programing Multi-node Computation with MPI - I Parallelization and Optimization Group TATA Consultancy Services, Sahyadri Park Pune, India TCS all rights reserved April 29, 2013 Copyright
More informationTopic Notes: Message Passing Interface (MPI)
Computer Science 400 Parallel Processing Siena College Fall 2008 Topic Notes: Message Passing Interface (MPI) The Message Passing Interface (MPI) was created by a standards committee in the early 1990
More informationIntroduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign
Introduction to MPI May 20, 2013 Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Top500.org PERFORMANCE DEVELOPMENT 1 Eflop/s 162 Pflop/s PROJECTED 100 Pflop/s
More informationAdvanced Message-Passing Interface (MPI)
Outline of the workshop 2 Advanced Message-Passing Interface (MPI) Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@mcgill.ca Morning: Advanced MPI Revision More on Collectives More on Point-to-Point
More informationMessage Passing Interface: Basic Course
Overview of DM- HPC2N, UmeåUniversity, 901 87, Sweden. April 23, 2015 Table of contents Overview of DM- 1 Overview of DM- Parallelism Importance Partitioning Data Distributed Memory Working on Abisko 2
More informationMessage Passing Interface. most of the slides taken from Hanjun Kim
Message Passing Interface most of the slides taken from Hanjun Kim Message Passing Pros Scalable, Flexible Cons Someone says it s more difficult than DSM MPI (Message Passing Interface) A standard message
More informationMPI Collective communication
MPI Collective communication CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) MPI Collective communication Spring 2018 1 / 43 Outline 1 MPI Collective communication
More informationMessage Passing Interface
Message Passing Interface DPHPC15 TA: Salvatore Di Girolamo DSM (Distributed Shared Memory) Message Passing MPI (Message Passing Interface) A message passing specification implemented
More informationParallel Programming
Parallel Programming for Multicore and Cluster Systems von Thomas Rauber, Gudula Rünger 1. Auflage Parallel Programming Rauber / Rünger schnell und portofrei erhältlich bei beck-shop.de DIE FACHBUCHHANDLUNG
More informationAgenda. MPI Application Example. Praktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI. 1) Recap: MPI. 2) 2.
Praktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI Agenda 1) Recap: MPI 2) 2. Übungszettel 3) Projektpräferenzen? 4) Nächste Woche: 3. Übungszettel, Projektauswahl, Konzepte 5)
More informationMessage Passing Interface - MPI
Message Passing Interface - MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 24, 2011 Many slides adapted from lectures by
More informationMPI: Parallel Programming for Extreme Machines. Si Hammond, High Performance Systems Group
MPI: Parallel Programming for Extreme Machines Si Hammond, High Performance Systems Group Quick Introduction Si Hammond, (sdh@dcs.warwick.ac.uk) WPRF/PhD Research student, High Performance Systems Group,
More informationMessage Passing Interface
MPSoC Architectures MPI Alberto Bosio, Associate Professor UM Microelectronic Departement bosio@lirmm.fr Message Passing Interface API for distributed-memory programming parallel code that runs across
More informationHigh-Performance Computing: MPI (ctd)
High-Performance Computing: MPI (ctd) Adrian F. Clark: alien@essex.ac.uk 2015 16 Adrian F. Clark: alien@essex.ac.uk High-Performance Computing: MPI (ctd) 2015 16 1 / 22 A reminder Last time, we started
More informationECE 574 Cluster Computing Lecture 13
ECE 574 Cluster Computing Lecture 13 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 21 March 2017 Announcements HW#5 Finally Graded Had right idea, but often result not an *exact*
More informationPractical Introduction to Message-Passing Interface (MPI)
1 Outline of the workshop 2 Practical Introduction to Message-Passing Interface (MPI) Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@mcgill.ca Theoretical / practical introduction Parallelizing your
More informationIntroduction to MPI. Ricardo Fonseca. https://sites.google.com/view/rafonseca2017/
Introduction to MPI Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Distributed Memory Programming (MPI) Message Passing Model Initializing and terminating programs Point to point
More informationDistributed Memory Systems: Part IV
Chapter 5 Distributed Memory Systems: Part IV Max Planck Institute Magdeburg Jens Saak, Scientific Computing II 293/342 The Message Passing Interface is a standard for creation of parallel programs using
More informationProgramming with MPI Collectives
Programming with MPI Collectives Jan Thorbecke Type to enter text Delft University of Technology Challenge the future Collectives Classes Communication types exercise: BroadcastBarrier Gather Scatter exercise:
More informationMPI. (message passing, MIMD)
MPI (message passing, MIMD) What is MPI? a message-passing library specification extension of C/C++ (and Fortran) message passing for distributed memory parallel programming Features of MPI Point-to-point
More informationIntroduction to Lab Series DMS & MPI
TDDC 78 Labs: Memory-based Taxonomy Introduction to Lab Series DMS & Mikhail Chalabine Linköping University Memory Lab(s) Use Distributed 1 Shared 2 3 Posix threads OpenMP Distributed 4 2011 LAB 5 (tools)
More informationOutline. Communication modes MPI Message Passing Interface Standard. Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa Tp.HCM
THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking
More informationReview of MPI Part 2
Review of MPI Part Russian-German School on High Performance Computer Systems, June, 7 th until July, 6 th 005, Novosibirsk 3. Day, 9 th of June, 005 HLRS, University of Stuttgart Slide Chap. 5 Virtual
More informationPraktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI
Praktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI Agenda 1) MPI für Java Installation OK? 2) 2. Übungszettel Grundidee klar? 3) Projektpräferenzen? 4) Nächste Woche: 3. Übungszettel,
More informationCollective Communication: Gatherv. MPI v Operations. root
Collective Communication: Gather MPI v Operations A Gather operation has data from all processes collected, or gathered, at a central process, referred to as the root Even the root process contributes
More informationLecture 9: MPI continued
Lecture 9: MPI continued David Bindel 27 Sep 2011 Logistics Matrix multiply is done! Still have to run. Small HW 2 will be up before lecture on Thursday, due next Tuesday. Project 2 will be posted next
More informationParallel programming MPI
Parallel programming MPI Distributed memory Each unit has its own memory space If a unit needs data in some other memory space, explicit communication (often through network) is required Point-to-point
More informationNon-Blocking Communications
Non-Blocking Communications Deadlock 1 5 2 3 4 Communicator 0 2 Completion The mode of a communication determines when its constituent operations complete. - i.e. synchronous / asynchronous The form of
More informationOutline. Communication modes MPI Message Passing Interface Standard
MPI THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking
More informationCollective Communication in MPI and Advanced Features
Collective Communication in MPI and Advanced Features Pacheco s book. Chapter 3 T. Yang, CS240A. Part of slides from the text book, CS267 K. Yelick from UC Berkeley and B. Gropp, ANL Outline Collective
More informationMore about MPI programming. More about MPI programming p. 1
More about MPI programming More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems, the CPUs share
More informationNon-Blocking Communications
Non-Blocking Communications Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationLecture 7: More about MPI programming. Lecture 7: More about MPI programming p. 1
Lecture 7: More about MPI programming Lecture 7: More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems
More informationCSE 613: Parallel Programming. Lecture 21 ( The Message Passing Interface )
CSE 613: Parallel Programming Lecture 21 ( The Message Passing Interface ) Jesmin Jahan Tithi Department of Computer Science SUNY Stony Brook Fall 2013 ( Slides from Rezaul A. Chowdhury ) Principles of
More informationRecap of Parallelism & MPI
Recap of Parallelism & MPI Chris Brady Heather Ratcliffe The Angry Penguin, used under creative commons licence from Swantje Hess and Jannis Pohlmann. Warwick RSE 13/12/2017 Parallel programming Break
More informationThe MPI Message-passing Standard Practical use and implementation (V) SPD Course 6/03/2017 Massimo Coppola
The MPI Message-passing Standard Practical use and implementation (V) SPD Course 6/03/2017 Massimo Coppola Intracommunicators COLLECTIVE COMMUNICATIONS SPD - MPI Standard Use and Implementation (5) 2 Collectives
More informationCOMP 322: Fundamentals of Parallel Programming. Lecture 34: Introduction to the Message Passing Interface (MPI), contd
COMP 322: Fundamentals of Parallel Programming Lecture 34: Introduction to the Message Passing Interface (MPI), contd Vivek Sarkar, Eric Allen Department of Computer Science, Rice University Contact email:
More informationCollective Communication: Gather. MPI - v Operations. Collective Communication: Gather. MPI_Gather. root WORKS A OK
Collective Communication: Gather MPI - v Operations A Gather operation has data from all processes collected, or gathered, at a central process, referred to as the root Even the root process contributes
More informationMPI - v Operations. Collective Communication: Gather
MPI - v Operations Based on notes by Dr. David Cronk Innovative Computing Lab University of Tennessee Cluster Computing 1 Collective Communication: Gather A Gather operation has data from all processes
More informationStandard MPI - Message Passing Interface
c Ewa Szynkiewicz, 2007 1 Standard MPI - Message Passing Interface The message-passing paradigm is one of the oldest and most widely used approaches for programming parallel machines, especially those
More informationWhat s in this talk? Quick Introduction. Programming in Parallel
What s in this talk? Parallel programming methodologies - why MPI? Where can I use MPI? MPI in action Getting MPI to work at Warwick Examples MPI: Parallel Programming for Extreme Machines Si Hammond,
More informationCOMP 322: Fundamentals of Parallel Programming
COMP 322: Fundamentals of Parallel Programming https://wiki.rice.edu/confluence/display/parprog/comp322 Lecture 37: Introduction to MPI (contd) Vivek Sarkar Department of Computer Science Rice University
More informationDistributed Memory Programming with MPI
Distributed Memory Programming with MPI Moreno Marzolla Dip. di Informatica Scienza e Ingegneria (DISI) Università di Bologna moreno.marzolla@unibo.it Algoritmi Avanzati--modulo 2 2 Credits Peter Pacheco,
More informationCS 179: GPU Programming. Lecture 14: Inter-process Communication
CS 179: GPU Programming Lecture 14: Inter-process Communication The Problem What if we want to use GPUs across a distributed system? GPU cluster, CSIRO Distributed System A collection of computers Each
More informationIntroduction to MPI Part II Collective Communications and communicators
Introduction to MPI Part II Collective Communications and communicators Andrew Emerson, Fabio Affinito {a.emerson,f.affinito}@cineca.it SuperComputing Applications and Innovation Department Collective
More informationMPI 5. CSCI 4850/5850 High-Performance Computing Spring 2018
MPI 5 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives
More informationFirst day. Basics of parallel programming. RIKEN CCS HPC Summer School Hiroya Matsuba, RIKEN CCS
First day Basics of parallel programming RIKEN CCS HPC Summer School Hiroya Matsuba, RIKEN CCS Today s schedule: Basics of parallel programming 7/22 AM: Lecture Goals Understand the design of typical parallel
More informationHigh Performance Computing
High Performance Computing Course Notes 2009-2010 2010 Message Passing Programming II 1 Communications Point-to-point communications: involving exact two processes, one sender and one receiver For example,
More informationIntroduction to MPI. HY555 Parallel Systems and Grids Fall 2003
Introduction to MPI HY555 Parallel Systems and Grids Fall 2003 Outline MPI layout Sending and receiving messages Collective communication Datatypes An example Compiling and running Typical layout of an
More informationIntroduction to MPI: Part II
Introduction to MPI: Part II Pawel Pomorski, University of Waterloo, SHARCNET ppomorsk@sharcnetca November 25, 2015 Summary of Part I: To write working MPI (Message Passing Interface) parallel programs
More informationMPI - The Message Passing Interface
MPI - The Message Passing Interface The Message Passing Interface (MPI) was first standardized in 1994. De facto standard for distributed memory machines. All Top500 machines (http://www.top500.org) are
More informationDistributed Systems + Middleware Advanced Message Passing with MPI
Distributed Systems + Middleware Advanced Message Passing with MPI Gianpaolo Cugola Dipartimento di Elettronica e Informazione Politecnico, Italy cugola@elet.polimi.it http://home.dei.polimi.it/cugola
More informationIntroduction to MPI part II. Fabio AFFINITO
Introduction to MPI part II Fabio AFFINITO (f.affinito@cineca.it) Collective communications Communications involving a group of processes. They are called by all the ranks involved in a communicator (or
More informationAdvanced MPI. Andrew Emerson
Advanced MPI Andrew Emerson (a.emerson@cineca.it) Agenda 1. One sided Communications (MPI-2) 2. Dynamic processes (MPI-2) 3. Profiling MPI and tracing 4. MPI-I/O 5. MPI-3 11/12/2015 Advanced MPI 2 One
More informationBasic MPI Communications. Basic MPI Communications (cont d)
Basic MPI Communications MPI provides two non-blocking routines: MPI_Isend(buf,cnt,type,dst,tag,comm,reqHandle) buf: source of data to be sent cnt: number of data elements to be sent type: type of each
More informationFor developers. If you do need to have all processes write e.g. debug messages, you d then use channel 12 (see below).
For developers A. I/O channels in SELFE You need to exercise caution when dealing with parallel I/O especially for writing. For writing outputs, you d generally let only 1 process do the job, e.g. if(myrank==0)
More informationUNIVERSITY OF MORATUWA
UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Engineering 2012 Intake Semester 8 Examination CS4532 CONCURRENT PROGRAMMING Time allowed: 2 Hours March
More informationNUMERICAL PARALLEL COMPUTING
Lecture 5, March 23, 2012: The Message Passing Interface http://people.inf.ethz.ch/iyves/pnc12/ Peter Arbenz, Andreas Adelmann Computer Science Dept, ETH Zürich E-mail: arbenz@inf.ethz.ch Paul Scherrer
More informationIntroduction to the Message Passing Interface (MPI)
Introduction to the Message Passing Interface (MPI) CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction to the Message Passing Interface (MPI) Spring 2018
More informationData parallelism. [ any app performing the *same* operation across a data stream ]
Data parallelism [ any app performing the *same* operation across a data stream ] Contrast stretching: Version Cores Time (secs) Speedup while (step < NumSteps &&!converged) { step++; diffs = 0; foreach
More informationIntroduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR
Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR clematis@ge.imati.cnr.it Ack. & riferimenti An Introduction to MPI Parallel Programming with the Message Passing InterfaceWilliam
More informationMPI MESSAGE PASSING INTERFACE
MPI MESSAGE PASSING INTERFACE David COLIGNON CÉCI - Consortium des Équipements de Calcul Intensif http://hpc.montefiore.ulg.ac.be Outline Introduction From serial source code to parallel execution MPI
More informationOptimization of MPI Applications Rolf Rabenseifner
Optimization of MPI Applications Rolf Rabenseifner University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Optimization of MPI Applications Slide 1 Optimization and Standardization
More informationmpi4py HPC Python R. Todd Evans January 23, 2015
mpi4py HPC Python R. Todd Evans rtevans@tacc.utexas.edu January 23, 2015 What is MPI Message Passing Interface Most useful on distributed memory machines Many implementations, interfaces in C/C++/Fortran
More informationCS 6230: High-Performance Computing and Parallelization Introduction to MPI
CS 6230: High-Performance Computing and Parallelization Introduction to MPI Dr. Mike Kirby School of Computing and Scientific Computing and Imaging Institute University of Utah Salt Lake City, UT, USA
More informationINTRODUCTION TO MPI COMMUNICATORS AND VIRTUAL TOPOLOGIES
INTRODUCTION TO MPI COMMUNICATORS AND VIRTUAL TOPOLOGIES Introduction to Parallel Computing with MPI and OpenMP 24 november 2017 a.marani@cineca.it WHAT ARE COMMUNICATORS? Many users are familiar with
More informationThe Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing
The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Parallelism Decompose the execution into several tasks according to the work to be done: Function/Task
More informationSlides prepared by : Farzana Rahman 1
Introduction to MPI 1 Background on MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers, and parallel programmers Used to create parallel programs based
More informationParallel Programming, MPI Lecture 2
Parallel Programming, MPI Lecture 2 Ehsan Nedaaee Oskoee 1 1 Department of Physics IASBS IPM Grid and HPC workshop IV, 2011 Outline 1 Point-to-Point Communication Non Blocking PTP Communication 2 Collective
More informationWelcome to the introductory workshop in MPI programming at UNICC
Welcome...... to the introductory workshop in MPI programming at UNICC Schedule: 08.00-12.00 Hard work and a short coffee break Scope of the workshop: We will go through the basics of MPI-programming and
More informationINTRODUCTION TO MPI VIRTUAL TOPOLOGIES
INTRODUCTION TO MPI VIRTUAL TOPOLOGIES Introduction to Parallel Computing with MPI and OpenMP 18-19-20 november 2013 a.marani@cineca.it g.muscianisi@cineca.it l.ferraro@cineca.it VIRTUAL TOPOLOGY Topology:
More informationMessage Passing Interface - MPI
Message Passing Interface - MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico March 31, 2016 Many slides adapted from lectures by Bill
More informationL15: Putting it together: N-body (Ch. 6)!
Outline L15: Putting it together: N-body (Ch. 6)! October 30, 2012! Review MPI Communication - Blocking - Non-Blocking - One-Sided - Point-to-Point vs. Collective Chapter 6 shows two algorithms (N-body
More informationPart - II. Message Passing Interface. Dheeraj Bhardwaj
Part - II Dheeraj Bhardwaj Department of Computer Science & Engineering Indian Institute of Technology, Delhi 110016 India http://www.cse.iitd.ac.in/~dheerajb 1 Outlines Basics of MPI How to compile and
More informationMPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016
MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared
More informationAn Introduction to MPI
An Introduction to MPI Parallel Programming with the Message Passing Interface William Gropp Ewing Lusk Argonne National Laboratory 1 Outline Background The message-passing model Origins of MPI and current
More informationCS4961 Parallel Programming. Lecture 16: Introduction to Message Passing 11/3/11. Administrative. Mary Hall November 3, 2011.
CS4961 Parallel Programming Lecture 16: Introduction to Message Passing Administrative Next programming assignment due on Monday, Nov. 7 at midnight Need to define teams and have initial conversation with
More informationAdvanced Parallel Programming
Advanced Parallel Programming Networks and All-to-All communication David Henty, Joachim Hein EPCC The University of Edinburgh Overview of this Lecture All-to-All communications MPI_Alltoall MPI_Alltoallv
More informationChip Multiprocessors COMP Lecture 9 - OpenMP & MPI
Chip Multiprocessors COMP35112 Lecture 9 - OpenMP & MPI Graham Riley 14 February 2018 1 Today s Lecture Dividing work to be done in parallel between threads in Java (as you are doing in the labs) is rather
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms http://sudalabissu-tokyoacjp/~reiji/pna16/ [ 5 ] MPI: Message Passing Interface Parallel Numerical Algorithms / IST / UTokyo 1 PNA16 Lecture Plan General Topics 1 Architecture
More informationParallel Programming Using MPI
Parallel Programming Using MPI Short Course on HPC 15th February 2019 Aditya Krishna Swamy adityaks@iisc.ac.in SERC, Indian Institute of Science When Parallel Computing Helps? Want to speed up your calculation
More informationCEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced
1 / 32 CEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced Albert S. Kim Department of Civil and Environmental Engineering University of Hawai i at Manoa 2540 Dole
More informationPARALLEL AND DISTRIBUTED COMPUTING
PARALLEL AND DISTRIBUTED COMPUTING 2013/2014 1 st Semester 1 st Exam January 7, 2014 Duration: 2h00 - No extra material allowed. This includes notes, scratch paper, calculator, etc. - Give your answers
More informationProgramming Scalable Systems with MPI. Clemens Grelck, University of Amsterdam
Clemens Grelck University of Amsterdam UvA / SurfSARA High Performance Computing and Big Data Course June 2014 Parallel Programming with Compiler Directives: OpenMP Message Passing Gentle Introduction
More informationParallel Computing Paradigms
Parallel Computing Paradigms Message Passing João Luís Ferreira Sobral Departamento do Informática Universidade do Minho 31 October 2017 Communication paradigms for distributed memory Message passing is
More informationPractical Course Scientific Computing and Visualization
July 5, 2006 Page 1 of 21 1. Parallelization Architecture our target architecture: MIMD distributed address space machines program1 data1 program2 data2 program program3 data data3.. program(data) program1(data1)
More informationMA471. Lecture 5. Collective MPI Communication
MA471 Lecture 5 Collective MPI Communication Today: When all the processes want to send, receive or both Excellent website for MPI command syntax available at: http://www-unix.mcs.anl.gov/mpi/www/ 9/10/2003
More informationCluster Computing MPI. Industrial Standard Message Passing
MPI Industrial Standard Message Passing MPI Features Industrial Standard Highly portable Widely available SPMD programming model Synchronous execution MPI Outer scope int MPI_Init( int *argc, char ** argv)
More informationMPI Optimisation. Advanced Parallel Programming. David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh
MPI Optimisation Advanced Parallel Programming David Henty, Iain Bethune, Dan Holmes EPCC, University of Edinburgh Overview Can divide overheads up into four main categories: Lack of parallelism Load imbalance
More informationAcknowledgments. Programming with MPI Basic send and receive. A Minimal MPI Program (C) Contents. Type to enter text
Acknowledgments Programming with MPI Basic send and receive Jan Thorbecke Type to enter text This course is partly based on the MPI course developed by Rolf Rabenseifner at the High-Performance Computing-Center
More informationProgramming with MPI Basic send and receive
Programming with MPI Basic send and receive Jan Thorbecke Type to enter text Delft University of Technology Challenge the future Acknowledgments This course is partly based on the MPI course developed
More informationMPI Tutorial. Shao-Ching Huang. High Performance Computing Group UCLA Institute for Digital Research and Education
MPI Tutorial Shao-Ching Huang High Performance Computing Group UCLA Institute for Digital Research and Education Center for Vision, Cognition, Learning and Art, UCLA July 15 22, 2013 A few words before
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 8 Matrix-vector Multiplication Chapter Objectives Review matrix-vector multiplication Propose replication of vectors Develop three
More informationAdvanced MPI. Andrew Emerson
Advanced MPI Andrew Emerson (a.emerson@cineca.it) Agenda 1. One sided Communications (MPI-2) 2. Dynamic processes (MPI-2) 3. Profiling MPI and tracing 4. MPI-I/O 5. MPI-3 22/02/2017 Advanced MPI 2 One
More informationParallel Programming. Matrix Decomposition Options (Matrix-Vector Product)
Parallel Programming Matrix Decomposition Options (Matrix-Vector Product) Matrix Decomposition Sequential algorithm and its complexity Design, analysis, and implementation of three parallel programs using
More informationA few words about MPI (Message Passing Interface) T. Edwald 10 June 2008
A few words about MPI (Message Passing Interface) T. Edwald 10 June 2008 1 Overview Introduction and very short historical review MPI - as simple as it comes Communications Process Topologies (I have no
More informationCopyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 8
Chapter 8 Matrix-vector Multiplication Chapter Objectives Review matrix-vector multiplicaiton Propose replication of vectors Develop three parallel programs, each based on a different data decomposition
More information