Message-Passing Thought Exercise

Size: px

Start display at page:

Download "Message-Passing Thought Exercise"

Margery Andrea Jenkins
6 years ago
Views:

1 Message-Passing Thought Exercise Traffic Modelling David Henty EPCC

2 traffic flow 15 May 2015 Traffic Modelling 2 we want to predict traffic flow to look for effects such as congestion build a computer model

3 simple traffic model divide road into a series of cells either occupied or unoccupied perform a number of steps each step, cars move forward if space ahead is empty could do this by moving pawns on a chess board 15 May 2015 Traffic Modelling 3

4 traffic behaviour 15 May 2015 Traffic Modelling 4 model predicts a number of interesting features traffic lights congestion average speed 1.0 more complicated models are used in practice % 50% 100% density of cars

5 how fast can we run the model? measure speed in Car Operations Per second how many COPs? around 2 COPs but what about three people? can they do six COPs? 15 May 2015 Traffic Modelling 5

6 a parallel traffic model 15 May 2015 Traffic Modelling 6 A S B C

7 State Table If R t (i) = 0, then R t+1 (i) is given by: R t (i-1) = 0 R t (i -1) = 1 R t (i+1) = R t (i+1) = If R t (i) = 1, then R t+1 (i) is given by: R t (i-1) = 0 R t (i -1) = 1 R t (i+1) = R t (i+1) = HPC Concepts

8 Pseudo Code (serial) 8 HPC Concepts declare arrays old(i) and new(i), i = 0,1,...,N,N+1 initialise old(i) for i = 1,2,...,N-1,N (eg randomly) loop over iterations set old(0) = old(n) and set old(n+1) = old(1) loop over i = 1,...,N if old(i) = 1 if old(i+1) = 1 then new(i) = 1 else new(i) = 0 if old(i) = 0 if old(i-1) = 1 then new(i) = 1 else new(i) = 0 end loop over i set old(i) = new(i) for i = 1,2,...,N-1,N end loop over iterations

9 Pseudo Code (serial with subroutines) 9 HPC Concepts declare arrays old(i) and new(i), i = 0,1,...,N,N+1 initialise old(i) for i = 1,2,...,N-1,N (eg randomly) loop over iterations! Implement boundary conditions set old(0) = old(n) and set old(n+1) = old(1)! Update road call newroad(new, old, N)! Prepare for next iteration set old(i) = new(i) for i = 1,2,...,N-1,N end loop over iterations

10 Pseudo Code (distributed memory) 10 HPC Concepts! assume we are running on P processes declare arrays old(i) and new(i), i = 0,1,...,N/P,N/P+1 initialise old(i) for i = 1,2,...,N/P-1,N/P (eg randomly) loop over iterations! Implement boundary conditions (processes arranged as a ring) set old(0) on this process to old(n/p) from previous process set old(n/p+1) on this process to old(1) from next process! Update road call newroad(new, old, N/P)! Prepare for next iteration set old(i) = new(i) for i = 1,2,...,N/P-1,N/P end loop over iterations

11 Halo swapping 15 May 2015 Traffic Modelling 11! Implement boundary conditions set old(0) on this process to old(n/p) from previous process set old(n/p+1) on this process to old(1) from next process Implement this using blocking receives (e.g. MPI_Recv) and: or or synchronous send (routine blocks until message is received) e.g. MPI_Ssend asynchronous send (message copied into buffer, returns straight away) e.g. MPI_Bsend non-blocking synchronous send (no buffering but immediate return) e.g. MPI_Issend / MPI_Wait

12 Synchronous sends 15 May 2015 Traffic Modelling 12! Implement boundary conditions Ssend(old(N/P), up) Recv (old(1), down) Ssend(old(1), down) Recv (old(n/p+1), up) Guaranteed to deadlock

13 Asynchronous (buffered) sends! Implement boundary conditions Bsend(old(N/P), Recv (old(1), Bsend(old(1), up) down) down) Recv (old(n/p+1), up) Where do synchronisation issues become important? call newroad(new, old, N/P)? OK because we are writing new but only reading old set old(i) = new(i)? only OK because Bsend has copied old(1) and old(n/p) We don t really care if/when the message is received we do really care if/when we can safely reuse the local send buffers 15 May 2015 Traffic Modelling 13

14 Non-blocking (immediate) sends 15 May 2015 Traffic Modelling 14! Implement boundary conditions Issend(old(N/P), up) Recv (old(1), down) Issend(old(1), down) Recv (old(n/p+1), up) call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P)

15 Non-blocking (immediate) sends 15 May 2015 Traffic Modelling 15! Implement boundary conditions Issend(old(N/P), up) Recv (old(1), down) Issend(old(1), down) Recv (old(n/p+1), up) call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P)! Wait for communications to complete before next iteration wait(up) wait(down)

16 15 May 2015 Traffic Modelling 16 Non-blocking (immediate) sends! Implement boundary conditions Issend(old(N/P), Recv (old(1), Issend(old(1), up) down) down) Recv (old(n/p+1), up) call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P)! Wait for communications to complete before next iteration wait(up) wait(down) Incorrect! overwriting old is the key issue need to know boundary values of old are sent before overwriting

17 Non-blocking sends: correct 15 May 2015 Traffic Modelling 17! Implement boundary conditions Issend(old(N/P), up) Recv (old(1), down) Issend(old(1), down) Recv (old(n/p+1), up) call newroad(new, old, N/P) wait(up) wait(down) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P)

18 Delaying the waits 15 May 2015 Traffic Modelling 18! Implement boundary conditions Issend(old(N/P), up) Recv (old(1), down) Issend(old(1), down) Recv (old(n/p+1), up) call newroad(new, old, N/P) set old(i) = new(i) for i = 2,3,...,N/P-1) wait(up) old(n/p = new(m/p) wait(down) old(1) = new(1)

19 15 May 2015 Traffic Modelling 19 RMA synchronisation Similar synchronisation issues to non-blocking message passing but worse!

20 Remote Memory Access 20 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote read first old(0) = old(n/p) from previous process old(n/p+1) = old(1) from next process call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P

21 Remote Memory Access 21 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote read first old(0) old(n/p+1) = old(1) = old(n/p) from previous process call newroad(new, old, N/P) from next process! synchronise to ensure my old values have all been read set old(i) = new(i) for i = 1,2,...,N/P-1,N/P! synchronise to ensure neighbours old values have been! updated before I read them on the next iteration assuming reads are blocking like Recv

22 Remote Memory Access 22 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote writes set old(0) on next process = old(n/p) set old(n/p+1) on previous process = old(1) call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P

23 Remote Memory Access 23 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote writes set old(0) on next process = old(n/p) set old(n/p+1) on previous process = old(1)! synchronise to ensure my halos on old have been updated call newroad(new, old, N/P) set old(i) = new(i) for i = 1,2,...,N/P-1,N/P

24 Remote Memory Access 24 HPC Concepts Imagine we can do halo swaps directly with read or write where do synchronisation issues become important? what assumptions are you making about remote reads and writes? Consider remote writes set old(0) on next process = old(n/p) set old(n/p+1) on previous process = old(1)! synchronise to ensure my halos on old have been updated call newroad(new, old, N/P) assuming writes behave like a Bsend set old(i) = new(i) for i = 1,2,...,N/P-1,N/P! synchronise to ensure my neighbours have finished with their! old arrays (in newroad ) before overwriting them

25 Summary 25 HPC Concepts Synchronisation in PGAS approaches is not simple easy to write programs with subtle synchronisation errors Think first, code later!

Further MPI Programming. Paul Burton April 2015

Further MPI Programming. Paul Burton April 2015 Further MPI Programming Paul Burton April 2015 Blocking v Non-blocking communication Blocking communication - Call to MPI sending routine does not return until the send buffer (array) is safe to use again