Bryan Carpenter, School of Computing
|
|
- Georgiana Freeman
- 5 years ago
- Views:
Transcription
1 Bryan Carpenter, School of Computing 1
2 Plan Brief reprise of parallel computers and programming models Overview of MPI, with illustrative example. Brief comparison with OpenMP Use of MPI in GADGET 2 2
3 What is MPI? MPI is a programming interface for sending and receiving messages in C or Fortran programs ( Beyond this, it effectively provides a general framework for Single Program Multiple Data (SPMD) computing on distributed memory parallel computers. 3
4 Parallel Computers Traditional dichotomy shared vs distributed memory: Shared memory Distributed memory processors memory fast interconnect 4
5 Programming Frameworks Shared memory (cooperating threads) OpenMP POSIX Threads Cilk Threaded Building Blocks etc, etc Distributed memory (cooperating processes) MPI (PVM, etc) Co-array Fortran, UPC, etc Global Array Toolkit (etc) Adlib and HPspmd?! etc, etc 5
6 Real World Hybrids Clusters of shared memory nodes typical of large modern systems e.g. SCIAMA... repeat 42 times node 0 memory node 1 memory fast interconnect 6
7 Programming Hybrid Machines Can be programmed using a combination of shared memory framework (e.g. OpenMP) within nodes, and distributed memory framework (e.g. MPI) between nodes. But easier to adopt the lowest common denominator use MPI (say) across all cores. 7
8 General MIMD Programming A program contains different types of interacting process, each coded separately, e.g.: Routine for process A Routine for process B... some computation... sendto(b, values) ;... etc recvfrom(a, values) ;... process values etc... Multiple Instruction, Multiple Data 8
9 Single Program, Multiple Data General MIMD style not usually appropriate for programming massively parallel systems (perhaps OK for distributed systems) Mindset, rather, that all cores working collectively on a single task. Same program executed by all cores, though different subset of data processed by each hence SPMD. 9
10 SPMD For Two Processes Contrived example see subsequent discussion me = getid(); // numeric id of this process if(me == 0) {... some computation... sendto(1, values) ; } else if (me == 1) { recvfrom(0, values) ;... process values... }... etc... 10
11 Comments Illustrates a principle, but real SPMD programs almost never like this. Instead all processes do similar processing (on a different subset of data), then they all exchange portions of their data with peers. Behaviour conditioned by me value, but all processes are doing similar kinds of things most of the time Exception is often node 0, which may at least part of the time be engaged in unique I/O or coordination roles. Here node = process note overloaded terminology! 11
12 Collective Behaviour The program as a whole behaves collectively. Non-trivial parallel tasks can also be subdivided ( horizontally ) into individual phases that are collective e.g. A single iteration of a relaxation solver. Each iteration can be further be broken down into smaller collective phases: 1. Update locally held elements 2. Exchange edge elements with neighbours 12
13 Collective Behaviour 2 Some data is global the same for all processes Trivial example is the overall iteration count for our relaxation solver Every process holds a copy of iteration number, but it is updated identically by all processes, as iterations unfold. Contrast with shared memory programming, e.g. OpenMP, where typically this count would be maintained just by the master thread. In (distributed memory) SPMD, copies of global information often maintained and updated redundantly by all processes. And this is the right way to do things! 13
14 14
15 Features of MPI First MPI standard circa It introduced the important abstraction of a communicator, which is an object something like an N-way communication channel, connecting all members of a group of cooperating processes. This was partly to support using multiple parallel libraries without interference. It also introduced a novel concept of datatypes, used to describe the contents of communication buffers. Partly to support zero-copying message transfer. 15
16 Minimal MPI Program (C language) #include <mpi.h> int main(int argc, char* argv []) { int me ; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &me); if(me == 1) { char* message = "Hello from process 1\n"; } MPI_Send(message, strlen(message), MPI_BYTE, 0, 111, MPI_COMM_WORLD); } else if(me == 0) { char buffer [100]; MPI_Status status ; MPI_Recv(buffer, 100, MPI_BYTE, 1, 111, MPI_COMM_WORLD, &status); printf(buffer); } MPI_Finalize(); 16
17 Running on SCIAMA Setting up environment and compiling on login node: $ module load compilers/intel/intel-64 mpi/intel/openmpi/1.4.3 $ mpicc example1.c Following job script in (say) jobscript1.sh : #!/bin/bash #PBS -l nodes=2:ppn=1 #PBS -l walltime=00:01:00 #PBS -d /users/dbc/creche/examples. /etc/profile.d/modules.sh module load compilers/intel/intel-64 mpi/intel/openmpi/1.4.3 mpirun a.out 17
18 Running on SCIAMA 2 $ qsub./jobscript1.sh 9470.headnode1.sciama.icg.port.ac.uk $ ls [...] jobscript1.sh.e9470 jobscript1.sh.o9470 $ cat jobscript1.sh.o9470 [...] ======================================================= Job Output Follows: ======================================================= Hello from process 1 ================================================================= Torque job completed on Sun Feb 27 14:49:02 GMT 2011 with exit status of 0 ================================================================= 18
19 The Same in Fortran PROGRAM main USE mpi INTEGER me, ierr, stat CHARACTER*20 message, buffer CALL MPI_INIT( ierr ) CALL MPI_COMM_RANK( MPI_COMM_WORLD, me, ierr ) IF (me.eq. 1) THEN message = 'Hello from process 1' CALL MPI_SEND(message, 20, MPI_CHARACTER, 0, 111, MPI_COMM_WORLD, ierr) ELSE IF (me.eq. 0) THEN CALL MPI_RECV(buffer, 20, MPI_CHARACTER, 1, 111, MPI_COMM_WORLD, stat, ierr) PRINT *, buffer ENDIF CALL MPI_FINALIZE(ierr) STOP END 19
20 MPI Program Structure An MPI program is an ordinary C application, with the program that runs on every node in the main()function. Functions, types and constants of MPI are defined in mpi.h, which should be imported. MPI is initialized by calling MPI_Init(). You should forward the parameters of the main() method to the MPI_Init() method. Call MPI_Finalize() to shut down MPI before the main() method terminates. Failing to do this may cause your executable to not terminate properly. 20
21 Simple send and receive Basic send and receive: int MPI_Send(void* buf, int count, MPI_Datatype type, int dst, int tag, MPI_Comm comm) int MPI_Recv(void* buf, int count, MPI_Datatype type, int src, int tag, MPI_Comm comm, MPI_Status *status) The parameters buf, count, type describe the data buffer the storage of the data that is sent or received see next slide. dst is the rank of the destination process relative to communicator comm. Similarly in MPI_Recv(), src is the rank of the source process. An arbitrarily chosen tag value can be used in MPI_Recv() to select between several incoming messages: the call will wait until a message sent with a matching tag value arrives. The MPI_Recv() method returns an MPI_Status value, discussed later. 21
22 Communication Buffers Most of the communication operations take a sequence of parameters like void* buf, int count, MPI_Datatype type In the actual arguments passed to these functions, buf should be interpreted an array of elements of type consistent with the type parameter. MPI_Datatype can describe many primitive and user-defined types that occur in C or Fortran. count is the number of items to send. 22
23 Communicators The MPI_Comm type represents an MPI communicator. All communication operations logically go through communicators. A communicator spans a group of processes the participants in some kind of parallel task or subtask In MPI, process ids are called ranks, and they are always relative to particular process group. Many programmers only ever use the predefined, global communicator MPI_COMM_WORLD! Ranks relative to MPI_COMM_WORLD are the obvious process ids between 0 and P-1. 23
24 Rank and Size Example #include <mpi.h> int main(int argc, char* argv []) { int me, p ; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &me); MPI_Comm_size(MPI_COMM_WORLD, &p); printf("hello from process %d of %d\n", me, p); } MPI_Finalize(); 24
25 Discrete Poisson Solver Solves a i,j-1 + a i,j+1 + a i-1,j + a i+1,j - 4a ij = f ij by red-black relaxation. Consider an N by N grid with periodic BCs, for simplicity. 25
26 Data Decomposition Divide a and f arrays over P processes, each with B = N/P rows: process 0 process 1... process P-1 26
27 Local Update Initially ignoring edges; assume B is even: for (iter = 0 ; iter < NITER ; iter++) { } for(i = 0 ; i < B ; i++) for(j = (iter + i) % 2 ; j < N ; j += 2) { a [i][j] = 0.25 * (a [i - 1][j] + a [i + 1][j] + a [i][j - 1] + a [i][j + 1] - f [i][j]); } 27
28 Ghost Regions Biggest problem is with access to elements a[i-1][j] and a[i+1][j], which may go into segment of array held by another process Deal with this by declaring local arrays with ghost extensions extra rows in this example. 28
29 Data with Ghost Extensions Ghost regions process 0 process 1... process P-1 29
30 Possible C Declarations #define P 4 // number of processors #define N 8 // array size (multiple of P) #define B (N/P) // local block size float ag [B + 2][N] ; // local block plus ghost regions float f [B][N] ; // local block float (*a)[n] = ag + 1 ; // just local block // (pointer magic to skip lower ghost row) 30
31 Edge Swap process 0 process 1... process P-1 31
32 MPI Code void edgeswap() { int prev, next ; MPI_Status status ; prev = (me + P - 1) % P ; next = (me + 1) % P ; // First row to high ghost row on previous processor MPI_Sendrecv(a [0], N, MPI_FLOAT, prev, 111, ag [B + 1], N, MPI_FLOAT, next, 111, MPI_COMM_WORLD, &status) ; } // Last row to low ghost row on next processor MPI_Sendrecv(a [B - 1], N, MPI_FLOAT, next, 111, ag [0], N, MPI_FLOAT, prev, 111, MPI_COMM_WORLD, &status) ; 32
33 Send-receive Combines a basic send and a basic receive in a single operation: int MPI_Sendrecv( void* sendbuf, int sendcount, MPI_Datatype sendtype, int dst, int sendtag, void* recvbuf, int recvcount, MPI_Datatype recvtype, int src, int recvtag, MPI_Comm comm, MPI_Status *status) Can be more efficient. More importantly, avoids a potential deadlock in this example. 33
34 Final Main Loop for (iter = 0 ; iter < NITER ; iter++) { edgeswap() ; for(i = 0 ; i < B ; i++) for(j = (iter + i) % 2 ; j < N ; j += 2) { a [i][j] = 0.25 * (a [i - 1][j] + a [i + 1][j] + a [i][(j + N - 1) % N] + a [i][(j + 1) % N] - f [i][j]); } }... Optional debugging output... 34
35 Initialization /* Point source and initial approximation to solution */ for(i = 0 ; i < B ; i++) for(j = 0 ; j < N ; j++) { int x = me * B + i ; /* global indices from local */ int y = j ; if (x == N / 2 && y == N / 2) { f [i][j] = 1 ; } else { f [i][j] = 0 ; } a [i][j] = 0 ; } 35
36 Printing a Distributed Array Simplified assumes enough memory to hold whole array on root process: float a0 [N][N] ; int i, j ; MPI_Gather(a, N * B, MPI_FLOAT, a0, N * B, MPI_FLOAT, 0, MPI_COMM_WORLD) ; if(me == 0) { for(i = 0 ; i < N ; i++) { for(j = 0 ; j < N ; j++) { printf("%8.3f", a0 [i] [j]) ; } printf("\n"); } } 36
37 Gather a Collective Operation Collectives must be called by all processes (spanned by communicator): int MPI_Gather( void* sendbuf, int sendcount, MPI_Datatype sendtype, int dst, int sendtag, void* recvbuf, int recvcount, MPI_Datatype recvtype, int src, int recvtag, int root, MPI_Comm comm) process 0 process root process process P-1 37
38 Other Important Collectives MPI_Bcast broadcast from some root process to all processes MPI_Scatter the opposite of gather MPI_Reduce sum, product, etc of values from all processes etc a handful of others. 38
39 39
40 OpenMP OpenMP (which rather misleadingly styles itself an API ) is a set of compiler directives and supporting libraries for exploiting shared memory computers. Typically more concise than parallelization with MPI but still pitfalls, e.g. race conditions. Works within node (at most 12 cores on SCIAMA, for example) 40
41 Poisson Main Loop with OMP for (iter = 0 ; iter < NITER ; iter++) { #pragma omp parallel for private (j) for(i = 0 ; i < N ; i++) for(j = (iter + i) % 2 ; j < N ; j += 2) { a [i][j] = 0.25 * ( a [(i + N 1) % N][j] + a [(i + 1) % N][j] + a [i][(j + N - 1) % N] + a [i][(j + 1) % N] - f [i][j]); } }... Optional debugging output... 41
42 Running on SCIAMA Setting up environment and compiling on login node: $ module load compilers/intel/intel-64 mpi/intel/openmpi/1.4.3 $ gcc fopenmp omp_poisson.c Submit using e.g. following job script: #!/bin/bash #PBS -l nodes=1:ppn=12 #PBS -l walltime=00:01:00 #PBS -d /users/dbc/creche/examples. /etc/profile.d/modules.sh module load compilers/intel/intel-64 mpi/intel/openmpi/1.4.3 export OMP_NUM_THREADS=12./a.out 42
43 Code Complexity For what it s worth, my MPI version of the Poisson solver is 99 lines of code; my OpenMP version is 62 lines. Note private (j) clause in OMP directive try running without it! 43
44 44
45 Gadget 2 Gadget 2 is a free-software, production code for cosmological N-body (and hydrodynamic) computations. Written by Volker Springel, of the Max Plank Institute for Astrophysics, Garching. It is written in the C language already parallelized using MPI. 45
46 Gadget Main Loop Simplified view of the Gadget code: Initialize while (not done) { move_particles(); // update positions domain_decomposition(); compute_accelerations(); advance_and_find_timesteps(); // update velocities } Most of the interesting work happens in compute_accelerations and domain_decomposition. 46
47 Domain Decomposition Need to divide space and/or particle set into domains, each domain handled by a single processor. Can t just divide space evenly, because some regions will have many more particle than others poor load balancing. Can t just divide particles evenly, because particles move throughout space, and want to maintain physically close particles on the same processor, as far as practical communication problem. 47
48 Peano-Hilbert Curve Warren and Salmon originally suggested using a space-filling curve : Picture borrowed from 48
49 Peano-Hilbert Key Gadget applies the recursion 20 times, logically dividing space into up to cells on the Peano-Hilbert curve. Then can label each cell by its location along the Peano-Hilbert curve 2 60 possible locations comfortably fit into a 64-bit word. 49
50 Distribution of BH Tree in Gadget Ibid. 50
51 Distributed Representation of Tree Every processor hold a copy of the root nodes, and a copy of all child nodes down to the point where all particles in of a node are held on a single remote processor c.f. discussion of global data in intro last slide Remotely held nodes are called pseudoparticles. To compute the force on a single local target particle, traverse tree from root as usual, and accumulate contributions from locally held particles. Build an export list containing target particle and hosts of pseudo-particles encountered in walk. 51
52 Communication 1. After local computation for all target particles, process export list and send list of local target particles to all hosts that own pseudo-particle nodes needed for those particles. 2. All processors do another tree walk to compute their contributions to remotely owned (from their point of view) target particles. 3. These contributions are returned to the original processor, and added into the accelerations for the target particles. 52
53 Use of MPI in GADGET 2 Uses collective communications wherever possible throughout the code But some sections have more intricate patterns of communication, not supported by MPI collectives. These sections generally harder to understand. 53
54 The Hard Parts Export of particles to other nodes, for calculation of remote contribution to force, density, etc, and retrieval of results (see above). A partial distributed sort of particles, according to Peano-Hilbert key: this implements domain decomposition. For TreePM: Projection of particle density to regular grid for calculation of long range force; scatter results back to irregularly distributed particles. 54
55 Use of Point-to-point In these difficult sections of code notable that there is extensive use of low-level MPI point-to-point communication (i.e. send/recv not collective) Code difficult to understand, and presumably difficult to maintain. 55
56 A Program of Development Devise higher-level libraries of operations akin to MPI collectives, but more application-specific e.g. for 1. Collective Asynchronous Remote Invocation. 56
Message Passing Interface
MPSoC Architectures MPI Alberto Bosio, Associate Professor UM Microelectronic Departement bosio@lirmm.fr Message Passing Interface API for distributed-memory programming parallel code that runs across
More informationMPI. (message passing, MIMD)
MPI (message passing, MIMD) What is MPI? a message-passing library specification extension of C/C++ (and Fortran) message passing for distributed memory parallel programming Features of MPI Point-to-point
More informationCS 470 Spring Mike Lam, Professor. Distributed Programming & MPI
CS 470 Spring 2017 Mike Lam, Professor Distributed Programming & MPI MPI paradigm Single program, multiple data (SPMD) One program, multiple processes (ranks) Processes communicate via messages An MPI
More informationRecap of Parallelism & MPI
Recap of Parallelism & MPI Chris Brady Heather Ratcliffe The Angry Penguin, used under creative commons licence from Swantje Hess and Jannis Pohlmann. Warwick RSE 13/12/2017 Parallel programming Break
More informationCS 470 Spring Mike Lam, Professor. Distributed Programming & MPI
CS 470 Spring 2018 Mike Lam, Professor Distributed Programming & MPI MPI paradigm Single program, multiple data (SPMD) One program, multiple processes (ranks) Processes communicate via messages An MPI
More informationHPC Parallel Programing Multi-node Computation with MPI - I
HPC Parallel Programing Multi-node Computation with MPI - I Parallelization and Optimization Group TATA Consultancy Services, Sahyadri Park Pune, India TCS all rights reserved April 29, 2013 Copyright
More informationBasic MPI Communications. Basic MPI Communications (cont d)
Basic MPI Communications MPI provides two non-blocking routines: MPI_Isend(buf,cnt,type,dst,tag,comm,reqHandle) buf: source of data to be sent cnt: number of data elements to be sent type: type of each
More informationMessage Passing Interface. most of the slides taken from Hanjun Kim
Message Passing Interface most of the slides taken from Hanjun Kim Message Passing Pros Scalable, Flexible Cons Someone says it s more difficult than DSM MPI (Message Passing Interface) A standard message
More informationFirst day. Basics of parallel programming. RIKEN CCS HPC Summer School Hiroya Matsuba, RIKEN CCS
First day Basics of parallel programming RIKEN CCS HPC Summer School Hiroya Matsuba, RIKEN CCS Today s schedule: Basics of parallel programming 7/22 AM: Lecture Goals Understand the design of typical parallel
More informationCSE 613: Parallel Programming. Lecture 21 ( The Message Passing Interface )
CSE 613: Parallel Programming Lecture 21 ( The Message Passing Interface ) Jesmin Jahan Tithi Department of Computer Science SUNY Stony Brook Fall 2013 ( Slides from Rezaul A. Chowdhury ) Principles of
More informationIntroduction to the Message Passing Interface (MPI)
Introduction to the Message Passing Interface (MPI) CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction to the Message Passing Interface (MPI) Spring 2018
More informationCS 426. Building and Running a Parallel Application
CS 426 Building and Running a Parallel Application 1 Task/Channel Model Design Efficient Parallel Programs (or Algorithms) Mainly for distributed memory systems (e.g. Clusters) Break Parallel Computations
More informationMessage Passing Interface
Message Passing Interface DPHPC15 TA: Salvatore Di Girolamo DSM (Distributed Shared Memory) Message Passing MPI (Message Passing Interface) A message passing specification implemented
More informationParallel programming MPI
Parallel programming MPI Distributed memory Each unit has its own memory space If a unit needs data in some other memory space, explicit communication (often through network) is required Point-to-point
More informationIntroduction to MPI: Part II
Introduction to MPI: Part II Pawel Pomorski, University of Waterloo, SHARCNET ppomorsk@sharcnetca November 25, 2015 Summary of Part I: To write working MPI (Message Passing Interface) parallel programs
More informationPCAP Assignment I. 1. A. Why is there a large performance gap between many-core GPUs and generalpurpose multicore CPUs. Discuss in detail.
PCAP Assignment I 1. A. Why is there a large performance gap between many-core GPUs and generalpurpose multicore CPUs. Discuss in detail. The multicore CPUs are designed to maximize the execution speed
More informationScientific Computing
Lecture on Scientific Computing Dr. Kersten Schmidt Lecture 21 Technische Universität Berlin Institut für Mathematik Wintersemester 2014/2015 Syllabus Linear Regression, Fast Fourier transform Modelling
More information15-440: Recitation 8
15-440: Recitation 8 School of Computer Science Carnegie Mellon University, Qatar Fall 2013 Date: Oct 31, 2013 I- Intended Learning Outcome (ILO): The ILO of this recitation is: Apply parallel programs
More informationStandard MPI - Message Passing Interface
c Ewa Szynkiewicz, 2007 1 Standard MPI - Message Passing Interface The message-passing paradigm is one of the oldest and most widely used approaches for programming parallel machines, especially those
More informationIntroduction to MPI. Jerome Vienne Texas Advanced Computing Center January 10 th,
Introduction to MPI Jerome Vienne Texas Advanced Computing Center January 10 th, 2013 Email: viennej@tacc.utexas.edu 1 Course Objectives & Assumptions Objectives Teach basics of MPI-Programming Share information
More informationMPI: Parallel Programming for Extreme Machines. Si Hammond, High Performance Systems Group
MPI: Parallel Programming for Extreme Machines Si Hammond, High Performance Systems Group Quick Introduction Si Hammond, (sdh@dcs.warwick.ac.uk) WPRF/PhD Research student, High Performance Systems Group,
More informationMPI Lab. How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums
MPI Lab Parallelization (Calculating π in parallel) How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums Sharing Data Across Processors
More informationOutline. Communication modes MPI Message Passing Interface Standard. Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa Tp.HCM
THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking
More informationDistributed Memory Systems: Part IV
Chapter 5 Distributed Memory Systems: Part IV Max Planck Institute Magdeburg Jens Saak, Scientific Computing II 293/342 The Message Passing Interface is a standard for creation of parallel programs using
More informationHolland Computing Center Kickstart MPI Intro
Holland Computing Center Kickstart 2016 MPI Intro Message Passing Interface (MPI) MPI is a specification for message passing library that is standardized by MPI Forum Multiple vendor-specific implementations:
More informationMPI point-to-point communication
MPI point-to-point communication Slides Sebastian von Alfthan CSC Tieteen tietotekniikan keskus Oy CSC IT Center for Science Ltd. Introduction MPI processes are independent, they communicate to coordinate
More informationIntroduction to MPI. SHARCNET MPI Lecture Series: Part I of II. Paul Preney, OCT, M.Sc., B.Ed., B.Sc.
Introduction to MPI SHARCNET MPI Lecture Series: Part I of II Paul Preney, OCT, M.Sc., B.Ed., B.Sc. preney@sharcnet.ca School of Computer Science University of Windsor Windsor, Ontario, Canada Copyright
More informationOutline. Communication modes MPI Message Passing Interface Standard
MPI THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking
More informationIntroduction to MPI. Ricardo Fonseca. https://sites.google.com/view/rafonseca2017/
Introduction to MPI Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Distributed Memory Programming (MPI) Message Passing Model Initializing and terminating programs Point to point
More informationNon-Blocking Communications
Non-Blocking Communications Deadlock 1 5 2 3 4 Communicator 0 2 Completion The mode of a communication determines when its constituent operations complete. - i.e. synchronous / asynchronous The form of
More informationDistributed Memory Programming with MPI
Distributed Memory Programming with MPI Moreno Marzolla Dip. di Informatica Scienza e Ingegneria (DISI) Università di Bologna moreno.marzolla@unibo.it Algoritmi Avanzati--modulo 2 2 Credits Peter Pacheco,
More informationParallel Programming, MPI Lecture 2
Parallel Programming, MPI Lecture 2 Ehsan Nedaaee Oskoee 1 1 Department of Physics IASBS IPM Grid and HPC workshop IV, 2011 Outline 1 Introduction and Review The Von Neumann Computer Kinds of Parallel
More informationIntroduction to parallel computing concepts and technics
Introduction to parallel computing concepts and technics Paschalis Korosoglou (support@grid.auth.gr) User and Application Support Unit Scientific Computing Center @ AUTH Overview of Parallel computing
More informationPractical Introduction to Message-Passing Interface (MPI)
1 Practical Introduction to Message-Passing Interface (MPI) October 1st, 2015 By: Pier-Luc St-Onge Partners and Sponsors 2 Setup for the workshop 1. Get a user ID and password paper (provided in class):
More informationMessage Passing Interface
Message Passing Interface by Kuan Lu 03.07.2012 Scientific researcher at Georg-August-Universität Göttingen and Gesellschaft für wissenschaftliche Datenverarbeitung mbh Göttingen Am Faßberg, 37077 Göttingen,
More informationMPI introduction - exercises -
MPI introduction - exercises - Paolo Ramieri, Maurizio Cremonesi May 2016 Startup notes Access the server and go on scratch partition: ssh a08tra49@login.galileo.cineca.it cd $CINECA_SCRATCH Create a job
More informationMPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016
MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared
More informationMPI Collective communication
MPI Collective communication CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) MPI Collective communication Spring 2018 1 / 43 Outline 1 MPI Collective communication
More informationParallel Computing and the MPI environment
Parallel Computing and the MPI environment Claudio Chiaruttini Dipartimento di Matematica e Informatica Centro Interdipartimentale per le Scienze Computazionali (CISC) Università di Trieste http://www.dmi.units.it/~chiarutt/didattica/parallela
More informationAdvanced MPI. Andrew Emerson
Advanced MPI Andrew Emerson (a.emerson@cineca.it) Agenda 1. One sided Communications (MPI-2) 2. Dynamic processes (MPI-2) 3. Profiling MPI and tracing 4. MPI-I/O 5. MPI-3 11/12/2015 Advanced MPI 2 One
More informationNon-Blocking Communications
Non-Blocking Communications Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationMPI 5. CSCI 4850/5850 High-Performance Computing Spring 2018
MPI 5 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives
More informationParallel Computing Paradigms
Parallel Computing Paradigms Message Passing João Luís Ferreira Sobral Departamento do Informática Universidade do Minho 31 October 2017 Communication paradigms for distributed memory Message passing is
More informationNUMERICAL PARALLEL COMPUTING
Lecture 5, March 23, 2012: The Message Passing Interface http://people.inf.ethz.ch/iyves/pnc12/ Peter Arbenz, Andreas Adelmann Computer Science Dept, ETH Zürich E-mail: arbenz@inf.ethz.ch Paul Scherrer
More informationDistributed Systems + Middleware Advanced Message Passing with MPI
Distributed Systems + Middleware Advanced Message Passing with MPI Gianpaolo Cugola Dipartimento di Elettronica e Informazione Politecnico, Italy cugola@elet.polimi.it http://home.dei.polimi.it/cugola
More informationIntroduction to MPI. HY555 Parallel Systems and Grids Fall 2003
Introduction to MPI HY555 Parallel Systems and Grids Fall 2003 Outline MPI layout Sending and receiving messages Collective communication Datatypes An example Compiling and running Typical layout of an
More informationPractical Course Scientific Computing and Visualization
July 5, 2006 Page 1 of 21 1. Parallelization Architecture our target architecture: MIMD distributed address space machines program1 data1 program2 data2 program program3 data data3.. program(data) program1(data1)
More informationMessage-Passing Computing
Chapter 2 Slide 41þþ Message-Passing Computing Slide 42þþ Basics of Message-Passing Programming using userlevel message passing libraries Two primary mechanisms needed: 1. A method of creating separate
More informationCS 179: GPU Programming. Lecture 14: Inter-process Communication
CS 179: GPU Programming Lecture 14: Inter-process Communication The Problem What if we want to use GPUs across a distributed system? GPU cluster, CSIRO Distributed System A collection of computers Each
More informationIntroduction to MPI. Ritu Arora Texas Advanced Computing Center June 17,
Introduction to MPI Ritu Arora Texas Advanced Computing Center June 17, 2014 Email: rauta@tacc.utexas.edu 1 Course Objectives & Assumptions Objectives Teach basics of MPI-Programming Share information
More informationOutline. Introduction to HPC computing. OpenMP MPI. Introduction. Understanding communications. Collective communications. Communicators.
Lecture 8 MPI Outline Introduction to HPC computing OpenMP MPI Introduction Understanding communications Collective communications Communicators Topologies Grouping Data for Communication Input / output
More informationCSE 160 Lecture 18. Message Passing
CSE 160 Lecture 18 Message Passing Question 4c % Serial Loop: for i = 1:n/3-1 x(2*i) = x(3*i); % Restructured for Parallelism (CORRECT) for i = 1:3:n/3-1 y(2*i) = y(3*i); for i = 2:3:n/3-1 y(2*i) = y(3*i);
More informationParallel hardware. Distributed Memory. Parallel software. COMP528 MPI Programming, I. Flynn s taxonomy:
COMP528 MPI Programming, I www.csc.liv.ac.uk/~alexei/comp528 Alexei Lisitsa Dept of computer science University of Liverpool a.lisitsa@.liverpool.ac.uk Flynn s taxonomy: Parallel hardware SISD (Single
More informationProgramming Scalable Systems with MPI. Clemens Grelck, University of Amsterdam
Clemens Grelck University of Amsterdam UvA / SurfSARA High Performance Computing and Big Data Course June 2014 Parallel Programming with Compiler Directives: OpenMP Message Passing Gentle Introduction
More informationCS4961 Parallel Programming. Lecture 18: Introduction to Message Passing 11/3/10. Final Project Purpose: Mary Hall November 2, 2010.
Parallel Programming Lecture 18: Introduction to Message Passing Mary Hall November 2, 2010 Final Project Purpose: - A chance to dig in deeper into a parallel programming model and explore concepts. -
More informationHigh-Performance Computing: MPI (ctd)
High-Performance Computing: MPI (ctd) Adrian F. Clark: alien@essex.ac.uk 2015 16 Adrian F. Clark: alien@essex.ac.uk High-Performance Computing: MPI (ctd) 2015 16 1 / 22 A reminder Last time, we started
More informationIntroduction to MPI. Ekpe Okorafor. School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014
Introduction to MPI Ekpe Okorafor School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014 Topics Introduction MPI Model and Basic Calls MPI Communication Summary 2 Topics Introduction
More informationMPI Message Passing Interface
MPI Message Passing Interface Portable Parallel Programs Parallel Computing A problem is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information
More informationThe Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs
1 The Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs http://mpi-forum.org https://www.open-mpi.org/ Mike Bailey mjb@cs.oregonstate.edu Oregon State University mpi.pptx
More informationCS4961 Parallel Programming. Lecture 16: Introduction to Message Passing 11/3/11. Administrative. Mary Hall November 3, 2011.
CS4961 Parallel Programming Lecture 16: Introduction to Message Passing Administrative Next programming assignment due on Monday, Nov. 7 at midnight Need to define teams and have initial conversation with
More informationPARALLEL AND DISTRIBUTED COMPUTING
PARALLEL AND DISTRIBUTED COMPUTING 2013/2014 1 st Semester 1 st Exam January 7, 2014 Duration: 2h00 - No extra material allowed. This includes notes, scratch paper, calculator, etc. - Give your answers
More informationMessage Passing Interface - MPI
Message Passing Interface - MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 24, 2011 Many slides adapted from lectures by
More informationMPI Lab. Steve Lantz Susan Mehringer. Parallel Computing on Ranger and Longhorn May 16, 2012
MPI Lab Steve Lantz Susan Mehringer Parallel Computing on Ranger and Longhorn May 16, 2012 1 MPI Lab Parallelization (Calculating p in parallel) How to split a problem across multiple processors Broadcasting
More informationData parallelism. [ any app performing the *same* operation across a data stream ]
Data parallelism [ any app performing the *same* operation across a data stream ] Contrast stretching: Version Cores Time (secs) Speedup while (step < NumSteps &&!converged) { step++; diffs = 0; foreach
More informationThe Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs
1 The Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) s http://mpi-forum.org https://www.open-mpi.org/ Mike Bailey mjb@cs.oregonstate.edu Oregon State University mpi.pptx
More informationChip Multiprocessors COMP Lecture 9 - OpenMP & MPI
Chip Multiprocessors COMP35112 Lecture 9 - OpenMP & MPI Graham Riley 14 February 2018 1 Today s Lecture Dividing work to be done in parallel between threads in Java (as you are doing in the labs) is rather
More informationLecture 6: Message Passing Interface
Lecture 6: Message Passing Interface Introduction The basics of MPI Some simple problems More advanced functions of MPI A few more examples CA463D Lecture Notes (Martin Crane 2013) 50 When is Parallel
More informationParallel Programming. Using MPI (Message Passing Interface)
Parallel Programming Using MPI (Message Passing Interface) Message Passing Model Simple implementation of the task/channel model Task Process Channel Message Suitable for a multicomputer Number of processes
More informationMPI MESSAGE PASSING INTERFACE
MPI MESSAGE PASSING INTERFACE David COLIGNON, ULiège CÉCI - Consortium des Équipements de Calcul Intensif http://www.ceci-hpc.be Outline Introduction From serial source code to parallel execution MPI functions
More informationDistributed Memory Parallel Programming
COSC Big Data Analytics Parallel Programming using MPI Edgar Gabriel Spring 201 Distributed Memory Parallel Programming Vast majority of clusters are homogeneous Necessitated by the complexity of maintaining
More informationThe Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing
The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Parallelism Decompose the execution into several tasks according to the work to be done: Function/Task
More informationHigh Performance Computing Course Notes Message Passing Programming I
High Performance Computing Course Notes 2008-2009 2009 Message Passing Programming I Message Passing Programming Message Passing is the most widely used parallel programming model Message passing works
More informationIntroduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS. Teacher: Jan Kwiatkowski, Office 201/15, D-2
Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS Teacher: Jan Kwiatkowski, Office 201/15, D-2 COMMUNICATION For questions, email to jan.kwiatkowski@pwr.edu.pl with 'Subject=your name.
More informationWorking with IITJ HPC Environment
Working with IITJ HPC Environment by Training Agenda for 23 Dec 2011 1. Understanding Directory structure of IITJ HPC 2. User vs root 3. What is bash_profile 4. How to install any source code in your user
More informationIntroduction to MPI. SuperComputing Applications and Innovation Department 1 / 143
Introduction to MPI Isabella Baccarelli - i.baccarelli@cineca.it Mariella Ippolito - m.ippolito@cineca.it Cristiano Padrin - c.padrin@cineca.it Vittorio Ruggiero - v.ruggiero@cineca.it SuperComputing Applications
More informationMPI MPI. Linux. Linux. Message Passing Interface. Message Passing Interface. August 14, August 14, 2007 MPICH. MPI MPI Send Recv MPI
Linux MPI Linux MPI Message Passing Interface Linux MPI Linux MPI Message Passing Interface MPI MPICH MPI Department of Science and Engineering Computing School of Mathematics School Peking University
More informationCS 6230: High-Performance Computing and Parallelization Introduction to MPI
CS 6230: High-Performance Computing and Parallelization Introduction to MPI Dr. Mike Kirby School of Computing and Scientific Computing and Imaging Institute University of Utah Salt Lake City, UT, USA
More informationL15: Putting it together: N-body (Ch. 6)!
Outline L15: Putting it together: N-body (Ch. 6)! October 30, 2012! Review MPI Communication - Blocking - Non-Blocking - One-Sided - Point-to-Point vs. Collective Chapter 6 shows two algorithms (N-body
More informationComputer Architecture
Jens Teubner Computer Architecture Summer 2016 1 Computer Architecture Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2016 Jens Teubner Computer Architecture Summer 2016 2 Part I Programming
More informationMPI Tutorial. Shao-Ching Huang. High Performance Computing Group UCLA Institute for Digital Research and Education
MPI Tutorial Shao-Ching Huang High Performance Computing Group UCLA Institute for Digital Research and Education Center for Vision, Cognition, Learning and Art, UCLA July 15 22, 2013 A few words before
More informationPeter Pacheco. Chapter 3. Distributed Memory Programming with MPI. Copyright 2010, Elsevier Inc. All rights Reserved
An Introduction to Parallel Programming Peter Pacheco Chapter 3 Distributed Memory Programming with MPI 1 Roadmap Writing your first MPI program. Using the common MPI functions. The Trapezoidal Rule in
More informationDistributed Memory Programming with MPI. Copyright 2010, Elsevier Inc. All rights Reserved
An Introduction to Parallel Programming Peter Pacheco Chapter 3 Distributed Memory Programming with MPI 1 Roadmap Writing your first MPI program. Using the common MPI functions. The Trapezoidal Rule in
More informationCSE 160 Lecture 15. Message Passing
CSE 160 Lecture 15 Message Passing Announcements 2013 Scott B. Baden / CSE 160 / Fall 2013 2 Message passing Today s lecture The Message Passing Interface - MPI A first MPI Application The Trapezoidal
More informationMore about MPI programming. More about MPI programming p. 1
More about MPI programming More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems, the CPUs share
More informationCOMP 322: Fundamentals of Parallel Programming
COMP 322: Fundamentals of Parallel Programming https://wiki.rice.edu/confluence/display/parprog/comp322 Lecture 37: Introduction to MPI (contd) Vivek Sarkar Department of Computer Science Rice University
More informationAdvanced MPI. Andrew Emerson
Advanced MPI Andrew Emerson (a.emerson@cineca.it) Agenda 1. One sided Communications (MPI-2) 2. Dynamic processes (MPI-2) 3. Profiling MPI and tracing 4. MPI-I/O 5. MPI-3 22/02/2017 Advanced MPI 2 One
More informationCOSC 6374 Parallel Computation. Message Passing Interface (MPI ) I Introduction. Distributed memory machines
Network card Network card 1 COSC 6374 Parallel Computation Message Passing Interface (MPI ) I Introduction Edgar Gabriel Fall 015 Distributed memory machines Each compute node represents an independent
More informationWhat s in this talk? Quick Introduction. Programming in Parallel
What s in this talk? Parallel programming methodologies - why MPI? Where can I use MPI? MPI in action Getting MPI to work at Warwick Examples MPI: Parallel Programming for Extreme Machines Si Hammond,
More informationThe Message Passing Model
Introduction to MPI The Message Passing Model Applications that do not share a global address space need a Message Passing Framework. An application passes messages among processes in order to perform
More informationlslogin3$ cd lslogin3$ tar -xvf ~train00/mpibasic_lab.tar cd mpibasic_lab/pi cd mpibasic_lab/decomp1d
MPI Lab Getting Started Login to ranger.tacc.utexas.edu Untar the lab source code lslogin3$ cd lslogin3$ tar -xvf ~train00/mpibasic_lab.tar Part 1: Getting Started with simple parallel coding hello mpi-world
More informationLecture 7: Distributed memory
Lecture 7: Distributed memory David Bindel 15 Feb 2010 Logistics HW 1 due Wednesday: See wiki for notes on: Bottom-up strategy and debugging Matrix allocation issues Using SSE and alignment comments Timing
More informationIntroduction to the Message Passing Interface (MPI)
Applied Parallel Computing LLC http://parallel-computing.pro Introduction to the Message Passing Interface (MPI) Dr. Alex Ivakhnenko March 4, 2018 Dr. Alex Ivakhnenko (APC LLC) Introduction to MPI March
More informationProgramming with MPI. Pedro Velho
Programming with MPI Pedro Velho Science Research Challenges Some applications require tremendous computing power - Stress the limits of computing power and storage - Who might be interested in those applications?
More informationO.I. Streltsova, D.V. Podgainy, M.V. Bashashin, M.I.Zuev
High Performance Computing Technologies Lecture, Practical training 9 Parallel Computing with MPI: parallel algorithm for linear algebra https://indico-hlit.jinr.ru/event/120/ O.I. Streltsova, D.V. Podgainy,
More informationAssignment 3 MPI Tutorial Compiling and Executing MPI programs
Assignment 3 MPI Tutorial Compiling and Executing MPI programs B. Wilkinson: Modification date: February 11, 2016. This assignment is a tutorial to learn how to execute MPI programs and explore their characteristics.
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 4 Message-Passing Programming Learning Objectives n Understanding how MPI programs execute n Familiarity with fundamental MPI functions
More informationMPI MESSAGE PASSING INTERFACE
MPI MESSAGE PASSING INTERFACE David COLIGNON CÉCI - Consortium des Équipements de Calcul Intensif http://hpc.montefiore.ulg.ac.be Outline Introduction From serial source code to parallel execution MPI
More informationITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 2016 Solutions Name:...
ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 016 Solutions Name:... Answer questions in space provided below questions. Use additional paper if necessary but make sure
More informationA few words about MPI (Message Passing Interface) T. Edwald 10 June 2008
A few words about MPI (Message Passing Interface) T. Edwald 10 June 2008 1 Overview Introduction and very short historical review MPI - as simple as it comes Communications Process Topologies (I have no
More informationDistributed Memory Programming with Message-Passing
Distributed Memory Programming with Message-Passing Pacheco s book Chapter 3 T. Yang, CS240A Part of slides from the text book and B. Gropp Outline An overview of MPI programming Six MPI functions and
More information