Review of MPI Part 2

Similar documents
Collective Communications

Intermediate MPI features

Parallel Programming. Using MPI (Message Passing Interface)

Programming with MPI Collectives

High-Performance Computing: MPI (ctd)

Topics. Lecture 7. Review. Other MPI collective functions. Collective Communication (cont d) MPI Programming (III)

Cornell Theory Center. Discussion: MPI Collective Communication I. Table of Contents. 1. Introduction

Outline. Communication modes MPI Message Passing Interface Standard

Programming Using the Message Passing Paradigm

Reusing this material

Cluster Computing MPI. Industrial Standard Message Passing

CDP. MPI Derived Data Types and Collective Communication

Distributed Systems + Middleware Advanced Message Passing with MPI

Introduction to MPI part II. Fabio AFFINITO

Distributed Memory Parallel Programming

Non-Blocking Communications

Part - II. Message Passing Interface. Dheeraj Bhardwaj

Standard MPI - Message Passing Interface

Non-Blocking Communications

COMP 322: Fundamentals of Parallel Programming

Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR

Message Passing with MPI

Introduction to MPI Part II Collective Communications and communicators

Week 3: MPI. Day 02 :: Message passing, point-to-point and collective communications

Topics. Lecture 6. Point-to-point Communication. Point-to-point Communication. Broadcast. Basic Point-to-point communication. MPI Programming (III)

CME 213 SPRING Eric Darve

Practical Scientific Computing: Performanceoptimized

MPI. (message passing, MIMD)

Slides prepared by : Farzana Rahman 1

High Performance Computing

Recap of Parallelism & MPI

COMP 322: Fundamentals of Parallel Programming. Lecture 34: Introduction to the Message Passing Interface (MPI), contd

Outline. Communication modes MPI Message Passing Interface Standard. Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa Tp.HCM

CSE 613: Parallel Programming. Lecture 21 ( The Message Passing Interface )

COSC 4397 Parallel Computation. Introduction to MPI (III) Process Grouping. Terminology (I)

Agenda. MPI Application Example. Praktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI. 1) Recap: MPI. 2) 2.

Masterpraktikum - Scientific Computing, High Performance Computing

MPI - The Message Passing Interface

Praktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI

MA471. Lecture 5. Collective MPI Communication

INTRODUCTION TO MPI COMMUNICATORS AND VIRTUAL TOPOLOGIES

Programming Using the Message Passing Interface

Today's agenda. Parallel Programming for Multicore Machines Using OpenMP and MPI

Programming SoHPC Course June-July 2015 Vladimir Subotic MPI - Message Passing Interface

MPI 8. CSCI 4850/5850 High-Performance Computing Spring 2018

MPI MESSAGE PASSING INTERFACE

Message Passing Programming with MPI. Message Passing Programming with MPI 1

Parallel Programming with MPI MARCH 14, 2018

More Communication (cont d)

Chapter 4. Message-passing Model

INTRODUCTION TO MPI VIRTUAL TOPOLOGIES

Message-Passing and MPI Programming

Parallel Programming in C with MPI and OpenMP

MPI MESSAGE PASSING INTERFACE

Optimization of MPI Applications Rolf Rabenseifner

Masterpraktikum - Scientific Computing, High Performance Computing

Message passing. Week 3: MPI. Day 02 :: Message passing, point-to-point and collective communications. What is MPI?

IPM Workshop on High Performance Computing (HPC08) IPM School of Physics Workshop on High Perfomance Computing/HPC08

Communicators. MPI Communicators and Topologies. Why Communicators? MPI_Comm_split

Scientific Computing

Parallel Programming in C with MPI and OpenMP

Message Passing Interface

Basic MPI Communications. Basic MPI Communications (cont d)

In the simplest sense, parallel computing is the simultaneous use of multiple computing resources to solve a problem.

The MPI Message-passing Standard Practical use and implementation (VI) SPD Course 08/03/2017 Massimo Coppola

Experiencing Cluster Computing Message Passing Interface

HPC Parallel Programing Multi-node Computation with MPI - I

Collective Communication in MPI and Advanced Features

Parallel Computing. PD Dr. rer. nat. habil. Ralf-Peter Mundani. Computation in Engineering / BGU Scientific Computing in Computer Science / INF

The MPI Message-passing Standard Practical use and implementation (V) SPD Course 6/03/2017 Massimo Coppola

Message Passing Interface. most of the slides taken from Hanjun Kim

Intra and Inter Communicators

A Message Passing Standard for MPP and Workstations. Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W.

High Performance Computing Course Notes Message Passing Programming III

Programming Using the Message-Passing Paradigm (Chapter 6) Alexandre David

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI

Introduction to MPI, the Message Passing Library

Practical Scientific Computing: Performanceoptimized

Acknowledgments. Programming with MPI Basic send and receive. A Minimal MPI Program (C) Contents. Type to enter text

Introduction to MPI: Part II

Programming with MPI Basic send and receive

High Performance Computing Course Notes Message Passing Programming III

CS 179: GPU Programming. Lecture 14: Inter-process Communication

CPS 303 High Performance Computing

Parallel Computing. MPI Collective communication

4. Parallel Programming with MPI

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI

Department of Informatics V. HPC-Lab. Session 4: MPI, CG M. Bader, A. Breuer. Alex Breuer

MPI Collective communication

Buffering in MPI communications

Parallel programming MPI

Distributed Memory Systems: Part IV

Introduction to the Message Passing Interface (MPI)

Parallel Programming, MPI Lecture 2

Capstone Project. Project: Middleware for Cluster Computing

A Message Passing Standard for MPP and Workstations

Distributed Memory Programming with MPI

Introduction to Parallel Programming

Lecture Topic: Multi-Core Processors: MPI 1.0 Overview (Part-II)

Introduction to the Message Passing Interface (MPI)

Transcription:

Review of MPI Part Russian-German School on High Performance Computer Systems, June, 7 th until July, 6 th 005, Novosibirsk 3. Day, 9 th of June, 005 HLRS, University of Stuttgart Slide

Chap. 5 Virtual Topologies. MPI Overview. Process model and language bindings MPI_Init() MPI_Comm_rank() 3. Messages and point-to-point communication 4. Non-blocking communication 5. Virtual topologies 8. Collective communication Slide

MPI Communicators Groups offer a subset of processes of the program. Communicators are the handle on groups for communication. The initial comm. MPI_COMM_WORLD consists of all processes. 0 4 6 3 5 7 Every process within a communicator has a distinct rank. There are functions to split / cut processes out of communicators. 0 3 4 5 6 7 MPI_Comm_split (MPI_COMM_WORLD, comm_rank%, comm_rank, &new_comm); Slide 3

MPI Communicators int MPI_Comm_split (MPI_Comm old, int color, int key, MPI_Comm * new) MPI_COMM_SPLIT (old, color, key, new, IERROR) INTEGER OLD, COLOR, KEY, NEW, IERROR Splits the communicator into color separate communicators. Each subgroup contains all processes of the same color. The value of key specifies the rank in the new communicator. 0 3 6 4 7 MPI_Comm_split (MPI_COMM_WORLD, comm_rank%3, comm_rank, &new_comm); 5 Slide 4

MPI Communicators int MPI_Comm_group (MPI_Comm comm, MPI_Group * group) MPI_COMM_GROUP (comm, group, IERROR) INTEGER comm, group, IERROR int MPI_Group_excl (MPI_Group group, int n, int * ranks, MPI_Group * new) MPI_GROUP_excl (group, n, ranks, new, IERROR) INTEGER group, n, ranks, new, IERROR Extract group of processes out of comm with MPI_Comm_group. Then one may MPI_Group_incl, MPI_Group_excl, MPI_Group_range_incl... Afterwards through a global operation MPI_Comm_create, convert group into comm. Slide 5

MPI Virtual Topologies Convenient process naming. Naming scheme to fit the communication pattern. Simplifies writing of code. Can allow MPI to optimize communications. Creating a topology produces a new communicator. MPI provides mapping functions: to compute process ranks, based on the topology naming scheme, and vice versa. Graph and Cartesian Topology. Slide 6

MPI Topology Types Cartesian Topologies each process is connected to its neighbor in a virtual grid, boundaries can be cyclic, or not, processes are identified by Cartesian coordinates, of course, communication between any two processes is still allowed. int MPI_Cart_create(MPI_Comm comm_old, int ndims, int *dims, int *periods, int reorder, MPI_Comm *out); comm_old = MPI_COMM_WORLD ndims = dims = (4, 3) periods = (/.true., 0/.false.) reorder = see next slide 0 (0,0) (0,) 3 (,0) 4 (,) 6 (,0) 7 (,) 9 (3,0) 0 (3,) (0,) 5 (,) 8 (,) (3,) Slide 7

MPI A -dimensional Cylinder Ranks and Cartesian process coordinates in comm_cart 0 (0,0) (0,) (0,) 7 3 (,0) 6 6 (,0) 5 9 (3,0) 4 MPI_Cart_rank 4 0 7 9 0 8 (,) (,) (3,) MPI_Cart_coords 3 5 8 0 (,) (,) (3,) Ranks in comm and comm_cart may differ, if reorder = or.true. This reordering may allow MPI to optimize communications Slide 8

MPI MPI_Cart_shift 0 (0,0) 3 (,0) 6 (,0) 9 (3,0) (0,) 4 (,) 7 (,) 0 (3,) (0,) 5 (,) 8 (,) (3,) invisible input argument: my_rank in cart MPI_Cart_shift( cart, direction, displace, rank_source, rank_dest, ierror) example on 0 or + 4 0 process rank=7 + 6 8 Slide 9

MPI Cartesian Partitioning Cut a grid up into slices. A new communicator is produced for each slice. Each slice can then perform its own collective communications. int MPI_Cart_sub(MPI_Comm comm_cart,int *remain_dims, MPI_Comm *comm_slice); Splitting in with first dimension remaining: 0 (0,0) 3 (,0) 6 (,0) 9 (3,0) (0,) 4 (,) 7 (,) 0 (3,) (0,) 5 (,) 8 (,) (3,) Slide 0

Chap. 6 Collective Communication. MPI Overview. Process model and language bindings MPI_Init() MPI_Comm_rank() 3. Messages and point-to-point communication 4. Non-blocking communication 5. Virtual topologies 8. Collective communication Slide

MPI Collective Communication Communications involving a group of processes. Called by all processes in a communicator. Examples: Barrier synchronization. Broadcast, scatter, gather. Reduction: Global sum, global maximum, etc. Collective action over a communicator. All process of the communicator must communicate, i.e. must call the collective routine. Synchronization may or may not occur, therefore all processes must be able to start the collective routine. All collective operations are blocking. No tags. Slide

MPI Barrier Synchronization MPI_Barrier is normally never needed: all synchronization is done automatically by the data communication: a process cannot continue before it has the data that it needs. if debugging: remove in production-code. if profiling: to separate time measurement of Load imbalance of computation: MPI_Wtime(); MPI_Barrier(); MPI_Wtime(); communication epochs: MPI_Wtime(); MPI_Allreduce(); ; MPI_Wtime(); if used for synchronizing external communication (e.g. I/O): exchanging tokens may be more efficient and scalable than a barrier on MPI_COMM_WORLD. Slide 3

MPI Broadcast int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int root, MPI_Comm comm) before bcast r e d after bcast r e d r e d r e d r e d r e d e.g., root= Rank of the sending process (i.e., root process) Must be given identically by all processes Slide 4

MPI Gather e.g., root= before gather A B C D E after gather A B ABCD E C D E int MPI_Gather(void *sbuf, int scount, MPI_Datatype stype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm); Slide 5

MPI Scatter e.g., root= before scatter ABCD E after scatter A ABCD E B C D E int MPI_Scatter(void *sendbuf, int sendcount,mpi_datatype stype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) Slide 6

MPI Global Reduction Operations To perform a global reduce operation across all members of a group. d 0 o d o d o d 3 o o d s- o d s- d i = data in process rank i single variable, or vector o = associative operation Example: global sum or product global maximum or minimum global user-defined operation floating point rounding may depend on usage of associative law: [(d 0 o d ) o (d o d 3 )] o [ o (d s- o d s- )] ((((((d 0 o d ) o d ) o d 3 ) o ) o d s- ) o d s- ) Slide 7

MPI Operators for Global Reduction Sum of all inbuf values should be returned in resultbuf (only at root). MPI_Reduce(&inbuf, &resultbuf,, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD); Predefined operation handle MPI_MAX MPI_MIN MPI_SUM MPI_PROD MPI_LAND MPI_BAND MPI_LOR MPI_BOR MPI_LXOR MPI_BXOR MPI_MAXLOC MPI_MINLOC Function Maximum Minimum Sum Product Logical AND Bitwise AND Logical OR Bitwise OR Logical exclusive OR Bitwise exclusive OR Maximum and location of the maximum Minimum and location of the minimum Slide 8

MPI Reduce before MPI_REDUCE inbuf result A B C D E F GH I J K L MN O o o o o after A B C D E F GH I J K L MN O AoDoGoJoM root= Slide 9

MPI Variants of Reduction Operations MPI_ALLREDUCE no root, returns the result in all processes MPI_REDUCE_SCATTER result vector of the reduction operation is scattered to the processes into the real result buffers MPI_SCAN prefix reduction result at process with rank i := reduction of inbuf-values from rank 0 to rank i Slide 0

MPI Allreduce before MPI_ALLREDUCE inbuf result A B C D E F GH I J K L MN O o o o o after A B C D E F GH I J K L MN O AoDoGoJoM Slide

MPI Scan before MPI_SCAN inbuf result A B C D E F GH I J K L MN O o o o o after A B C D E F GH I J K L MN O A AoD AoDoG AoDoGoJ AoDoGoJoM done in parallel Slide

Exercise Rotating information around a ring A set of processes are arranged in a ring. my_rank Each process stores its rank 0 in MPI_COMM_WORLD into snd_buf an integer variable snd_buf. 0 Each process passes this on to its neighbor on the right. sum 0 Each processor calculates the sum of all values. Keep passing it around the ring until my_rank the value is back where it started, i.e. each process calculates sum of all ranks. snd_buf Use non-blocking MPI_Issend to avoid deadlocks sum to verify the correctness, because 0 blocking synchronous send will cause a deadlock Init 0 my_rank snd_buf sum 0 Slide 3

Exercise Rotating information around a ring Initialization: Each iteration: 3 4 5 3 my_rank snd_buf 4 rcv_buf sum 5 Fortran: dest = mod(my_rank+,size) source = mod(my_rank +size,size) C: dest = (my_rank+) % size; source = (my_rank +size) % size; Single Program!!! my_rank snd_buf 4 rcv_buf sum 5 3 my_rank snd_buf 4 rcv_buf sum 5 3 Slide 4 see also login-slides

Advanced Exercises Irecv instead of Issend Substitute the Issend Recv Wait method by the Irecv Ssend Wait method in your ring program. Or Substitute the Issend Recv Wait method by the Irecv Issend Waitall method in your ring program. Or Use the collective call MPI_Reduce to deduce the information. Slide 5