Intermediate MPI features

Similar documents
A Message Passing Standard for MPP and Workstations

A Message Passing Standard for MPP and Workstations. Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W.

High-Performance Computing: MPI (ctd)

Review of MPI Part 2

High Performance Computing

Topics. Lecture 7. Review. Other MPI collective functions. Collective Communication (cont d) MPI Programming (III)

Basic MPI Communications. Basic MPI Communications (cont d)

Collective Communications

Topics. Lecture 6. Point-to-point Communication. Point-to-point Communication. Broadcast. Basic Point-to-point communication. MPI Programming (III)

Distributed Memory Parallel Programming

Outline. Communication modes MPI Message Passing Interface Standard

MPI. (message passing, MIMD)

Parallel programming MPI

Message Passing Interface

Standard MPI - Message Passing Interface

Cluster Computing MPI. Industrial Standard Message Passing

CSE 613: Parallel Programming. Lecture 21 ( The Message Passing Interface )

Introduzione al Message Passing Interface (MPI) Andrea Clematis IMATI CNR

Programming SoHPC Course June-July 2015 Vladimir Subotic MPI - Message Passing Interface

Message Passing Interface. most of the slides taken from Hanjun Kim

Message Passing with MPI

Data parallelism. [ any app performing the *same* operation across a data stream ]

CME 213 SPRING Eric Darve

Distributed Systems + Middleware Advanced Message Passing with MPI

Department of Informatics V. HPC-Lab. Session 4: MPI, CG M. Bader, A. Breuer. Alex Breuer

Outline. Communication modes MPI Message Passing Interface Standard. Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa Tp.HCM

MA471. Lecture 5. Collective MPI Communication

CS 179: GPU Programming. Lecture 14: Inter-process Communication

Collective Communications I

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI

Buffering in MPI communications

IPM Workshop on High Performance Computing (HPC08) IPM School of Physics Workshop on High Perfomance Computing/HPC08

CDP. MPI Derived Data Types and Collective Communication

In the simplest sense, parallel computing is the simultaneous use of multiple computing resources to solve a problem.

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI

Cornell Theory Center. Discussion: MPI Collective Communication I. Table of Contents. 1. Introduction

Parallel Computing Paradigms

Programming with MPI Collectives

Introduction to TDDC78 Lab Series. Lu Li Linköping University Parts of Slides developed by Usman Dastgeer

Message Passing Interface

More Communication (cont d)

The Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs

High Performance Computing Course Notes Message Passing Programming III

The Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs

Programming Using the Message-Passing Paradigm (Chapter 6) Alexandre David

High Performance Computing Course Notes Message Passing Programming III

Part - II. Message Passing Interface. Dheeraj Bhardwaj

HPC Parallel Programing Multi-node Computation with MPI - I

MPI 5. CSCI 4850/5850 High-Performance Computing Spring 2018

Introduction to MPI. HY555 Parallel Systems and Grids Fall 2003

Programming Using the Message Passing Paradigm

Recap of Parallelism & MPI

Distributed Memory Systems: Part IV

Tutorial 2: MPI. CS486 - Principles of Distributed Computing Papageorgiou Spyros

COMP 322: Fundamentals of Parallel Programming

Last Time. Intro to Parallel Algorithms. Parallel Search Parallel Sorting. Merge sort Sample sort

Parallel Programming. Using MPI (Message Passing Interface)

Experiencing Cluster Computing Message Passing Interface

MPI Collective communication

Introduction to MPI. SuperComputing Applications and Innovation Department 1 / 143

The MPI Message-passing Standard Practical use and implementation (V) SPD Course 6/03/2017 Massimo Coppola

Document Classification Problem

Intra and Inter Communicators

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI

Introduction to MPI: Part II

Basic Communication Operations (Chapter 4)

Distributed Memory Programming with MPI

Non-Blocking Communications

Introduction to the Message Passing Interface (MPI)

Reusing this material

Non-Blocking Communications

Collective Communication in MPI and Advanced Features

Parallel Programming with MPI MARCH 14, 2018

L15: Putting it together: N-body (Ch. 6)!

UNIVERSITY OF MORATUWA

Practical Scientific Computing: Performanceoptimized

COSC 6374 Parallel Computation

The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing

A message contains a number of elements of some particular datatype. MPI datatypes:

Scientific Computing

Introduction to MPI part II. Fabio AFFINITO

MPI Message Passing Interface

MPI Programming Techniques

Collective Communications II

Practical Scientific Computing: Performanceoptimized

Masterpraktikum - Scientific Computing, High Performance Computing

Lecture 6: Message Passing Interface

Parallel Programming, MPI Lecture 2

MPI Workshop - III. Research Staff Cartesian Topologies in MPI and Passing Structures in MPI Week 3 of 3

Introduction to Parallel Programming

Acknowledgments. Programming with MPI Basic send and receive. A Minimal MPI Program (C) Contents. Type to enter text

Introduction to MPI Part II Collective Communications and communicators

MPI Programming. Henrik R. Nagel Scientific Computing IT Division

Programming with MPI Basic send and receive

The MPI Message-passing Standard Practical use and implementation (VI) SPD Course 08/03/2017 Massimo Coppola

Using Model Checking with Symbolic Execution for the Verification of Data-Dependent Properties of MPI-Based Parallel Scientific Software

Reusing this material

More MPI. Bryan Mills, PhD. Spring 2017

Masterpraktikum - Scientific Computing, High Performance Computing

PCAP Assignment I. 1. A. Why is there a large performance gap between many-core GPUs and generalpurpose multicore CPUs. Discuss in detail.

Transcription:

Intermediate MPI features Advanced message passing Collective communication Topologies Group communication Forms of message passing (1) Communication modes: Standard: system decides whether message is buffered Buffered: user explicitly controls buffering Synchronous: send waits for matching receive Ready: send may be started only if matching receive has already been posted 1

Forms of message passing (2) Non-blocking communication When blocking send returns, memory buffer can be reused Blocking receive waits for message Non-blocking send returns immediately (dangerous) Non-blocking receive through IPROBE Non-blocking receive MPI_IPROBE MPI_PROBE MPI_GET_COUNT message check for pending message wait for pending message number of data elements in MPI_PROBE (source, tag, comm, &status) => status MPI_GET_COUNT (status, datatype, &count) => message size status.mpi_source => identity of sender status.mpi_tag => tag of message 2

Example: Check for Pending Message int buf[1], flag, source, minimum; while (...) { MPI_IPROBE(MPI_ANY_SOURCE, NEW_MINIMUM, comm, &flag, &status); if (flag) { /* handle new minimum */ source = status.mpi_source; MPI_RECV (buf, 1, MPI_INT, source, NEW_MINIMUM, comm, &status); minimum = buf[0]; }... /* compute */ } Example: Receiving Message with Unknown Size int count, *buf, source; MPI_PROBE(MPI_ANY_SOURCE, 0, comm, &status); source = status.mpi_source; MPI_GET_COUNT (status, MPI_INT, &count); buf = malloc (count * sizeof (int)); MPI_RECV (buf, count, MPI_INT, source, 0, comm, &status); 3

Global Operations - Collective Communication Coordinated communication involving all processes Functions: MPI_BARRIER MPI_BCAST MPI_GATHER MPI_SCATTER MPI_REDUCE MPI_ALLREDUCE result synchronize all processes send data to all processes gather data from all processes scatter data to all processes reduction operation reduction, all processes get Barrier MPI_Barrier (comm) Synchronizes group of processes All processes block until all have reached the barrier Often invoked at end of loop in iterative algorithms 4

Broadcast operation Sends the same data packet to all processes in the communicator MPI_Bcast(void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm) 5

Reduction Combine values provided by different processes Result sent to one processor (MPI_Reduce) or all processors (MPI_Allreduce) Used with commutative and associative operators: MAX, MIN, +, x, AND, OR MPI_Reduce(void *sendbuf, void *recbuff, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm) Example 1 Global minimum operation MPI_REDUCE (inbuf, outbuf, 2, MPI_INT, MPI_MIN, 0, MPI_COMM_WORLD) outbuf[0] = minimum over inbuf[0]'s outbuf[1] = minimum over inbuf[1]'s 6

Reduce operation 7

SOR communication scheme Each CPU communicates with left & right neighbor (if existing) Also need to determine convergence criteria Expressing SOR in MPI Use a ring topology Each processor exchanges rows with left/right neighbor Use REDUCE_ALL to determine if grid has changed less than epsilon during last iteration 8

Send iteration number Send data Exchange border data Check convergence error Collect result Topologies E.g. Cartesian coordinate system int MPI_Cart_create ( MPI_Comm comm_old, int ndims, int *dims, int *periods, int reorder, MPI_Comm *comm_cart ) Makes a new communicator to which topology information has been attached Input Parameters comm_old input communicator (handle) ndims number of dimensions of cartesian grid (integer) dims integer array of size ndims specifying the number of processes in each dimension periods logical array of size ndims specifying whether the grid is periodic (true) or not (false) in each dimension reorder ranking may be reordered (true) or not (false) (logical) 9

Group communication Several groups can be created in an MPI program. int MPI_Comm_create ( MPI_Comm comm, MPI_Group group, MPI_Comm *comm_out ) Creates a new communicator Input Parameters comm communicator (handle) group group, which is a subset of the group of comm (handle) Other features Parallel I/O Threads Complex data types Shared memory operations Dynamic process management Error handling 10