Introduction to Parallel Programming with MPI

Similar documents
Parallel Programming Using MPI

Message Passing Interface. most of the slides taken from Hanjun Kim

Message Passing Interface

Practical Introduction to Message-Passing Interface (MPI)

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

CSE. Parallel Algorithms on a cluster of PCs. Ian Bush. Daresbury Laboratory (With thanks to Lorna Smith and Mark Bull at EPCC)

MPI introduction - exercises -

The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing

Introduction to MPI. Ricardo Fonseca.

Introduction to MPI. Ekpe Okorafor. School of Parallel Programming & Parallel Architecture for HPC ICTP October, 2014

AMath 483/583 Lecture 18 May 6, 2011

Introduction to parallel computing with MPI

Slides prepared by : Farzana Rahman 1

Outline. Communication modes MPI Message Passing Interface Standard. Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa Tp.HCM

L14 Supercomputing - Part 2

Distributed Memory Parallel Programming

Message Passing Interface

Part One: The Files. C MPI Slurm Tutorial - Hello World. Introduction. Hello World! hello.tar. The files, summary. Output Files, summary

MPI Lab. How to split a problem across multiple processors Broadcasting input to other nodes Using MPI_Reduce to accumulate partial sums

MPI. (message passing, MIMD)

Practical Introduction to Message-Passing Interface (MPI)

FROM SERIAL FORTRAN TO MPI PARALLEL FORTRAN AT THE IAS: SOME ILLUSTRATIVE EXAMPLES

lslogin3$ cd lslogin3$ tar -xvf ~train00/mpibasic_lab.tar cd mpibasic_lab/pi cd mpibasic_lab/decomp1d

Parallel Programming in C with MPI and OpenMP

Introduction to the Message Passing Interface (MPI)

MPI: Parallel Programming for Extreme Machines. Si Hammond, High Performance Systems Group

Report S1 C. Kengo Nakajima

The Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs

Our new HPC-Cluster An overview

MPI MESSAGE PASSING INTERFACE

15-440: Recitation 8

MPI Lab. Steve Lantz Susan Mehringer. Parallel Computing on Ranger and Longhorn May 16, 2012

Parallel Programming. Using MPI (Message Passing Interface)

Programming with MPI Collectives

Parallel Programming, MPI Lecture 2

MPI 1. CSCI 4850/5850 High-Performance Computing Spring 2018

Introduction to MPI. SHARCNET MPI Lecture Series: Part I of II. Paul Preney, OCT, M.Sc., B.Ed., B.Sc.

Introduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign

IPM Workshop on High Performance Computing (HPC08) IPM School of Physics Workshop on High Perfomance Computing/HPC08

The Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs

The Message Passing Model

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Experiencing Cluster Computing Message Passing Interface

Compilation and Parallel Start

Parallel Programming Using Basic MPI. Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center

Parallel Computing and the MPI environment

MPI: The Message-Passing Interface. Most of this discussion is from [1] and [2].

Tech Computer Center Documentation

To connect to the cluster, simply use a SSH or SFTP client to connect to:

Tutorial on MPI: part I

Recap of Parallelism & MPI

Practical Course Scientific Computing and Visualization

Parallel Programming in C with MPI and OpenMP

MPI introduction - exercises -

AMath 483/583 Lecture 21

Simple examples how to run MPI program via PBS on Taurus HPC

Getting started with the CEES Grid

Introduction to Parallel Programming

東京大学情報基盤中心准教授片桐孝洋 Takahiro Katagiri, Associate Professor, Information Technology Center, The University of Tokyo

CEE 618 Scientific Parallel Computing (Lecture 5): Message-Passing Interface (MPI) advanced

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI

Elementary Parallel Programming with Examples. Reinhold Bader (LRZ) Georg Hager (RRZE)

Practical Scientific Computing: Performanceoptimized

Chapter 4. Message-passing Model

MPI-Hello World. Timothy H. Kaiser, PH.D.

Supercomputing in Plain English Exercise #6: MPI Point to Point

ACEnet for CS6702 Ross Dickson, Computational Research Consultant 29 Sep 2009

Agenda. MPI Application Example. Praktikum: Verteiltes Rechnen und Parallelprogrammierung Introduction to MPI. 1) Recap: MPI. 2) 2.

CS4961 Parallel Programming. Lecture 16: Introduction to Message Passing 11/3/11. Administrative. Mary Hall November 3, 2011.

SHARCNET Workshop on Parallel Computing. Hugh Merz Laurentian University May 2008

Introduction to MPI Programming Part 2

MPI-Hello World. Timothy H. Kaiser, PH.D.

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Report S1 C. Kengo Nakajima Information Technology Center. Technical & Scientific Computing II ( ) Seminar on Computer Science II ( )

CS 426. Building and Running a Parallel Application

Parallel Programming Basic MPI. Timothy H. Kaiser, Ph.D.

Parallel Programming Using MPI

How to for compiling and running MPI Programs. Prepared by Kiriti Venkat

CS 470 Spring Mike Lam, Professor. Distributed Programming & MPI

MPI Program Structure

Tutorial: parallel coding MPI

MPI Tutorial. Shao-Ching Huang. IDRE High Performance Computing Workshop

Parallel Computing: Overview

Message-Passing Programming with MPI

Message Passing Interface

ECE 574 Cluster Computing Lecture 13

Introduction to MPI. Martin Čuma Center for High Performance Computing University of Utah

What s in this talk? Quick Introduction. Programming in Parallel

Hybrid Programming with MPI and OpenMP. B. Estrade

Computing with the Moore Cluster

Outline. Communication modes MPI Message Passing Interface Standard

Message-Passing Computing

Lecture 14: Mixed MPI-OpenMP programming. Lecture 14: Mixed MPI-OpenMP programming p. 1

Claudio Chiaruttini Dipartimento di Matematica e Informatica Centro Interdipartimentale per le Scienze Computazionali (CISC) Università di Trieste

Parallel Programming Basic MPI. Timothy H. Kaiser, Ph.D.

An Introduction to MPI

High Performance Computing Course Notes Message Passing Programming I

MPI MESSAGE PASSING INTERFACE

A few words about MPI (Message Passing Interface) T. Edwald 10 June 2008

Transcription:

Introduction to Parallel Programming with MPI PICASso Tutorial October 25-26, 2006 Stéphane Ethier (ethier@pppl.gov) Computational Plasma Physics Group Princeton Plasma Physics Lab

Why Parallel Computing? Want to speed up a calculation. Solution: Split the work between several processors. How? It depends on the type of parallel computer Shared memory (usually thread-based) Distributed memory (process-based) MPI works on all of them!

Shared memory parallelism Program runs inside a single process Several execution threads are created within that process and work is split between them. The threads run on different processors. All threads have access to the shared data through shared memory access. Must be careful not the have threads overwrite each other s data.

Shared memory programming Easy to do loop-level parallelism. Compiler-based automatic parallelization Easy but not always efficient Better to do it yourself with OpenMP Coarse-grain parallelism can be difficult threads Main Process serial work start parallel loop stop parallel loop

Distributed memory parallelism Process-based programming. Each process has its own memory space that cannot be accessed by the other processes. The work is split between several processes. For efficiency, each processor runs a single process. Communication between the processes must be explicit, e.g. Message Passing

How to split the work between processors? Most widely used method for grid-based calculations: DOMAIN DECOMPOSITION Split particles in particle-in-cell (PIC) or molecular dynamics codes. Split arrays in PDE solvers etc Keep it LOCAL

What is MPI? MPI stands for Message Passing Interface. It is a message-passing specification, a standard, for the vendors to implement. In practice, MPI is a set of functions (C) and subroutines (Fortran) used for exchanging data between processes. An MPI library exists on most, if not all, parallel computing platforms so it is highly portable.

How much do I need to know? MPI is small (6 functions) Many parallel programs can be written with just 6 basic functions. MPI is large (125 functions) MPI's extensive functionality requires many functions Number of functions not necessarily a measure of complexity MPI is just right One can access flexibility when it is required. One need not master all parts of MPI to use it.

How MPI works Launch the parallel calculation with: mpirun np #proc a.out mpiexec n #proc a.out Copies of the same program run on each processor within its own process (private address space). Each processor works on a subset of the problem. Exchange data when needed Can be exchanged through the network interconnect Or through the shared memory on SMP machines (Bus?) Easy to do coarse grain parallelism = scalable

Good MPI web sites http://www.llnl.gov/computing/tutorials/mpi/ http://www.nersc.gov/nusers/help/tutorials/mpi/intro/ http://www-unix.mcs.anl.gov/mpi/tutorial/gropp/talk.html http://www-unix.mcs.anl.gov/mpi/tutorial/ MPI on Linux clusters: MPICH (http://www-unix.mcs.anl.gov/mpi/mpich/) Open MPI (http://www.open-mpi.org/)

Structure of an MPI program Program mpi_code! Load MPI definitions use mpi (or include mpif.h)! Initialize MPI call MPI_Init(ierr)! Get the number of processes call MPI_Comm_size(MPI_COMM_WORLD,nproc,ierr)! Get my process number (rank) call MPI_Comm_rank(MPI_COMM_WORLD,myrank,ierr) Do work and make message passing calls! Finalize call MPI_Finalize(ierr) end program mpi_code Fortran 90 Style!

Structure of an MPI program #include "mpi.h" int main( int argc, char *argv[] ) { int nproc, myrank; /* Initialize MPI */ MPI_Init(&argc,&argv); /* Get the number of processes */ MPI_Comm_size(MPI_COMM_WORLD,&nproc); /* Get my process number (rank) */ MPI_Comm_rank(MPI_COMM_WORLD,&myrank); Do work and make message passing calls /* Finalize */ call MPI_Finalize(); return 0; } C Style!

Compilation mpich provides scripts that take care of the include directories and linking libraries mpicc mpicc mpif77 mpif90 Otherwise, must link with the right MPI library

Makefile Always a good idea to have a Makefile %cat Makefile CC=mpicc CFLAGS=-O % : %.c $(CC) $(CFLAGS) $< -o $@

mpirun and mpiexec Both are used for starting an MPI job If you don t have a batch system, use mpirun mpirun np #proc machinefile mfile a.out >& out < in & %cat mfile machine1.princeton.edu machine2.princeton.edu machine3.princeton.edu machine4.princeton.edu PBS usually takes care of arguments to mpiexec

Batch System: PBS primer Submit a job script: qsub script Check status of jobs: qstat a (for all jobs) Stop a job: qdel job_id ### --- PBS SCRIPT --- #PBS l nodes=4:ppn=2,walltime=02:00:00 #PBS q dque #PBS V #PBS N job_name #PBS m abe cd $PBS_O_WORKDIR mpiexec np 8 a.out

Basic MPI calls to exchange data Point to point: 2 processes at a time MPI_Send(buf,count,datatype,dest,tag,comm,ierr) MPI_Recv(buf,count,datatype,source,tag,comm,status,ierr) MPI_Sendrecv(sendbuf,sendcount,sendtype,dest,sendtag, recvbuf,recvcount,recvtype,source,recvtag,comm,status,ierr) where types are: MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_COMPLEX, MPI_CHARACTER, MPI_LOGICAL, etc Predefined Communicator: MPI_COMM_WORLD

Collective MPI calls Collective calls: All processes participate One process sends to everybody: MPI_Bcast(buffer,count,datatype,root,comm,ierr) All processes send to root process and the operation op is applied MPI_Reduce(sendbuf,recvbuf,count,datatype,op,root,comm,ierr) where op = MPI_SUM, MPI_MAX, MPI_MIN, MPI_PROD, etc You can create your own reduction operation with MPI_Op_create() All processes send to everybody and apply the operation op (equivalent to an MPI_Reduce followed by an MPI_Bcast) MPI_Allreduce(sendbuf,recvbuf,count,datatype,op,comm,ierr) Synchronize all processes MPI_Barrier(comm,ierr)

More MPI collective calls All processes send a different piece of data to one single root process which gathers everything (messages ordered by index) MPI_Gather(sendbuf,sendcnt,sendtype,recvbuf,recvcount, recvtype,root,comm,ierr) All processes gather everybody else s pieces of data MPI_Allgather(sendbuf,sendcount,sendtype,recvbuf,recvcount, recvtype,comm,info) One root process send a different piece of the data to each one of the other processes MPI_Scatter(sendbuf,sendcnt,sendtype,recvbuf,recvcnt, recvtype,root,comm,ierr) Each process performs a scatter operation, sending a distinct message to all the processes in the group in order by index. MPI_Alltoall(sendbuf,sendcount,sendtype,recvbuf,recvcnt, recvtype,comm,ierr)