Parallel programming with MPI. Jianfeng Yang Internet and Information Technology Lab Wuhan university
|
|
- Caren Thomas
- 5 years ago
- Views:
Transcription
1 Parallel programming with MPI Jianfeng Yang Internet and Information Technology Lab Wuhan university
2 Agenda Part Ⅰ: Seeking Parallelism/Concurrency Part Ⅱ: Parallel Algorithm Design Part Ⅲ: Message-Passing Programming 2
3 Part Ⅰ Seeking Parallel/Concurrency
4 1 Introduction 2 Seeking Parallel Outline 4
5 1 Introduction(1/6) Well done is quickly done Caesar Auguest Fast, Fast, Fast is not fast enough. How to get Higher Performance Parallel Computing. 5
6 1 Introduction(2/6) What is parallel computing? is the use of a parallel computer to reduce the time needed to solve a single computational problem. is now considered a standard way for computational scientists and engineers to solve problems in areas as diverse as galactic evolution, climate modeling, aircraft design, molecular dynamics and economic analysis. 6
7 Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we basically need? The ability to start the tasks A way for them to communicate 7
8 1 Introduction(3/6) What s s parallel computer? Is a Multi-processor computer system supporting parallel programming. Multi-computer Is a parallel computer constructed out of multiple computers and an interconnection network. The processors on different computers interact by passing message e to each other. Centralized multiprocessor (SMP: Symmetrical multiprocessor) Is a more high integrated system in which all CPUs share access to a single global memory. The shared memory supports communications and synchronization among processors. 8
9 1 Introduction(4/6) Multi-core platform Integrated duo/quad or more core in one processor, and each core has their own registers and Level 1 cache, all cores share Level 2 cache, which supports communications and synchronizations among cores. All cores share access to a global memory. 9
10 1 Introduction(5/6) What s s parallel programming? Is programming in language that allows you to explicitly indicate how different portions of the computation may be executed paralleled/concurrently by different processors/cores. Do I need parallel programming really? YES, for the reasons of: Although a lot of research has been invested in and many experimental parallelizing compilers have been developed, there are still no commercial system thus far. The alternative is for you to write your own parallel programs. 10
11 1 Introduction(6/6) Why should I program using MPI and OpenMP? MPI ( Message Passing Interface) is a standard specification for message passing libraries. Which is available on virtually every parallel computer system. Free. If you develop programs using MPI, you will be able to reuse them when you get access to a newer, faster parallel computer. On Multi-core platform or SMP, the cores/cpus have a shared memory space. While MPI is a perfect satisfactory way for cores/processors to communicate with each other, OpenMP is a better way for cores/processors with a single Processor/SMP to interact. The hybrid MPI/OpenMP program can get even high performance. 11
12 2 Seeking Parallel(1/7) In order to take advantage of multi-core/multiple processors, programmers must be able to identify operations that may be performed in parallel. Several ways: Data Dependence Graphs Data Parallelism Functional Parallelism Pipelining 12
13 2 Seeking Parallel(2/7) Data Dependence Graphs A directed graph Each vertex: represent a task to be completed. An edge from vertex u to vertex v means: task u must be completed before task v begins Task v is dependent on task u. If there is no path from u to v, then the tasks are independent and may be performed parallelized. 13
14 2 Seeking Parallel(3/7) Data Dependence Graphs 14
15 2 Seeking Parallel(4/7) Data Parallelism Independent tasks applying the same operation to different elements of a data set. e.g. 15
16 2 Seeking Parallel(5/7) Functional Parallelism Independent tasks applying different operations to different data elements of a data set. 16
17 2 Seeking Parallel(6/7) Pipelining A data dependence graph forming a simple path/chain admits no parallelism if only a single problem instance must be processed. If multiple problems instance to be processed: If a computation can be divided into several stage with the same e time consumption. Then, can support parallelism. E.g. Assembly line. 17
18 2 Seeking Parallel(7/7) Pipelining P[0] = a[0]; P[1] = p[0]+a[1]; P[2] = p[1]+a[2]; P[3] = p[2]+a[3]; 18
19 For Example: Landscape maintains Prepare for dinner Data cluster 19
20 Homework Given a task that can be divided into m subtasks, each require one unit of time, how much time is needed for an m-stage pipeline to process n tasks? Consider the data dependence graph in figure below. identify all sources of data parallelism; identify all sources of functional parallelism. I A A A B C D A A A O 20
21 Part Ⅱ Parallel Algorithm Design
22 Outline 1.Introduction 2.The Task/Channel Model 3.Foster s s Design Methodology 22
23 1.Introduction Foster, Ian. Design and Building Parallel Programs: Concepts and Tools for Parallel Software engineering. Reading, MA: Addison-Wesley, Describe the Task/Channel Model; A few simple problems 23
24 2.The Task/Channel Model The model represents a parallel computation as a set of tasks that may interact with each other by sending message through channels. Task: is a program, its local memory, and a collection of I/O ports. Local memory: instructions private data 24
25 2.The Task/Channel Model channel: Via channel: A task can send local data to other tasks via output ports; A task can receive data value from other tasks via input ports. A channel is a message queue: Connect one task s s output port with another task s s input port. Data value appears at the inputs port in the same order in which they were placed in the output port of the other end of the channel. Receiving data can be blocked: Synchronous. Sending data can never be blocked: Asynchronous. Access to local memory: faster than nonlocal data access. 25
26 3.Foster s s Design Methodology Four-step process: Partitioning Communication Agglomeration mapping Problem Partitioning Communication Mapping Agglomeration 26
27 3.Foster s s Design Methodology Partitioning Is the process of dividing the computation and the data into pieces. More small pieces is good. How to Data-centric approach Function-centric centric approach Domain Decomposition First, divide data into pieces; Then, determine how to associate computations with the data. Focus on: the largest and/or most frequently accessed data structure in the program. E.g., Functional Decomposition 27
28 3.Foster s s Design Methodology Domain Decomposition 1-D 2-D Primitive Task 3-D Better 28
29 3.Foster s s Design Methodology Functional Decomposition Yield collections of tasks that achieve parallel through pipelining. E.g., a system supporting interactive image-guided guided surgery. 29
30 3.Foster s s Design Methodology The quality of Partition (evaluation) At least an order of magnitude more primitive tasks than processors in the target parallel computer. Otherwise: later design options may be too constrained. Redundant computations and redundant data structure storage are minimized. Otherwise: the design may not work well when the size of the problem increases. Primitive tasks are roughly the same size. Otherwise: it may be hard to balance work among the processors/cores. ores. The number of tasks is an increasing function of the problem size. Otherwise: it may be impossible to use more processor/cores to solve s large problem. 30
31 3.Foster s s Design Methodology Communication After identifying the primitive tasks, the communications type between those primitive tasks should be determined. Two kinds of communication type: Local Global 31
32 3.Foster s s Design Methodology Communication Local: A task needs values from a small number of other tasks in order to perform a computation, a channel is created from the tasks supplying the data to the task consuming the data. Global: When a significant number of the primitive tasks must be contribute data in order to perform a computation. E.g., computing the sums of the values held by the primitive processes. 32
33 3.Foster s s Design Methodology Communication Evaluate the communication structure of the designed parallel algorithm. The communication operations are balanced among the tasks. Each task communications with only a small number of neighbors. Tasks can perform their communication in parallel/concurrently. Tasks can perform their computations in parallel/concurrently. 33
34 3.Foster s s Design Methodology Agglomeration Why we need agglomeration? If the number of tasks exceeds the number of processors/cores by several orders of magnitude, simply creating these tasks would be a source of significant overhead. So, combine primitive tasks into large tasks and map them into physical processors/cores to reduce the amount of parallel overhead. What s s agglomeration? Is the process of grouping tasks into large tasks in order to improve performance or simplify programming. When developing MPI programs, ONE task per core/processor is better. 34
35 3.Foster s s Design Methodology Agglomeration Goals 1: lower communication overhead. Eliminate communication among tasks. Increasing the locality of parallelism. Combining groups of sending and receiving tasks. 35
36 3.Foster s s Design Methodology Agglomeration Goals 2: Maintain the scalability of the parallel design. Enable that we have not combined so many tasks that we will not be able to port our program at some point in the future to a computer with more processors/cores. E.g. 3-D 3 D Matrix Operation size: 8*128*258 36
37 3.Foster s s Design Methodology Agglomeration Goals 3: reduce software engineering costs. Make greater use of the existing sequential code. Reducing time; Reducing expense. 37
38 3.Foster s s Design Methodology Agglomeration evaluation: Has increased the locality of the parallel algorithm. Replicated computations take less time than the computations the replace. The amount of replicated data is small enough to allow algorithm to scale. Agglomeration tasks have similar computational and communication costs. The number of tasks is an increasing function of the problem size. The number of tasks is as small as possible, yet at least as great as the number of cores/processors in the target computers. The trade-off between the chosen agglomeration and the cost of modifications to existing sequential code is reasonable. 38
39 3.Foster s s Design Methodology Mapping Increasing processor utilization Minimizing inter-processor communication 39
40 Part Ⅲ Message-Passing Programming
41 Preface 41
42 42
43 process 0 process 1 process 2 Load Process Gather Store 43
44 Hello World! #include <stdio.h< stdio.h> #include mpi.h int main(int argc,char *argv[]) { int size, rank; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD,, &size); MPI_Comm_rank(MPI_COMM_WORLD,, &rank); print( Process %d of %d: Hello world,, rank, size); MPI_Finalize(); } Hello world from process 0 of 4 Hello world from process 1 of 4 Hello world from process 2 of 4 Hello world from process 3 of 4 44
45 Outline Introduction The Message-Passing Model The Message-Passing Interface (MPI) Communication Mode Circuit satisfiability Point-to to-point Communication Collective Communication Benchmarking parallel performance 45
46 Introduction MPI: Message Passing Interface Is a library, not a parallel language. C&MPI, Fortran&MPI Is a standard, not a implement for a actually problem. MPICH Intel MPI MSMPI LAM MPI Is a Message Passing Model 46
47 Introduction The history of MPI: Draft: 1992 MPI-1: 1994 MPI-2:
48 Introduction MPICH: unix.mcs.anl.gov/mpi/mpich1/download.html; unix.mcs.anl.gov/mpi/mpich2/index.htm#download Main Features: Open source; Synchronized on MPI standard; Supports MPMD (multiple Program Multiple Data) and heterogeneous clusters. Supports combining with C/C++, Fortran77 and Fortran90; Supports Unix, Windows NT platform; Supports multi-core, SMP, Cluster, Large Scale Parallel Computer System. 48
49 Introduction Intel MPI According to MPI-2 2 standard. Latest version: 3.1 DAPL (Direct Access Programming Library) 49
50 Introduction-Intel Intel MPI Intel MPI Library Supports Multiple Hardware Fabrics 50
51 Introduction-Intel Intel MPI Features is a multi-fabric message passing library. implements the Message Passing Interface, v2 (MPI-2) specification. provides a standard library across Intel platforms that: Focuses on making applications perform best on IA based clusters Enables adoption of the MPI-2 2 functions as the customer needs dictate Delivers best in class performance for enterprise, divisional, departmental and workgroup high performance computing 51
52 Introduction-Intel Intel MPI Why Intel MPI Library? High performance MPI-2 2 implementation Linux and Windows CCS support Interconnect independence Smart fabric selection Easy installation Free Runtime Environment Close integration with the Intel and 3rd party development tools Internet based licensing and technical support 52
53 Introduction-Intel Intel MPI Standards Based Argonne National Laboratory's MPICH-2 implementation. Integration, can be easily integrated with: Platform LSF 6.1 and higher Altair PBS Pro* 7.1 and higher OpenPBS* * 2.3 Torque* and higher Parallelnavi* * NQS* for Linux V2.0L10 and higher Parallelnavi for Linux Advanced Edition V1.0L10A and higher NetBatch* * 6.x and higher 53
54 Introduction-Intel Intel MPI System Requirements: Host and Target Systems hardware: IA-32, Intel 64, or IA-64 architecture using Intel Pentium 4, Intel Xeon processor, Intel Itanium processor family and compatible platforms 1 GB of RAM - 4 GB recommended Minimum 100 MB of free hard disk space - 10GB recommended. 54
55 Introduction-Intel Intel MPI Operating Systems Requirements: Microsoft Windows* Compute Cluster Server 2003 (Intel 64 architecture only) Red Hat Enterprise Linux* 3.0, 4.0, or 5.0 SUSE* Linux Enterprise Server 9 or 10 SUSE Linux 9.0 thru 10.0 (all except Intel 64 architecture starts at 9.1) HaanSoft Linux 2006 Server* Miracle Linux* 4.0 Red Flag* DC Server 5.0 Asianux* * Linux 2.0 Fedora Core 4, 5, or 6 (IA-32 and Intel 64 architectures only) TurboLinux*10 (IA-32 and Intel 64 architecture) Mandriva/Mandrake* 10.1 (IA-32 architecture only) SGI* ProPack 4.0 (IA-64 architecture only) or 5.0 (IA-64 and Intel 64 architectures) 55
56 The Message-Passing Model Processor Memory Processor Memory Processor Memory Processor Memory Interconnection network Processor Memory Processor Memory Processor Memory Processor Memory 56
57 The Message-Passing Model A task in task/channel model become a process in Message-Passing Model; The number of processes: Is specified by user; Is specified when the program begins; Is constant throughout the execution of the program; Each process: Has a unique ID number; Processor Memory Processor Memory Processor Memory Processor Memory Interconnection network Processor Memory Processor Memory Processor Processor Memory Memory 57
58 The Message-Passing Model Goals of Message-Passing Model: Communication with each other; Synchronization with each other; 58
59 The Message-Passing Interface (MPI) Advantages: Run well on a wide variety of MPMD architectures; Easily to debugging; Threading safe 59
60 What is in MPI Point-to to-point message passing Collective communication Support for process groups Support for communication contexts Support for application topologies Environmental inquiry routines Profiling interface 60
61 Introduction to Groups & Communicator Process model and groups Communication scope Communicators 61
62 Process model and groups Fundamental computational unit is the process. Each process has: an independent thread of control, a separate address space MPI processes execute in MIMD style, but: No mechanism for loading code onto processors, or assigning processes to processors No mechanism for creating or destroying processes MPI supports dynamic process groups. Process groups can be created and destroyed Membership is static Groups may overlap No explicit support for multithreading, but MPI is designed to be b thread-safe. 62
63 Communication scope In MPI, a process is specified by: a group a rank relative to the group ( ) A message label is specified by: a message context a message tag relative to the context Groups are used to partition process space Contexts are used to partition ``message label space'' Groups and contexts are bound together to form a communicator object. Contexts are not visible at the application level. A communicator defines the scope of a communication operation 63
64 Communicators Communicators are used to create independent ``message universes''. Communicators are used to disambiguate message selection when an application calls a library routine that performs message passing. Nondeterminacy may arise if processes enter the library routine asynchronously, if processes enter the library routine synchronously, but there are outstanding communication operations. A communicator binds together groups and contexts defines the scope of a communication operation is represented by an opaque object 64
65 A communicator handle defines which processes a particular command will apply to All MPI communication calls take a communicator handle as a parameter, which is effectively the context in which the communication will take place MPI_INIT defines a communicator called MPI_COMM_WORLD for each process that calls it 65
66 Every communicator contains a group which is a list of processes The processes are ordered and numbered consecutively from 0. The number of each process is known as its rank The rank identifies each process within the communicator The group of MPI_COMM_WORLD is the set of all MPI processes 66
67 Skeleton MPI Program #include <mpi.h> main( int argc, char** argv ) { MPI_Init( &argc, &argv ); /* main part of the program */ } MPI_Finalize(); 67
68 Circuit satisfiability a b What combinations of input value will the circuit output the value of 1? c d e f g h i j k l m n o p 68
69 Circuit satisfiability Analysis: 16 input, a-p, a each take on 2 values of 0 or =65536 design a parallel algorithm Partition Function decomposition No channel between tasks Tasks are independent; Suit for parallelism; Partition Communication Agglomeration Mapping 69
70 Circuit satisfiability Communication: Tasks are independent; 70
71 Circuit satisfiability Agglomeration and Mapping Fixed number of tasks; The time for each task to complete is variable. WHY? How to balance the computation load? Mapping tasks in cyclic fashion. Partition Communication Agglomeration Mapping 71
72 Circuit satisfiability Each process will examine a combination of inputs in turn. 72
73 Circuit satisfiability #define EXTRACT_BIT(n,i) ((n&(1<<i))?1:0) void check_circuit(int id,int z){ int v[16]; int i; for( i=0;i<16;i++) v[i] = EXTRACT_BIT(z,i) ; if((v[0] v[1]) && (!v[1]!v[3]) && (v[2] v[3]) && (!v[3]!v[4]) && (v[4]!v[5]) && ( v[5]!v[6]) && (v[5] v[6]) && ( v[6]!v[15]) && (v[7]!v[8]) && (!v[7]!v[13]) && (v[8] v[9]) && ( v[9] v[11]) && (v[10] v[11]) && ( v[12] v[13]) && (v[13]!v[14]) && (v[14] v[15]) ) { printf( %d) %d%d%d%d%d%d%d%d%d%d%d%d%d%d%d %d,id,v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7],v[8],v[9], v[10],v[11],v[12],v[13],v[14],v[15]); fflush(stdout); } } 73
74 Point-to to-point Communication Overview Blocking Behaviors Non-Blocking Behaviors 74
75 overview A message is sent from a sender to a receiver There are several variations on how the sending of a message can interact with the program 75
76 Synchronous does not complete until the message has been received A FAX or registered mail 76
77 Asynchronous completes as soon as the message is on the way. A post card or 77
78 communication modes is selected with send routine. synchronous mode ("safest") ready mode (lowest system overhead) buffered mode (decouples sender from receiver) standard mode (compromise) Calls are also blocking or nonblocking. Blocking stops the program until the message buffer is safe to use Non-blocking separates communication from computation 78
79 Blocking Behavior int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm) buf is the beginning of the buffer containing the data to be sent. For Fortran, this is often the name of an array in your program. For C, it is an address. count is the number of elements to be sent (not bytes) datatype is the type of data dest is the rank of the process which is the destination for the message tag is an arbitrary number which can be used to distinguish among messages comm is the communicator 79
80 Temporary Knowledge Message Msg: buf,, count, datatype Msg envelop: dest,, tag, comm Tag why? ( ) ( ) ( ) ( ) 80
81 81
82 When using standard-mode send It is up to MPI to decide whether outgoing messages will be buffered. Completes once the message has been sent, which may or may not imply that the massage has arrived at its destination Can be started whether or not a matching receive has been posted. It may complete before a matching receive is posted. Has non-local completion semantics, since successful completion of the send operation may depend on the occurrence of a matching receive. 82
83 Blocking Standard Send 83
84 MPI_Recv int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status) buf is the beginning of the buffer where the incoming data are to be b stored. For Fortran, this is often the name of an array in your program. For C, it is an address. count is the number of elements (not bytes) in your receive buffer datatype is the type of data source is the rank of the process from which data will be accepted (This can be a wildcard, by specifying the parameter MPI_ANY_SOURCE.) tag is an arbitrary number which can be used to distinguish among messages (This can be a wildcard, by specifying the parameter MPI_ANY_TAG.) comm is the communicator status is an array or structure of information that is returned. For example, e if you specify a wildcard for source or tag, status will tell you u the actual rank or tag for the message received 84
85 85
86 86
87 Blocking Synchronous Send 87
88 Cont. can be started whether or not a matching receive was posted will complete successfully only if a matching receive is posted, and the receive operation has started to receive the message sent by the synchronous send. provides synchronous communication semantics: a communication does not complete at either end before both processes rendezvous at the communication. has non-local completion semantics. 88
89 Blocking Ready Send 89
90 completes immediately may be started only if the matching receive has already been posted. has the same semantics as a standard-mode send. saves on overhead by avoiding handshaking and buffering 90
91 Blocking Buffered Send 91
92 Can be started whether or not a matching receive has been posted. It may complete before a matching receive is posted. Has local completion semantics: its completion does not depend on the occurrence of a matching receive. In order to complete the operation, it may be necessary to buffer the outgoing message locally. For that purpose, buffer space is provided by the application. 92
93 Non-Blocking Behavior MPI_Isend (buf,count,dtype,dest,tag,comm, buf,count,dtype,dest,tag,comm,request) MPI_Wait (request,status) request matches request on Isend or Irecv request status returns status equivalent to status for Recv when complete Blocks for send until message is buffered or sent so message variable is free Blocks for receive until message is received and ready 93
94 Non-blocking Synchronous Send int MPI_Issend (void *buf* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request) IN = provided by programmer, OUT = set by routine buf: starting address of message buffer (IN( IN) count: : number of elements in message (IN( IN) datatype: : type of elements in message (IN( IN) dest: : rank of destination task in communicator comm (IN) tag: : message tag (IN( IN) comm: : communicator (IN( IN) request: : identifies a communication event (OUT( OUT) 94
95 Non-blocking Ready Send int MPI_Irsend (void *buf* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request) 95
96 Non-blocking Buffered Send int MPI_Ibsend (void *buf* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request) 96
97 Non-blocking Standard Send int MPI_Isend (void *buf* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request) 97
98 Non-blocking Receive IN = provided by programmer, OUT = set by routine buf: : starting address of message buffer (OUT-buffer contents written) count: : number of elements in message (IN( IN) datatype: : type of elements in message (IN( IN) source: : rank of source task in communicator comm (IN) tag: : message tag (IN( IN) comm: : communicator (IN( IN) request: : identifies a communication event (OUT( OUT) 98
99 int MPI_Irecv (void* buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Request *request) 99
100 request: : identifies a communication event (INOUT( INOUT) status: : status of communication event (OUT( OUT) count: : number of communication events (IN( IN) index: : index in array of requests of completed event (OUT( OUT) incount: : number of communication events (IN( IN) outcount: : number of completed events (OUT( OUT) 100
101 int MPI_Wait (MPI_Request *request, MPI_Status *status) int MPI_Waitall (int count, MPI_Request *array_of_requests, MPI_Status *array_of_statuses) int MPI_Waitany (int count, MPI_Request *array_of_requests, int *index, MPI_Status *status) int MPI_Waitsome (int incount, MPI_Request *array_of_requests, int *outcount, int* array_of_indices, MPI_Status *array_of_statuses) 101
102 Communication Mode Synchronous Ready Buffered Standard Blocking Routines MPI_SSEND MPI_RSEND MPI_BSEND MPI_SEND MPI_RECV Non-Blocking Routines MPI_ISSEND MPI_IRSEND MPI_IBSEND MPI_ISEND MPI_IRECV 102
103 Synchro nous Ready Buffered Standard Advantages Safest, and therefore most portable SEND/RECV order not critical Amount of buffer space irrelevant Lowest total overhead SEND/RECV handshake not required Decouples SEND from RECV No sync overhead on SEND Order of SEND/RECV irrelevant Programmer can control size of buffer space Good for many cases Disadvantages Can incur substantial synchronization overhead RECV must precede SEND Additional system overhead incurred by copy to buffer Your program may not be suitable 103
104 MPI Quick Start MPI_Init MPI_BCast MPI_Wtime MPI_Comm_rank MPI_Scatter MPI_Wtick MPI_Comm_size MPI_Gather MPI_Barrier MPI_Send MPI_Reduce MPI_Recv MPI_Finalize MPI_Xxxxx 104
105 MPI Routines MPI_Init To Initialize MPI execution environment. argc: Pointer to the number of arguments argv: Pointer to the argument vector The First MPI function call; Allow system to do any setup needed to hander further calls to MPI Library. defines a communicator called MPI_COMM_WORLD for each process that calls it MPI_Init must be called before any other MPI functions. Exception: MPI_Initializes,, checks to see if MPI has been initialzed. May be called before MPI_Init. 105
106 MPI Routines MPI_Comm_rank To determine a process s s ID number. Return: Process s s ID by rank Communicator: MPI_Comm: : MPI_COMM_WORLD, include all process when MPI initialized. 106
107 MPI Routines MPI_Comm_size To find the number of processes -- size 107
108 MPI Routines MPI_Send The source process send the data in buffer to destination process. buf count The starting address of the data to be transmitted. The number of data items. datatype The type of data items.(all of the data items must be in the same type) dest tag comm The rank of the process to receive the data. An integer label for the message, allowing messages serving different purpose to be identified. Indicates the communicator in which this message is being sent. 108
109 MPI Routines MPI_Send Blocks until the message buffer is once again availabel. MPI constants for C data types. 109
110 MPI Routines MPI_Recv buf count The starting address where the received data is to be stored. The maximum number of data items the receiving process is willing to receive. datatype The type of data items source tag comm status The rank of the process sending this message. The desired tag value for the message Indicates the communicator in which this message is being passed. MPI data structure. Return the status. 110
111 MPI Routines MPI_Recv Receive the message from the source process. The data type and tag of message received must be in according with the data type and tag define in MPI_Recv funciton. The count of data items received must be less than the count define in this function. Otherwise, will cause the overflow error condition. If count equal to zero, then message is empty. Blocks until the message has been recived. Or an error conditions cause the function to return. 111
112 MPI Routines MPI_Recv status->mpi_source status->mpi_tag The rank of the process sending the msg. The msg s tag value. status->mpi_erroe The error condition. int MPI_Abort(MPI_Comm comm, int errorcode) 112
113 MPI Routines MPI_Finalize Allowing system to free up resources, such as memory, that have been allocated to MPI. Without MPI_Finalize,, the result of program will unknowns. 113
114 summary 114
115 Collective communication Communication operation A group of processes work together to distribute or gather together a set of one or more values. 115
116 Collective communication MPI_Bcast A root process broadcast one or more data items of the same type to all other processed in a communicator. 116
117 Collective communication MPI_Bcast int MPI_Bcast( void* buffer, //addr of 1st broadcast element int count, // #element to be broadcast MPI_Datatype datatype, // type of element to be broadcast int root, // ID of process doing broadcast MPI_Comm comm) //communicator 117
118 Collective communication MPI_Scatter The root process send the different parts of data item to other processes. 118
119 Collective communication MPI_Scatter 119
120 Collective communication MPI_Gather Each process sending data of its buffer to root process
121 Collective communication MPI_Gather 121
122 Collective communication MPI_Reduce After a process has completed its share of the work, it is ready to participate in the reduction operation. MPI_Reduce perform one or more reduction operations on values submitted by all the processed in a communicator. 122
123 Collective communication MPI_Reduce 123
124 Collective communication MPI_Reduce MPI s built-in in reduction operators MPI_BAND MPI_BOR MPI_BXOR MPI_LAND MPI_LOR MPI_LXOR MPI_MAX MPI_MAXLOC MPI_MIN MPI_MINLOC MPI_PORD MPI_SUM Bitwise and Bitwise or Bitwise exclusive or logical and logical or Logical exclusive or Maximum Maximum and location of maximum Minimum Minimum and location of maximum Product Sum 124
125 summary 125
126 126
127 127
128 128
129 Benchmarking parallel performance Measure the performance of a parallel application. How? Measuring the number of seconds that elapse from the time we initiate execution until the program terminates. double MPI_Wtime(void) Returns the numbers of seconds that have elapsed since some point of time in the past. double MPI_Wtick(void) Returns the precision of the result returned by MPI_Wtime. 129
130 Benchmarking parallel performance MPI_Barrier int MPI_Barrier(MPI_Comm comm) comm: : indicate in which communicator the processes will participate the barrier synchronization. Function of MPI_Barrier is. 130
131 For example Send and receive operation 131
132 For example Compute pi dx = arctan( x) 0 = arctan(1) arctan(0) = arctan(1) = π / x f 4 ( x) = 2 (1 + x ) 1 f ( x) dx = π 0 132
133 For example π = N 2 i 1 f ( ) 2 N 1 N i= 1 i= 1 = 1 N N f i ( 0.5 ) N 133
134 For example Compute pi 134
135 For example Matrix Multiplication MPI_Scatter(&iaA[0][0],N,MPI_INT,&iaA[iRank][0],N,MPI_INT,0,MPI_COMM_WORLD); MPI_Bcast(&iaB[0][0],N*N,MPI_INT,0,MPI_COMM_WORLD); for(i=0;i<n;i++) { temp = 0; for(j=0;j<n;j++) { remp = temp+iaa[irank][j] * iab[j][i]; } iac[irank][i] = temp; } MPI_Gather(&iaC[iRank][0],N,MPI_INT,&iaC[0][0],N,MPI_INT,0,MPI_COMM_WORLD); 135
136 136
137 l 1 C i, = = a b j i, k k, j k 0 where A is an n x l matrix and B is an l x m matrix. 137
138 138
139 139
140 140
141 141
142 Summary MPI is a Library. Six foundational functions of MPI. collective communication. MPI communication Model. 142
143 Thanks! Fell free to contact me via for any questions or suggestions. And Welcome to Wuhan University!
Parallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 4 Message-Passing Programming Learning Objectives n Understanding how MPI programs execute n Familiarity with fundamental MPI functions
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 4 Message-Passing Programming Learning Objectives Understanding how MPI programs execute Familiarity with fundamental MPI functions
More informationIntroduction to parallel computing concepts and technics
Introduction to parallel computing concepts and technics Paschalis Korosoglou (support@grid.auth.gr) User and Application Support Unit Scientific Computing Center @ AUTH Overview of Parallel computing
More informationOutline. Communication modes MPI Message Passing Interface Standard
MPI THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking
More informationMPI Message Passing Interface
MPI Message Passing Interface Portable Parallel Programs Parallel Computing A problem is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information
More informationPractical Scientific Computing: Performanceoptimized
Practical Scientific Computing: Performanceoptimized Programming Programming with MPI November 29, 2006 Dr. Ralf-Peter Mundani Department of Computer Science Chair V Technische Universität München, Germany
More informationOutline. Communication modes MPI Message Passing Interface Standard. Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa Tp.HCM
THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking
More informationMessage Passing Interface
Message Passing Interface DPHPC15 TA: Salvatore Di Girolamo DSM (Distributed Shared Memory) Message Passing MPI (Message Passing Interface) A message passing specification implemented
More informationPoint-to-Point Communication. Reference:
Point-to-Point Communication Reference: http://foxtrot.ncsa.uiuc.edu:8900/public/mpi/ Introduction Point-to-point communication is the fundamental communication facility provided by the MPI library. Point-to-point
More informationSlides prepared by : Farzana Rahman 1
Introduction to MPI 1 Background on MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers, and parallel programmers Used to create parallel programs based
More informationMessage Passing Interface. most of the slides taken from Hanjun Kim
Message Passing Interface most of the slides taken from Hanjun Kim Message Passing Pros Scalable, Flexible Cons Someone says it s more difficult than DSM MPI (Message Passing Interface) A standard message
More informationAcknowledgments. Programming with MPI Basic send and receive. A Minimal MPI Program (C) Contents. Type to enter text
Acknowledgments Programming with MPI Basic send and receive Jan Thorbecke Type to enter text This course is partly based on the MPI course developed by Rolf Rabenseifner at the High-Performance Computing-Center
More informationProgramming with MPI Basic send and receive
Programming with MPI Basic send and receive Jan Thorbecke Type to enter text Delft University of Technology Challenge the future Acknowledgments This course is partly based on the MPI course developed
More informationPart - II. Message Passing Interface. Dheeraj Bhardwaj
Part - II Dheeraj Bhardwaj Department of Computer Science & Engineering Indian Institute of Technology, Delhi 110016 India http://www.cse.iitd.ac.in/~dheerajb 1 Outlines Basics of MPI How to compile and
More informationCS 426. Building and Running a Parallel Application
CS 426 Building and Running a Parallel Application 1 Task/Channel Model Design Efficient Parallel Programs (or Algorithms) Mainly for distributed memory systems (e.g. Clusters) Break Parallel Computations
More informationMPI. (message passing, MIMD)
MPI (message passing, MIMD) What is MPI? a message-passing library specification extension of C/C++ (and Fortran) message passing for distributed memory parallel programming Features of MPI Point-to-point
More informationMessage Passing Interface. George Bosilca
Message Passing Interface George Bosilca bosilca@icl.utk.edu Message Passing Interface Standard http://www.mpi-forum.org Current version: 3.1 All parallelism is explicit: the programmer is responsible
More informationLesson 1. MPI runs on distributed memory systems, shared memory systems, or hybrid systems.
The goals of this lesson are: understanding the MPI programming model managing the MPI environment handling errors point-to-point communication 1. The MPI Environment Lesson 1 MPI (Message Passing Interface)
More informationStandard MPI - Message Passing Interface
c Ewa Szynkiewicz, 2007 1 Standard MPI - Message Passing Interface The message-passing paradigm is one of the oldest and most widely used approaches for programming parallel machines, especially those
More informationChapter 4. Message-passing Model
Chapter 4 Message-Passing Programming Message-passing Model 2 1 Characteristics of Processes Number is specified at start-up time Remains constant throughout the execution of program All execute same program
More informationHigh Performance Computing
High Performance Computing Course Notes 2009-2010 2010 Message Passing Programming II 1 Communications Point-to-point communications: involving exact two processes, one sender and one receiver For example,
More informationThe Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing
The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Parallelism Decompose the execution into several tasks according to the work to be done: Function/Task
More informationDistributed Memory Parallel Programming
COSC Big Data Analytics Parallel Programming using MPI Edgar Gabriel Spring 201 Distributed Memory Parallel Programming Vast majority of clusters are homogeneous Necessitated by the complexity of maintaining
More informationCS 470 Spring Mike Lam, Professor. Distributed Programming & MPI
CS 470 Spring 2017 Mike Lam, Professor Distributed Programming & MPI MPI paradigm Single program, multiple data (SPMD) One program, multiple processes (ranks) Processes communicate via messages An MPI
More informationRecap of Parallelism & MPI
Recap of Parallelism & MPI Chris Brady Heather Ratcliffe The Angry Penguin, used under creative commons licence from Swantje Hess and Jannis Pohlmann. Warwick RSE 13/12/2017 Parallel programming Break
More informationCSE 613: Parallel Programming. Lecture 21 ( The Message Passing Interface )
CSE 613: Parallel Programming Lecture 21 ( The Message Passing Interface ) Jesmin Jahan Tithi Department of Computer Science SUNY Stony Brook Fall 2013 ( Slides from Rezaul A. Chowdhury ) Principles of
More informationParallel Computing Paradigms
Parallel Computing Paradigms Message Passing João Luís Ferreira Sobral Departamento do Informática Universidade do Minho 31 October 2017 Communication paradigms for distributed memory Message passing is
More informationMessage Passing Interface
Message Passing Interface by Kuan Lu 03.07.2012 Scientific researcher at Georg-August-Universität Göttingen and Gesellschaft für wissenschaftliche Datenverarbeitung mbh Göttingen Am Faßberg, 37077 Göttingen,
More informationCS 470 Spring Mike Lam, Professor. Distributed Programming & MPI
CS 470 Spring 2018 Mike Lam, Professor Distributed Programming & MPI MPI paradigm Single program, multiple data (SPMD) One program, multiple processes (ranks) Processes communicate via messages An MPI
More informationIntroduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign
Introduction to MPI May 20, 2013 Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Top500.org PERFORMANCE DEVELOPMENT 1 Eflop/s 162 Pflop/s PROJECTED 100 Pflop/s
More informationIPM Workshop on High Performance Computing (HPC08) IPM School of Physics Workshop on High Perfomance Computing/HPC08
IPM School of Physics Workshop on High Perfomance Computing/HPC08 16-21 February 2008 MPI tutorial Luca Heltai Stefano Cozzini Democritos/INFM + SISSA 1 When
More informationCOSC 6374 Parallel Computation
COSC 6374 Parallel Computation Message Passing Interface (MPI ) II Advanced point-to-point operations Spring 2008 Overview Point-to-point taxonomy and available functions What is the status of a message?
More informationCS 6230: High-Performance Computing and Parallelization Introduction to MPI
CS 6230: High-Performance Computing and Parallelization Introduction to MPI Dr. Mike Kirby School of Computing and Scientific Computing and Imaging Institute University of Utah Salt Lake City, UT, USA
More informationDiscussion: MPI Basic Point to Point Communication I. Table of Contents. Cornell Theory Center
1 of 14 11/1/2006 3:58 PM Cornell Theory Center Discussion: MPI Point to Point Communication I This is the in-depth discussion layer of a two-part module. For an explanation of the layers and how to navigate
More informationTopics. Lecture 7. Review. Other MPI collective functions. Collective Communication (cont d) MPI Programming (III)
Topics Lecture 7 MPI Programming (III) Collective communication (cont d) Point-to-point communication Basic point-to-point communication Non-blocking point-to-point communication Four modes of blocking
More informationParallel Short Course. Distributed memory machines
Parallel Short Course Message Passing Interface (MPI ) I Introduction and Point-to-point operations Spring 2007 Distributed memory machines local disks Memory Network card 1 Compute node message passing
More informationCS 179: GPU Programming. Lecture 14: Inter-process Communication
CS 179: GPU Programming Lecture 14: Inter-process Communication The Problem What if we want to use GPUs across a distributed system? GPU cluster, CSIRO Distributed System A collection of computers Each
More informationScientific Computing
Lecture on Scientific Computing Dr. Kersten Schmidt Lecture 21 Technische Universität Berlin Institut für Mathematik Wintersemester 2014/2015 Syllabus Linear Regression, Fast Fourier transform Modelling
More informationProgramming Scalable Systems with MPI. Clemens Grelck, University of Amsterdam
Clemens Grelck University of Amsterdam UvA / SurfSARA High Performance Computing and Big Data Course June 2014 Parallel Programming with Compiler Directives: OpenMP Message Passing Gentle Introduction
More informationHolland Computing Center Kickstart MPI Intro
Holland Computing Center Kickstart 2016 MPI Intro Message Passing Interface (MPI) MPI is a specification for message passing library that is standardized by MPI Forum Multiple vendor-specific implementations:
More informationTopics. Lecture 6. Point-to-point Communication. Point-to-point Communication. Broadcast. Basic Point-to-point communication. MPI Programming (III)
Topics Lecture 6 MPI Programming (III) Point-to-point communication Basic point-to-point communication Non-blocking point-to-point communication Four modes of blocking communication Manager-Worker Programming
More informationChip Multiprocessors COMP Lecture 9 - OpenMP & MPI
Chip Multiprocessors COMP35112 Lecture 9 - OpenMP & MPI Graham Riley 14 February 2018 1 Today s Lecture Dividing work to be done in parallel between threads in Java (as you are doing in the labs) is rather
More informationA Message Passing Standard for MPP and Workstations
A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker Message Passing Interface (MPI) Message passing library Can be
More informationMessage Passing Interface
MPSoC Architectures MPI Alberto Bosio, Associate Professor UM Microelectronic Departement bosio@lirmm.fr Message Passing Interface API for distributed-memory programming parallel code that runs across
More informationParallel programming MPI
Parallel programming MPI Distributed memory Each unit has its own memory space If a unit needs data in some other memory space, explicit communication (often through network) is required Point-to-point
More informationHPC Parallel Programing Multi-node Computation with MPI - I
HPC Parallel Programing Multi-node Computation with MPI - I Parallelization and Optimization Group TATA Consultancy Services, Sahyadri Park Pune, India TCS all rights reserved April 29, 2013 Copyright
More informationParallel Programming. Using MPI (Message Passing Interface)
Parallel Programming Using MPI (Message Passing Interface) Message Passing Model Simple implementation of the task/channel model Task Process Channel Message Suitable for a multicomputer Number of processes
More informationMPI: Parallel Programming for Extreme Machines. Si Hammond, High Performance Systems Group
MPI: Parallel Programming for Extreme Machines Si Hammond, High Performance Systems Group Quick Introduction Si Hammond, (sdh@dcs.warwick.ac.uk) WPRF/PhD Research student, High Performance Systems Group,
More informationIntroduction to the Message Passing Interface (MPI)
Introduction to the Message Passing Interface (MPI) CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction to the Message Passing Interface (MPI) Spring 2018
More informationA few words about MPI (Message Passing Interface) T. Edwald 10 June 2008
A few words about MPI (Message Passing Interface) T. Edwald 10 June 2008 1 Overview Introduction and very short historical review MPI - as simple as it comes Communications Process Topologies (I have no
More informationDistributed Systems + Middleware Advanced Message Passing with MPI
Distributed Systems + Middleware Advanced Message Passing with MPI Gianpaolo Cugola Dipartimento di Elettronica e Informazione Politecnico, Italy cugola@elet.polimi.it http://home.dei.polimi.it/cugola
More informationProgramming SoHPC Course June-July 2015 Vladimir Subotic MPI - Message Passing Interface
www.bsc.es Programming with Message-Passing Libraries SoHPC Course June-July 2015 Vladimir Subotic 1 Data Transfer Blocking: Function does not return, before message can be accessed again Process is blocked
More informationIntroduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS. Teacher: Jan Kwiatkowski, Office 201/15, D-2
Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS Teacher: Jan Kwiatkowski, Office 201/15, D-2 COMMUNICATION For questions, email to jan.kwiatkowski@pwr.edu.pl with 'Subject=your name.
More informationExperiencing Cluster Computing Message Passing Interface
Experiencing Cluster Computing Message Passing Interface Class 6 Message Passing Paradigm The Underlying Principle A parallel program consists of p processes with different address spaces. Communication
More informationDocument Classification
Document Classification Introduction Search engine on web Search directories, subdirectories for documents Search for documents with extensions.html,.txt, and.tex Using a dictionary of key words, create
More informationProgramming Scalable Systems with MPI. UvA / SURFsara High Performance Computing and Big Data. Clemens Grelck, University of Amsterdam
Clemens Grelck University of Amsterdam UvA / SURFsara High Performance Computing and Big Data Message Passing as a Programming Paradigm Gentle Introduction to MPI Point-to-point Communication Message Passing
More informationParallel Programming
Parallel Programming Point-to-point communication Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de WS 18/19 Scenario Process P i owns matrix A i, with i = 0,..., p 1. Objective { Even(i) : compute Ti
More informationMPI MESSAGE PASSING INTERFACE
MPI MESSAGE PASSING INTERFACE David COLIGNON, ULiège CÉCI - Consortium des Équipements de Calcul Intensif http://www.ceci-hpc.be Outline Introduction From serial source code to parallel execution MPI functions
More informationMessage Passing Interface - MPI
Message Passing Interface - MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 24, 2011 Many slides adapted from lectures by
More informationIntroduction to MPI: Part II
Introduction to MPI: Part II Pawel Pomorski, University of Waterloo, SHARCNET ppomorsk@sharcnetca November 25, 2015 Summary of Part I: To write working MPI (Message Passing Interface) parallel programs
More informationLecture 7: Distributed memory
Lecture 7: Distributed memory David Bindel 15 Feb 2010 Logistics HW 1 due Wednesday: See wiki for notes on: Bottom-up strategy and debugging Matrix allocation issues Using SSE and alignment comments Timing
More informationMPI MESSAGE PASSING INTERFACE
MPI MESSAGE PASSING INTERFACE David COLIGNON CÉCI - Consortium des Équipements de Calcul Intensif http://hpc.montefiore.ulg.ac.be Outline Introduction From serial source code to parallel execution MPI
More information4. Parallel Programming with MPI
4. Parallel Programming with MPI 4. Parallel Programming with MPI... 4.. MPI: Basic Concepts and Definitions...3 4... The Concept of Parallel Program...3 4..2. Data Communication Operations...3 4..3. Communicators...3
More informationAn Introduction to MPI
An Introduction to MPI Parallel Programming with the Message Passing Interface William Gropp Ewing Lusk Argonne National Laboratory 1 Outline Background The message-passing model Origins of MPI and current
More informationHigh Performance Computing Course Notes Message Passing Programming I
High Performance Computing Course Notes 2008-2009 2009 Message Passing Programming I Message Passing Programming Message Passing is the most widely used parallel programming model Message passing works
More informationMessage Passing Interface - MPI
Message Passing Interface - MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico March 31, 2016 Many slides adapted from lectures by Bill
More informationCOMP 322: Fundamentals of Parallel Programming
COMP 322: Fundamentals of Parallel Programming https://wiki.rice.edu/confluence/display/parprog/comp322 Lecture 37: Introduction to MPI (contd) Vivek Sarkar Department of Computer Science Rice University
More informationCS4961 Parallel Programming. Lecture 18: Introduction to Message Passing 11/3/10. Final Project Purpose: Mary Hall November 2, 2010.
Parallel Programming Lecture 18: Introduction to Message Passing Mary Hall November 2, 2010 Final Project Purpose: - A chance to dig in deeper into a parallel programming model and explore concepts. -
More informationMPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016
MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared
More informationThe Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs
1 The Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) s http://mpi-forum.org https://www.open-mpi.org/ Mike Bailey mjb@cs.oregonstate.edu Oregon State University mpi.pptx
More informationCS4961 Parallel Programming. Lecture 16: Introduction to Message Passing 11/3/11. Administrative. Mary Hall November 3, 2011.
CS4961 Parallel Programming Lecture 16: Introduction to Message Passing Administrative Next programming assignment due on Monday, Nov. 7 at midnight Need to define teams and have initial conversation with
More informationHigh performance computing. Message Passing Interface
High performance computing Message Passing Interface send-receive paradigm sending the message: send (target, id, data) receiving the message: receive (source, id, data) Versatility of the model High efficiency
More informationMessage-Passing Computing
Chapter 2 Slide 41þþ Message-Passing Computing Slide 42þþ Basics of Message-Passing Programming using userlevel message passing libraries Two primary mechanisms needed: 1. A method of creating separate
More informationProgramming with MPI. Pedro Velho
Programming with MPI Pedro Velho Science Research Challenges Some applications require tremendous computing power - Stress the limits of computing power and storage - Who might be interested in those applications?
More informationCollective Communication in MPI and Advanced Features
Collective Communication in MPI and Advanced Features Pacheco s book. Chapter 3 T. Yang, CS240A. Part of slides from the text book, CS267 K. Yelick from UC Berkeley and B. Gropp, ANL Outline Collective
More informationParallel Programming with MPI: Day 1
Parallel Programming with MPI: Day 1 Science & Technology Support High Performance Computing Ohio Supercomputer Center 1224 Kinnear Road Columbus, OH 43212-1163 1 Table of Contents Brief History of MPI
More informationWeek 3: MPI. Day 02 :: Message passing, point-to-point and collective communications
Week 3: MPI Day 02 :: Message passing, point-to-point and collective communications Message passing What is MPI? A message-passing interface standard MPI-1.0: 1993 MPI-1.1: 1995 MPI-2.0: 1997 (backward-compatible
More informationProgramming with MPI on GridRS. Dr. Márcio Castro e Dr. Pedro Velho
Programming with MPI on GridRS Dr. Márcio Castro e Dr. Pedro Velho Science Research Challenges Some applications require tremendous computing power - Stress the limits of computing power and storage -
More informationPCAP Assignment I. 1. A. Why is there a large performance gap between many-core GPUs and generalpurpose multicore CPUs. Discuss in detail.
PCAP Assignment I 1. A. Why is there a large performance gap between many-core GPUs and generalpurpose multicore CPUs. Discuss in detail. The multicore CPUs are designed to maximize the execution speed
More information15-440: Recitation 8
15-440: Recitation 8 School of Computer Science Carnegie Mellon University, Qatar Fall 2013 Date: Oct 31, 2013 I- Intended Learning Outcome (ILO): The ILO of this recitation is: Apply parallel programs
More informationIntroduction to MPI. Ricardo Fonseca. https://sites.google.com/view/rafonseca2017/
Introduction to MPI Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Distributed Memory Programming (MPI) Message Passing Model Initializing and terminating programs Point to point
More informationCluster Computing MPI. Industrial Standard Message Passing
MPI Industrial Standard Message Passing MPI Features Industrial Standard Highly portable Widely available SPMD programming model Synchronous execution MPI Outer scope int MPI_Init( int *argc, char ** argv)
More informationMPI and comparison of models Lecture 23, cs262a. Ion Stoica & Ali Ghodsi UC Berkeley April 16, 2018
MPI and comparison of models Lecture 23, cs262a Ion Stoica & Ali Ghodsi UC Berkeley April 16, 2018 MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers,
More informationIntroduction to Parallel Programming
University of Nizhni Novgorod Faculty of Computational Mathematics & Cybernetics Section 4. Part 1. Introduction to Parallel Programming Parallel Programming with MPI Gergel V.P., Professor, D.Sc., Software
More informationDistributed Memory Programming with Message-Passing
Distributed Memory Programming with Message-Passing Pacheco s book Chapter 3 T. Yang, CS240A Part of slides from the text book and B. Gropp Outline An overview of MPI programming Six MPI functions and
More informationThe Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs
1 The Message Passing Interface (MPI): Parallelism on Multiple (Possibly Heterogeneous) CPUs http://mpi-forum.org https://www.open-mpi.org/ Mike Bailey mjb@cs.oregonstate.edu Oregon State University mpi.pptx
More informationParallel Computing. Distributed memory model MPI. Leopold Grinberg T. J. Watson IBM Research Center, USA. Instructor: Leopold Grinberg
Parallel Computing Distributed memory model MPI Leopold Grinberg T. J. Watson IBM Research Center, USA Why do we need to compute in parallel large problem size - memory constraints computation on a single
More informationDepartment of Informatics V. HPC-Lab. Session 4: MPI, CG M. Bader, A. Breuer. Alex Breuer
HPC-Lab Session 4: MPI, CG M. Bader, A. Breuer Meetings Date Schedule 10/13/14 Kickoff 10/20/14 Q&A 10/27/14 Presentation 1 11/03/14 H. Bast, Intel 11/10/14 Presentation 2 12/01/14 Presentation 3 12/08/14
More informationFramework of an MPI Program
MPI Charles Bacon Framework of an MPI Program Initialize the MPI environment MPI_Init( ) Run computation / message passing Finalize the MPI environment MPI_Finalize() Hello World fragment #include
More informationWhat s in this talk? Quick Introduction. Programming in Parallel
What s in this talk? Parallel programming methodologies - why MPI? Where can I use MPI? MPI in action Getting MPI to work at Warwick Examples MPI: Parallel Programming for Extreme Machines Si Hammond,
More informationBasic MPI Communications. Basic MPI Communications (cont d)
Basic MPI Communications MPI provides two non-blocking routines: MPI_Isend(buf,cnt,type,dst,tag,comm,reqHandle) buf: source of data to be sent cnt: number of data elements to be sent type: type of each
More informationPractical Introduction to Message-Passing Interface (MPI)
1 Practical Introduction to Message-Passing Interface (MPI) October 1st, 2015 By: Pier-Luc St-Onge Partners and Sponsors 2 Setup for the workshop 1. Get a user ID and password paper (provided in class):
More informationCS4961 Parallel Programming. Lecture 19: Message Passing, cont. 11/5/10. Programming Assignment #3: Simple CUDA Due Thursday, November 18, 11:59 PM
Parallel Programming Lecture 19: Message Passing, cont. Mary Hall November 4, 2010 Programming Assignment #3: Simple CUDA Due Thursday, November 18, 11:59 PM Today we will cover Successive Over Relaxation.
More informationint sum;... sum = sum + c?
int sum;... sum = sum + c? Version Cores Time (secs) Speedup manycore Message Passing Interface mpiexec int main( ) { int ; char ; } MPI_Init( ); MPI_Comm_size(, &N); MPI_Comm_rank(, &R); gethostname(
More informationParallel Computing and the MPI environment
Parallel Computing and the MPI environment Claudio Chiaruttini Dipartimento di Matematica e Informatica Centro Interdipartimentale per le Scienze Computazionali (CISC) Università di Trieste http://www.dmi.units.it/~chiarutt/didattica/parallela
More informationCOSC 6374 Parallel Computation. Message Passing Interface (MPI ) I Introduction. Distributed memory machines
Network card Network card 1 COSC 6374 Parallel Computation Message Passing Interface (MPI ) I Introduction Edgar Gabriel Fall 015 Distributed memory machines Each compute node represents an independent
More informationProgramming Using the Message Passing Paradigm
Programming Using the Message Passing Paradigm Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. Topic Overview
More informationMore about MPI programming. More about MPI programming p. 1
More about MPI programming More about MPI programming p. 1 Some recaps (1) One way of categorizing parallel computers is by looking at the memory configuration: In shared-memory systems, the CPUs share
More informationCSE 160 Lecture 15. Message Passing
CSE 160 Lecture 15 Message Passing Announcements 2013 Scott B. Baden / CSE 160 / Fall 2013 2 Message passing Today s lecture The Message Passing Interface - MPI A first MPI Application The Trapezoidal
More informationCSE. Parallel Algorithms on a cluster of PCs. Ian Bush. Daresbury Laboratory (With thanks to Lorna Smith and Mark Bull at EPCC)
Parallel Algorithms on a cluster of PCs Ian Bush Daresbury Laboratory I.J.Bush@dl.ac.uk (With thanks to Lorna Smith and Mark Bull at EPCC) Overview This lecture will cover General Message passing concepts
More information