COSC 6374 Parallel Computation. Remote Direct Memory Access
|
|
- Helen Daniel
- 6 years ago
- Views:
Transcription
1 COSC 6374 Parallel Computation Remote Direct Memory Access Edgar Gabriel Fall 2015 Communication Models A P0 receive send B P1 Message Passing Model A B Shared Memory Model P0 A=B P1 A P0 put B P1 Remote Memory Access 1
2 Data Movement Mem CPU CPU Mem NIC NIC Message Passing Model: Two-sided communication Mem CPU CPU Mem NIC NIC Remote Memory Access: One-sided communication Remote Direct Memory Access Direct Memory Access (DMA) allows data to be sent directly from an attached device to the memory on the computer's motherboard. One CPU is freed from involvement with the data transfer, thus speeding up overall computer operation Remote Direct Memory Access (RDMA): two or more computers communicate directly from the main memory of one system to the main memory of another 2
3 One-sided communication in MPI MPI-2 defines one-sided communication: A process can put some data into the main memory of another process (MPI_Put) A process can get some data from the main memory of another process (MPI_Get) A process can perform some operations on a data item in the main memory of another process (MPI_Accumulate) Target process not actively involved in the communication RDMA in MPI Problems: How can a process define which part of its main memory are available for RDMA? How can a process define when this part of the main memory is available for RDMA? How can a process define who is allowed to access its memory? How can a process define which elements in a remote memory it wants to access? 3
4 The window concept of MPI-2 (I) MPI_Win_create(void *base, MPI_Aint size, int disp_unit, MPI_Info info, MPI_Comm comm, MPI_Win *win); An MPI_Win defines the group of process allowed to access a certain memory area Arguments: base: Starting address for the public memory region size: size of the public memory area in bytes disp_unit: offset from the base address in bytes info: Hint to the MPI how the window will be used (e.g. only reading or only writing) comm: communicator defining the group of processes allowed to access the memory window The window concept of MPI-2 (II) Definition of a temporal window: Access Epoch: time slot in which a process accesses remote memory of another process Exposure Epoch: time slot in which a process allows access to a memory window by other processes Does a process have control when other processes are accessing its memory window? yes: active target communication no: passive target communication 4
5 Active Target Communication (I) MPI_Win_fence ( int assert, MPI_Win win); Synchronization of all operations within a window collective across all processes of win No difference between access and exposure epoch Starts or closes an access and exposure epoch Arguments assert: Hint to the library on the usage (default: 0) Data exchange (I) MPI_Put (void *oaddr, int ocount, MPI_Datatype otype, int rank, MPI_Aint disp, int tcount, MPI_Datatype ttype, MPI_Win win); A single process controls the data parameters of both processes Put data described by (oaddr, ocount, otype) into the main memory of the process defined by Rank rank in the window win at the position (base+disp*disp_unit,tcount,ttype) base and disp_unit have been defined in MPI_Win_create Value of base and disp_unit not known by the process calling MPI_Put! 5
6 Example: Ghost-cell update Parallel Matrix-vector multiply for band-matrices x1 rhs1 x2 rhs2 30 x 3 rhs3 50x4 rhs4 Process 0 Process 1 50x1 30x 2 20x1 50x2 30x rhs x2 50x3 30x4 rhs3 20x3 50x rhs 4 rhs 1 4 Process 0 needs x 3 Process 1 needs x 2 Example: Ghost-cell update (II) Ghost cells: (read-only) copy of elements held by another process x 1 Process 0 x 2 x3 x2 Process 1 Ghost-cells for 2-D matrices: additional row of data x 3 x 4 Process 0 nxlocal Process 1 nxlocal Process 2 nxlocal ny 6
7 Example: Ghost-cell update (III) Data structure: u[i][j] is stored in a matrix nxlocal : no of data points in x direction ny : no of data points in y direction Extent of variable u u[ n xlocal 2][ ny ] with u[ 1: n xlocal ][0 : n y 1] containing the local data Example: Ghost-cell update (IV) MPI_Win_create ( u,(nxlocal+2)*ny*sizeof(double), 0, MPI_INFO_NULL, &win); MPI_Win_fence ( 0, win); MPI_Put ( &u[1][0], ny, MPI_DOUBLE, rank-1, (nxlocal+1)*ny*sizeof(double), ny, MPI_DOUBLE, win); MPI_Put ( &u[nxlocal][0], ny, MPI_DOUBLE, rank+1, 0, ny, MPI_DOUBLE, win); MPI_Win_fence ( 0, win); MPI_Win_free ( &win); 7
8 Comments to the example Modifications to the data items might only be visible after closing the corresponding epochs No guarantee whether the data item is really transfered during MPI_Put or during MPI_Win_fence If multiple processes modify the very same memory address at the very same process, no guarantees are given on which data item will be visible. Responsibility of the user to get it right Passive Target Communication MPI_Win_lock (int lock_type, int rank, int assert, MPI_Win win); MPI_Win_unlock (int rank, MPI_Win win); MPI_Win_lock starts an access epoch to access the main memory of the process with rank rank All RDMA operations between a lock/unlock appear atomic lock_type: MPI_LOCK_EXCLUSIVE or MPI_LOCK_SHARED Update to the local memory exposed through the MPI window should also happen using MPI_Win_lock/MPI_Put Otherwise undefined access order/race condition between local update and RDMA access 8
9 Example: Ghost-cell update (V) MPI_Win_create ( u,(nxlocal+2)*ny*sizeof(double), 0, MPI_INFO_NULL, &win); MPI_Win_lock ( MPI_LOCK_EXCLUSIVE, rank-1, 0, win); MPI_Put ( &u[1][0], ny, MPI_DOUBLE, rank-1, (nxlocal+1)*ny*sizeof(double), ny, MPI_DOUBLE, win); MPI_Win_unlock( rank-1, win); MPI_Win_lock ( MPI_LOCK_EXCLUSIVE, rank+1, 0, win); MPI_Put ( &u[nxlocal][0], ny, MPI_DOUBLE, rank+1, 0, ny, MPI_DOUBLE, win); MPI_Win_unlock ( rank+1, win); One-sided vs. Two-sided communication One-sided communication doesn t need message matching unexpected message queues Uses only one processor potentially faster! One-sided communication in MPI can optimize potentially multiple transactions between multiple processes 9
10 Limitations of the MPI-2 model Synchronization costs (e.g. MPI_Win_fence) can be significant Static model Size of memory window can not be altered after creating an MPI_Win Difficult to support dynamic data structures such as a linked list Passive target model has limited usability But that is what most other RDMA libraries focus on In MPI-3: Introduction of dynamic windows Extending the functionality passive target operations Use case: distributed linked list A linked list maintained across multiple processes E.g. after a global sort operation of all elements E.g. having fixed rules for the keys rank 0: keys which start with a to d rank 1: keys which start with e to h Rank 0 Rank 1 Rank 2 10
11 Use case: Distributed linked list typedef struct{ char key[max_key_size]; char value[max_value_size]; MPI_Aint next_disp; Equivalent to the next pointer int next_rank; in a non-distributed linked list void *next_local; // next local element } ListElem; // Create an MPI data type describing this // structure using MPI_Type_create_struct. Not shown // here for brevity Traversing a distributed linked list ListElem local_copy, *current; ListElem *head; //assumed to be already set current=head; Get a shared (read-only) lock to all processes that are part of win MPI_Win_lock_all ( win ); while (!found ) { if ( current->next_rank!= myrank ) { MPI_Get (&local_copy, 1, ListElem_type, current->next_rank, current->next_disp, 1, ListElem_type, win ); MPI_Win_flush ( current->next_rank, win ); current = &local_copy; } else Enforce the completion of all current = current->next_local; pending operations to a process if ( strcmp(current->key, key ) == 0 ) without having to release the lock(s) break; } MPI_Win_unlock_all( win); 11
12 Inserting elements into a linked list Assuming that only a local process is allowed to insert an element (e.g. after a global sort operation) Remote processes only allowed to read elements on other processes Requires dynamically allocating memory and extending a memory region MPI_Win_create_dynamic( MPI_Info info, MPI_Comm comm, MPI_Win *win); MPI_Win_attach (MPI_Win win, void *base, MPI_Aint size); A dynamic window defines only the participating group of process More than one memory region can be assigned to a single window Inserting elements into a linked list (II) // create window instance once MPI_Win_create_dynamic (MPI_INFO_NULL, comm, &win); // insert each element into the memory window t = (ListElem *) malloc ( sizeof (ListElem) ); t->key = strdup (key); t->value = strdup (value); current = find_prev_element (head, key, value) t2 = current->next_local; Similarly for updating next_rank and current->next_local = t; next_disp on current and t t->next_local = t2; MPI_Win_attach ( win, t, sizeof(listelem ); // add another element t = (ListElem *) malloc ( sizeof (ListElem) ); MPI_Win_attach ( win, t, sizeof(listelem ); MPI_Barrier (comm); 12
COSC 6374 Parallel Computation. Remote Direct Memory Acces
COSC 6374 Parallel Computation Remote Direct Memory Acces Edgar Gabriel Fall 2013 Communication Models A P0 receive send B P1 Message Passing Model: Two-sided communication A P0 put B P1 Remote Memory
More informationHigh Performance Computing: Tools and Applications
High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 19 MPI remote memory access (RMA) Put/Get directly
More informationMPI-3 One-Sided Communication
HLRN Parallel Programming Workshop Speedup your Code on Intel Processors at HLRN October 20th, 2017 MPI-3 One-Sided Communication Florian Wende Zuse Institute Berlin Two-sided communication Standard Message
More informationMPI One sided Communication
MPI One sided Communication Chris Brady Heather Ratcliffe The Angry Penguin, used under creative commons licence from Swantje Hess and Jannis Pohlmann. Warwick RSE Notes for Fortran Since it works with
More informationLecture 34: One-sided Communication in MPI. William Gropp
Lecture 34: One-sided Communication in MPI William Gropp www.cs.illinois.edu/~wgropp Thanks to This material based on the SC14 Tutorial presented by Pavan Balaji William Gropp Torsten Hoefler Rajeev Thakur
More informationDPHPC Recitation Session 2 Advanced MPI Concepts
TIMO SCHNEIDER DPHPC Recitation Session 2 Advanced MPI Concepts Recap MPI is a widely used API to support message passing for HPC We saw that six functions are enough to write useful
More informationDPHPC: Locks Recitation session
SALVATORE DI GIROLAMO DPHPC: Locks Recitation session spcl.inf.ethz.ch 2-threads: LockOne T0 T1 volatile int flag[2]; void lock() { int j = 1 - tid; flag[tid] = true; while (flag[j])
More informationMPI Tutorial Part 2 Design of Parallel and High-Performance Computing Recitation Session
S. DI GIROLAMO [DIGIROLS@INF.ETHZ.CH] MPI Tutorial Part 2 Design of Parallel and High-Performance Computing Recitation Session Slides credits: Pavan Balaji, Torsten Hoefler https://htor.inf.ethz.ch/teaching/mpi_tutorials/isc16/hoefler-balaji-isc16-advanced-mpi.pdf
More informationHigh Performance Computing: Tools and Applications
High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 20 Shared memory computers and clusters On a shared
More informationAdvanced MPI. Andrew Emerson
Advanced MPI Andrew Emerson (a.emerson@cineca.it) Agenda 1. One sided Communications (MPI-2) 2. Dynamic processes (MPI-2) 3. Profiling MPI and tracing 4. MPI-I/O 5. MPI-3 22/02/2017 Advanced MPI 2 One
More informationMore advanced MPI and mixed programming topics
More advanced MPI and mixed programming topics Extracting messages from MPI MPI_Recv delivers each message from a peer in the order in which these messages were send No coordination between peers is possible
More informationOne Sided Communication. MPI-2 Remote Memory Access. One Sided Communication. Standard message passing
MPI-2 Remote Memory Access Based on notes by Sathish Vadhiyar, Rob Thacker, and David Cronk One Sided Communication One sided communication allows shmem style gets and puts Only one process need actively
More informationThe good, the bad and the ugly: Experiences with developing a PGAS runtime on top of MPI-3
The good, the bad and the ugly: Experiences with developing a PGAS runtime on top of MPI-3 6th Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2018) www.dash-project.org Karl Fürlinger
More informationIntroduction to MPI-2 (Message-Passing Interface)
Introduction to MPI-2 (Message-Passing Interface) What are the major new features in MPI-2? Parallel I/O Remote Memory Operations Dynamic Process Management Support for Multithreading Parallel I/O Includes
More informationCOSC 6374 Parallel Computation. Introduction to MPI V Derived Data Types. Edgar Gabriel Fall Derived Datatypes
COSC 6374 Parallel Computation Introduction to MPI V Derived Data Types Edgar Gabriel Fall 2013 Derived Datatypes Basic idea: describe memory layout of user data structures e.g. a structure in C typedef
More informationCOSC 6374 Parallel Computation. Message Passing Interface (MPI ) I Introduction. Distributed memory machines
Network card Network card 1 COSC 6374 Parallel Computation Message Passing Interface (MPI ) I Introduction Edgar Gabriel Fall 015 Distributed memory machines Each compute node represents an independent
More informationDistributed Memory Parallel Programming
COSC Big Data Analytics Parallel Programming using MPI Edgar Gabriel Spring 201 Distributed Memory Parallel Programming Vast majority of clusters are homogeneous Necessitated by the complexity of maintaining
More informationAdvanced MPI. Andrew Emerson
Advanced MPI Andrew Emerson (a.emerson@cineca.it) Agenda 1. One sided Communications (MPI-2) 2. Dynamic processes (MPI-2) 3. Profiling MPI and tracing 4. MPI-I/O 5. MPI-3 11/12/2015 Advanced MPI 2 One
More informationMore MPI. Bryan Mills, PhD. Spring 2017
More MPI Bryan Mills, PhD Spring 2017 MPI So Far Communicators Blocking Point- to- Point MPI_Send MPI_Recv CollecEve CommunicaEons MPI_Bcast MPI_Barrier MPI_Reduce MPI_Allreduce Non-blocking Send int MPI_Isend(
More informationAdvanced MPI. George Bosilca
Advanced MPI George Bosilca Nonblocking and collective communications Nonblocking communication Prevent deadlocks related to message ordering Overlapping communication/computation If communication progress
More informationEnabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided
Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided ROBERT GERSTENBERGER, MACIEJ BESTA, TORSTEN HOEFLER MPI-3.0 RMA MPI-3.0 supports RMA ( MPI One Sided ) Designed to react to
More informationImplementing Byte-Range Locks Using MPI One-Sided Communication
Implementing Byte-Range Locks Using MPI One-Sided Communication Rajeev Thakur, Robert Ross, and Robert Latham Mathematics and Computer Science Division Argonne National Laboratory Argonne, IL 60439, USA
More informationMPI Introduction More Advanced Topics
MPI Introduction More Advanced Topics Torsten Hoefler (some slides borrowed from Rajeev Thakur and Pavan Balaji) Course Outline Friday Morning 9.00-10.30: Intro to Advanced Features Nonblocking collective
More informationMPI Remote Memory Access Programming (MPI3-RMA) and Advanced MPI Programming
TORSTEN HOEFLER MPI Remote Access Programming (MPI3-RMA) and Advanced MPI Programming presented at RWTH Aachen, Jan. 2019 based on tutorials in collaboration with Bill Gropp, Rajeev Thakur, and Pavan Balaji
More informationIntermediate MPI (Message-Passing Interface) 1/11
Intermediate MPI (Message-Passing Interface) 1/11 What happens when a process sends a message? Suppose process 0 wants to send a message to process 1. Three possibilities: Process 0 can stop and wait until
More informationEnabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided
Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided ROBERT GERSTENBERGER, MACIEJ BESTA, TORSTEN HOEFLER MPI-3.0 REMOTE MEMORY ACCESS MPI-3.0 supports RMA ( MPI One Sided ) Designed
More informationIntermediate MPI (Message-Passing Interface) 1/11
Intermediate MPI (Message-Passing Interface) 1/11 What happens when a process sends a message? Suppose process 0 wants to send a message to process 1. Three possibilities: Process 0 can stop and wait until
More informationCS4961 Parallel Programming. Lecture 19: Message Passing, cont. 11/5/10. Programming Assignment #3: Simple CUDA Due Thursday, November 18, 11:59 PM
Parallel Programming Lecture 19: Message Passing, cont. Mary Hall November 4, 2010 Programming Assignment #3: Simple CUDA Due Thursday, November 18, 11:59 PM Today we will cover Successive Over Relaxation.
More informationCOSC 6374 Parallel Computation. Derived Data Types in MPI. Edgar Gabriel. Spring Derived Datatypes
COSC 6374 Parallel Computation Derived Data Types in MPI Spring 2008 Derived Datatypes Basic idea: interface to describe memory layout of user data structures e.g. a structure in C typedef struct { char
More informationCollective Communications II
Collective Communications II Ned Nedialkov McMaster University Canada SE/CS 4F03 January 2014 Outline Scatter Example: parallel A b Distributing a matrix Gather Serial A b Parallel A b Allocating memory
More informationMPI Parallel I/O. Chieh-Sen (Jason) Huang. Department of Applied Mathematics. National Sun Yat-sen University
MPI Parallel I/O Chieh-Sen (Jason) Huang Department of Applied Mathematics National Sun Yat-sen University Materials are taken from the book, Using MPI-2: Advanced Features of the Message-Passing Interface
More informationECE 587 Hardware/Software Co-Design Lecture 09 Concurrency in Practice Message Passing
ECE 587 Hardware/Software Co-Design Spring 2018 1/14 ECE 587 Hardware/Software Co-Design Lecture 09 Concurrency in Practice Message Passing Professor Jia Wang Department of Electrical and Computer Engineering
More informationUsing Model Checking with Symbolic Execution for the Verification of Data-Dependent Properties of MPI-Based Parallel Scientific Software
Using Model Checking with Symbolic Execution for the Verification of Data-Dependent Properties of MPI-Based Parallel Scientific Software Anastasia Mironova Problem It is hard to create correct parallel
More informationParallel Programming
Parallel Programming Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de WS 16/17 Point-to-point communication Send MPI_Ssend MPI_Send MPI_Isend. MPI_Bsend Receive MPI_Recv MPI_Irecv Paolo Bientinesi MPI
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI One-sided Communication Nick Maclaren nmm1@cam.ac.uk October 2010 Programming with MPI p. 2/?? What Is It? This corresponds to what is often called RDMA
More informationComparing One-Sided Communication with MPI, UPC and SHMEM
Comparing One-Sided Communication with MPI, UPC and SHMEM EPCC University of Edinburgh Dr Chris Maynard Application Consultant, EPCC c.maynard@ed.ac.uk +44 131 650 5077 The Future ain t what it used to
More informationIBM HPC Development MPI update
IBM HPC Development MPI update Chulho Kim Scicomp 2007 Enhancements in PE 4.3.0 & 4.3.1 MPI 1-sided improvements (October 2006) Selective collective enhancements (October 2006) AIX IB US enablement (July
More informationMPI-2 Remote Memory Access. One Sided Communication
MPI-2 Remote Memory Access Based on notes by Sathish Vadhiyar, Rob Thacker, and David ronk One Sided ommunication One sided communication allows shmem style gets and puts Only one process need actively
More informationReusing this material
Derived Datatypes Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationCS 241 Data Organization Binary Trees
CS 241 Data Organization Binary Trees Brooke Chenoweth University of New Mexico Fall 2017 Binary Tree: Kernighan and Ritchie 6.5 Read a file and count the occurrences of each word. now is the time for
More informationMC-Checker: Detecting Memory Consistency Errors in MPI One-Sided Applications
MC-Checker: Detecting Memory Consistency Errors in MPI One-Sided Applications Zhezhe Chen, 1 James Dinan, 3 Zhen Tang, 2 Pavan Balaji, 4 Hua Zhong, 2 Jun Wei, 2 Tao Huang, 2 and Feng Qin 5 1 Twitter Inc.
More informationCOMP26120: Linked List in C (2018/19) Lucas Cordeiro
COMP26120: Linked List in C (2018/19) Lucas Cordeiro lucas.cordeiro@manchester.ac.uk Linked List Lucas Cordeiro (Formal Methods Group) lucas.cordeiro@manchester.ac.uk Office: 2.28 Office hours: 10-11 Tuesday,
More informationThe Message Passing Interface MPI and MPI-2
Subject 8 Fall 2004 The Message Passing Interface MPI and MPI-2 Disclaimer: These notes DO NOT substitute the textbook for this class. The notes should be used IN CONJUNCTION with the textbook and the
More informationEnabling highly-scalable remote memory access programming with MPI-3 One Sided 1
Scientific Programming 22 (2014) 75 91 75 DOI 10.3233/SPR-140383 IOS Press Enabling highly-scalable remote memory access programming with MPI-3 One Sided 1 Robert Gerstenberger, Maciej Besta and Torsten
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI Point-to-Point Transfers Nick Maclaren nmm1@cam.ac.uk May 2008 Programming with MPI p. 2/?? Digression Most books and courses teach point--to--point first
More informationImplementing MPI-IO Shared File Pointers without File System Support
Implementing MPI-IO Shared File Pointers without File System Support Robert Latham, Robert Ross, Rajeev Thakur, Brian Toonen Mathematics and Computer Science Division Argonne National Laboratory Argonne,
More informationAryan College. Fundamental of C Programming. Unit I: Q1. What will be the value of the following expression? (2017) A + 9
Fundamental of C Programming Unit I: Q1. What will be the value of the following expression? (2017) A + 9 Q2. Write down the C statement to calculate percentage where three subjects English, hindi, maths
More informationComputer Science & Engineering 150A Problem Solving Using Computers
Computer Science & Engineering 150A Problem Solving Using Computers Lecture 06 - Stephen Scott Adapted from Christopher M. Bourke 1 / 30 Fall 2009 Chapter 8 8.1 Declaring and 8.2 Array Subscripts 8.3 Using
More informationParallel I/O using standard data formats in climate and NWP
Parallel I/O using standard data formats in climate and NWP Project ScalES funded by BMBF Deike Kleberg Luis Kornblueh, Hamburg MAX-PLANCK-GESELLSCHAFT Outline Introduction Proposed Solution Implementation
More informationCS 470 Spring Mike Lam, Professor. Advanced MPI Topics
CS 470 Spring 2018 Mike Lam, Professor Advanced MPI Topics MPI safety A program is unsafe if it relies on MPI-provided buffering Recall that MPI_Send has ambiguous blocking/buffering If you rely on it
More informationFormal Verification of Programs That Use MPI One-Sided Communication
Formal Verification of Programs That Use MPI One-Sided Communication Salman Pervez 1, Ganesh Gopalakrishnan 1, Robert M. Kirby 1, Rajeev Thakur 2, and William Gropp 2 1 School of Computing University of
More informationMessage-Passing Interface
Mitglied der Helmholtz-Gemeinschaft Message-Passing Interface Selected Topics and Best Practices July 9, 2014 Florian Janetzko References and Literature [EG10] Edgar Gabriel, Introduction to MPI IV MPI
More informationReview of the C Programming Language for Principles of Operating Systems
Review of the C Programming Language for Principles of Operating Systems Prof. James L. Frankel Harvard University Version of 7:26 PM 4-Sep-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights
More informationPoint-to-Point Communication. Reference:
Point-to-Point Communication Reference: http://foxtrot.ncsa.uiuc.edu:8900/public/mpi/ Introduction Point-to-point communication is the fundamental communication facility provided by the MPI library. Point-to-point
More informationAdvanced MPI. George Bosilca
Advanced MPI George Bosilca Nonblocking and collective communications Nonblocking communication Prevent deadlocks related to message ordering Overlapping communication/computation If communication progress
More informationIntroduction to MPI. HY555 Parallel Systems and Grids Fall 2003
Introduction to MPI HY555 Parallel Systems and Grids Fall 2003 Outline MPI layout Sending and receiving messages Collective communication Datatypes An example Compiling and running Typical layout of an
More informationIntermediate MPI features
Intermediate MPI features Advanced message passing Collective communication Topologies Group communication Forms of message passing (1) Communication modes: Standard: system decides whether message is
More informationMessage Passing. Bruno Raffin - M2R Parallel Systems UGA
Message Passing Bruno Raffin - M2R Parallel Systems UGA History Message Passing Programming first became popular through PVM (Parallel virtual machine), a library initially developed in 989 by Oak Ridge
More informationParallel Programming with MPI and OpenMP
Parallel Programming with MPI and OpenMP Michael J. Quinn Chapter 6 Floyd s Algorithm Chapter Objectives Creating 2-D arrays Thinking about grain size Introducing point-to-point communications Reading
More informationCluster Computing. Remote Memory Architectures
Remote Memory Architectures Evolution Shared Memory Shared Virtual Memory Parallel Virtual Machine Structured Memory SIMD Remote Memory Communication Models A B A B P0 receive send message passing 2-sided
More informationPortable SHMEMCache: A High-Performance Key-Value Store on OpenSHMEM and MPI
Portable SHMEMCache: A High-Performance Key-Value Store on OpenSHMEM and MPI Huansong Fu*, Manjunath Gorentla Venkata, Neena Imam, Weikuan Yu* *Florida State University Oak Ridge National Laboratory Outline
More informationThe combination of pointers, structs, and dynamic memory allocation allow for creation of data structures
Data Structures in C C Programming and Software Tools N.C. State Department of Computer Science Data Structures in C The combination of pointers, structs, and dynamic memory allocation allow for creation
More informationCarnegie Mellon. Cache Lab. Recitation 7: Oct 11 th, 2016
1 Cache Lab Recitation 7: Oct 11 th, 2016 2 Outline Memory organization Caching Different types of locality Cache organization Cache lab Part (a) Building Cache Simulator Part (b) Efficient Matrix Transpose
More informationIntroduction to TDDC78 Lab Series. Lu Li Linköping University Parts of Slides developed by Usman Dastgeer
Introduction to TDDC78 Lab Series Lu Li Linköping University Parts of Slides developed by Usman Dastgeer Goals Shared- and Distributed-memory systems Programming parallelism (typical problems) Goals Shared-
More informationDynamic Data Structures. CSCI 112: Programming in C
Dynamic Data Structures CSCI 112: Programming in C 1 It s all about flexibility In the programs we ve made so far, the compiler knows at compile time exactly how much memory to allocate for each variable
More informationa. Assuming a perfect balance of FMUL and FADD instructions and no pipeline stalls, what would be the FLOPS rate of the FPU?
CPS 540 Fall 204 Shirley Moore, Instructor Test November 9, 204 Answers Please show all your work.. Draw a sketch of the extended von Neumann architecture for a 4-core multicore processor with three levels
More informationFOR Loop. FOR Loop has three parts:initialization,condition,increment. Syntax. for(initialization;condition;increment){ body;
CLASSROOM SESSION Loops in C Loops are used to repeat the execution of statement or blocks There are two types of loops 1.Entry Controlled For and While 2. Exit Controlled Do while FOR Loop FOR Loop has
More informationAt this time we have all the pieces necessary to allocate memory for an array dynamically. Following our example, we allocate N integers as follows:
Pointers and Arrays Part II We will continue with our discussion on the relationship between pointers and arrays, and in particular, discuss how arrays with dynamical length can be created at run-time
More informationThe Message-Passing Paradigm
Parallel Systems Course: Chapter III The Message-Passing Paradigm Jan Lemeire Dept. ETRO October 21 th 2011 Overview 1. Definition 2. MPI 3. Collective Communications 4. Interconnection networks Static
More informationThe MPI Message-passing Standard Practical use and implementation (VI) SPD Course 08/03/2017 Massimo Coppola
The MPI Message-passing Standard Practical use and implementation (VI) SPD Course 08/03/2017 Massimo Coppola Datatypes REFINING DERIVED DATATYPES LAYOUT FOR COMPOSITION SPD - MPI Standard Use and Implementation
More informationCOSC 6374 Parallel Computation. Scientific Data Libraries. Edgar Gabriel Fall Motivation
COSC 6374 Parallel Computation Scientific Data Libraries Edgar Gabriel Fall 2013 Motivation MPI I/O is good It knows about data types (=> data conversion) It can optimize various access patterns in applications
More informationMPI-IO. Warwick RSE. Chris Brady Heather Ratcliffe. The Angry Penguin, used under creative commons licence from Swantje Hess and Jannis Pohlmann.
MPI-IO Chris Brady Heather Ratcliffe The Angry Penguin, used under creative commons licence from Swantje Hess and Jannis Pohlmann. Warwick RSE Getting data in and out The purpose of MPI-IO is to get data
More informationThe MPI Message-passing Standard Practical use and implementation (II) SPD Course 27/02/2017 Massimo Coppola
The MPI Message-passing Standard Practical use and implementation (II) SPD Course 27/02/2017 Massimo Coppola MPI communication semantics Message order is not guaranteed, Only communications with same envelope
More informationRicardo Rocha. Department of Computer Science Faculty of Sciences University of Porto
Ricardo Rocha Department of Computer Science Faculty of Sciences University of Porto Adapted from the slides Revisões sobre Programação em C, Sérgio Crisóstomo Compilation #include int main()
More informationLists (Section 5) Lists, linked lists Implementation of lists in C Other list structures List implementation of stacks, queues, priority queues
(Section 5) Lists, linked lists Implementation of lists in C Other list structures List implementation of stacks, queues, priority queues By: Pramod Parajuli, Department of Computer Science, St. Xavier
More informationLecture 16. Parallel Sorting MPI Datatypes
Lecture 16 Parallel Sorting MPI Datatypes Today s lecture MPI Derived Datatypes Parallel Sorting 2 MPI Datatypes Data types MPI messages sources need not be contiguous 1-dimensional arrays The element
More informationAll-Pairs Shortest Paths - Floyd s Algorithm
All-Pairs Shortest Paths - Floyd s Algorithm Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 31, 2011 CPD (DEI / IST) Parallel
More informationCSE 613: Parallel Programming. Lecture 21 ( The Message Passing Interface )
CSE 613: Parallel Programming Lecture 21 ( The Message Passing Interface ) Jesmin Jahan Tithi Department of Computer Science SUNY Stony Brook Fall 2013 ( Slides from Rezaul A. Chowdhury ) Principles of
More informationCache Lab Implementation and Blocking
Cache Lab Implementation and Blocking Lou Clark February 24 th, 2014 1 Welcome to the World of Pointers! 2 Class Schedule Cache Lab Due Thursday. Start soon if you haven t yet! Exam Soon! Start doing practice
More informationMPI, Part 3. Scientific Computing Course, Part 3
MPI, Part 3 Scientific Computing Course, Part 3 Non-blocking communications Diffusion: Had to Global Domain wait for communications to compute Could not compute end points without guardcell data All work
More informationCS 470 Spring Mike Lam, Professor. Distributed Programming & MPI
CS 470 Spring 2019 Mike Lam, Professor Distributed Programming & MPI MPI paradigm Single program, multiple data (SPMD) One program, multiple processes (ranks) Processes communicate via messages An MPI
More informationWhen you add a number to a pointer, that number is added, but first it is multiplied by the sizeof the type the pointer points to.
Refresher When you add a number to a pointer, that number is added, but first it is multiplied by the sizeof the type the pointer points to. i.e. char *ptr1 = malloc(1); ptr1 + 1; // adds 1 to pointer
More informationESC101N: Fundamentals of Computing End-sem st semester
ESC101N: Fundamentals of Computing End-sem 2010-11 1st semester Instructor: Arnab Bhattacharya 8:00-11:00am, 15th November, 2010 Instructions 1. Please write your name, roll number and section below. 2.
More informationUSER-DEFINED DATATYPES
Advanced MPI USER-DEFINED DATATYPES MPI datatypes MPI datatypes are used for communication purposes Datatype tells MPI where to take the data when sending or where to put data when receiving Elementary
More informationReview of the C Programming Language
Review of the C Programming Language Prof. James L. Frankel Harvard University Version of 11:55 AM 22-Apr-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Reference Manual for the
More informationPointers. A pointer is simply a reference to a variable/object. Compilers automatically generate code to store/retrieve variables from memory
Pointers A pointer is simply a reference to a variable/object Compilers automatically generate code to store/retrieve variables from memory It is automatically generating internal pointers We don t have
More informationHigh Performance Computing and Programming, Lecture 3
High Performance Computing and Programming, Lecture 3 Memory usage and some other things Ali Dorostkar Division of Scientific Computing, Department of Information Technology, Uppsala University, Sweden
More informationDmitry Durnov 15 February 2017
Cовременные тенденции разработки высокопроизводительных приложений Dmitry Durnov 15 February 2017 Agenda Modern cluster architecture Node level Cluster level Programming models Tools 2/20/2017 2 Modern
More informationECE 2035 Programming HW/SW Systems Spring problems, 5 pages Exam Three 8 April Your Name (please print clearly)
Your Name (please print clearly) This exam will be conducted according to the Georgia Tech Honor Code. I pledge to neither give nor receive unauthorized assistance on this exam and to abide by all provisions
More informationParallel I/O for SwissTx Emin Gabrielyan Prof. Roger D. Hersch Peripheral Systems Laboratory Ecole Polytechnique Fédérale de Lausanne Switzerland
Fig. 00. Parallel I/O for SwissTx Emin Gabrielyan Prof. Roger D. Hersch Peripheral Systems Laboratory Ecole Polytechnique Fédérale de Lausanne Switzerland Introduction SFIO (Striped File I/O) software
More informationCollective Communication in MPI and Advanced Features
Collective Communication in MPI and Advanced Features Pacheco s book. Chapter 3 T. Yang, CS240A. Part of slides from the text book, CS267 K. Yelick from UC Berkeley and B. Gropp, ANL Outline Collective
More informationCSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community
CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community http://csc.cs.rit.edu History and Evolution of Programming Languages 1. Explain the relationship between machine
More informationCS 31: Intro to Systems Arrays, Structs, Strings, and Pointers. Kevin Webb Swarthmore College March 1, 2016
CS 31: Intro to Systems Arrays, Structs, Strings, and Pointers Kevin Webb Swarthmore College March 1, 2016 Overview Accessing things via an offset Arrays, Structs, Unions How complex structures are stored
More informationOptimising MPI for Multicore Systems
Optimising MPI for Multicore Systems Fall 2014 Instructor: Dr. Ming Hwa Wang Santa Clara University Submitted by: Akshaya Shenoy Ramya Jagannath Suraj Pulla (Team 7 (Previously Team 8)) 1 ACKNOWLEDGEMENT
More informationData Structures and Algorithms for Engineers
04-630 Data Structures and Algorithms for Engineers David Vernon Carnegie Mellon University Africa vernon@cmu.edu www.vernon.eu Data Structures and Algorithms for Engineers 1 Carnegie Mellon University
More informationStructures and Pointers
Structures and Pointers Comp-206 : Introduction to Software Systems Lecture 11 Alexandre Denault Computer Science McGill University Fall 2006 Note on Assignment 1 Please note that handin does not allow
More informationCS240: Programming in C
CS240: Programming in C Lecture 11: Bit fields, unions, pointers to functions Cristina Nita-Rotaru Lecture 11/ Fall 2013 1 Structures recap Holds multiple items as a unit Treated as scalar in C: can be
More informationStandards, Standards & Standards - MPI 3. Dr Roger Philp, Bayncore an Intel paired
Standards, Standards & Standards - MPI 3 Dr Roger Philp, Bayncore an Intel paired 1 Message Passing Interface (MPI) towards millions of cores MPI is an open standard library interface for message passing
More informationC Review. MaxMSP Developers Workshop Summer 2009 CNMAT
C Review MaxMSP Developers Workshop Summer 2009 CNMAT C Syntax Program control (loops, branches): Function calls Math: +, -, *, /, ++, -- Variables, types, structures, assignment Pointers and memory (***
More informationThreading Tradeoffs in Domain Decomposition
Threading Tradeoffs in Domain Decomposition Jed Brown Collaborators: Barry Smith, Karl Rupp, Matthew Knepley, Mark Adams, Lois Curfman McInnes CU Boulder SIAM Parallel Processing, 2016-04-13 Jed Brown
More information