Enhanced Memory debugging of MPI-parallel Applications in Open MPI
|
|
- Rosalyn Bryan
- 6 years ago
- Views:
Transcription
1 Enhanced Memory debugging of MPI-parallel Applications in Open MPI 4th Parallel tools workshop 2010 Shiqing Fan HLRS, High Performance Computing Center University of Stuttgart, Germany Slide 1
2 Introduction: Open MPI 1/3 A new MPI implementation from scratch w/o the cruft of previous implementation Design started in early 2004 PACX-MPI LAM/MPI LA-MPI FT-MPI Project goals Full, fast & extensible MPI-2 implementation Thread-safety Prevent the forking problem Combine the best ideas and techs. Open source license based on the BSD license Slide 2
3 Introduction: Open MPI 2/3 Current status Stable version v1.2.6 (April 2008) Release v1.3 comes very soon 14 members, 6 contributors 4 US DOE labs 8 universities 7 vendors 1 individual Slide 3
4 Introduction: Open MPI 3/3 Open MPI consists of three sub-packages Open MPI Open RTE Open RunTime Environment Open PAL Open Portable Access Layer Operating System Modular Component Architecture (MCA) Dynamically load available modules like plug-in and check for hardware Select best plug-in and unload others (e.g. if hw not available) Fast indirect calls into each plug-in Framework Framework BTL Comp User application MPI API Module Component Architecture Comp OpenIB TCP Comp Comp Myrinet SM Slide 4
5 Introduction: Valgrind 1/2 An Open-Source Debugging & Profiling tool For x86/linux, AMD64/Linux, PPC32/Linux and PPC64/Linux Works with any dynamically & statically linked application Memcheck - A heavyweight memory checker Runs program on a synthetic CPU Identical to a real CPU, store information of memory Valid-value bits (V-bits) for each bit Has valid value or not Address bits (A-bits) for each byte Possible to read/write that location All reads and writes of memory are checked Calls to malloc/new/free/delete are intercepted Slide 5
6 Introduction: Valgrind 2/2 Use of uninitialized memory only reports the error when using the uninitialized value e.g. : int c[2]; int i = c[0]; /* OK!! */ if (i == 0) /* Memcheck: use of uninitialized value!! */ Use of free d memory Mismatched use of malloc/new with free/delete Memory leaks Overlap src and dst blocks memcpy(), strcpy(), strncpy(), strcat(), strncat() Slide 6
7 Valgrind MPI Example 1/2 Open MPI readily supports execution of apps with valgrind: mpirun np 2 valgrind./mpi_murks: Slide 7
8 Valgrind MPI Example 2/2 PID With Valgrind mpirun np 2 valgrind./mpi_murks: ==11278== Invalid read of size 1 ==11278== at 0x E: memcpy (../../memcheck/mac_replace_strmem.c:256) ==11278== by 0x80690F6: MPID_SHMEM_Eagerb_send_short (mpich/../shmemshort.c:70).. 2 lines of calls to MPIch-functions deleted... ==11278== by 0x80492BA: MPI_Send (/usr/src/mpich/src/pt2pt/send.c:91) ==11278== by 0x8048F28: main (mpi_murks.c:44) ==11278== Address 0x4158B0EF is 3 bytes after a block of size 40 alloc'd ==11278== at 0x4002BBCE: malloc (../../coregrind/vg_replace_malloc.c:160) ==11278== by 0x8048EB0: main (mpi_murks.c:39)... Buffer-Overrun by 4 Bytes in MPI_Send ==11278== Conditional jump or move depends on uninitialised value(s) ==11278== at 0x402985C4: _IO_vfprintf_internal (in /lib/libc so) ==11278== by 0x402A15BD: _IO_printf (in /lib/libc so) ==11278== by 0x8048F44: main (mpi_murks.c:46) It can not find: Printing of uninitialized variable May be run with 1 process: One pending Recv (Marmot). May be run with >2 processes: Unmatched Sends (Marmot). Slide 8
9 Design and implementation 1/3 Memchecker: a new concept to use valgrind s API internally in Open MPI to reveal bugs In the Application In Open MPI itself Implement generic interface memchecker as MCA Implemented in Open PAL layer Configure option --enable-memchecker Possibly pass installed Valgrind --with-valgrind=/path/to/valgrind Simply run command, e.g. : mpirun -np 2 valgrind./my_mpi Open MPI Open RTE Open PAL Memchecker valgrind Memchecker* solaris_rtc Memchecker some mca Operating System *currently no API implemented in rtc. Slide 9
10 Design and implementation 2/3 Detect application s memory violation of MPI-standard Application s usage of undefined data Application s memory access due to MPI-semantics Detect Non-blocking/One-sided communication buffer errors Functions in BTL layer for both communications Set memory accessibility independent of MPI operations i.e. only set accessibility for the fragment to be sent/received Handles derived datatypes MPI object checking Check definedness of MPI objects that passing to MPI API MPI_Status, MPI_Comm, MPI_Request and MPI_Datatype Could be disabled for better performance Slide 10
11 Design and implementation 3/3 Non-blocking send/receive buffer error checking Proc0 Proc1 MPI-Layer Buffer MPI_Isend Inaccessible (unaddressable) Frag0 Frag1 Fragn MPI_Irecv Inaccessible (unaddressable) PML P2P Management Layer BML BTL Management Layer not accessible not accessible Fragn MPI_Wait MPI_Wait BTL Byte Transfer Layer Slide 11
12 Detectable bug-classes 1/3 Non-blocking buffer accessed/modified before finished MPI_Isend (buffer, SIZE, MPI_INT,, &request); buffer[1] = 4711; MPI_Wait (&req, &status); The standard does not (yet) allow read access: MPI_Isend (buffer, SIZE, MPI_INT,, &request); result[1] = buffer[1]; MPI_Wait (&request, &status); Side note: MPI-1, p30, Rationale for restrictive access rules; allows better performance on some systems. Slide 12
13 Detectable bug-classes 2/3 Access to buffer under control of MPI: MPI_Irecv (buffer, SIZE, MPI_CHAR,, &request); buffer[1] = 4711; MPI_Wait (&request, &status); Side note: CRC-based methods do not reliably catch these cases. Memory that is outside receive buffer is overwritten : buffer = malloc( SIZE * sizeof(mpi_char) ); memset (buffer, SIZE * sizeof(mpi_char), 0); MPI_Recv(buffer, SIZE+1, MPI_CHAR,, &status); Side note: MPI-1, p21, rationale of overflow situations: no memory that outside the receive buffer will ever be overwritten. Slide 13
14 Detectable bug-classes 3/3 Usage of the Undefined Memory passed from Open MPI MPI_Wait(&request, &status); if (status.mpi_error!= MPI_SUCCESS) Side note: This field should remain undefined. MPI-1, p22 (not needed for calls that return only one status) MPI-2, p24 (Clarification of status in single-completion calls). Write to buffer before accumulate is finished : MPI_Accumulate(A, NROWS*NCOLS, MPI_INT, 1, 0, 1, \ xpose, MPI_SUM, win); A[0][1] = 4711; MPI_Win_fence(0, win); Slide 14
15 Performance 1/2 Benchmarks Intel MPI Bechmark Environment Dgrid-cluster at HLRS Dual-processor Intel Woodcrest Infiniband-DDR network with the Open Fabrics stack Test cases Plain Open MPI With memchecker component without MPI objects checking Slide 15
16 Performance 2/2 Intel MPI Benchmark, Bi-directional Get test Use 2 nodes, TCP connections employing IPoverIB-interface Run with/without Valgrind Slide 16
17 Valgrind (Memcheck) Extension 1/2 New client requests for: Watching on memory read operations Watching on memory write operations Initiating callback functions on memory read/write Making memory readable and/or writable use fast ordered set algorithm byte-wise memory checking handle the memory with mixed registered and unregistered blocks Slide 17
18 Valgrind (Memcheck) Extension 2/2 VALGRIND_REG_USER_MEM_WATCH (addr, len, op, cb, info) VALGRIND_UNREG_USER_MEM_WATCH (addr, len) Watch op could be: WATCH_MEM_READ, WATCH_MEM_WRITE and WATCH_MEM_RW Valgrind Alloc_mem Read_mem User app Alloc_mem Read_mem Read_cb Slide 18
19 Thank you very much! Slide 19
Memory Checking and Single Processor Optimization with Valgrind [05b]
Memory Checking and Single Processor Optimization with Valgrind Memory Checking and Single Processor Optimization with Valgrind [05b] University of Stuttgart High-Performance Computing-Center Stuttgart
More informationSingle Processor Optimization III
Russian-German School on High-Performance Computer Systems, 27th June - 6th July, Novosibirsk 2. Day, 28th of June, 2005 HLRS, University of Stuttgart Slide 1 Outline Motivation Valgrind Memory Tracing
More informationParallel Debugging. Matthias Müller, Pavel Neytchev, Rainer Keller, Bettina Krammer
Parallel Debugging Matthias Müller, Pavel Neytchev, Rainer Keller, Bettina Krammer University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Höchstleistungsrechenzentrum Stuttgart
More informationImplementation and Usage of the PERUSE-Interface in Open MPI
Implementation and Usage of the PERUSE-Interface in Open MPI Rainer Keller HLRS George Bosilca UTK Graham Fagg UTK Michael Resch HLRS Jack J. Dongarra UTK 13th EuroPVM/MPI 2006, Bonn EU-project HPC-Europa
More informationOpen MPI und ADCL. Kommunikationsbibliotheken für parallele, wissenschaftliche Anwendungen. Edgar Gabriel
Open MPI und ADCL Kommunikationsbibliotheken für parallele, wissenschaftliche Anwendungen Department of Computer Science University of Houston gabriel@cs.uh.edu Is MPI dead? New MPI libraries released
More informationOpen MPI from a user perspective. George Bosilca ICL University of Tennessee
Open MPI from a user perspective George Bosilca ICL University of Tennessee bosilca@cs.utk.edu From Scratch? Merger of ideas from FT-MPI (U. of Tennessee) LA-MPI (Los Alamos, Sandia) LAM/MPI (Indiana U.)
More informationMemory Analysis tools
Memory Analysis tools PURIFY The Necessity TOOL Application behaviour: Crashes intermittently Uses too much memory Runs too slowly Isn t well tested Is about to ship You need something See what your code
More informationMPI Runtime Error Detection with MUST
MPI Runtime Error Detection with MUST At the 27th VI-HPS Tuning Workshop Joachim Protze IT Center RWTH Aachen University April 2018 How many issues can you spot in this tiny example? #include #include
More informationMessage Passing Interface. most of the slides taken from Hanjun Kim
Message Passing Interface most of the slides taken from Hanjun Kim Message Passing Pros Scalable, Flexible Cons Someone says it s more difficult than DSM MPI (Message Passing Interface) A standard message
More informationMessage Passing Interface
Message Passing Interface DPHPC15 TA: Salvatore Di Girolamo DSM (Distributed Shared Memory) Message Passing MPI (Message Passing Interface) A message passing specification implemented
More informationMyths and reality of communication/computation overlap in MPI applications
Myths and reality of communication/computation overlap in MPI applications Alessandro Fanfarillo National Center for Atmospheric Research Boulder, Colorado, USA elfanfa@ucar.edu Oct 12th, 2017 (elfanfa@ucar.edu)
More informationTypical Bugs in parallel Programs
Center for Information Services and High Performance Computing (ZIH) Typical Bugs in parallel Programs Parallel Programming Course, Dresden, 8.- 12. February 2016 Joachim Protze (protze@rz.rwth-aachen.de)
More informationProgramming Scalable Systems with MPI. Clemens Grelck, University of Amsterdam
Clemens Grelck University of Amsterdam UvA / SurfSARA High Performance Computing and Big Data Course June 2014 Parallel Programming with Compiler Directives: OpenMP Message Passing Gentle Introduction
More informationCS 179: GPU Programming. Lecture 14: Inter-process Communication
CS 179: GPU Programming Lecture 14: Inter-process Communication The Problem What if we want to use GPUs across a distributed system? GPU cluster, CSIRO Distributed System A collection of computers Each
More informationMPI History. MPI versions MPI-2 MPICH2
MPI versions MPI History Standardization started (1992) MPI-1 completed (1.0) (May 1994) Clarifications (1.1) (June 1995) MPI-2 (started: 1995, finished: 1997) MPI-2 book 1999 MPICH 1.2.4 partial implemention
More informationUse Dynamic Analysis Tools on Linux
Use Dynamic Analysis Tools on Linux FTF-SDS-F0407 Gene Fortanely Freescale Software Engineer Catalin Udma A P R. 2 0 1 4 Software Engineer, Digital Networking TM External Use Session Introduction This
More informationMPI Runtime Error Detection with MUST
MPI Runtime Error Detection with MUST At the 25th VI-HPS Tuning Workshop Joachim Protze IT Center RWTH Aachen University March 2017 How many issues can you spot in this tiny example? #include #include
More informationProgramming Scalable Systems with MPI. UvA / SURFsara High Performance Computing and Big Data. Clemens Grelck, University of Amsterdam
Clemens Grelck University of Amsterdam UvA / SURFsara High Performance Computing and Big Data Message Passing as a Programming Paradigm Gentle Introduction to MPI Point-to-point Communication Message Passing
More informationParallel Programming
Parallel Programming Point-to-point communication Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de WS 18/19 Scenario Process P i owns matrix A i, with i = 0,..., p 1. Objective { Even(i) : compute Ti
More informationDocument Classification Problem
Document Classification Problem Search directories, subdirectories for documents (look for.html,.txt,.tex, etc.) Using a dictionary of key words, create a profile vector for each document Store profile
More informationPoint-to-Point Communication. Reference:
Point-to-Point Communication Reference: http://foxtrot.ncsa.uiuc.edu:8900/public/mpi/ Introduction Point-to-point communication is the fundamental communication facility provided by the MPI library. Point-to-point
More informationCSE 613: Parallel Programming. Lecture 21 ( The Message Passing Interface )
CSE 613: Parallel Programming Lecture 21 ( The Message Passing Interface ) Jesmin Jahan Tithi Department of Computer Science SUNY Stony Brook Fall 2013 ( Slides from Rezaul A. Chowdhury ) Principles of
More informationCS 470 Spring Mike Lam, Professor. Distributed Programming & MPI
CS 470 Spring 2017 Mike Lam, Professor Distributed Programming & MPI MPI paradigm Single program, multiple data (SPMD) One program, multiple processes (ranks) Processes communicate via messages An MPI
More informationCopyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 9
Chapter 9 Document Classification Document Classification Problem Search directories, subdirectories for documents (look for.html,.txt,.tex, etc.) Using a dictionary of key words, create a profile vector
More informationMessage Passing Interface
MPSoC Architectures MPI Alberto Bosio, Associate Professor UM Microelectronic Departement bosio@lirmm.fr Message Passing Interface API for distributed-memory programming parallel code that runs across
More informationCSCI-1200 Data Structures Spring 2016 Lecture 6 Pointers & Dynamic Memory
Announcements CSCI-1200 Data Structures Spring 2016 Lecture 6 Pointers & Dynamic Memory There will be no lecture on Tuesday, Feb. 16. Prof. Thompson s office hours are canceled for Monday, Feb. 15. Prof.
More informationCS 470 Spring Mike Lam, Professor. Distributed Programming & MPI
CS 470 Spring 2018 Mike Lam, Professor Distributed Programming & MPI MPI paradigm Single program, multiple data (SPMD) One program, multiple processes (ranks) Processes communicate via messages An MPI
More informationvalgrind overview: runtime memory checker and a bit more What can we do with it?
Valgrind overview: Runtime memory checker and a bit more... What can we do with it? MLUG Mar 30, 2013 The problem When do we start thinking of weird bug in a program? The problem When do we start thinking
More informationDocument Classification
Document Classification Introduction Search engine on web Search directories, subdirectories for documents Search for documents with extensions.html,.txt, and.tex Using a dictionary of key words, create
More informationParallel Programming. Functional Decomposition (Document Classification)
Parallel Programming Functional Decomposition (Document Classification) Document Classification Problem Search directories, subdirectories for text documents (look for.html,.txt,.tex, etc.) Using a dictionary
More informationBasic MPI Communications. Basic MPI Communications (cont d)
Basic MPI Communications MPI provides two non-blocking routines: MPI_Isend(buf,cnt,type,dst,tag,comm,reqHandle) buf: source of data to be sent cnt: number of data elements to be sent type: type of each
More informationMessage Passing Interface. George Bosilca
Message Passing Interface George Bosilca bosilca@icl.utk.edu Message Passing Interface Standard http://www.mpi-forum.org Current version: 3.1 All parallelism is explicit: the programmer is responsible
More informationHigh Performance Computing Course Notes Message Passing Programming I
High Performance Computing Course Notes 2008-2009 2009 Message Passing Programming I Message Passing Programming Message Passing is the most widely used parallel programming model Message passing works
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 9 Document Classification Chapter Objectives Complete introduction of MPI functions Show how to implement manager-worker programs
More informationParallel Short Course. Distributed memory machines
Parallel Short Course Message Passing Interface (MPI ) I Introduction and Point-to-point operations Spring 2007 Distributed memory machines local disks Memory Network card 1 Compute node message passing
More informationIntermediate MPI (Message-Passing Interface) 1/11
Intermediate MPI (Message-Passing Interface) 1/11 What happens when a process sends a message? Suppose process 0 wants to send a message to process 1. Three possibilities: Process 0 can stop and wait until
More informationIntermediate MPI (Message-Passing Interface) 1/11
Intermediate MPI (Message-Passing Interface) 1/11 What happens when a process sends a message? Suppose process 0 wants to send a message to process 1. Three possibilities: Process 0 can stop and wait until
More informationMPI Application Development with MARMOT
MPI Application Development with MARMOT Bettina Krammer University of Stuttgart High-Performance Computing-Center Stuttgart (HLRS) www.hlrs.de Matthias Müller University of Dresden Centre for Information
More informationHigh Performance Computing
High Performance Computing Course Notes 2009-2010 2010 Message Passing Programming II 1 Communications Point-to-point communications: involving exact two processes, one sender and one receiver For example,
More informationCOMP 322: Fundamentals of Parallel Programming
COMP 322: Fundamentals of Parallel Programming https://wiki.rice.edu/confluence/display/parprog/comp322 Lecture 37: Introduction to MPI (contd) Vivek Sarkar Department of Computer Science Rice University
More informationOutline. Communication modes MPI Message Passing Interface Standard. Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa Tp.HCM
THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking
More informationMPI versions. MPI History
MPI versions MPI History Standardization started (1992) MPI-1 completed (1.0) (May 1994) Clarifications (1.1) (June 1995) MPI-2 (started: 1995, finished: 1997) MPI-2 book 1999 MPICH 1.2.4 partial implemention
More informationHPC Parallel Programing Multi-node Computation with MPI - I
HPC Parallel Programing Multi-node Computation with MPI - I Parallelization and Optimization Group TATA Consultancy Services, Sahyadri Park Pune, India TCS all rights reserved April 29, 2013 Copyright
More informationIntroduction to the Message Passing Interface (MPI)
Introduction to the Message Passing Interface (MPI) CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction to the Message Passing Interface (MPI) Spring 2018
More informationParallel Programming
Parallel Programming Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de WS 16/17 Point-to-point communication Send MPI_Ssend MPI_Send MPI_Isend. MPI_Bsend Receive MPI_Recv MPI_Irecv Paolo Bientinesi MPI
More informationHigh performance computing. Message Passing Interface
High performance computing Message Passing Interface send-receive paradigm sending the message: send (target, id, data) receiving the message: receive (source, id, data) Versatility of the model High efficiency
More informationCOSC 6374 Parallel Computation
COSC 6374 Parallel Computation Message Passing Interface (MPI ) II Advanced point-to-point operations Spring 2008 Overview Point-to-point taxonomy and available functions What is the status of a message?
More informationNon-Blocking Collectives for MPI
Non-Blocking Collectives for MPI overlap at the highest level Torsten Höfler Open Systems Lab Indiana University Bloomington, IN, USA Institut für Wissenschaftliches Rechnen Technische Universität Dresden
More informationParallel Programming
Parallel Programming Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de WS 17/18 Exercise MPI_Irecv MPI_Wait ==?? MPI_Recv Paolo Bientinesi MPI 2 Exercise MPI_Irecv MPI_Wait ==?? MPI_Recv ==?? MPI_Irecv
More informationAaron Evans 2004 Nov 15
Aaron Evans 2004 Nov 15 Citation Nicholas Nethercote and Julian Seward, Valgrind: A Program Supervision Framework, Electronic Notes in Theoretical Computer Science, Volume 89, Issue 2, October 2003, Pages
More informationProgramming with MPI. Pedro Velho
Programming with MPI Pedro Velho Science Research Challenges Some applications require tremendous computing power - Stress the limits of computing power and storage - Who might be interested in those applications?
More informationCSCI-1200 Data Structures Fall 2015 Lecture 6 Pointers & Dynamic Memory
CSCI-1200 Data Structures Fall 2015 Lecture 6 Pointers & Dynamic Memory Announcements: Test 1 Information Test 1 will be held Monday, September 21st, 2015 from 6-:50pm. Students have been randomly assigned
More informationThe Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing
The Message Passing Interface (MPI) TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Parallelism Decompose the execution into several tasks according to the work to be done: Function/Task
More informationThe Valgrind Memory Checker. (i.e., Your best friend.)
The Valgrind Memory Checker. (i.e., Your best friend.) Dept of Computing Science University of Alberta Small modifications by Stef Nychka, Mar. 2006 5th March 2006 Attribution. Introduction Some of the
More informationProgramming with MPI on GridRS. Dr. Márcio Castro e Dr. Pedro Velho
Programming with MPI on GridRS Dr. Márcio Castro e Dr. Pedro Velho Science Research Challenges Some applications require tremendous computing power - Stress the limits of computing power and storage -
More informationIntroduction to MPI. May 20, Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign
Introduction to MPI May 20, 2013 Daniel J. Bodony Department of Aerospace Engineering University of Illinois at Urbana-Champaign Top500.org PERFORMANCE DEVELOPMENT 1 Eflop/s 162 Pflop/s PROJECTED 100 Pflop/s
More informationTI2725-C, C programming lab, course
Valgrind tutorial Valgrind is a tool which can find memory leaks in your programs, such as buffer overflows and bad memory management. This document will show per example how Valgrind responds to buggy
More informationIntroduction to MPI. Ricardo Fonseca. https://sites.google.com/view/rafonseca2017/
Introduction to MPI Ricardo Fonseca https://sites.google.com/view/rafonseca2017/ Outline Distributed Memory Programming (MPI) Message Passing Model Initializing and terminating programs Point to point
More informationDEBUGGING: DYNAMIC PROGRAM ANALYSIS
DEBUGGING: DYNAMIC PROGRAM ANALYSIS WS 2017/2018 Martina Seidl Institute for Formal Models and Verification System Invariants properties of a program must hold over the entire run: integrity of data no
More informationIntroduction to parallel computing concepts and technics
Introduction to parallel computing concepts and technics Paschalis Korosoglou (support@grid.auth.gr) User and Application Support Unit Scientific Computing Center @ AUTH Overview of Parallel computing
More informationParallel programming MPI
Parallel programming MPI Distributed memory Each unit has its own memory space If a unit needs data in some other memory space, explicit communication (often through network) is required Point-to-point
More informationA Novel Approach to Explain the Detection of Memory Errors and Execution on Different Application Using Dr Memory.
A Novel Approach to Explain the Detection of Memory Errors and Execution on Different Application Using Dr Memory. Yashaswini J 1, Tripathi Ashish Ashok 2 1, 2 School of computer science and engineering,
More informationShared Memory & Message Passing Programming on SCI-Connected Clusters
Shared Memory & Message Passing Programming on SCI-Connected Clusters Joachim Worringen, RWTH Aachen SCI Summer School 2000 Trinitiy College Dublin Agenda How to utilize SCI-Connected Clusters SMI Library
More informationMessage Passing Programming. Modes, Tags and Communicators
Message Passing Programming Modes, Tags and Communicators Overview Lecture will cover - explanation of MPI modes (Ssend, Bsend and Send) - meaning and use of message tags - rationale for MPI communicators
More informationDebugging / Profiling
The Center for Astrophysical Thermonuclear Flashes Debugging / Profiling Chris Daley 23 rd June An Advanced Simulation & Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at Motivation
More informationScreencast: Basic Architecture and Tuning
Screencast: Basic Architecture and Tuning Jeff Squyres May 2008 May 2008 Screencast: Basic Architecture and Tuning 1 Open MPI Architecture Modular component architecture (MCA) Backbone plugin / component
More informationIntroduction to MPI: Part II
Introduction to MPI: Part II Pawel Pomorski, University of Waterloo, SHARCNET ppomorsk@sharcnetca November 25, 2015 Summary of Part I: To write working MPI (Message Passing Interface) parallel programs
More informationMPI and CUDA. Filippo Spiga, HPCS, University of Cambridge.
MPI and CUDA Filippo Spiga, HPCS, University of Cambridge Outline Basic principle of MPI Mixing MPI and CUDA 1 st example : parallel GPU detect 2 nd example: heat2d CUDA- aware MPI, how
More informationPROGRAMMING WITH MESSAGE PASSING INTERFACE. J. Keller Feb 26, 2018
PROGRAMMING WITH MESSAGE PASSING INTERFACE J. Keller Feb 26, 2018 Structure Message Passing Programs Basic Operations for Communication Message Passing Interface Standard First Examples Collective Communication
More informationParallel Programming with MPI and OpenMP
Parallel Programming with MPI and OpenMP Michael J. Quinn Chapter 6 Floyd s Algorithm Chapter Objectives Creating 2-D arrays Thinking about grain size Introducing point-to-point communications Reading
More informationNon-Blocking Communications
Non-Blocking Communications Deadlock 1 5 2 3 4 Communicator 0 2 Completion The mode of a communication determines when its constituent operations complete. - i.e. synchronous / asynchronous The form of
More informationPractical Introduction to Message-Passing Interface (MPI)
1 Practical Introduction to Message-Passing Interface (MPI) October 1st, 2015 By: Pier-Luc St-Onge Partners and Sponsors 2 Setup for the workshop 1. Get a user ID and password paper (provided in class):
More informationLecture 7: Distributed memory
Lecture 7: Distributed memory David Bindel 15 Feb 2010 Logistics HW 1 due Wednesday: See wiki for notes on: Bottom-up strategy and debugging Matrix allocation issues Using SSE and alignment comments Timing
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 4 Message-Passing Programming Learning Objectives n Understanding how MPI programs execute n Familiarity with fundamental MPI functions
More informationDiscussion: MPI Basic Point to Point Communication I. Table of Contents. Cornell Theory Center
1 of 14 11/1/2006 3:58 PM Cornell Theory Center Discussion: MPI Point to Point Communication I This is the in-depth discussion layer of a two-part module. For an explanation of the layers and how to navigate
More informationMPI-Adapter for Portable MPI Computing Environment
MPI-Adapter for Portable MPI Computing Environment Shinji Sumimoto, Kohta Nakashima, Akira Naruse, Kouichi Kumon (Fujitsu Laboratories Ltd.), Takashi Yasui (Hitachi Ltd.), Yoshikazu Kamoshida, Hiroya Matsuba,
More informationParallel Processing. Parallel Processing. 4 Optimization Techniques WS 2018/19
Parallel Processing WS 2018/19 Universität Siegen rolanda.dwismuellera@duni-siegena.de Tel.: 0271/740-4050, Büro: H-B 8404 Stand: September 7, 2018 Betriebssysteme / verteilte Systeme Parallel Processing
More informationCSCI-1200 Data Structures Spring 2013 Lecture 7 Templated Classes & Vector Implementation
CSCI-1200 Data Structures Spring 2013 Lecture 7 Templated Classes & Vector Implementation Review from Lectures 5 & 6 Arrays and pointers Different types of memory ( automatic, static, dynamic) Dynamic
More informationICHEC. Using Valgrind. Using Valgrind :: detecting memory errors. Introduction. Program Compilation TECHNICAL REPORT
ICHEC TECHNICAL REPORT Mr. Ivan Girotto ICHEC Computational Scientist Stoney Compute Node Bull Novascale R422-E2 Using Valgrind :: detecting memory errors Valgrind is a suite of command line tools both
More informationAdvanced Debugging and the Address Sanitizer
Developer Tools #WWDC15 Advanced Debugging and the Address Sanitizer Finding your undocumented features Session 413 Mike Swingler Xcode UI Infrastructure Anna Zaks LLVM Program Analysis 2015 Apple Inc.
More informationPart - II. Message Passing Interface. Dheeraj Bhardwaj
Part - II Dheeraj Bhardwaj Department of Computer Science & Engineering Indian Institute of Technology, Delhi 110016 India http://www.cse.iitd.ac.in/~dheerajb 1 Outlines Basics of MPI How to compile and
More informationLesson 1. MPI runs on distributed memory systems, shared memory systems, or hybrid systems.
The goals of this lesson are: understanding the MPI programming model managing the MPI environment handling errors point-to-point communication 1. The MPI Environment Lesson 1 MPI (Message Passing Interface)
More informationDepartment of Informatics V. HPC-Lab. Session 4: MPI, CG M. Bader, A. Breuer. Alex Breuer
HPC-Lab Session 4: MPI, CG M. Bader, A. Breuer Meetings Date Schedule 10/13/14 Kickoff 10/20/14 Q&A 10/27/14 Presentation 1 11/03/14 H. Bast, Intel 11/10/14 Presentation 2 12/01/14 Presentation 3 12/08/14
More informationMPI and MPI on ARCHER
MPI and MPI on ARCHER Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationUnderstanding MPI on Cray XC30
Understanding MPI on Cray XC30 MPICH3 and Cray MPT Cray MPI uses MPICH3 distribution from Argonne Provides a good, robust and feature rich MPI Cray provides enhancements on top of this: low level communication
More informationMPI Message Passing Interface
MPI Message Passing Interface Portable Parallel Programs Parallel Computing A problem is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information
More informationNon-Blocking Communications
Non-Blocking Communications Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationAll-Pairs Shortest Paths - Floyd s Algorithm
All-Pairs Shortest Paths - Floyd s Algorithm Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 31, 2011 CPD (DEI / IST) Parallel
More informationRDMA Read Based Rendezvous Protocol for MPI over InfiniBand: Design Alternatives and Benefits
RDMA Read Based Rendezvous Protocol for MPI over InfiniBand: Design Alternatives and Benefits Sayantan Sur Hyun-Wook Jin Lei Chai D. K. Panda Network Based Computing Lab, The Ohio State University Presentation
More informationEliminate Memory Errors to Improve Program Stability
Introduction INTEL PARALLEL STUDIO XE EVALUATION GUIDE This guide will illustrate how Intel Parallel Studio XE memory checking capabilities can find crucial memory defects early in the development cycle.
More informationOutline. Communication modes MPI Message Passing Interface Standard
MPI THOAI NAM Outline Communication modes MPI Message Passing Interface Standard TERMs (1) Blocking If return from the procedure indicates the user is allowed to reuse resources specified in the call Non-blocking
More informationMessage Passing Interface - MPI
Message Passing Interface - MPI Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 24, 2011 Many slides adapted from lectures by
More informationMPI. (message passing, MIMD)
MPI (message passing, MIMD) What is MPI? a message-passing library specification extension of C/C++ (and Fortran) message passing for distributed memory parallel programming Features of MPI Point-to-point
More informationDesigning High Performance Communication Middleware with Emerging Multi-core Architectures
Designing High Performance Communication Middleware with Emerging Multi-core Architectures Dhabaleswar K. (DK) Panda Department of Computer Science and Engg. The Ohio State University E-mail: panda@cse.ohio-state.edu
More informationParallel Applications Design with MPI
Parallel Applications Design with MPI Killer applications Science Research Challanges Challenging use of computer power and storage Who might be interested in those applications? Simulation and analysis
More informationScreencast: OMPI OpenFabrics Protocols (v1.2 series)
Screencast: OMPI OpenFabrics Protocols (v1.2 series) Jeff Squyres May 2008 May 2008 Screencast: OMPI OpenFabrics Protocols (v1.2 series) 1 Short Messages For short messages memcpy() into / out of pre-registered
More informationTopics. Lecture 6. Point-to-point Communication. Point-to-point Communication. Broadcast. Basic Point-to-point communication. MPI Programming (III)
Topics Lecture 6 MPI Programming (III) Point-to-point communication Basic point-to-point communication Non-blocking point-to-point communication Four modes of blocking communication Manager-Worker Programming
More informationRavindra Babu Ganapathi
14 th ANNUAL WORKSHOP 2018 INTEL OMNI-PATH ARCHITECTURE AND NVIDIA GPU SUPPORT Ravindra Babu Ganapathi Intel Corporation [ April, 2018 ] Intel MPI Open MPI MVAPICH2 IBM Platform MPI SHMEM Intel MPI Open
More informationProf. Thomas Sterling
High Performance Computing: Concepts, Methods & Means Performance 3 : Measurement Prof. Thomas Sterling Department of Computer Science Louisiana i State t University it February 27 th, 2007 Term Projects
More informationEliminate Threading Errors to Improve Program Stability
Introduction This guide will illustrate how the thread checking capabilities in Intel Parallel Studio XE can be used to find crucial threading defects early in the development cycle. It provides detailed
More information