DDT: A visual, parallel debugger on Ra

Similar documents
Debugging with GDB and DDT

Debugging with GDB and DDT

DDT Debugging Techniques

User Guide of High Performance Computing Cluster in School of Physics

Allinea DDT Debugger. Dan Mazur, McGill HPC March 5,

Guillimin HPC Users Meeting July 14, 2016

Welcome. HRSK Practical on Debugging, Zellescher Weg 12 Willers-Bau A106 Tel

Parallel Debugging with TotalView BSC-CNS

Parallel Debugging. ª Objective. ª Contents. ª Learn the basics of debugging parallel programs

Debugging with TotalView

Debugging. John Lockman Texas Advanced Computing Center

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

Working on the NewRiver Cluster

COSC 6374 Parallel Computation. Debugging MPI applications. Edgar Gabriel. Spring 2008

Introduction to PICO Parallel & Production Enviroment

DDT User Guide. Version 2.2.1

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides)

Introduction to GALILEO

Compilation and Parallel Start

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

MPI Runtime Error Detection with MUST

CS354 gdb Tutorial Written by Chris Feilbach

MPI Runtime Error Detection with MUST and Marmot For the 8th VI-HPS Tuning Workshop

Distributed Memory Programming With MPI Computer Lab Exercises

MPI Runtime Error Detection with MUST

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Introduction to GALILEO

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System

New User Seminar: Part 2 (best practices)

XSEDE New User Tutorial

Improving the Productivity of Scalable Application Development with TotalView May 18th, 2010

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Introduction to GALILEO

Our new HPC-Cluster An overview

Effective Use of CCV Resources

Batch Systems. Running calculations on HPC resources

Intel Parallel Studio XE 2017 Composer Edition BETA C++ - Debug Solutions Release Notes

XSEDE New User Tutorial

Introduction to debugging. Martin Čuma Center for High Performance Computing University of Utah

XSEDE New User Tutorial

Practical Introduction to Message-Passing Interface (MPI)

Tool for Analysing and Checking MPI Applications

The Distributed Debugging Tool

TotalView. Debugging Tool Presentation. Josip Jakić

CSE 374 Programming Concepts & Tools. Brandon Myers Winter 2015 Lecture 11 gdb and Debugging (Thanks to Hal Perkins)

IBM PSSC Montpellier Customer Center. Content

OBTAINING AN ACCOUNT:

NBIC TechTrack PBS Tutorial

High Performance Beowulf Cluster Environment User Manual

Profiling and debugging. Carlos Rosales September 18 th 2009 Texas Advanced Computing Center The University of Texas at Austin

CSE 374 Programming Concepts & Tools

XSEDE New User Tutorial

Computing with the Moore Cluster

GPU Debugging Made Easy. David Lecomber CTO, Allinea Software

Intel MPI Cluster Edition on Graham A First Look! Doug Roberts

Knights Landing production environment on MARCONI

Debugging and Profiling

COSC 4397 Parallel Computation. Debugging and Performance Analysis of Parallel MPI Applications

Module 3: Working with C/C++

Welcomes PRACE/LinkSCEEM 2011 Winter School Jacques Philouze Vice President Sales & Marketing

To connect to the cluster, simply use a SSH or SFTP client to connect to:

Practical Introduction to Message-Passing Interface (MPI)

HPCC - Hrothgar. Getting Started User Guide TotalView. High Performance Computing Center Texas Tech University

Minnesota Supercomputing Institute Regents of the University of Minnesota. All rights reserved.

Batch Systems. Running your jobs on an HPC machine

Introduction to CINECA HPC Environment

Cluster Clonetroop: HowTo 2014

Debugging with Totalview. Martin Čuma Center for High Performance Computing University of Utah

First steps on using an HPC service ARCHER

Viglen NPACI Rocks. Getting Started and FAQ

We first learn one useful option of gcc. Copy the following C source file to your

Programming Studio #9 ECE 190

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

Introduction to HPC Numerical libraries on FERMI and PLX

TotalView Release Notes

Hybrid MPI/OpenMP parallelization. Recall: MPI uses processes for parallelism. Each process has its own, separate address space.

Sharpen Exercise: Using HPC resources and running parallel applications

Introduction to CINECA Computer Environment

Preventing and Finding Bugs in Parallel Programs

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Running Jobs, Submission Scripts, Modules

Chapter 12 Visual Program Debugger

Guillimin HPC Users Meeting March 17, 2016

Implementation of Parallelization

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Running LAMMPS on CC servers at IITM

Debugging. Marcelo Ponce SciNet HPC Consortium University of Toronto. July 15, /41 Ontario HPC Summerschool 2016 Central Edition: Toronto

Debugging for the hybrid-multicore age (A HPC Perspective) David Lecomber CTO, Allinea Software

Sharpen Exercise: Using HPC resources and running parallel applications

Beginner's Guide for UK IBM systems

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

Laboratory Assignment #4 Debugging in Eclipse CDT 1

Introduction to Parallel Programming. Martin Čuma Center for High Performance Computing University of Utah

Debugging at Scale Lindon Locks

Tutorial: Compiling, Makefile, Parallel jobs

MUST. MPI Runtime Error Detection Tool

Intro to MS Visual C++ Debugging

PACE. Instructional Cluster Environment (ICE) Orientation. Research Scientist, PACE

Blue Gene/Q User Workshop. Debugging

UF Research Computing: Overview and Running STATA

Transcription:

DDT: A visual, parallel debugger on Ra David M. Larue dlarue@mines.edu High Performance & Research Computing Campus Computing, Communications, and Information Technologies Colorado School of Mines March, 2010 David M. Larue DDT on Ra 1/44

Outline 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 2/44

Programs hang, give incorrect output or exhibit unexpected behavior. Parallel computing is not deterministic. Things can happen in different orders or with different timings. available resources can differ different optimizations are applied job sizes and inputs can exacerbate these issues. These differences can alter outcomes. Hence programs can work some of the time, and fail some of the time. Debugging is needed David M. Larue DDT on Ra 3/44

Outline Ancient Medieval Modern 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 4/44

Old Style Ancient Medieval Modern Insertion of statements that write to a file or standard output help ascertain that the program is doing what it was intended to do. sometimes surprisingly effective often easy to implement However it has serious limitations. slows and interrupts your thoughts requires many recompiles and restarts prevents interactive use David M. Larue DDT on Ra 5/44

Outline Ancient Medieval Modern 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 6/44

Middle Style Ancient Medieval Modern Command-line text-oriented serial tools Example: GDB (the GNU Project debugger) GDB was designed for serial processing It can be started multiple times to inspect and control multiple processes. Many algorithms in use in a parallel environment have their beginnings as serial code correctness there is needed serial debuggers are then immediately useful David M. Larue DDT on Ra 7/44

Outline Ancient Medieval Modern 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 8/44

High Style Ancient Medieval Modern Debuggers with Graphical User Interfaces Support for Parallel Execution Advantages: Intuitive Interactive Manipulate multiple processes as if one entity. Don t forget Profilers David M. Larue DDT on Ra 9/44

DDT Ancient Medieval Modern Product du jour: DDT commercial graphical interface built on top of GDB designed with debugging parallel programs in mind. made by Allinea. Documentation for DDT: http://allinea.com/downloads/userguide.pdf DDT User Guide, Version 2.5.1 Its home on Ra is /lustre/home/apps/ddt/bin/ddt David M. Larue DDT on Ra 10/44

Other notes Ancient Medieval Modern Ra s DDT License 16 processes More information on debugging on Ra http://geco.mines.edu/workshop/class2/03wed/debuggers.pdf Debugging on Ra Timothy H. Kaiser, Ph.D., Revised January 2010 http://geco.mines.edu/guide/errortracking/index.shtml Debugging Programs whilst not bugging yourself Timothy H. Kaiser, Ph.D., January 2010 http://geco.mines.edu/guide/body.shtml#h5450 Ra User s Guide, section on Debugging, including references to workshop resources. David M. Larue DDT on Ra 11/44

Outline 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 12/44

Things to have and do ddt tools that support debugging and ddt compiler mpi library gdb access Ra with X Windows setup ddt compile with debugging support David M. Larue DDT on Ra 13/44

Versions Combinations of Versions various combinations work many do not one will be detailed here OpenMPI 1.3.4 Intel Compilers 11.1 DDT 2.5.1 BASH ssh -X gdb 6.8 Non-standard combination of versions requires extra setup David M. Larue DDT on Ra 14/44

Modes Modes DDT submits a job to the queue and attaches to its processes DDT attaches to the processes of an already running job We will let DDT submit the job need a PBS template script configure DDT to msub David M. Larue DDT on Ra 15/44

Outline 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 16/44

One-Time System Setup Append the following to the end of ~/.bashrc (then source it or logout/login). ### openmpi 1.3.4, intel 11.1 compiler wrapper settings ### ### (LD_LIBRARY_PATH, PATH) ### source /lustre/home/apps/mpi/db/openmpi1.3.4/intel11.1/setup.sh ### ddt 2.5.1 settings ### ### (LD_LIBRARY_PATH, PATH, DMALLOCPATH) ### source /lustre/home/apps/ddt/setup.sh This sets the paths and environment variables that point to the executables and libraries for the specific versions of OpenMPI, the compiler, GDB and DDT. David M. Larue DDT on Ra 17/44

One-Time DDT Setup, Part 1/2 Login to Ra enabling X-Windows From most Unixes, type ssh -X ra. Start DDT for the first time type ddt & this creates a ~/.ddt/ directory where settings are stored. go through the Configuration Wizard. Alternately, start DDT select Session / Options... or Session / Configuration Wizard... David M. Larue DDT on Ra 18/44

One-Time DDT Setup, Part 2/2 Options for DDT: System Settings MPI Implementation: OpenMPI (Compatibility) Debugger: GNU (gdb) Job Submission Settings Submit job through queue: checked Submission template file: /lustre/home/[username]/workshop/ra.qtf Submit command: msub Regexpr for job id: (\d+) Cancel command: qdel JOB_ID_TAG Display command: qstat Template Uses: NUM_NODES_TAG, PROCE_PER_NODE_TAG: selected PROCS_PER_NODE_TAG: 8 (or less...) (Version 11 of the Intel Compilers requires selecting the OpenMPI (Compatibility) MPI Implementation.) David M. Larue DDT on Ra 19/44

One-Time Submission template file setup, Part 1/4 A simple PBS script prog.pbs to submit a job on Ra reads #!/bin/bash #PBS -l nodes=2:ppn=8 #PBS -l walltime=00:010:00 #PBS -N testddt #PBS -o $PBS_JOBID.out.pbs #PBS -e $PBS_JOBID.err.pbs #PBS -r n #PBS -V #----------------------------------------------------- cd $PBS_O_WORKDIR mpirun -np 16./prog One submits this by executing msub prog.pbs David M. Larue DDT on Ra 20/44

One-Time Submission template file setup, Part 2/4 To have DDT submit the job, we need a template ra.qtf for the PBS file with named tags that DDT will fill in. #!/bin/bash #PBS -l nodes=num_nodes_tag:ppn=procs_per_node_tag #PBS -l walltime=00:10:00 #PBS -N testddt #PBS -o $PBS_JOBID.out.pbs #PBS -e $PBS_JOBID.err.pbs #PBS -r n #PBS -V #----------------------------------------------------- cd $PBS_O_WORKDIR MPIRUN_TAG \ AUTO_MPI_ARGUMENTS_TAG EXTRA_MPI_ARGUMENTS_TAG \ DDTPATH_TAG/bin/ddt-client \ DDT_DEBUGGER_ARGUMENTS_TAG \ PROGRAM_TAG \ PROGRAM_ARGUMENTS_TAG David M. Larue DDT on Ra 21/44

One-Time Submission template file setup, Part 3/4 Here are what these tags might evaluate to: PROGRAM_TAG /lustre/home/[username]/workshop/prog PROGRAM_ARGUMENTS_TAG NUM_NODES_TAG 2 PROCS_PER_NODE_TAG 8 DDT_DEBUGGER_ARGUMENTS_TAG --ddthost ra.mines.edu --ddtport 4242 --ddtsession 1 --ddtsessionfile /lustre/home/[username]/.ddt/session/ra.mines.edu-1 MPIRUN_TAG mpirun AUTO_MPI_ARGUMENTS_TAG -np 16 EXTRA_MPI_ARGUMENTS_tag DDTPATH_TAG /lustre/home/apps/ddt2.5.1 Other tags are pre-defined, and user-defined tags can be made. David M. Larue DDT on Ra 22/44

One-Time Submission template file setup, Part 4/4 Takeaway: create a submission template file name and place it as specified in the DDT options example: /lustre/home/[username]/workshop/ra.qtf David M. Larue DDT on Ra 23/44

Outline 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 24/44

Compile with the debug flag Using the appropriate MPI Compiler Wrapper for the Intel Compilers ifort icc compile your program with the debug flag, -g: mpif90 -g -o prog prog.f90 mpicc -g -o prog prog.c Probably you want to compile without optimizations, as these make the connection between your source code and the running code more tenuous. (A warning about feupdateenv may be harmless, but can be removed by adding the flag -shared-intel.) David M. Larue DDT on Ra 25/44

Start DDT, Submit Job Login to Ra with X-Windows enabled, ssh -X ra. Change to the program directory, cd ~/workshop Start DDT, ddt & Prepare to run a program From the Welcome screen menu, select Run and Debug a Program From the main menu, select Session / New Session... / Run... At the DDT - Run (queue submission mode) window: Set Application to your executable, say /lustre/home/[username]/workshop/prog Set Arguments as needed Verify Number of Nodes (1 or 2, say) Press Submit David M. Larue DDT on Ra 26/44

Wait for Queue, DDT DDT will submit your job to the queue DDT displays the result of qstat, highlighting your Job ID You wait until your job runs DDT connects to the processes your job has started Your processes have been paused after MPI_Init Sample screen with successful connection follows David M. Larue DDT on Ra 27/44

Successful Submission, Run and Connection David M. Larue DDT on Ra 28/44

Outline 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 29/44

Serial Bugs, 1/2 Parallel programs suffer from all the same flaws as serial programs: out of bound memory access division by 0 and other improper arguments to functions overflow or underflow syntactical and semantical errors other algorithm flaws David M. Larue DDT on Ra 30/44

Serial Bugs, 2/2 All the usual debugging techniques for serial programs are available here in DDT: one can set break points examine or change the values of variables step over or into statements and function calls One can do these things in individual processes, or keep all the processes in sync. David M. Larue DDT on Ra 31/44

Parallel Bugs Three often cited classes of new types of bugs in parallel code: Data Race Conditions Deadlock LiveLock David M. Larue DDT on Ra 32/44

Data Race Conditions Shared Memory (OpenMP) one process can write to memory at the same time another process can read it inadequate synchronization methods are present the result depends on the sequence of actions We are focusing here on message passing and MPI, this is not where we dwell, but mention that DDT has debugging support for OpenMP threads memory Debuggers change timing, so these can be very difficicult to locate. David M. Larue DDT on Ra 33/44

Deadlock Shared Memory (OpenMP) and Message Passing (OpenMPI) multiple blocked processes are waiting on each other for release of locks for messages or terminating prematurely circular locking: A waits for B waits for C waits for A, and the like. protocol mismatch: A expecting from B, B not expecting to send, or sends with wrong tag David M. Larue DDT on Ra 34/44

Livelock Shared Memory (OpenMP) and Message Passing (OpenMPI) multiple non-blocked processes are repeatedly waiting on each other timing coincidence keeps from resolution This can manifest rarely and occasionaly, without a blocked process, and so be hard to find. David M. Larue DDT on Ra 35/44

Outline 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 36/44

: User Interface User Interface GUI Tabbed documents Dockable windows Save/Restore preferences Search/Go to/highlight current/view Multiple source code Control methods Icons/Buttons Menus Function keys Mouse clicks, double-clicks, right-clicks David M. Larue DDT on Ra 37/44

: Groups Groups Default All Root Workers Individual Processes Can add/edit/delete groups Current Group Affects Which Processes you start/stop/step Which Variables displayed Which Code and Line, and which Stack, is visible David M. Larue DDT on Ra 38/44

: Controlling Your Processes Controlling Your Processes Advancing by a line (Stepping over) Stepping into a function Stepping out of a function Following if branches Pausing at breakpoints Manually pausing and resuming Control individually, or by group David M. Larue DDT on Ra 39/44

: Viewing and Editing Data Viewing and editing data Current line Local data Keeping an expression in view: Evaluate Watch: break on change Editing data Structured data Examine pointers Multi-dimensional arrays Visualizations, Statistics View Message Queues Memory debugging Intercept [de]-allocations Statistics David M. Larue DDT on Ra 40/44

Outline 1 Ancient Medieval Modern 2 3 David M. Larue DDT on Ra 41/44

Time and Technology permitting, let s try a sample run David M. Larue DDT on Ra 42/44

We have looked at Why debug? Different debugging technologies A modern product available on Ra: DDT GUI Parallel Multiple modes Setup One time: configure tools, template PBS script Bugs Serial : compile with debugging, use X Parallel: Race Conditions, Deadlock DDT User Interface Process Control Data and Memory Control David M. Larue DDT on Ra 43/44

The End Questions? David M. Larue DDT on Ra 44/44