Center for Information Services and High Performance Computing (ZIH) Welcome HRSK Practical on Debugging, 03.04.2009 Zellescher Weg 12 Willers-Bau A106 Tel. +49 351-463 - 31945 Matthias Lieber (matthias.lieber@tu-dresden.de) Tobias Hilbrich (tobias.hilbrich@zih.tu-dresden.de)
Agenda 08:00 Introduction 08:30 Typical Bugs and how to avoid them 09:00 The Heat Equation Example 09:30 COFFEE BREAK 09:45 First Steps with GDB 10:30 DDT and Totalview I 11:30 LUNCH BREAK 12:30 DDT and Totalview II 13:30 Memory Debugging 14:00 COFFEE BREAK 14:15 Debugging Multithreaded Applications 15:00 Debugging MPI Applications 16:00 THE END Introduction 2
Center for Information Services and High Performance Computing (ZIH) Introduction HRSK Practical on Debugging, 03.04.2009 Zellescher Weg 12 Willers-Bau A106 Tel. +49 351-463 - 31945 Matthias Lieber (matthias.lieber@tu-dresden.de) Tobias Hilbrich (tobias.hilbrich@zih.tu-dresden.de)
Why using a Debugger? Your program terminates abnormally produces wrong results shows incomprehensible behavior You want to know what your program is (really) doing Typical example: your program crashes with a segmentation fault % gcc myprog.c o myprog %./myprog Segmentation fault % What s going wrong? Introduction 4
What can a Debugger do? Observe a running program: Print variables (scalars, arrays, structures / derived types, classes) Inform about current source code line and function (function call stack) Control the program execution: Stop the program at a specific source code line (Breakpoints) Stop the program by evaluating variable expressions (Conditional Breakpoints and Watchpoints) Stop the program before terminating abnormally Execute the program line-by-line (Stepping) Introduction 5
Function call stack program main... call print_date... end program main subroutine print_date... call date_and_time... end subroutine print_date calls calls Function call stack: information of all active functions of a process which have been called but not yet finished. Stack: #0: date_and_time #1: print_date #2: main subroutine date_and_time... program is here Introduction 6
Typical Usage of a Debugger Compile the program with the g compiler flag gcc g myprog.c o myprog Run the program under control of the debugger: ddt./myprog Locate the position of the problem and examine variables Find out why the program shows unexpected behavior Edit the source code, modify parameters, etc. Repeat until problem is solved Introduction 7
Advanced Debugger Features Support for parallel applications, i.e. observe N processes at a time Graphical display of type structures and pointers Visualization of array contents Memory debugging Modify variables Reverse stepping Introduction 8
Debugging Tools at HRSK vendor / name command graphical user interface parallel support Mars Deimos GNU Debugger gdb Threads GNU Data Display Debugger ddd Threads Allinea Distributed Debugging Tool ddt Totalview totalview Threads, MPI Threads, MPI Valgrind valgrind Threads DUMA duma (yes) Introduction 9
Example: Allinea DDT (Distributed Debugging Tool) Introduction 10
Example: Totalview Introduction 11
Example: GNU Debugger (gdb) Introduction 12
Example: GNU DDD (GUI for GDB) Introduction 13
Debugger Usage at HRSK On login CPUs (not recommended) Works for short runs only (10 minutes runtime limit) As batch job Start your program like normal batch job, but under control of the debugger Closing the debugger finishes the batch job new job necessary for next debugging run From interactive batch session (recommended) Start a shell on compute CPUs and start program from there Very useful if you plan to do multiple debugging sessions: consecutive program runs in one batch job In the interactive batch session you can work like on the login CPUs but with your own private CPUs Introduction 14
DDT Usage at HRSK Compile your code with the g compiler flag (common across all compilers) icc g O0 myprog.c o myprog Set the environment for DDT module load ddt Start the program under control of DDT on the login CPUs (short runs only): ddt./myprog as batch job: bsub W 2:00 n 1 ddt./myprog from interactive batch session: bsub W 2:00 n 1 Is bash ddt./myprog Introduction 15
Debugger Operation Modes: Start program under Debugger Most common way to use a debugger Fully interactive Not useful if you want to observe what the program does after a long runtime Introduction 16
Debugger Operation Modes: Attach to a running Program Debuggers can attach to an already running program Useful if program has been running for a long time You observe incomprehensive behavior (hangs, wrong results) Program was not started under debugger You don t want to rerun the program Introduction 17
Debugger Operation Modes: Core File Debugging Core dumps are memory and register state of a crashed program written to disk These files can be analyzed after program termination with a debugger Must be enabled before starting the program ulimit c <corefile size limit> in the bash Shell Useful if you don t expect a crash or don t want to wait until a crash happens (probably after long runtime) Drawback: Program is not alive anymore Only static analysis of program s data Corefile can become huge, especially for parallel production codes Introduction 18
More Tools Memory debugger Detect dynamic memory management bugs Today: Valgrind, DUMA Correctness checking tools for multithreaded applications Detect faulty use of thread parallelization (e.g. deadlocks) Today: Intel Thread Checker Correctness checking tools for MPI applications Detect wrong use of the MPI library Today: MARMOT Introduction 19
Remarks on Compiler Flags Use compiler s check capabilities like -Wall etc. Read compiler s manual: man {gcc ifort pathcc pgf90...} Intel C Compiler: -Wall -Wp64 -Wuninitialized -strict-ansi Intel Fortran Compiler: -warn all -std95 -C -fpe0 -traceback Always compile your application with the g flag, especially during developing and testing Optimizations often interfere with debugging (e.g. functions or variables of interest are optimized away ) If necessary, compile with O0 to disable optimizations Introduction 20