Welcomes PRACE/LinkSCEEM 2011 Winter School Jacques Philouze Vice President Sales & Marketing

Size: px
Start display at page:

Download "Welcomes PRACE/LinkSCEEM 2011 Winter School Jacques Philouze Vice President Sales & Marketing"

Transcription

1 Welcomes PRACE/LinkSCEEM 2011 Winter School Jacques Philouze Vice President Sales & Marketing

2 Content Company Background Products in more depth Allinea OPT (Optimization and Profiling Tool) Allinea DDT (Distributed Debugging Tool) Presentation Let s practice on some examples

3 New Market Drivers Software has become the #1 roadblock Many applications will need a major redesign - IDC HPC Update, June 2010 Most ISV codes do not scale High programming costs are delaying GPU usage Development tools are a vital part of the solution

4 Allinea Software HPC tools company since 2001 Allinea DDT the scalable parallel debugger Allinea OPT the optimization tool for MPI and non-mpi Allinea DDTLite the parallel debugging plug-in for Microsoft Visual Studio Large European and US customer base 12 of top 20 systems run Allinea DDT in EMEA World's only Petascale debugger! Most scalable and cost effective debugger for Cuda Users debugging regularly at all scales at 1 or 100,000 cores.. but also easy to use on small clusters!

5 Academic Over 200 universities Clients and Partners Major research centres ANL, EPCC, CEA, Genci, Juelich, NERSC, ORNL... Aviation and Defence Airbus, AWE, BAE, Dassault, DLR, EADS... Energy CGG Veritas, IFP, Total... EDA Cadence, Intel, Synopsys Climate and Weather UK Met Office, Meteo France, NOAA

6

7 Optimization & Profiling Tool A powerful Optimization and Profiling Tool Supports MPI profiling Supports Scalar profiling Supports internal sampler, GProf and PAPI Cross-platform support Most flavours of Linux GNU, IBM, Intel, PGI and PathScale compilers x86, x86-64, ia64, PowerPC, Bluegene/P and Cell architectures Most major MPIs OpenMPI, MPICH, MPICH2, Intel, HP, Bull, LAM, SGI and SCALI Support for major scheduling systems

8 OPT Preparation Allinea OPT services need to run on physical server PostgreSQL database Logging server GUI server (client connections) Allinea OPT is a link time library Support for major MPIs or scalar OPT MPI library sits between program and MPI No need to carry out a full recompile, just re-link

9 OPT Launching Jobs Supports both GUI and CLI job launch GUI can be configured per job Launch method local, scheduler Enable sampler Sample interval Snapshot interval #procs stdin/out/err MPI functions CLI business as usual Use scheduler as normal Use your own launch script Uses opt.conf for configuration (user or system)

10 OPT Session Manager Web services interface Allows connectivity to remote GUI server Authentication via database or PAM Local Server requires no additional authentication

11 OPT Session Manager Web services interface Allows connectivity to remote GUI server Authentication via database or PAM Local Server requires no additional authentication

12 OPT Session Manager Web services interface Allows connectivity to remote GUI server Authentication via database or PAM Local Server requires no additional authentication Collect and analyse data from multiple sources Different clusters within environment Different compilers and their options Different MPIs and their options Different number of MPI processes Offers a TRUE Cloud enabled tool

13 OPT Standard Features MPI Timeline view

14 OPT Standard Features Metric Time View

15 OPT Standard Features Metric Statistics

16 OPT Standard Features Histogram

17 OPT Standard Features Job Comparison

18 OPT Bigger Picture FOCUS is the key... Too much visual information can be confusing Good parallel tools should make things simple Target the areas which consume most time! Functions which you spend most run-time in! Allinea OPT embraces this Focused approach See the big picture first Drill down successively for more information.. Don t drown users in too much data Move to a mixture of sampling and selective tracing

19 OPT Call Graph Highlights problem functions Directly linked with the time line Works with all of the following MPI function calls MPI Overview Bytes in/out Computation and communication time Profiling GProf Sampler Snapshot details (hw counters)

20 OPT Call Graph Example Select Communication and Computation time Flat Profile select function with greatest Self time Select Average, Self Seconds and Depth=3 The red cells are areas that are most significant Most of our time is spent in MPI_Alltoall Select Cumulative, see who is calling MPI_Alltoall Change depth to 4 to increase visibility Double click block_mtranspose Zoom in

21 OPT Call Graph Example Select Communication and Computation time Flat Profile select function with greatest Self time Select Average, Self Seconds and Depth=3 The red cells are areas that are most significant Most of our time is spent in MPI_Alltoall Select Cumulative, see who is calling MPI_Alltoall Change depth to 4 to increase visibility

22 OPT Call Graph Example Select Communication and Computation time Flat Profile select function with greatest Self time Select Average, Self Seconds and Depth=3 The red cells are areas that are most significant Most of our time is spent in MPI_Alltoall Select Cumulative, see who is calling MPI_Alltoall Change depth to 4 to increase visibility Double click block_mtranspose We now have bigger picture Far more Focused

23 OPT Call Graph Example Select Communication and Computation time Flat Profile select function with greatest Self time Select Average, Self Seconds and Depth=3 The red cells are areas that are most significant Most of our time is spent in MPI_Alltoall Select Cumulative, see who is calling MPI_Alltoall Change depth to 4 to increase visibility Double click block_mtranspose We now have bigger picture Far more Focused Could we have worked this out from a MPI timeline?

24 OPT Selective Logging Collecting far too much data!! Allinea OPT selective logging Logging is on by default On at MPI_Init Off by MPI_Finalize

25 OPT Selective Logging Collecting far too much data!! Allinea OPT selective logging Logging is on by default On at MPI_Init Off by MPI_Finalize Requirements C/C++ need to replace mpi.h for optmpi.h Turn ON and OFF at will In C/C++ OPT_Start_Logging(); and OPT_Stop_Logging(); In Fortran call OPT_Start_Logging and call OPT_Stop_Logging

26 OPT User Defined Regions Need to highlight a specific area of code!! Allinea OPT has user defined regions Off by default

27 OPT User Defined Regions Need to highlight a specific area of code!! Allinea OPT has user defined regions Off by default Requirements C/C++ need to replace mpi.h for optmpi.h Needs to be turned ON and OFF In C/C++ OPT_Start_Region( My Region ); and OPT_Stop_Region(); In Fortran call OPT_Start_Region('My Region') and call OPT_Stop_Region

28 Try it! Allinea OPT available for download at

29 Debugging at Target-Scale Hybrid code Lindon Locks Pre-sales engineer

30 Background Debugging: A necessary part of development Reproducing and fixing software problems Complexity of scaling and GPU architecture will introduce bugs Debuggers interactively examine processes and data Fastest way to debug with less chance of introducing more bugs Bugs at Target-scale need a debugger at Target-scale until recently debuggers limited to ~4,000-8,000 cores Bugs on GPUs need a debugger for GPUs until recently GPU software couldn't be debugged Allinea DDT is the first graphical debugger to do both

31 Problems at Target-Scale Increasing job sizes leads to unanticipated errors Regular bugs Data issues from larger data sets: i.e. garbage in..., overflow Logic issues and control flow Increasing probability of independent random error Memory errors/exhaustion: Random bugs! System problems: MPI and Operating System Pushing coded boundaries Algorithmic (performance) Hard-wired limits ( magic numbers ) Unknown unknowns

32 Bug Fixing Strategies: Part l Improved coding standards Unit tests, assertions and consistency checks Good practice: But tend to be single-process checks Parallel checks also valid and good practice Only checks for things you predict when developed Coverage is rarely perfect Unexpected problems, particularly random/system issues Often missed Debugger still required Combines well with debuggers Find why a failure occurs not just a pass/fail

33 Bug Fixing Strategies: Part ll Logging: printf and write The oldest debugger still in active use Tried and tested - as easy as hello world If you have good intuition into the problem Edit code, insert print, recompile and re-run Slow and iterative Use to log exceptions, progress or state Post-mortem analysis only Hard to establish real causal order of output of multiple processes Output can be lost by process termination Rapid growth in log output size Unscalable

34 Bug Fixing Strategies: Part lll Reproduce at a smaller scale Attempt to make problem happen on fewer nodes Often requires reduced data set the large one may not fit Didn't you already try the code at small scale? Smaller data set may not trigger the problem Does the bug even exist on smaller problems? Is it a system issue i.e. an MPI problem? Is probability stacking up against you? Example: 1 in 10,000 independent probability of error? Unlikely to spot on smaller runs without many many runs But near guaranteed to see it on a 10,000 core run What can a parallel debugger do to help? Debug at the scale of the problem - Now

35 Use a Parallel Debugger Many benefits to graphical parallel debuggers Large feature sets for common bugs Richness of user interface and real control of processes Historically all parallel debuggers hit scale problems Bottleneck at the frontend: Direct GUI nodes architectures Linear performance in number of processes Human factors limit mouse fatigue and brain overload Are tools ready for the task? Allinea DDT has changed the game

36 Distributed Debugging Tool Allinea DDT a mature and highly intuitive software tool Traditional focus has been HPC market Additional Client-Server capabilities Developing new features(multicore, GP-GPU) Cross-platform support Linux, AIX, Blue Gene/P, Super-UX, CLE GNU, Absoft, IBM, Intel, PGI, PathScale, Sun compilers x86, x86-64, ia64, PowerPC, UltraSparc, Cray XT4/5/6, NEC, NVIDIA CUDA architectures Across all MPIs OpenMP and Threads All major scheduling systems (PBS, LSF, SGE)

37 DDT : Basic Principles Sophisticated GUI helps the user to control parallel execution and helps to find and focus on potential problems User controls actions by groups.. Set breakpoints, step in / out, align stacks etc Focus in on individual threads / processes when necessary Create groups both Manually: select processes via drag and drop Automatically: by process stack, values etc

38 Start Debugging Ability to run in numerous modes Advanced interactive launch options or Simple clear interactive launch options

39 Options Select Change options

40 DDT Scheduler Launch Enhanced Queuing Submission Simple mechanism to customise options Submit non-mpi jobs to a queue Set pre-defined variables in submission script System administrator configures Alter the pre-defined variables at run-time User selectable values

41 Features Scalar Features Advanced C++ support including STL, namespaces, virtual functions and templates Advanced Fortran 90, 95 and 2003 support including modules, allocatable data, pointers and derived types

42 Features Scalar Features Advanced C++ support including STL, namespaces, virtual functions and templates Advanced Fortran 90, 95 and 2003 support including modules, allocatable data, pointers and derived types Multithreading & OpenMP features Perform actions individually or collectively

43 Features Scalar Features Advanced C++ support including STL, namespaces, virtual functions and templates Advanced Fortran 90, 95 and 2003 support including modules, allocatable data, pointers and derived types Multithreading & OpenMP features Perform actions individually or collectively MPI features Control processes Individually or by groups Visualize message queues

44 Breakpoints / Watchpoints Quickly and easily control you debug session Breakpoints Double click to add them For all processes, groups or a single process Enable/disable from window panel Add a condition Default settings (enable/disable)

45 Breakpoints / Watchpoints Quickly and easily control you debug session Breakpoints Double click to add them For all processes, groups or a single process Enable/disable from window panel Add a condition Default settings (enable/disable) Watchpoints Select from evaluate, locals or current line Set for a variable or expression Program halts if change occurs Single process only

46 Data Comparison Cross Process/Thread Comparison (CPC) Run from evaluate, locals or current line Variable or expression Create groups from results Visualize 3D OpenGL array viewer

47 Data Comparison Cross Process/Thread Comparison (CPC) Run from evaluate, locals or current line Variable or expression Create groups from results Visualize

48 Data Comparison Cross Process/Thread Comparison (CPC) Run from evaluate, locals or current line Variable or expression Create groups from results Visualize Multi Dimensional Array viewer 3D OpenGL array viewer Export csv file for post analysis

49 Memory Debugging Find memory leaks Or stop on read/write beyond end of array:

50 Scalable Process Control Parallel Stack View Find rogue processes quickly Identify classes of process behaviour Rapid grouping of processes Control Processes by Groups Set breakpoints, step, play, stop Scalable groups view: Compact group display

51 Large Array Support Browse arrays 1, 2, 3 dimensions Table view Filtering Look for an outlying value Export Save to a spreadsheet View arrays from multiple processes Search terabytes for rogue data: in parallel with v3.0

52 Checkpointing GDB Single session, no option to save/restore MPI support No thread support Still in the experimental stage

53 Checkpointing GDB Single session, no option to save/restore MPI support No thread support Still in the experimental stage OpenMPI/BLCR Prepared and ready for official OpenMPI release Ability to save/restore session Also in experimental stage

54 Checkpointing GDB Single session, no option to save/restore MPI support No thread support Still in the experimental stage OpenMPI/BLCR Prepared and ready for official OpenMPI release Ability to save/restore session Also in experimental stage

55 Petascale Debugging DDT is delivering Petascale debugging today 0,12 DDT Performance Collaborations with ORNL on Jaguar Cray XT and CEA 0,1 Logarithmic performance Many operations now faster at 220,000 than previously at 1,000 cores Time (s) 0,08 0,06 0,04 All Step All Breakpoint ~1/10 th of a second to step and gather all stacks at 220,000 cores 0, MPI Processes

56 Debugging GPU Applications

57 CUDA Debugging Options Old world printf NVIDIA SDK 3.2 allows this (new) but has limitations Fake it Run the kernel on the host x86_64 processor Languages often support targeting host CPU instead of GPU Different numeric precision different answer? Different scheduling different answer? A reasonable option for some bugs Or run on the GPU with Allinea DDT...

58 Debugging Kernels Debugging CPU and GPU concurrently Browse source, examine variables, control processes and threads Set breakpoints Or automatically stop on kernel launch Stop at a line of CUDA code Kernels stop when breakpoint reached Hover the mouse for more information Step a warp

59 Examine Thread Data At a glance display of variables Expressions, local variables and current line Also possible to edit values Displays the memory types Shared, parameter, constant, register

60 Kernel Scale Made Easy View all threads in parallel stack view At a glance, see all GPU and CPU threads together Links with thread selection Pick a tree node to select one of the CUDA threads at that location Full MPI support See GPU and CPU threads from multiple nodes

61 Current Limitations Mix of SDK and driver limitations Only one warp can be stepped per GPU at any time Rapid progress as SDKs are released Strong partnership with NVIDIA and others helping to extend capabilities Other languages and devices HMPP, PGI Accelerators, OpenCL Can only support when the compilers/drivers make it possible...

62 DDT CUDA News SDK 3.2 with DDT Released January 2011 Multi-device support Fermi and Tesla support CUDA Memcheck support for memory errors MPI and CUDA support for GPU clusters Breakpoints, thread control, and data evaluation Stop on kernel launch

63 Hands on session

64 Training at Allinea This workshop is part of Allinea regular training process Webinar Presentation Workshop Lecture Hands on On site Training

65 Expected walkthroughs Getting Started Straightforward Crashes Memory Errors Memory Leaks Incorrect Results GPU Debugging

Debugging at Scale Lindon Locks

Debugging at Scale Lindon Locks Debugging at Scale Lindon Locks llocks@allinea.com Debugging at Scale At scale debugging - from 100 cores to 250,000 Problems faced by developers on real systems Alternative approaches to debugging and

More information

GPU Debugging Made Easy. David Lecomber CTO, Allinea Software

GPU Debugging Made Easy. David Lecomber CTO, Allinea Software GPU Debugging Made Easy David Lecomber CTO, Allinea Software david@allinea.com Allinea Software HPC development tools company Leading in HPC software tools market Wide customer base Blue-chip engineering,

More information

Debugging for the hybrid-multicore age (A HPC Perspective) David Lecomber CTO, Allinea Software

Debugging for the hybrid-multicore age (A HPC Perspective) David Lecomber CTO, Allinea Software Debugging for the hybrid-multicore age (A HPC Perspective) David Lecomber CTO, Allinea Software david@allinea.com Agenda What is HPC? How is scale affecting HPC? Achieving tool scalability Scale in practice

More information

Debugging HPC Applications. David Lecomber CTO, Allinea Software

Debugging HPC Applications. David Lecomber CTO, Allinea Software Debugging HPC Applications David Lecomber CTO, Allinea Software david@allinea.com Agenda Bugs and Debugging Debugging parallel applications Debugging OpenACC and other hybrid codes Debugging for Petascale

More information

Improving the Productivity of Scalable Application Development with TotalView May 18th, 2010

Improving the Productivity of Scalable Application Development with TotalView May 18th, 2010 Improving the Productivity of Scalable Application Development with TotalView May 18th, 2010 Chris Gottbrath Principal Product Manager Rogue Wave Major Product Offerings 2 TotalView Technologies Family

More information

Debugging CUDA Applications with Allinea DDT. Ian Lumb Sr. Systems Engineer, Allinea Software Inc.

Debugging CUDA Applications with Allinea DDT. Ian Lumb Sr. Systems Engineer, Allinea Software Inc. Debugging CUDA Applications with Allinea DDT Ian Lumb Sr. Systems Engineer, Allinea Software Inc. ilumb@allinea.com GTC 2013, San Jose, March 20, 2013 Embracing GPUs GPUs a rival to traditional processors

More information

Understanding Dynamic Parallelism

Understanding Dynamic Parallelism Understanding Dynamic Parallelism Know your code and know yourself Presenter: Mark O Connor, VP Product Management Agenda Introduction and Background Fixing a Dynamic Parallelism Bug Understanding Dynamic

More information

Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer

Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems Ed Hinkel Senior Sales Engineer Agenda Overview - Rogue Wave & TotalView GPU Debugging with TotalView Nvdia CUDA Intel Phi 2

More information

Allinea Unified Environment

Allinea Unified Environment Allinea Unified Environment Allinea s unified tools for debugging and profiling HPC Codes Beau Paisley Allinea Software bpaisley@allinea.com 720.583.0380 Today s Challenge Q: What is the impact of current

More information

Developing, Debugging, and Optimizing GPU Codes for High Performance Computing with Allinea Forge

Developing, Debugging, and Optimizing GPU Codes for High Performance Computing with Allinea Forge Developing, Debugging, and Optimizing GPU Codes for High Performance Computing with Allinea Forge Ryan Hulguin Applications Engineer ryan.hulguin@arm.com Agenda Introduction Overview of Allinea Products

More information

Development tools to enable Multicore

Development tools to enable Multicore Development tools to enable Multicore From the desktop to the extreme A perspective on multicore looking in from HPC David Lecomber CTO, Allinea Software david@allinea.com Introduction The Multicore Challenge

More information

DDT: A visual, parallel debugger on Ra

DDT: A visual, parallel debugger on Ra DDT: A visual, parallel debugger on Ra David M. Larue dlarue@mines.edu High Performance & Research Computing Campus Computing, Communications, and Information Technologies Colorado School of Mines March,

More information

ECMWF Workshop on High Performance Computing in Meteorology. 3 rd November Dean Stewart

ECMWF Workshop on High Performance Computing in Meteorology. 3 rd November Dean Stewart ECMWF Workshop on High Performance Computing in Meteorology 3 rd November 2010 Dean Stewart Agenda Company Overview Rogue Wave Product Overview IMSL Fortran TotalView Debugger Acumem ThreadSpotter 1 Copyright

More information

Development Tools for Parallel Computing. David Lecomber CTO, Allinea Software

Development Tools for Parallel Computing. David Lecomber CTO, Allinea Software Development Tools for Parallel Computing David Lecomber CTO, Allinea Software david@allinea.com Agenda Introduction What is HPC Bugs and Debugging Debugging parallel applications Challenges for the future

More information

Tools and Methodology for Ensuring HPC Programs Correctness and Performance. Beau Paisley

Tools and Methodology for Ensuring HPC Programs Correctness and Performance. Beau Paisley Tools and Methodology for Ensuring HPC Programs Correctness and Performance Beau Paisley bpaisley@allinea.com About Allinea Over 15 years of business focused on parallel programming development tools Strong

More information

Parallel Debugging with TotalView BSC-CNS

Parallel Debugging with TotalView BSC-CNS Parallel Debugging with TotalView BSC-CNS AGENDA What debugging means? Debugging Tools in the RES Allinea DDT as alternative (RogueWave Software) What is TotalView Compiling Your Program Starting totalview

More information

Introduction to debugging. Martin Čuma Center for High Performance Computing University of Utah

Introduction to debugging. Martin Čuma Center for High Performance Computing University of Utah Introduction to debugging Martin Čuma Center for High Performance Computing University of Utah m.cuma@utah.edu Overview Program errors Simple debugging Graphical debugging DDT and Totalview Intel tools

More information

Scalable Debugging with TotalView on Blue Gene. John DelSignore, CTO TotalView Technologies

Scalable Debugging with TotalView on Blue Gene. John DelSignore, CTO TotalView Technologies Scalable Debugging with TotalView on Blue Gene John DelSignore, CTO TotalView Technologies Agenda TotalView on Blue Gene A little history Current status Recent TotalView improvements ReplayEngine (reverse

More information

TotalView. Debugging Tool Presentation. Josip Jakić

TotalView. Debugging Tool Presentation. Josip Jakić TotalView Debugging Tool Presentation Josip Jakić josipjakic@ipb.ac.rs Agenda Introduction Getting started with TotalView Primary windows Basic functions Further functions Debugging parallel programs Topics

More information

IBM High Performance Computing Toolkit

IBM High Performance Computing Toolkit IBM High Performance Computing Toolkit Pidad D'Souza (pidsouza@in.ibm.com) IBM, India Software Labs Top 500 : Application areas (November 2011) Systems Performance Source : http://www.top500.org/charts/list/34/apparea

More information

Parallel Programming and Debugging with CUDA C. Geoff Gerfin Sr. System Software Engineer

Parallel Programming and Debugging with CUDA C. Geoff Gerfin Sr. System Software Engineer Parallel Programming and Debugging with CUDA C Geoff Gerfin Sr. System Software Engineer CUDA - NVIDIA s Architecture for GPU Computing Broad Adoption Over 250M installed CUDA-enabled GPUs GPU Computing

More information

Debugging with GDB and DDT

Debugging with GDB and DDT Debugging with GDB and DDT Ramses van Zon SciNet HPC Consortium University of Toronto June 13, 2014 1/41 Ontario HPC Summerschool 2014 Central Edition: Toronto Outline Debugging Basics Debugging with the

More information

Debugging with GDB and DDT

Debugging with GDB and DDT Debugging with GDB and DDT Ramses van Zon SciNet HPC Consortium University of Toronto June 28, 2012 1/41 Ontario HPC Summerschool 2012 Central Edition: Toronto Outline Debugging Basics Debugging with the

More information

NightStar. NightView Source Level Debugger. Real-Time Linux Debugging and Analysis Tools BROCHURE

NightStar. NightView Source Level Debugger. Real-Time Linux Debugging and Analysis Tools BROCHURE NightStar Real-Time Linux Debugging and Analysis Tools Concurrent s NightStar is a powerful, integrated tool set for debugging and analyzing time-critical Linux applications. NightStar tools run with minimal

More information

Profiling and debugging. Carlos Rosales September 18 th 2009 Texas Advanced Computing Center The University of Texas at Austin

Profiling and debugging. Carlos Rosales September 18 th 2009 Texas Advanced Computing Center The University of Texas at Austin Profiling and debugging Carlos Rosales carlos@tacc.utexas.edu September 18 th 2009 Texas Advanced Computing Center The University of Texas at Austin Outline Debugging Profiling GDB DDT Basic use Attaching

More information

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides)

STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2. (Mouse over to the left to see thumbnails of all of the slides) STARTING THE DDT DEBUGGER ON MIO, AUN, & MC2 (Mouse over to the left to see thumbnails of all of the slides) ALLINEA DDT Allinea DDT is a powerful, easy-to-use graphical debugger capable of debugging a

More information

Debugging with TotalView

Debugging with TotalView Debugging with TotalView Le Yan HPC Consultant User Services Goals Learn how to start TotalView on Linux clusters Get familiar with TotalView graphic user interface Learn basic debugging functions of TotalView

More information

OPT User Guide. Version 1.4

OPT User Guide. Version 1.4 Version 1.4 Table of Contents 1 Introduction...1 1.1 Reading this User Guide...2 2 Preparing Your Application...4 2.1 MPI Profiling...4 2.1.1 Supported MPI Libraries...4 2.1.2 Compiling For Shared MPI

More information

Tesla GPU Computing A Revolution in High Performance Computing

Tesla GPU Computing A Revolution in High Performance Computing Tesla GPU Computing A Revolution in High Performance Computing Gernot Ziegler, Developer Technology (Compute) (Material by Thomas Bradley) Agenda Tesla GPU Computing CUDA Fermi What is GPU Computing? Introduction

More information

Accelerate HPC Development with Allinea Performance Tools

Accelerate HPC Development with Allinea Performance Tools Accelerate HPC Development with Allinea Performance Tools 19 April 2016 VI-HPS, LRZ Florent Lebeau / Ryan Hulguin flebeau@allinea.com / rhulguin@allinea.com Agenda 09:00 09:15 Introduction 09:15 09:45

More information

Performance Tools for Technical Computing

Performance Tools for Technical Computing Christian Terboven terboven@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Intel Software Conference 2010 April 13th, Barcelona, Spain Agenda o Motivation and Methodology

More information

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology

Introduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology Introduction to the SHARCNET Environment 2010-May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology available hardware and software resources our web portal

More information

OpenACC Course. Office Hour #2 Q&A

OpenACC Course. Office Hour #2 Q&A OpenACC Course Office Hour #2 Q&A Q1: How many threads does each GPU core have? A: GPU cores execute arithmetic instructions. Each core can execute one single precision floating point instruction per cycle

More information

[Scalasca] Tool Integrations

[Scalasca] Tool Integrations Mitglied der Helmholtz-Gemeinschaft [Scalasca] Tool Integrations Aug 2011 Bernd Mohr CScADS Performance Tools Workshop Lake Tahoe Contents Current integration of various direct measurement tools Paraver

More information

Debugging and Optimizing Programs Accelerated with Intel Xeon Phi Coprocessors

Debugging and Optimizing Programs Accelerated with Intel Xeon Phi Coprocessors Debugging and Optimizing Programs Accelerated with Intel Xeon Phi Coprocessors Chris Gottbrath Rogue Wave Software Boulder, CO Chris.Gottbrath@roguewave.com Abstract Intel Xeon Phi coprocessors present

More information

Debugging, benchmarking, tuning i.e. software development tools. Martin Čuma Center for High Performance Computing University of Utah

Debugging, benchmarking, tuning i.e. software development tools. Martin Čuma Center for High Performance Computing University of Utah Debugging, benchmarking, tuning i.e. software development tools Martin Čuma Center for High Performance Computing University of Utah m.cuma@utah.edu SW development tools Development environments Compilers

More information

DDT Debugging Techniques

DDT Debugging Techniques DDT Debugging Techniques Carlos Rosales carlos@tacc.utexas.edu Scaling to Petascale 2010 July 7, 2010 Debugging Parallel Programs Usual problems Memory access issues Special cases not accounted for in

More information

An Introduction to OpenACC

An Introduction to OpenACC An Introduction to OpenACC Alistair Hart Cray Exascale Research Initiative Europe 3 Timetable Day 1: Wednesday 29th August 2012 13:00 Welcome and overview 13:15 Session 1: An Introduction to OpenACC 13:15

More information

Welcome. HRSK Practical on Debugging, Zellescher Weg 12 Willers-Bau A106 Tel

Welcome. HRSK Practical on Debugging, Zellescher Weg 12 Willers-Bau A106 Tel Center for Information Services and High Performance Computing (ZIH) Welcome HRSK Practical on Debugging, 03.04.2009 Zellescher Weg 12 Willers-Bau A106 Tel. +49 351-463 - 31945 Matthias Lieber (matthias.lieber@tu-dresden.de)

More information

Progress on GPU Parallelization of the NIM Prototype Numerical Weather Prediction Dynamical Core

Progress on GPU Parallelization of the NIM Prototype Numerical Weather Prediction Dynamical Core Progress on GPU Parallelization of the NIM Prototype Numerical Weather Prediction Dynamical Core Tom Henderson NOAA/OAR/ESRL/GSD/ACE Thomas.B.Henderson@noaa.gov Mark Govett, Jacques Middlecoff Paul Madden,

More information

Allinea DDT Debugger. Dan Mazur, McGill HPC March 5,

Allinea DDT Debugger. Dan Mazur, McGill HPC  March 5, Allinea DDT Debugger Dan Mazur, McGill HPC daniel.mazur@mcgill.ca guillimin@calculquebec.ca March 5, 2015 1 Outline Introduction and motivation Guillimin login and DDT configuration Compiling for a debugger

More information

Le Yan Louisiana Optical Network Initiative. 8/3/2009 Scaling to Petascale Virtual Summer School

Le Yan Louisiana Optical Network Initiative. 8/3/2009 Scaling to Petascale Virtual Summer School Parallel Debugging Techniques Le Yan Louisiana Optical Network Initiative 8/3/2009 Scaling to Petascale Virtual Summer School Outline Overview of parallel debugging Challenges Tools Strategies Gtf Get

More information

Debugging Your CUDA Applications With CUDA-GDB

Debugging Your CUDA Applications With CUDA-GDB Debugging Your CUDA Applications With CUDA-GDB Outline Introduction Installation & Usage Program Execution Control Thread Focus Program State Inspection Run-Time Error Detection Tips & Miscellaneous Notes

More information

Compiler support for Fortran 2003 and 2008 standards Ian Chivers & Jane Sleightholme Fortranplus

Compiler support for Fortran 2003 and 2008 standards Ian Chivers & Jane Sleightholme Fortranplus Compiler support for Fortran 2003 and Ian Chivers & Jane Sleightholme Fortranplus ian@fortranplus.co.uk www.fortranplus.co.uk 15 th June 2012 jane@fortranplus.co.uk A bit about us Ian & Jane Worked in

More information

Productive Performance on the Cray XK System Using OpenACC Compilers and Tools

Productive Performance on the Cray XK System Using OpenACC Compilers and Tools Productive Performance on the Cray XK System Using OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 The New Generation of Supercomputers Hybrid

More information

RWTH GPU-Cluster. Sandra Wienke March Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

RWTH GPU-Cluster. Sandra Wienke March Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky RWTH GPU-Cluster Fotos: Christian Iwainsky Sandra Wienke wienke@rz.rwth-aachen.de March 2012 Rechen- und Kommunikationszentrum (RZ) The GPU-Cluster GPU-Cluster: 57 Nvidia Quadro 6000 (29 nodes) innovative

More information

Portable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc.

Portable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. Portable and Productive Performance with OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 Cray: Leadership in Computational Research Earth Sciences

More information

Guillimin HPC Users Meeting July 14, 2016

Guillimin HPC Users Meeting July 14, 2016 Guillimin HPC Users Meeting July 14, 2016 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News System Status Software Updates Training

More information

Intel Parallel Studio XE 2017 Composer Edition BETA C++ - Debug Solutions Release Notes

Intel Parallel Studio XE 2017 Composer Edition BETA C++ - Debug Solutions Release Notes Developer Zone Intel Parallel Studio XE 2017 Composer Edition BETA C++ - Debug Solutions Release Notes Submitted by Georg Z. (Intel) on August 5, 2016 This page provides the current Release Notes for the

More information

GPU Technology Conference Three Ways to Debug Parallel CUDA Applications: Interactive, Batch, and Corefile

GPU Technology Conference Three Ways to Debug Parallel CUDA Applications: Interactive, Batch, and Corefile GPU Technology Conference 2015 Three Ways to Debug Parallel CUDA Applications: Interactive, Batch, and Corefile Three Ways to Debug Parallel CUDA Applications: Interactive, Batch, and Corefile What do

More information

Using Java for Scientific Computing. Mark Bul EPCC, University of Edinburgh

Using Java for Scientific Computing. Mark Bul EPCC, University of Edinburgh Using Java for Scientific Computing Mark Bul EPCC, University of Edinburgh markb@epcc.ed.ac.uk Java and Scientific Computing? Benefits of Java for Scientific Computing Portability Network centricity Software

More information

DDT User Guide. Version 2.2.1

DDT User Guide. Version 2.2.1 DDT User Guide Version 2.2.1 Contents Contents...1 1 Introduction...5 2 Installation and Configuration...6 2.1 Installation...6 2.1.1 Graphical Install...6 2.1.2 Text-mode Install...8 2.1.3 Licence Files...8

More information

Parallel Debugging. ª Objective. ª Contents. ª Learn the basics of debugging parallel programs

Parallel Debugging. ª Objective. ª Contents. ª Learn the basics of debugging parallel programs ª Objective ª Learn the basics of debugging parallel programs ª Contents ª Launching a debug session ª The Parallel Debug Perspective ª Controlling sets of processes ª Controlling individual processes

More information

Facing the challenges of. New Approaches To Debugging Complex Codes! Ed Hinkel, Sales Engineer Rogue Wave Software

Facing the challenges of. New Approaches To Debugging Complex Codes! Ed Hinkel, Sales Engineer Rogue Wave Software Facing the challenges of or New Approaches To Debugging Complex Codes! Ed Hinkel, Sales Engineer Rogue Wave Software Agenda Introduction Rogue Wave! TotalView! Approaching the Debugging Challenge! 1 TVScript

More information

Hands-on Workshop on How To Debug Codes at the Institute

Hands-on Workshop on How To Debug Codes at the Institute Hands-on Workshop on How To Debug Codes at the Institute H. Birali Runesha, Shuxia Zhang and Ben Lynch (612) 626 0802 (help) help@msi.umn.edu October 13, 2005 Outline Debuggers at the Institute Totalview

More information

The Cray Programming Environment. An Introduction

The Cray Programming Environment. An Introduction The Cray Programming Environment An Introduction Vision Cray systems are designed to be High Productivity as well as High Performance Computers The Cray Programming Environment (PE) provides a simple consistent

More information

Short Introduction to Debugging Tools on the Cray XC40

Short Introduction to Debugging Tools on the Cray XC40 Short Introduction to Debugging Tools on the Cray XC40 Overview Debugging Get your code up and running correctly. Profiling Locate performance bottlenecks. Light weight At most relinking. Get a first picture

More information

Moab Workload Manager on Cray XT3

Moab Workload Manager on Cray XT3 Moab Workload Manager on Cray XT3 presented by Don Maxwell (ORNL) Michael Jackson (Cluster Resources, Inc.) MOAB Workload Manager on Cray XT3 Why MOAB? Requirements Features Support/Futures 2 Why Moab?

More information

Oracle Developer Studio Performance Analyzer

Oracle Developer Studio Performance Analyzer Oracle Developer Studio Performance Analyzer The Oracle Developer Studio Performance Analyzer provides unparalleled insight into the behavior of your application, allowing you to identify bottlenecks and

More information

OpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4

OpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4 OpenACC Course Class #1 Q&A Contents OpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4 OpenACC/CUDA/OpenMP Q: Is OpenACC an NVIDIA standard or is it accepted

More information

CUDA Development Using NVIDIA Nsight, Eclipse Edition. David Goodwin

CUDA Development Using NVIDIA Nsight, Eclipse Edition. David Goodwin CUDA Development Using NVIDIA Nsight, Eclipse Edition David Goodwin NVIDIA Nsight Eclipse Edition CUDA Integrated Development Environment Project Management Edit Build Debug Profile SC'12 2 Powered By

More information

Debugging Programs Accelerated with Intel Xeon Phi Coprocessors

Debugging Programs Accelerated with Intel Xeon Phi Coprocessors Debugging Programs Accelerated with Intel Xeon Phi Coprocessors A White Paper by Rogue Wave Software. Rogue Wave Software 5500 Flatiron Parkway, Suite 200 Boulder, CO 80301, USA www.roguewave.com Debugging

More information

Automatic Tuning of HPC Applications with Periscope. Michael Gerndt, Michael Firbach, Isaias Compres Technische Universität München

Automatic Tuning of HPC Applications with Periscope. Michael Gerndt, Michael Firbach, Isaias Compres Technische Universität München Automatic Tuning of HPC Applications with Periscope Michael Gerndt, Michael Firbach, Isaias Compres Technische Universität München Agenda 15:00 15:30 Introduction to the Periscope Tuning Framework (PTF)

More information

The Distributed Debugging Tool

The Distributed Debugging Tool The Distributed Debugging Tool Userguide - PAGE 1 - - PAGE 2 - Contents Contents... 3 1. Introduction... 7 2. Installation... 8 Licence Files... 8 Floating Licences... 9 Getting Help... 10 3. Starting

More information

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit Analyzing the Performance of IWAVE on a Cluster using HPCToolkit John Mellor-Crummey and Laksono Adhianto Department of Computer Science Rice University {johnmc,laksono}@rice.edu TRIP Meeting March 30,

More information

Debugging on Blue Waters

Debugging on Blue Waters Debugging on Blue Waters Debugging tools and techniques for Blue Waters are described here with example sessions, output, and pointers to small test codes. For tutorial purposes, this material will work

More information

Tool for Analysing and Checking MPI Applications

Tool for Analysing and Checking MPI Applications Tool for Analysing and Checking MPI Applications April 30, 2010 1 CONTENTS CONTENTS Contents 1 Introduction 3 1.1 What is Marmot?........................... 3 1.2 Design of Marmot..........................

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Debugging & Profiling CUDA Programs February 28, 2012 Dan Negrut, 2012 ME964 UW-Madison Everyone knows that debugging is twice as hard as writing

More information

Intel Parallel Studio XE 2015

Intel Parallel Studio XE 2015 2015 Create faster code faster with this comprehensive parallel software development suite. Faster code: Boost applications performance that scales on today s and next-gen processors Create code faster:

More information

Using Intel VTune Amplifier XE and Inspector XE in.net environment

Using Intel VTune Amplifier XE and Inspector XE in.net environment Using Intel VTune Amplifier XE and Inspector XE in.net environment Levent Akyil Technical Computing, Analyzers and Runtime Software and Services group 1 Refresher - Intel VTune Amplifier XE Intel Inspector

More information

Cray RS Programming Environment

Cray RS Programming Environment Cray RS Programming Environment Gail Alverson Cray Inc. Cray Proprietary Red Storm Red Storm is a supercomputer system leveraging over 10,000 AMD Opteron processors connected by an innovative high speed,

More information

Bright Cluster Manager Advanced HPC cluster management made easy. Martijn de Vries CTO Bright Computing

Bright Cluster Manager Advanced HPC cluster management made easy. Martijn de Vries CTO Bright Computing Bright Cluster Manager Advanced HPC cluster management made easy Martijn de Vries CTO Bright Computing About Bright Computing Bright Computing 1. Develops and supports Bright Cluster Manager for HPC systems

More information

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda KFUPM HPC Workshop April 29-30 2015 Mohamed Mekias HPC Solutions Consultant Agenda 1 Agenda-Day 1 HPC Overview What is a cluster? Shared v.s. Distributed Parallel v.s. Massively Parallel Interconnects

More information

The GPU-Cluster. Sandra Wienke Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

The GPU-Cluster. Sandra Wienke Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky The GPU-Cluster Sandra Wienke wienke@rz.rwth-aachen.de Fotos: Christian Iwainsky Rechen- und Kommunikationszentrum (RZ) The GPU-Cluster GPU-Cluster: 57 Nvidia Quadro 6000 (29 nodes) innovative computer

More information

Systems software design. Software build configurations; Debugging, profiling & Quality Assurance tools

Systems software design. Software build configurations; Debugging, profiling & Quality Assurance tools Systems software design Software build configurations; Debugging, profiling & Quality Assurance tools Who are we? Krzysztof Kąkol Software Developer Jarosław Świniarski Software Developer Presentation

More information

Evolving HPCToolkit John Mellor-Crummey Department of Computer Science Rice University Scalable Tools Workshop 7 August 2017

Evolving HPCToolkit John Mellor-Crummey Department of Computer Science Rice University   Scalable Tools Workshop 7 August 2017 Evolving HPCToolkit John Mellor-Crummey Department of Computer Science Rice University http://hpctoolkit.org Scalable Tools Workshop 7 August 2017 HPCToolkit 1 HPCToolkit Workflow source code compile &

More information

The Titan Tools Experience

The Titan Tools Experience The Titan Tools Experience Michael J. Brim, Ph.D. Computer Science Research, CSMD/NCCS Petascale Tools Workshop 213 Madison, WI July 15, 213 Overview of Titan Cray XK7 18,688+ compute nodes 16-core AMD

More information

HPC on Windows. Visual Studio 2010 and ISV Software

HPC on Windows. Visual Studio 2010 and ISV Software HPC on Windows Visual Studio 2010 and ISV Software Christian Terboven 19.03.2012 / Aachen, Germany Stand: 16.03.2012 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda

More information

CUDA Tools for Debugging and Profiling

CUDA Tools for Debugging and Profiling CUDA Tools for Debugging and Profiling CUDA Course, Jülich Supercomputing Centre Andreas Herten, Forschungszentrum Jülich, 3 August 2016 Overview What you will learn in this session Use cuda-memcheck to

More information

Parallel Programming. Libraries and implementations

Parallel Programming. Libraries and implementations Parallel Programming Libraries and implementations Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Enabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition. Jeff Kiel Director, Graphics Developer Tools

Enabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition. Jeff Kiel Director, Graphics Developer Tools Enabling the Next Generation of Computational Graphics with NVIDIA Nsight Visual Studio Edition Jeff Kiel Director, Graphics Developer Tools Computational Graphics Enabled Problem: Complexity of Computation

More information

Performance Analysis of MPI Programs with Vampir and Vampirtrace Bernd Mohr

Performance Analysis of MPI Programs with Vampir and Vampirtrace Bernd Mohr Performance Analysis of MPI Programs with Vampir and Vampirtrace Bernd Mohr Research Centre Juelich (FZJ) John von Neumann Institute of Computing (NIC) Central Institute for Applied Mathematics (ZAM) 52425

More information

CS2141 Software Development using C/C++ Debugging

CS2141 Software Development using C/C++ Debugging CS2141 Software Development using C/C++ Debugging Debugging Tips Examine the most recent change Error likely in, or exposed by, code most recently added Developing code incrementally and testing along

More information

Borland Optimizeit Enterprise Suite 6

Borland Optimizeit Enterprise Suite 6 Borland Optimizeit Enterprise Suite 6 Feature Matrix The table below shows which Optimizeit product components are available in Borland Optimizeit Enterprise Suite and which are available in Borland Optimizeit

More information

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine

Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Batch Systems & Parallel Application Launchers Running your jobs on an HPC machine Partners Funding Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike

More information

Qualcomm Snapdragon Profiler

Qualcomm Snapdragon Profiler Qualcomm Technologies, Inc. Qualcomm Snapdragon Profiler User Guide September 21, 2018 Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc. Other Qualcomm products referenced herein are products

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

Accelerator programming with OpenACC

Accelerator programming with OpenACC ..... Accelerator programming with OpenACC Colaboratorio Nacional de Computación Avanzada Jorge Castro jcastro@cenat.ac.cr 2018. Agenda 1 Introduction 2 OpenACC life cycle 3 Hands on session Profiling

More information

Oracle Developer Studio 12.6

Oracle Developer Studio 12.6 Oracle Developer Studio 12.6 Oracle Developer Studio is the #1 development environment for building C, C++, Fortran and Java applications for Oracle Solaris and Linux operating systems running on premises

More information

Debugging on Intel Platforms

Debugging on Intel Platforms White Paper Robert Mueller-Albrecht Developer Products Division Intel Corporation Debugging on Intel Platforms Introduction...3 Overview...3 Servers and Workstations...4 Support for Linux*, Mac OS X*,

More information

The PAPI Cross-Platform Interface to Hardware Performance Counters

The PAPI Cross-Platform Interface to Hardware Performance Counters The PAPI Cross-Platform Interface to Hardware Performance Counters Kevin London, Shirley Moore, Philip Mucci, and Keith Seymour University of Tennessee-Knoxville {london, shirley, mucci, seymour}@cs.utk.edu

More information

Introduction to Performance Tuning & Optimization Tools

Introduction to Performance Tuning & Optimization Tools Introduction to Performance Tuning & Optimization Tools a[i] a[i+1] + a[i+2] a[i+3] b[i] b[i+1] b[i+2] b[i+3] = a[i]+b[i] a[i+1]+b[i+1] a[i+2]+b[i+2] a[i+3]+b[i+3] Ian A. Cosden, Ph.D. Manager, HPC Software

More information

Coding Tools. (Lectures on High-performance Computing for Economists VI) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 March 25, 2018

Coding Tools. (Lectures on High-performance Computing for Economists VI) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 March 25, 2018 Coding Tools (Lectures on High-performance Computing for Economists VI) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 March 25, 2018 1 University of Pennsylvania 2 Boston College Compilers Compilers

More information

TotalView 2018 Release Notes

TotalView 2018 Release Notes These release notes contain a summary of new features and enhancements, late-breaking product issues, migration from earlier releases, and bug fixes. PLEASE NOTE: The version of this document in the product

More information

Intel Parallel Studio 2011

Intel Parallel Studio 2011 THE ULTIMATE ALL-IN-ONE PERFORMANCE TOOLKIT Studio 2011 Product Brief Studio 2011 Accelerate Development of Reliable, High-Performance Serial and Threaded Applications for Multicore Studio 2011 is a comprehensive

More information

Beyond Hardware IP An overview of Arm development solutions

Beyond Hardware IP An overview of Arm development solutions Beyond Hardware IP An overview of Arm development solutions 2018 Arm Limited Arm Technical Symposia 2018 Advanced first design cost (US$ million) IC design complexity and cost aren t slowing down 542.2

More information

Debugging. John Lockman Texas Advanced Computing Center

Debugging. John Lockman Texas Advanced Computing Center Debugging John Lockman Texas Advanced Computing Center Debugging Outline GDB Basic use Attaching to a running job DDT Identify MPI problems using Message Queues Catch memory errors PTP For the extremely

More information

AMD CodeXL 1.3 GA Release Notes

AMD CodeXL 1.3 GA Release Notes AMD CodeXL 1.3 GA Release Notes Thank you for using CodeXL. We appreciate any feedback you have! Please use the CodeXL Forum to provide your feedback. You can also check out the Getting Started guide on

More information

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar

CRAY XK6 REDEFINING SUPERCOMPUTING. - Sanjana Rakhecha - Nishad Nerurkar CRAY XK6 REDEFINING SUPERCOMPUTING - Sanjana Rakhecha - Nishad Nerurkar CONTENTS Introduction History Specifications Cray XK6 Architecture Performance Industry acceptance and applications Summary INTRODUCTION

More information

TotalView. Users Guide. August 2001 Version 5.0

TotalView. Users Guide. August 2001 Version 5.0 TotalView Users Guide August 2001 Version 5.0 Copyright 1999 2001 by Etnus LLC. All rights reserved. Copyright 1998 1999 by Etnus Inc. All rights reserved. Copyright 1996 1998 by Dolphin Interconnect Solutions,

More information