Performance Analysis of Parallel Scientific Applications In Eclipse

Size: px
Start display at page:

Download "Performance Analysis of Parallel Scientific Applications In Eclipse"

Transcription

1 Performance Analysis of Parallel Scientific Applications In Eclipse EclipseCon 2015 Wyatt Spear, University of Oregon

2 Supercomputing Big systems solving big problems Performance gains save time and money Development historically based in the command line Communications infrastructure to transfer data between compute nodes complicates software development Principles of parallel computing becoming principles of computing

3 Parallel Tools Platform (PTP) The Parallel Tools Platform aims to provide a highly integrated environment specifically designed for parallel application development Features include: An integrated development environment (IDE) that supports a wide range of parallel architectures and runtime systems A scalable parallel debugger Parallel programming tools (MPI, OpenMP, UPC, etc.) Support for the integration of parallel tools An environment that simplifies the end-user interaction with parallel systems

4 How Eclipse PTP is Used Editing/Compiling Local Source Code Remote Source Code

5 How Eclipse PTP is Used Launching/Monitoring Source Code Executable

6 How Eclipse PTP is Used Debugging Source Code Executable

7 How Eclipse PTP is Used Performance Tuning Source Code Executable Perf. Data

8 Synchronized Projects Projects types can be: Launch Service Run Compute Debug Service g Debu Build Service Executable Bu il d Index Service File Service Edit Search/Index Navigation Local source code Synchronize Local Source code copy Remote

9 PTP/External Tools Framework formerly Performance Tools Framework Goal: Reduce the eclipse plumbing necessary to integrate tools Provide integration for instrumentation, measurement, and analysis for a variety of performance tools Dynamic Tool Definitions: Workflows & UI Tools and tool workflows are specified in an XML file Tools are selected and configured in the launch configuration window Output is generated, managed and analyzed as specified in the workflow One-click launch functionality Support for development tools such as TAU, PPW and others. Adding new tools is much easier than developing a full Eclipse plug-in

10 SAX and JAXB Tool Definitions Prior implementations of ETFW used a simple SAX based schema to define tool workflows By default workflows now use the more powerful JAXB schema that defines PTP s resource manager Legacy workflows can still be loaded by selecting the SAX parser in PTP options Window->Preferences-> Parallel Tools->External Tools

11 TAU: Tuning and Analysis Utilities TAU is a performance evaluation tool It supports parallel profiling and tracing Profiling shows you how much (total) time was spent in each routine Tracing shows you when the events take place in each process along a timeline TAU uses a package called PDT (Performance Database Toolkit) for automatic instrumentation of the source code Profiling and tracing can measure time as well as hardware performance counters from your CPU (or GPU!) TAU can automatically instrument your source code (routines, loops, I/O, memory, phases, etc.) TAU runs on all HPC platforms and it is free (BSD style license) TAU has instrumentation, measurement and analysis tools paraprof is TAU s 3D profile browser TAU TAU-11

12 TAU Performance System Architecture

13 Direct Observation: Events Event types Interval events (begin/end events) Measures exclusive & inclusive durations between events Metrics monotonically increase Atomic events (trigger with data value) Used to capture performance data state Shows extent of variation of triggered values (min/max/mean) Code events Routines, classes, templates Statement-level blocks, loops 13

14 Inclusive and Exclusive Profiles Performance with respect to code regions Exclusive measurements for region only Inclusive measurements includes child regions int foo() { int a; a =a + 1; bar(); exclusive duration inclusive duration a =a + 1; return a; } 14

15 Hardware Counters Hardware performance counters available on most modern microprocessors can provide insight into: 1. Whole program timing 2. Cache behaviors 3. Branch behaviors 4. Memory and resource access patterns 5. Pipeline stalls 6. Floating point efficiency 7. Instructions per cycle Hardware counter information can be obtained with: 1. Subroutine or basic block resolution 2. Process or thread attribution

16 Profiling On The Command Line % export TAU_MAKEFILE=<taudir>/<arch>/lib/Makefile.tau-papi-mpipdt % export TAU_OPTIONS= -opttauselectfile=select.tau optverbose % cat select.tau BEGIN_INSTRUMENT_SECTION loops routine= # END_INSTRUMENT_SECTION % make F90=tau_f90.sh (Or edit Makefile and change F90=tau_f90.sh) % export TAU_METRICS=TIME:PAPI_FP_INS:PAPI_L1_DCM % mpirun np 8./a.out % paraprof - pack app.ppk Move the app.ppk file to your desktop. % paraprof app.ppk 16

17 PTP TAU plug-ins TAU (Tuning and Analysis Utilities) First implementation of External Tools Framework (ETFw) Eclipse plug-ins wrap TAU functions, make them available from Eclipse Full GUI support for the TAU command line interface Performance analysis integrated with development environment

18 TAU Integration with PTP TAU: Tuning and Analysis Utilities Performance data collection and analysis for HPC codes Numerous features Command line interface The TAU Workflow: Instrumentation Execution Analysis

19 Selective Instrumentation By default tau provides timing data for each subroutine of your application Selective instrumentation allows you to include/exclude code from analysis and control additional analysis features Include/exclude source files or routines Add timers and phases around routines or arbitrary code Instrument loops Note that some instrumentation features require the PDT Right click on a source file to see the Selective Instrumention context menu Results in creation of tau.selective

20 Begin Profile Configuration The ETFw uses the same run configurations and resource managers as debugging/launching Click on the Run menu or the right side of the Profile button From the dropdown menu select Profile configurations

21 Select Configuration Select an existing launch configuration or create a new one The Resource and Application configuration tabs require little or no modification from standard PTP launch Allows selection/creation of remote connection PTP provides a UI for the remote resource manager, e.g. Torque Includes options for configuring remote environment including modules Performance Analysis tab is present in the Profile Configurations dialog

22 Select Tool/Workflow Select the Performance Analysis tab and choose the TAU tool set in the Select Tool dropdown box Other tools may be available, either installed as plug-ins or loaded from workflow definition XML files Configuration sub-panes appear depending on the selected tool

23 Select TAU Configuration Choose the TAU Makefile tab: All TAU configurations in remote installation are available Check MPI and PDT checkboxes to filter listed makefiles Make your selection in the Select Makefile: dropdown box TAU provides individual stub makefiles for each configuration, tailored to the programming paradigm and data being collected.

24 Choose PAPI Hardware Counters When a PAPI-enabled TAU configuration is selected the PAPI Counter tool becomes available Select the Select PAPI Counters button to open the tool Open the PRESET subtree Select PAPI_L1_DCM (Data cache misses) Scroll down to select PAPI_FP_INS (Floating point instructions) Invalid selections are automatically excluded Select OK

25 Compiler Options TAU Compiler Options Set arguments to TAU compiler scripts Control instrumentation and compilation behavior Verbose shows activity of compiler wrapper KeepFiles retains instrumented source PreProcess handles C type ifdefs in fortran Specify use of selective instrumentation

26 Runtime Options TAU Runtime options Set environment variables used by TAU Control data collection behavior Verbose provides debugging info Callpath shows call stack placement of events Throttling reduces overhead Tracing generates execution timelines Hover help

27 Working with Profiles Profiles are uploaded to selected database A text summary may be printed to the console Profiles may be uploaded to the TAU Portal for viewing online tau.nic.uoregon.edu Profiles may be copied to your workspace and loaded in ParaProf from the command line.

28 Launch TAU Analysis Once your TAU launch is configured select Profile The project rebuilds on the remote system with TAU compiler commands The project will execute normally but TAU profiles will be generated TAU profiles will be processed as specified in the launch configuration. If you have a local profile database the run will show up in the Performance Data Management view Double click the new entry to view in ParaProf Right click on a function bar and select Show Source Code for source callback to Eclipse

29 Paraprof Use ParaProf for profile visualization to identify performance hotspots Inefficient sequential computation Communication overhead IO/Memory bottlenecks Load imbalance Suboptimal cache performance Compare multiple trials in PerfExplorer to identify performance regressions and scaling issues

30 Callpath Profile 30

31 Trace of HMPP SGEMM (CAPS Entreprise) Host Process Transfer Kernel Compute Kernel Host Process Transfer Kernel Compute Kernel

32 Communication Matrix Display Goal: What is the volume of inter-process communication? Along which calling path?

33 Performance Regression Testing

34 Evaluate Scalability

35 Roofline Analysis A Roofline chart represents peak hardware floating point performance and peak memory bandwidth The Roofline can be derived from system modeling via benchmarks or taken from system specifications Computational intensity of application kernels are plotted on the roofline to illustrate the difference between observed and achievable performance Roofline charts can be applied to system power models as well

36 Roofline Visualization Toolkit Eclipse integration for development platform support Roofline charts implemented in JavaFX. Allows for portable, standalone viewer Roofline data is stored in JSON files including recorded performance metrics and metadata for ease of searching and comparison between trials, systems and benchmarks. Preliminary remote, file based database for roofline data storage

37 Roofline UI Load roofline data from remote site or local disk Quick selection from multiple datasets

38 Eclipse Integration Select application events in Eclipse source outline Display values from TAUdb database on Roofline chart

39 Online Information Information about PTP PTP online help Main web site for downloads, documentation, etc. Wiki for designs, planning, meetings, etc. Information about Photran Main web site for downloads, documentation, etc. Information About TAU

40 PTP Mailing Lists User Mailing Lists PTP Photran Major announcements (new releases, etc.) - low volume Developer Mailing Lists Developer discussions - higher volume

Improving Applica/on Performance Using the TAU Performance System

Improving Applica/on Performance Using the TAU Performance System Improving Applica/on Performance Using the TAU Performance System Sameer Shende, John C. Linford {sameer, jlinford}@paratools.com ParaTools, Inc and University of Oregon. April 4-5, 2013, CG1, NCAR, UCAR

More information

Performance Tools. Tulin Kaman. Department of Applied Mathematics and Statistics

Performance Tools. Tulin Kaman. Department of Applied Mathematics and Statistics Performance Tools Tulin Kaman Department of Applied Mathematics and Statistics Stony Brook/BNL New York Center for Computational Science tkaman@ams.sunysb.edu Aug 23, 2012 Do you have information on exactly

More information

Tau Introduction. Lars Koesterke (& Kent Milfeld, Sameer Shende) Cornell University Ithaca, NY. March 13, 2009

Tau Introduction. Lars Koesterke (& Kent Milfeld, Sameer Shende) Cornell University Ithaca, NY. March 13, 2009 Tau Introduction Lars Koesterke (& Kent Milfeld, Sameer Shende) Cornell University Ithaca, NY March 13, 2009 General Outline Measurements Instrumentation & Control Example: matmult Profiling and Tracing

More information

TAU by example - Mpich

TAU by example - Mpich TAU by example From Mpich TAU (Tuning and Analysis Utilities) is a toolkit for profiling and tracing parallel programs written in C, C++, Fortran and others. It supports dynamic (librarybased), compiler

More information

Profiling with TAU. Le Yan. 6/6/2012 LONI Parallel Programming Workshop

Profiling with TAU. Le Yan. 6/6/2012 LONI Parallel Programming Workshop Profiling with TAU Le Yan 6/6/2012 LONI Parallel Programming Workshop 2012 1 Three Steps of Code Development Debugging Make sure the code runs and yields correct results Profiling Analyze the code to identify

More information

Profiling with TAU. Le Yan. User Services LSU 2/15/2012

Profiling with TAU. Le Yan. User Services LSU 2/15/2012 Profiling with TAU Le Yan User Services HPC @ LSU Feb 13-16, 2012 1 Three Steps of Code Development Debugging Make sure the code runs and yields correct results Profiling Analyze the code to identify performance

More information

Score-P. SC 14: Hands-on Practical Hybrid Parallel Application Performance Engineering 1

Score-P. SC 14: Hands-on Practical Hybrid Parallel Application Performance Engineering 1 Score-P SC 14: Hands-on Practical Hybrid Parallel Application Performance Engineering 1 Score-P Functionality Score-P is a joint instrumentation and measurement system for a number of PA tools. Provide

More information

TAU 2.19 Quick Reference

TAU 2.19 Quick Reference What is TAU? The TAU Performance System is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, Java, Python. It comprises 3 main units: Instrumentation,

More information

Module 7: Advanced Development

Module 7: Advanced Development Module 7: Advanced Development Objective Become familiar with other tools that help parallel application development Contents Parallel Language Development Tools: MPI, OpenMP, UPC Overview of UPC tools

More information

TAUdb: PerfDMF Refactored

TAUdb: PerfDMF Refactored TAUdb: PerfDMF Refactored Kevin Huck, Suzanne Millstein, Allen D. Malony and Sameer Shende Department of Computer and Information Science University of Oregon PerfDMF Overview Performance Data Management

More information

Improving the Eclipse Parallel Tools Platform in Support of Earth Sciences High Performance Computing

Improving the Eclipse Parallel Tools Platform in Support of Earth Sciences High Performance Computing Improving the Eclipse Parallel Tools Platform in Support of Earth Sciences High Performance Computing Jay Alameda National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign

More information

Using the Eclipse Parallel Tools Platform in Support of Earth Sciences High Performance Computing

Using the Eclipse Parallel Tools Platform in Support of Earth Sciences High Performance Computing Using the Eclipse Parallel Tools Platform in Support of Earth Sciences High Performance Computing Jay Alameda National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign

More information

Eclipse-PTP: An Integrated Environment for the Development of Parallel Applications

Eclipse-PTP: An Integrated Environment for the Development of Parallel Applications Eclipse-PTP: An Integrated Environment for the Development of Parallel Applications Greg Watson (grw@us.ibm.com) Craig Rasmussen (rasmusen@lanl.gov) Beth Tibbitts (tibbitts@us.ibm.com) Parallel Tools Workshop,

More information

Performance Tool Workflows

Performance Tool Workflows Performance Tool Workflows Wyatt Spear, Allen Malony, Alan Morris, and Sameer Shende Performance Research Laboritory Department of Computer and Information Science University of Oregon, Eugene OR 97403,

More information

Performance analysis basics

Performance analysis basics Performance analysis basics Christian Iwainsky Iwainsky@rz.rwth-aachen.de 25.3.2010 1 Overview 1. Motivation 2. Performance analysis basics 3. Measurement Techniques 2 Why bother with performance analysis

More information

Visual Profiler. User Guide

Visual Profiler. User Guide Visual Profiler User Guide Version 3.0 Document No. 06-RM-1136 Revision: 4.B February 2008 Visual Profiler User Guide Table of contents Table of contents 1 Introduction................................................

More information

Integrating Parallel Application Development with Performance Analysis in Periscope

Integrating Parallel Application Development with Performance Analysis in Periscope Technische Universität München Integrating Parallel Application Development with Performance Analysis in Periscope V. Petkov, M. Gerndt Technische Universität München 19 April 2010 Atlanta, GA, USA Motivation

More information

IBM High Performance Computing Toolkit

IBM High Performance Computing Toolkit IBM High Performance Computing Toolkit Pidad D'Souza (pidsouza@in.ibm.com) IBM, India Software Labs Top 500 : Application areas (November 2011) Systems Performance Source : http://www.top500.org/charts/list/34/apparea

More information

Improving the Eclipse Parallel Tools Platform to Create an Effective Workbench for High Performance Computing

Improving the Eclipse Parallel Tools Platform to Create an Effective Workbench for High Performance Computing Improving the Eclipse Parallel Tools Platform to Create an Effective Workbench for High Performance Computing Jay Alameda National Center for Supercomputing Applications 1 st CHANGES Workshop, Jülich 5

More information

Intel VTune Amplifier XE. Dr. Michael Klemm Software and Services Group Developer Relations Division

Intel VTune Amplifier XE. Dr. Michael Klemm Software and Services Group Developer Relations Division Intel VTune Amplifier XE Dr. Michael Klemm Software and Services Group Developer Relations Division Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED AS IS. NO LICENSE, EXPRESS

More information

Introduction to Parallel Performance Engineering

Introduction to Parallel Performance Engineering Introduction to Parallel Performance Engineering Markus Geimer, Brian Wylie Jülich Supercomputing Centre (with content used with permission from tutorials by Bernd Mohr/JSC and Luiz DeRose/Cray) Performance:

More information

The Eclipse Parallel Tools Platform

The Eclipse Parallel Tools Platform May 1, 2012 Toward an Integrated Development Environment for Improved Software Engineering on Crays Agenda 1. What is the Eclipse Parallel Tools Platform (PTP) 2. Tour of features available in Eclipse/PTP

More information

Developing Scientific Applications with the IBM Parallel Environment Developer Edition

Developing Scientific Applications with the IBM Parallel Environment Developer Edition Developing Scientific Applications with the IBM Parallel Environment Developer Edition Greg Watson, IBM grw@us.ibm.com Christoph Pospiech, IBM christoph.pospiech@de.ibm.com ScicomP 13 May 2013 Portions

More information

PTP - PLDT Parallel Language Development Tools Overview, Status & Plans

PTP - PLDT Parallel Language Development Tools Overview, Status & Plans PTP - PLDT Parallel Language Development Tools Overview, Status & Plans Beth Tibbitts tibbitts@us.ibm.com High Productivity Tools Group, IBM Research "This material is based upon work supported by the

More information

Introduction to Performance Tuning & Optimization Tools

Introduction to Performance Tuning & Optimization Tools Introduction to Performance Tuning & Optimization Tools a[i] a[i+1] + a[i+2] a[i+3] b[i] b[i+1] b[i+2] b[i+3] = a[i]+b[i] a[i+1]+b[i+1] a[i+2]+b[i+2] a[i+3]+b[i+3] Ian A. Cosden, Ph.D. Manager, HPC Software

More information

Tutorial: Analyzing MPI Applications. Intel Trace Analyzer and Collector Intel VTune Amplifier XE

Tutorial: Analyzing MPI Applications. Intel Trace Analyzer and Collector Intel VTune Amplifier XE Tutorial: Analyzing MPI Applications Intel Trace Analyzer and Collector Intel VTune Amplifier XE Contents Legal Information... 3 1. Overview... 4 1.1. Prerequisites... 5 1.1.1. Required Software... 5 1.1.2.

More information

ClearSpeed Visual Profiler

ClearSpeed Visual Profiler ClearSpeed Visual Profiler Copyright 2007 ClearSpeed Technology plc. All rights reserved. 12 November 2007 www.clearspeed.com 1 Profiling Application Code Why use a profiler? Program analysis tools are

More information

ETFw and adding a Simple Custom GUI Control

ETFw and adding a Simple Custom GUI Control ETFw and adding a Simple Custom GUI Control Brian D. Watt bwatt@us.ibm.com IBM Austin TX 2012 NCSA Eclipse PTP User-Developer Workshop September 18-20, 2012 IBM Contents Eclipse 4.2 with PTP 6.0 IBM Plugins

More information

Performance analysis tools: Intel VTuneTM Amplifier and Advisor. Dr. Luigi Iapichino

Performance analysis tools: Intel VTuneTM Amplifier and Advisor. Dr. Luigi Iapichino Performance analysis tools: Intel VTuneTM Amplifier and Advisor Dr. Luigi Iapichino luigi.iapichino@lrz.de Which tool do I use in my project? A roadmap to optimisation After having considered the MPI layer,

More information

Munara Tolubaeva Technical Consulting Engineer. 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries.

Munara Tolubaeva Technical Consulting Engineer. 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries. Munara Tolubaeva Technical Consulting Engineer 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries. notices and disclaimers Intel technologies features and benefits depend

More information

Developing Scientific Applications Using Eclipse and the Parallel Tools Platform

Developing Scientific Applications Using Eclipse and the Parallel Tools Platform Developing Scientific Applications Using Eclipse and the Parallel Tools Platform Greg Watson, IBM g.watson@computer.org Beth Tibbitts, IBM tibbitts@us.ibm.com Jay Alameda, NCSA jalameda@ncsa.uiuc.edu Jeff

More information

Parametric Studies in Eclipse with TAU and PerfExplorer

Parametric Studies in Eclipse with TAU and PerfExplorer Parametric Studies in Eclipse with TAU and PerfExplorer Kevin A. Huck, Wyatt Spear, Allen D. Malony, Sameer Shende and Alan Morris Performance Research Laboratory Department of Computer and Information

More information

HPC Toolkit. View MPI Trace Data

HPC Toolkit. View MPI Trace Data IBM HPC Toolkit View MPI Trace Data! Timeline view of MPI function calls! MPI Function calls color coded matching list of right! Exclude functions from trace by clicking checkboxes! Zoom in and out using

More information

Scalability Improvements in the TAU Performance System for Extreme Scale

Scalability Improvements in the TAU Performance System for Extreme Scale Scalability Improvements in the TAU Performance System for Extreme Scale Sameer Shende Director, Performance Research Laboratory, University of Oregon TGCC, CEA / DAM Île de France Bruyères- le- Châtel,

More information

Profile Data Mining with PerfExplorer. Sameer Shende Performance Reseaerch Lab, University of Oregon

Profile Data Mining with PerfExplorer. Sameer Shende Performance Reseaerch Lab, University of Oregon Profile Data Mining with PerfExplorer Sameer Shende Performance Reseaerch Lab, University of Oregon http://tau.uoregon.edu TAU Analysis 11th VI-HPS Tuning Workshop, 22-25 April 2013, MdS, Saclay PerfDMF:

More information

Eliminate Threading Errors to Improve Program Stability

Eliminate Threading Errors to Improve Program Stability Introduction This guide will illustrate how the thread checking capabilities in Intel Parallel Studio XE can be used to find crucial threading defects early in the development cycle. It provides detailed

More information

TAU Performance System. Sameer Shende Performance Research Lab, University of Oregon

TAU Performance System. Sameer Shende Performance Research Lab, University of Oregon TAU Performance System Sameer Shende Performance Research Lab, University of Oregon http://tau.uoregon.edu TAU Performance System (http://tau.uoregon.edu) Parallel performance framework and toolkit Supports

More information

Parallel Performance and Optimization

Parallel Performance and Optimization Parallel Performance and Optimization Gregory G. Howes Department of Physics and Astronomy University of Iowa Iowa High Performance Computing Summer School University of Iowa Iowa City, Iowa 25-26 August

More information

Intel VTune Amplifier XE

Intel VTune Amplifier XE Intel VTune Amplifier XE Vladimir Tsymbal Performance, Analysis and Threading Lab 1 Agenda Intel VTune Amplifier XE Overview Features Data collectors Analysis types Key Concepts Collecting performance

More information

Profile Data Mining with PerfExplorer. Sameer Shende Performance Reseaerch Lab, University of Oregon

Profile Data Mining with PerfExplorer. Sameer Shende Performance Reseaerch Lab, University of Oregon Profile Data Mining with PerfExplorer Sameer Shende Performance Reseaerch Lab, University of Oregon http://tau.uoregon.edu TAU Analysis TAUdb: Performance Data Mgmt. Framework 3 Using TAUdb Configure TAUdb

More information

Cray Performance Tools Enhancements for Next Generation Systems Heidi Poxon

Cray Performance Tools Enhancements for Next Generation Systems Heidi Poxon Cray Performance Tools Enhancements for Next Generation Systems Heidi Poxon Agenda Cray Performance Tools Overview Recent Enhancements Support for Cray systems with KNL 2 Cray Performance Analysis Tools

More information

The Art of Debugging: How to think like a programmer. Melissa Sulprizio GEOS-Chem Support Team

The Art of Debugging: How to think like a programmer. Melissa Sulprizio GEOS-Chem Support Team The Art of Debugging: How to think like a programmer Melissa Sulprizio GEOS-Chem Support Team geos-chem-support@as.harvard.edu Graduate Student Forum 23 February 2017 GEOS-Chem Support Team Bob Yantosca

More information

NSIGHT ECLIPSE EDITION

NSIGHT ECLIPSE EDITION NSIGHT ECLIPSE EDITION DG-06450-001 _v7.0 March 2015 Getting Started Guide TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. About...1 Chapter 2. New and Noteworthy... 2 2.1. New in 7.0... 2 2.2. New

More information

Intel profiling tools and roofline model. Dr. Luigi Iapichino

Intel profiling tools and roofline model. Dr. Luigi Iapichino Intel profiling tools and roofline model Dr. Luigi Iapichino luigi.iapichino@lrz.de Which tool do I use in my project? A roadmap to optimization (and to the next hour) We will focus on tools developed

More information

Performance Tools for Technical Computing

Performance Tools for Technical Computing Christian Terboven terboven@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Intel Software Conference 2010 April 13th, Barcelona, Spain Agenda o Motivation and Methodology

More information

MPI Performance Tools

MPI Performance Tools Physics 244 31 May 2012 Outline 1 Introduction 2 Timing functions: MPI Wtime,etime,gettimeofday 3 Profiling tools time: gprof,tau hardware counters: PAPI,PerfSuite,TAU MPI communication: IPM,TAU 4 MPI

More information

Eliminate Threading Errors to Improve Program Stability

Eliminate Threading Errors to Improve Program Stability Eliminate Threading Errors to Improve Program Stability This guide will illustrate how the thread checking capabilities in Parallel Studio can be used to find crucial threading defects early in the development

More information

Development Environments for HPC: The View from NCSA

Development Environments for HPC: The View from NCSA Development Environments for HPC: The View from NCSA Jay Alameda National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign DEHPC 15 San Francisco, CA 18 October 2015 Acknowledgements

More information

Automatic Tuning of HPC Applications with Periscope. Michael Gerndt, Michael Firbach, Isaias Compres Technische Universität München

Automatic Tuning of HPC Applications with Periscope. Michael Gerndt, Michael Firbach, Isaias Compres Technische Universität München Automatic Tuning of HPC Applications with Periscope Michael Gerndt, Michael Firbach, Isaias Compres Technische Universität München Agenda 15:00 15:30 Introduction to the Periscope Tuning Framework (PTF)

More information

Intel VTune Performance Analyzer 9.1 for Windows* In-Depth

Intel VTune Performance Analyzer 9.1 for Windows* In-Depth Intel VTune Performance Analyzer 9.1 for Windows* In-Depth Contents Deliver Faster Code...................................... 3 Optimize Multicore Performance...3 Highlights...............................................

More information

Debugging / Profiling

Debugging / Profiling The Center for Astrophysical Thermonuclear Flashes Debugging / Profiling Chris Daley 23 rd June An Advanced Simulation & Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at Motivation

More information

ncsa eclipse internal training

ncsa eclipse internal training ncsa eclipse internal training This tutorial will cover the basic setup and use of Eclipse with forge.ncsa.illinois.edu. At the end of the tutorial, you should be comfortable with the following tasks:

More information

NSIGHT ECLIPSE EDITION

NSIGHT ECLIPSE EDITION NSIGHT ECLIPSE EDITION DG-06450-001 _v8.0 September 2016 Getting Started Guide TABLE OF CONTENTS Chapter 1. Introduction...1 1.1. About...1 Chapter 2. New and Noteworthy... 2 2.1. New in 7.5... 2 2.2.

More information

Accelerate HPC Development with Allinea Performance Tools

Accelerate HPC Development with Allinea Performance Tools Accelerate HPC Development with Allinea Performance Tools 19 April 2016 VI-HPS, LRZ Florent Lebeau / Ryan Hulguin flebeau@allinea.com / rhulguin@allinea.com Agenda 09:00 09:15 Introduction 09:15 09:45

More information

KNL tools. Dr. Fabio Baruffa

KNL tools. Dr. Fabio Baruffa KNL tools Dr. Fabio Baruffa fabio.baruffa@lrz.de 2 Which tool do I use? A roadmap to optimization We will focus on tools developed by Intel, available to users of the LRZ systems. Again, we will skip the

More information

Allinea Unified Environment

Allinea Unified Environment Allinea Unified Environment Allinea s unified tools for debugging and profiling HPC Codes Beau Paisley Allinea Software bpaisley@allinea.com 720.583.0380 Today s Challenge Q: What is the impact of current

More information

Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer

Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems Ed Hinkel Senior Sales Engineer Agenda Overview - Rogue Wave & TotalView GPU Debugging with TotalView Nvdia CUDA Intel Phi 2

More information

Score-P A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir

Score-P A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Score-P A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir VI-HPS Team Score-P: Specialized Measurements and Analyses Mastering build systems Hooking up the

More information

Parallel Performance and Optimization

Parallel Performance and Optimization Parallel Performance and Optimization Erik Schnetter Gregory G. Howes Iowa High Performance Computing Summer School University of Iowa Iowa City, Iowa May 20-22, 2013 Thank you Ben Rogers Glenn Johnson

More information

Performance Profiler. Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava,

Performance Profiler. Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava, Performance Profiler Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava, 08-09-2016 Faster, Scalable Code, Faster Intel VTune Amplifier Performance Profiler Get Faster Code Faster With Accurate

More information

Estimating Accelerator Performance and Events

Estimating Accelerator Performance and Events Lab Workbook Estimating Accelerator Performance and Events Tracing Estimating Accelerator Performance and Events Tracing Introduction This lab guides you through the steps involved in estimating the expected

More information

Integrated Tool Capabilities for Performance Instrumentation and Measurement

Integrated Tool Capabilities for Performance Instrumentation and Measurement Integrated Tool Capabilities for Performance Instrumentation and Measurement Sameer Shende, Allen Malony Department of Computer and Information Science University of Oregon sameer@cs.uoregon.edu, malony@cs.uoregon.edu

More information

Parallelism V. HPC Profiling. John Cavazos. Dept of Computer & Information Sciences University of Delaware

Parallelism V. HPC Profiling. John Cavazos. Dept of Computer & Information Sciences University of Delaware Parallelism V HPC Profiling John Cavazos Dept of Computer & Information Sciences University of Delaware Lecture Overview Performance Counters Profiling PAPI TAU HPCToolkit PerfExpert Performance Counters

More information

Workload Characterization using the TAU Performance System

Workload Characterization using the TAU Performance System Workload Characterization using the TAU Performance System Sameer Shende, Allen D. Malony, and Alan Morris Performance Research Laboratory, Department of Computer and Information Science University of

More information

Profiling and Debugging Tools. Lars Koesterke University of Porto, Portugal May 28-29, 2009

Profiling and Debugging Tools. Lars Koesterke University of Porto, Portugal May 28-29, 2009 Profiling and Debugging Tools Lars Koesterke University of Porto, Portugal May 28-29, 2009 Outline General (Analysis Tools) Listings & Reports Timers Profilers (gprof, tprof, Tau) Hardware performance

More information

Detection and Analysis of Iterative Behavior in Parallel Applications

Detection and Analysis of Iterative Behavior in Parallel Applications Detection and Analysis of Iterative Behavior in Parallel Applications Karl Fürlinger and Shirley Moore Innovative Computing Laboratory, Department of Electrical Engineering and Computer Science, University

More information

Eliminate Memory Errors to Improve Program Stability

Eliminate Memory Errors to Improve Program Stability Introduction INTEL PARALLEL STUDIO XE EVALUATION GUIDE This guide will illustrate how Intel Parallel Studio XE memory checking capabilities can find crucial memory defects early in the development cycle.

More information

Profiling and Debugging Tools. Outline

Profiling and Debugging Tools. Outline Profiling and Debugging Tools Karl W. Schulz Texas Advanced Computing Center The University of Texas at Austin UT/Portugal Summer Institute Training Coimbra, Portugal July 17, 2008 Outline General (Analysis

More information

VAMPIR & VAMPIRTRACE INTRODUCTION AND OVERVIEW

VAMPIR & VAMPIRTRACE INTRODUCTION AND OVERVIEW VAMPIR & VAMPIRTRACE INTRODUCTION AND OVERVIEW 8th VI-HPS Tuning Workshop at RWTH Aachen September, 2011 Tobias Hilbrich and Joachim Protze Slides by: Andreas Knüpfer, Jens Doleschal, ZIH, Technische Universität

More information

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004 A Study of High Performance Computing and the Cray SV1 Supercomputer Michael Sullivan TJHSST Class of 2004 June 2004 0.1 Introduction A supercomputer is a device for turning compute-bound problems into

More information

PERFORMANCE ANALYSIS AND MODELING OF PARALLEL APPLICATIONS IN THE CONTEXT OF ARCHITECTURAL ROOFLINES

PERFORMANCE ANALYSIS AND MODELING OF PARALLEL APPLICATIONS IN THE CONTEXT OF ARCHITECTURAL ROOFLINES PERFORMANCE ANALYSIS AND MODELING OF PARALLEL APPLICATIONS IN THE CONTEXT OF ARCHITECTURAL ROOFLINES by NASHID SHAILA A THESIS Presented to the Department of Computer and Information Science and the Graduate

More information

Tools and techniques for optimization and debugging. Fabio Affinito October 2015

Tools and techniques for optimization and debugging. Fabio Affinito October 2015 Tools and techniques for optimization and debugging Fabio Affinito October 2015 Profiling Why? Parallel or serial codes are usually quite complex and it is difficult to understand what is the most time

More information

DB2 for z/os Stored Procedure support in Data Server Manager

DB2 for z/os Stored Procedure support in Data Server Manager DB2 for z/os Stored Procedure support in Data Server Manager This short tutorial walks you step-by-step, through a scenario where a DB2 for z/os application developer creates a query, explains and tunes

More information

TAU Performance System Hands on session

TAU Performance System Hands on session TAU Performance System Hands on session Sameer Shende sameer@cs.uoregon.edu University of Oregon http://tau.uoregon.edu Copy the workshop tarball! Setup preferred program environment compilers! Default

More information

Agenda. Optimization Notice Copyright 2017, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Agenda. Optimization Notice Copyright 2017, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Agenda VTune Amplifier XE OpenMP* Analysis: answering on customers questions about performance in the same language a program was written in Concepts, metrics and technology inside VTune Amplifier XE OpenMP

More information

Performance Tuning VTune Performance Analyzer

Performance Tuning VTune Performance Analyzer Performance Tuning VTune Performance Analyzer Paul Petersen, Intel Sept 9, 2005 Copyright 2005 Intel Corporation Performance Tuning Overview Methodology Benchmarking Timing VTune Counter Monitor Call Graph

More information

Profiling and debugging. Carlos Rosales September 18 th 2009 Texas Advanced Computing Center The University of Texas at Austin

Profiling and debugging. Carlos Rosales September 18 th 2009 Texas Advanced Computing Center The University of Texas at Austin Profiling and debugging Carlos Rosales carlos@tacc.utexas.edu September 18 th 2009 Texas Advanced Computing Center The University of Texas at Austin Outline Debugging Profiling GDB DDT Basic use Attaching

More information

GPU Debugging Made Easy. David Lecomber CTO, Allinea Software

GPU Debugging Made Easy. David Lecomber CTO, Allinea Software GPU Debugging Made Easy David Lecomber CTO, Allinea Software david@allinea.com Allinea Software HPC development tools company Leading in HPC software tools market Wide customer base Blue-chip engineering,

More information

Profiling of Data-Parallel Processors

Profiling of Data-Parallel Processors Profiling of Data-Parallel Processors Daniel Kruck 09/02/2014 09/02/2014 Profiling Daniel Kruck 1 / 41 Outline 1 Motivation 2 Background - GPUs 3 Profiler NVIDIA Tools Lynx 4 Optimizations 5 Conclusion

More information

Profiling Applications and Creating Accelerators

Profiling Applications and Creating Accelerators Introduction Program hot-spots that are compute-intensive may be good candidates for hardware acceleration, especially when it is possible to stream data between hardware and the CPU and memory and overlap

More information

Parallel Tools Platform for Judge

Parallel Tools Platform for Judge Parallel Tools Platform for Judge Carsten Karbach, Forschungszentrum Jülich GmbH September 20, 2013 Abstract The Parallel Tools Platform (PTP) represents a development environment for parallel applications.

More information

NEW DEVELOPER TOOLS FEATURES IN CUDA 8.0. Sanjiv Satoor

NEW DEVELOPER TOOLS FEATURES IN CUDA 8.0. Sanjiv Satoor NEW DEVELOPER TOOLS FEATURES IN CUDA 8.0 Sanjiv Satoor CUDA TOOLS 2 NVIDIA NSIGHT Homogeneous application development for CPU+GPU compute platforms CUDA-Aware Editor CUDA Debugger CPU+GPU CUDA Profiler

More information

Tools and Methodology for Ensuring HPC Programs Correctness and Performance. Beau Paisley

Tools and Methodology for Ensuring HPC Programs Correctness and Performance. Beau Paisley Tools and Methodology for Ensuring HPC Programs Correctness and Performance Beau Paisley bpaisley@allinea.com About Allinea Over 15 years of business focused on parallel programming development tools Strong

More information

Portable Power/Performance Benchmarking and Analysis with WattProf

Portable Power/Performance Benchmarking and Analysis with WattProf Portable Power/Performance Benchmarking and Analysis with WattProf Amir Farzad, Boyana Norris University of Oregon Mohammad Rashti RNET Technologies, Inc. Motivation Energy efficiency is becoming increasingly

More information

MPI Performance Engineering through the Integration of MVAPICH and TAU

MPI Performance Engineering through the Integration of MVAPICH and TAU MPI Performance Engineering through the Integration of MVAPICH and TAU Allen D. Malony Department of Computer and Information Science University of Oregon Acknowledgement Research work presented in this

More information

Using Eclipse and the

Using Eclipse and the Developing Scientific Applications Using Eclipse and the Parallel l Tools Platform Greg Watson, IBM g.watson@computer.org Beth Tibbitts, IBM tibbitts@us.ibm.com Jay Alameda, NCSA jalameda@ncsa.uiuc.edu

More information

Cupid Documentation. Release 0.2 (ESMF v7) Rocky Dunlap

Cupid Documentation. Release 0.2 (ESMF v7) Rocky Dunlap Cupid Documentation Release 0.2 (ESMF v7) Rocky Dunlap July 28, 2016 Contents 1 Overview 3 1.1 What is NUOPC?............................................ 3 1.2 What is Eclipse?.............................................

More information

HPC on Windows. Visual Studio 2010 and ISV Software

HPC on Windows. Visual Studio 2010 and ISV Software HPC on Windows Visual Studio 2010 and ISV Software Christian Terboven 19.03.2012 / Aachen, Germany Stand: 16.03.2012 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda

More information

Score-P A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir

Score-P A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Score-P A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir VI-HPS Team Congratulations!? If you made it this far, you successfully used Score-P to instrument

More information

[Scalasca] Tool Integrations

[Scalasca] Tool Integrations Mitglied der Helmholtz-Gemeinschaft [Scalasca] Tool Integrations Aug 2011 Bernd Mohr CScADS Performance Tools Workshop Lake Tahoe Contents Current integration of various direct measurement tools Paraver

More information

Scientific Software Development with Eclipse

Scientific Software Development with Eclipse Scientific Software Development with Eclipse A Best Practices for HPC Developers Webinar Gregory R. Watson ORNL is managed by UT-Battelle for the US Department of Energy Contents Downloading and Installing

More information

Scalasca support for Intel Xeon Phi. Brian Wylie & Wolfgang Frings Jülich Supercomputing Centre Forschungszentrum Jülich, Germany

Scalasca support for Intel Xeon Phi. Brian Wylie & Wolfgang Frings Jülich Supercomputing Centre Forschungszentrum Jülich, Germany Scalasca support for Intel Xeon Phi Brian Wylie & Wolfgang Frings Jülich Supercomputing Centre Forschungszentrum Jülich, Germany Overview Scalasca performance analysis toolset support for MPI & OpenMP

More information

Windows Embedded Compact Test Kit User Guide

Windows Embedded Compact Test Kit User Guide Windows Embedded Compact Test Kit User Guide Writers: Randy Ocheltree, John Hughes Published: October 2011 Applies To: Windows Embedded Compact 7 Abstract The Windows Embedded Compact Test Kit (CTK) is

More information

Graphics Performance Analyzer for Android

Graphics Performance Analyzer for Android Graphics Performance Analyzer for Android 1 What you will learn from this slide deck Detailed optimization workflow of Graphics Performance Analyzer Android* System Analysis Only Please see subsequent

More information

The PAPI Cross-Platform Interface to Hardware Performance Counters

The PAPI Cross-Platform Interface to Hardware Performance Counters The PAPI Cross-Platform Interface to Hardware Performance Counters Kevin London, Shirley Moore, Philip Mucci, and Keith Seymour University of Tennessee-Knoxville {london, shirley, mucci, seymour}@cs.utk.edu

More information

CUDA Tools for Debugging and Profiling. Jiri Kraus (NVIDIA)

CUDA Tools for Debugging and Profiling. Jiri Kraus (NVIDIA) Mitglied der Helmholtz-Gemeinschaft CUDA Tools for Debugging and Profiling Jiri Kraus (NVIDIA) GPU Programming with CUDA@Jülich Supercomputing Centre Jülich 25-27 April 2016 What you will learn How to

More information

Automatic trace analysis with the Scalasca Trace Tools

Automatic trace analysis with the Scalasca Trace Tools Automatic trace analysis with the Scalasca Trace Tools Ilya Zhukov Jülich Supercomputing Centre Property Automatic trace analysis Idea Automatic search for patterns of inefficient behaviour Classification

More information

COMP Superscalar. COMPSs Tracing Manual

COMP Superscalar. COMPSs Tracing Manual COMP Superscalar COMPSs Tracing Manual Version: 2.4 November 9, 2018 This manual only provides information about the COMPSs tracing system. Specifically, it illustrates how to run COMPSs applications with

More information

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit Analyzing the Performance of IWAVE on a Cluster using HPCToolkit John Mellor-Crummey and Laksono Adhianto Department of Computer Science Rice University {johnmc,laksono}@rice.edu TRIP Meeting March 30,

More information

<Insert Picture Here>

<Insert Picture Here> The Other HPC: Profiling Enterprise-scale Applications Marty Itzkowitz Senior Principal SW Engineer, Oracle marty.itzkowitz@oracle.com Agenda HPC Applications

More information