MPI Performance Analysis Trace Analyzer and Collector
|
|
- Bernadette Wade
- 5 years ago
- Views:
Transcription
1 MPI Performance Analysis Trace Analyzer and Collector Berk ONAT İTÜ Bilişim Enstitüsü 19 Haziran 2012
2 Outline MPI Performance Analyzing Defini6ons: Profiling Defini6ons: Tracing Intel Trace Analyzer Lab: How to use ITAC
3 Performance Problems Scalability Produc6vity Efficiency Performance technology Applica6on- specific and automa6c performance tools Cluster analysis
4 Role of Programmer How should we write our programs, given that we have a good op6mizing compiler? Write simpler codes: o Easy to read, o Easy to maintain and o Ensure correctness. Do: o o o Select best algorithm Write code that s readable & maintainable Eliminate op6miza6on blockers Allow compiler to do its job Focus on inner loops Use a profiler and an analyzer to find important ones with 6me consuming
5 Definitions Profiling: Recording of summary informa6on during execu6on Inclusive, exclusive 6me, Number of calls, Hardware sta6s6cs, (hardware counters ) Reflects performance behaviour of program en66es Func6ons, Loops, Basic blocks User- defined seman6c en66es Helps to expose performance bo\lenecks and hotspots Implemented through: Sampling: periodic OS interrupts or hardware counter traps Instrumenta6on: direct inser6on of measurement code
6 Definitions Profile Terminology Rou6ne int main() Inclusive 6me 100 secs Exclusive 6me =10 secs Number of Calls 1 call Number of Subrou6nes Child rou<nes called = 3 Inclusive 6me/call 100 secs int main( ) { /* takes 100 secs */ } /* f1(); /* takes 20 secs */ f2(); /* takes 50 secs */ f1(); /* takes 20 secs */ /* other work */ Time can be replaced by counts */
7 Definitions Tracing: Recording of informa6on about significant points (events) during program execu6on Entering/exi6ng code regions (func6on, loop, block, ) Thread/process interac6ons (e.g., send/receive message) Save informa6on in event record 6mestamp CPU iden6fier, thread iden6fier Event type and event- specific informa6on Event trace is a 6me- sequenced stream of event records Can be used to reconstruct dynamic program behavior Typically requires code instrumenta6on
8 Definitions Event Tracing: Instrumenta<on, Monitor, Trace
9 Definitions Event Tracing: Timeline Visualiza<on
10 Intel Trace Analyzer and Collector Intel Trace Analyzer and Collector: provide informa6on cri6cal to understanding and op6mizing MPI cluster performance by quickly finding performance bo\lenecks with MPI communica6on Interface and Displays Metrics Tracking Scalability Instrumenta6on and Tracing Compa6bility
11 Intel Trace Analyzer and Collector Compa<bility Intel compilers and GNU* compilers Intel MPI Library MPICH (and compa6ble deriva6ves) Red Hat Enterprise Linux 3.0 or 4.0 SUSE LINUX Enterprise Server 9 or 10 SGI Al6x
12 Intel Trace Analyzer and Collector Interface and Displays Timeline Views and Parallelism Display Displays concurrent behavior of Parallel applica6ons Calculates sta6s6cs for specific 6me Intervals, processes, or func6ons Displays applica6on ac6vi6es, event Source code loca6ons, and message passing along 6me axis
13 Intel Trace Analyzer and Collector Advanced GUI Display Scalability Detailed and Aggregate Views Examines aspects of applica6on run6me behavior, grouped by func6ons or processes Easily iden6fies the amount of 6me spent in MPI communica6on Easily see the performance differences between two program runs
14 Intel Trace Analyzer and Collector Execu<on Sta<s<cs Provides subrou6ne execu6on metrics or call- tree characteris6cs Profiling Library Records distributed, event- based trace data Sta<s<cs Readability Logs informa6on for func6on calls, sent messages, and collec6ve opera6ons
15 Intel Trace Analyzer and Collector Scalability Low Overhead Provides structured trace file (STF) format for scalability Generates trace files faster Allows random access to por6ons of a trace, making it suitable for analysis of large amounts of trace data Filtering and Memory Handling Caches trace data in memory to reduce run6me overhead and memory consump6on
16 How to Use ITAC Login to your UYBHM node using - X with ssh : bash: $ ssh - X du??@wsl- node??.uhem.itu.edu.tr or use your PuTTY program in your Windows with X11 forwarding in SSH sec<on. Copy example file tar to your directory bash: $ cd workshop bash: $ cp /RS/users/bonat/workshop/YAZOKULU/ tar. bash: $ tar - xvf tar bash: $ cd /mpi- analyze/traceanalyzer
17 How to Use ITAC Seeng Up Environmental Variables: Adding source ITAC line to your.bashrc and/or.bash_profile source /RS/progs/intel/itac/7.1/bin/itacvars.sh Use add- ITAC- to- my- PATH.sh script bash: $./add- ITAC- to- my- PATH.sh
18 How to Use ITAC Collec<ng Trace Data: First create the object files: bash: $ mpiicc bujerfly.c - c Link the object file with ITC libs: bash: $ mpiicc bujerfly.o - lvt - ldwarf - lelf - lvtunwind - lnsl - lm - ldl - lpthread - L/RS/progs/intel/itac/7.1/lib/ - o bujerfly.x You can also use the given ITACcompile.sh script: bash: $./ITACcompile.sh bujerfly.c
19 How to Use ITAC Collec<ng Trace Data: First create the object files: bash: $ mpirun - np 8./bujerfly.x # Iter: 1 # Stage = 3 0 (id): I'm at the barrier 5 (id): I'm at the barrier 7 (id): I'm at the barrier 2 (id): I'm at the barrier ## Calcula<on <me for 1 itera<ons : (id): I passed the barrier 4 (id): I passed the barrier 2 (id): I passed the barrier 0 (id): I passed the barrier
20 How to Use ITAC Analyzing Trace Data: Check tracing files (.ss): bash: $ ls bujerfly.c bujerfly.x bujerfly.x.ss Link the object file with ITC libs bash: $ traceanalyzer bujerfly.x.ss
21 How to Use ITAC Event Timeline Analyzing Trace Data: Func<on Profile Message Profile 21
22 How to Use ITAC Analyzing Trace Data: 22
23 How to Use ITAC Analyzing Trace Data: 23
24
Analysing OpenMP Programs Inspector XE and Amplifier XE
Analysing OpenMP Programs Inspector XE and Amplifier XE Berk ONAT İTÜ Bilişim Enstitüsü 22 Haziran 2012 Outline OpenMP Overhead Tools for analyzing OpenMP programs Print statement (Conven@onal way!) Intel
More informationMPI & OpenMP Mixed Hybrid Programming
MPI & OpenMP Mixed Hybrid Programming Berk ONAT İTÜ Bilişim Enstitüsü 22 Haziran 2012 Outline Introduc/on Share & Distributed Memory Programming MPI & OpenMP Advantages/Disadvantages MPI vs. OpenMP Why
More informationMPI Performance Analysis TAU: Tuning and Analysis Utilities
MPI Performance Analysis TAU: Tuning and Analysis Utilities Berk ONAT İTÜ Bilişim Enstitüsü 19 Haziran 2012 Outline TAU Parallel Performance System Hands on: How to use TAU Tools of TAU Analysis and Visualiza
More informationInstalling the Quantum ESPRESSO distribution
Joint ICTP-TWAS Caribbean School on Electronic Structure Fundamentals and Methodologies, Cartagena, Colombia (2012). Installing the Quantum ESPRESSO distribution Coordinator: A. D. Hernández-Nieves Installing
More informationOpenMP Programming 2 Advanced OpenMP Programming
OpenMP Programming 2 Advanced OpenMP Programming Berk ONAT İTÜ Bilişim Enstitüsü 21 Haziran 2012 Outline OpenMP Synchroniza6on Constructs Single, Cri6cal, Atomic, Barrier OpenMP Data Scope Clauses Firstprivate,
More informationImplementing MPI on Windows: Comparison with Common Approaches on Unix
Implementing MPI on Windows: Comparison with Common Approaches on Unix Jayesh Krishna, 1 Pavan Balaji, 1 Ewing Lusk, 1 Rajeev Thakur, 1 Fabian Tillier 2 1 Argonne Na+onal Laboratory, Argonne, IL, USA 2
More informationProfiling with TAU. Le Yan. 6/6/2012 LONI Parallel Programming Workshop
Profiling with TAU Le Yan 6/6/2012 LONI Parallel Programming Workshop 2012 1 Three Steps of Code Development Debugging Make sure the code runs and yields correct results Profiling Analyze the code to identify
More informationProfiling with TAU. Le Yan. User Services LSU 2/15/2012
Profiling with TAU Le Yan User Services HPC @ LSU Feb 13-16, 2012 1 Three Steps of Code Development Debugging Make sure the code runs and yields correct results Profiling Analyze the code to identify performance
More informationIntel Parallel Studio XE Cluster Edition - Intel MPI - Intel Traceanalyzer & Collector
Intel Parallel Studio XE Cluster Edition - Intel MPI - Intel Traceanalyzer & Collector A brief Introduction to MPI 2 What is MPI? Message Passing Interface Explicit parallel model All parallelism is explicit:
More informationPerformance Measurement
ECPE 170 Jeff Shafer University of the Pacific Performance Measurement 2 Lab Schedule Ac?vi?es Today Background discussion Lab 5 Performance Measurement Wednesday Lab 5 Performance Measurement Friday Lab
More informationLecture 4: Build Systems, Tar, Character Strings
CIS 330:! / / / / (_) / / / / _/_/ / / / / / \/ / /_/ / `/ \/ / / / _/_// / / / / /_ / /_/ / / / / /> < / /_/ / / / / /_/ / / / /_/ / / / / / \ /_/ /_/_/_/ _ \,_/_/ /_/\,_/ \ /_/ \ //_/ /_/ Lecture 4:
More informationXeon Phi Native Mode - Sharpen Exercise
Xeon Phi Native Mode - Sharpen Exercise Fiona Reid, Andrew Turner, Dominic Sloan-Murphy, David Henty, Adrian Jackson Contents June 19, 2015 1 Aims 1 2 Introduction 1 3 Instructions 2 3.1 Log into yellowxx
More informationTutorial: Analyzing MPI Applications. Intel Trace Analyzer and Collector Intel VTune Amplifier XE
Tutorial: Analyzing MPI Applications Intel Trace Analyzer and Collector Intel VTune Amplifier XE Contents Legal Information... 3 1. Overview... 4 1.1. Prerequisites... 5 1.1.1. Required Software... 5 1.1.2.
More informationSEDA An architecture for Well Condi6oned, scalable Internet Services
SEDA An architecture for Well Condi6oned, scalable Internet Services Ma= Welsh, David Culler, and Eric Brewer University of California, Berkeley Symposium on Operating Systems Principles (SOSP), October
More informationPerformance Analysis with Vampir
Performance Analysis with Vampir Johannes Ziegenbalg Technische Universität Dresden Outline Part I: Welcome to the Vampir Tool Suite Event Trace Visualization The Vampir Displays Vampir & VampirServer
More informationAn Introduc+on to OpenACC Part II
An Introduc+on to OpenACC Part II Wei Feinstein HPC User Services@LSU LONI Parallel Programming Workshop 2015 Louisiana State University 4 th HPC Parallel Programming Workshop An Introduc+on to OpenACC-
More informationRaceMob: Crowdsourced Data Race Detec,on
RaceMob: Crowdsourced Data Race Detec,on Baris Kasikci, Cris,an Zamfir, and George Candea School of Computer & Communica3on Sciences Data Races to shared memory loca,on By mul3ple threads At least one
More informationTools for Intel Xeon Phi: VTune & Advisor Dr. Fabio Baruffa - LRZ,
Tools for Intel Xeon Phi: VTune & Advisor Dr. Fabio Baruffa - fabio.baruffa@lrz.de LRZ, 27.6.- 29.6.2016 Architecture Overview Intel Xeon Processor Intel Xeon Phi Coprocessor, 1st generation Intel Xeon
More informationCSE Opera,ng System Principles
CSE 30341 Opera,ng System Principles Lecture 5 Processes / Threads Recap Processes What is a process? What is in a process control bloc? Contrast stac, heap, data, text. What are process states? Which
More informationCSE 451: Operating Systems. Sec$on 2 Interrupts, system calls, and project 1
CSE 451: Operating Systems Sec$on 2 Interrupts, system calls, and project 1 Interrupts Ü Interrupt Ü Hardware interrupts caused by devices signaling CPU Ü Excep$on Ü Uninten$onal sobware interrupt Ü Ex:
More informationIntroduc)on to Xeon Phi
Introduc)on to Xeon Phi ACES Aus)n, TX Dec. 04 2013 Kent Milfeld, Luke Wilson, John McCalpin, Lars Koesterke TACC What is it? Co- processor PCI Express card Stripped down Linux opera)ng system Dense, simplified
More informationIntroduc)on to Xeon Phi
Introduc)on to Xeon Phi MIC Training Event at TACC Lars Koesterke Xeon Phi MIC Xeon Phi = first product of Intel s Many Integrated Core (MIC) architecture Co- processor PCI Express card Stripped down Linux
More informationInforma)on Retrieval and Map- Reduce Implementa)ons. Mohammad Amir Sharif PhD Student Center for Advanced Computer Studies
Informa)on Retrieval and Map- Reduce Implementa)ons Mohammad Amir Sharif PhD Student Center for Advanced Computer Studies mas4108@louisiana.edu Map-Reduce: Why? Need to process 100TB datasets On 1 node:
More informationXeon Phi Native Mode - Sharpen Exercise
Xeon Phi Native Mode - Sharpen Exercise Fiona Reid, Andrew Turner, Dominic Sloan-Murphy, David Henty, Adrian Jackson Contents April 30, 2015 1 Aims The aim of this exercise is to get you compiling and
More informationUsing Intel VTune Amplifier XE for High Performance Computing
Using Intel VTune Amplifier XE for High Performance Computing Vladimir Tsymbal Performance, Analysis and Threading Lab 1 The Majority of all HPC-Systems are Clusters Interconnect I/O I/O... I/O I/O Message
More informationOpera&ng Systems: Principles and Prac&ce. Tom Anderson
Opera&ng Systems: Principles and Prac&ce Tom Anderson How This Course Fits in the UW CSE Curriculum CSE 333: Systems Programming Project experience in C/C++ How to use the opera&ng system interface CSE
More informationCSE Opera*ng System Principles
CSE 30341 Opera*ng System Principles Overview/Introduc7on Syllabus Instructor: Chris*an Poellabauer (cpoellab@nd.edu) Course Mee*ngs TR 9:30 10:45 DeBartolo 101 TAs: Jian Yang, Josh Siva, Qiyu Zhi, Louis
More informationIntel VTune Amplifier XE
Intel VTune Amplifier XE Vladimir Tsymbal Performance, Analysis and Threading Lab 1 Agenda Intel VTune Amplifier XE Overview Features Data collectors Analysis types Key Concepts Collecting performance
More informationShared- Memory Programming in OpenMP Advanced Research Computing
Shared- Memory Programming in OpenMP Advanced Research Computing Outline What is OpenMP? How does OpenMP work? Architecture Fork- join model of parallelism Communica:on OpenMP constructs Direc:ves Run:me
More informationPerformance analysis tools: Intel VTuneTM Amplifier and Advisor. Dr. Luigi Iapichino
Performance analysis tools: Intel VTuneTM Amplifier and Advisor Dr. Luigi Iapichino luigi.iapichino@lrz.de Which tool do I use in my project? A roadmap to optimisation After having considered the MPI layer,
More informationIntroduc)on to Xeon Phi
Introduc)on to Xeon Phi IXPUG 14 Lars Koesterke Acknowledgements Thanks/kudos to: Sponsor: National Science Foundation NSF Grant #OCI-1134872 Stampede Award, Enabling, Enhancing, and Extending Petascale
More informationProfiling & Tuning Applica1ons. CUDA Course July István Reguly
Profiling & Tuning Applica1ons CUDA Course July 21-25 István Reguly Introduc1on Why is my applica1on running slow? Work it out on paper Instrument code Profile it NVIDIA Visual Profiler Works with CUDA,
More informationCellular Networks and Mobile Compu5ng COMS , Spring 2012
Cellular Networks and Mobile Compu5ng COMS 6998-8, Spring 2012 Instructor: Li Erran Li (lierranli@cs.columbia.edu) hkp://www.cs.columbia.edu/~coms6998-8/ 2/27/2012: Radio Resource Usage Profiling and Op5miza5on
More informationW1005 Intro to CS and Programming in MATLAB. Brief History of Compu?ng. Fall 2014 Instructor: Ilia Vovsha. hip://www.cs.columbia.
W1005 Intro to CS and Programming in MATLAB Brief History of Compu?ng Fall 2014 Instructor: Ilia Vovsha hip://www.cs.columbia.edu/~vovsha/w1005 Computer Philosophy Computer is a (electronic digital) device
More informationComputer Architecture: Mul1ple Issue. Berk Sunar and Thomas Eisenbarth ECE 505
Computer Architecture: Mul1ple Issue Berk Sunar and Thomas Eisenbarth ECE 505 Outline 5 stages of RISC Type of hazards Sta@c and Dynamic Branch Predic@on Pipelining with Excep@ons Pipelining with Floa@ng-
More informationSpa$al Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University
Spa$al Analysis and Modeling (GIST 432/532) Guofeng Cao Department of Geosciences Texas Tech University Representa$on of Spa$al Data Representa$on of Spa$al Data Models Object- based model: treats the
More informationIntroduction to Performance Tuning & Optimization Tools
Introduction to Performance Tuning & Optimization Tools a[i] a[i+1] + a[i+2] a[i+3] b[i] b[i+1] b[i+2] b[i+3] = a[i]+b[i] a[i+1]+b[i+1] a[i+2]+b[i+2] a[i+3]+b[i+3] Ian A. Cosden, Ph.D. Manager, HPC Software
More informationIBM High Performance Computing Toolkit
IBM High Performance Computing Toolkit Pidad D'Souza (pidsouza@in.ibm.com) IBM, India Software Labs Top 500 : Application areas (November 2011) Systems Performance Source : http://www.top500.org/charts/list/34/apparea
More informationHPC Tools on Windows. Christian Terboven Center for Computing and Communication RWTH Aachen University.
- Excerpt - Christian Terboven terboven@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University PPCES March 25th, RWTH Aachen University Agenda o Intel Trace Analyzer and Collector
More informationRegister Alloca.on Deconstructed. David Ryan Koes Seth Copen Goldstein
Register Alloca.on Deconstructed David Ryan Koes Seth Copen Goldstein 12th Interna+onal Workshop on So3ware and Compilers for Embedded Systems April 24, 12009 Register Alloca:on Problem unbounded number
More informationAllinea Unified Environment
Allinea Unified Environment Allinea s unified tools for debugging and profiling HPC Codes Beau Paisley Allinea Software bpaisley@allinea.com 720.583.0380 Today s Challenge Q: What is the impact of current
More informationMicroarchitectural Analysis with Intel VTune Amplifier XE
Microarchitectural Analysis with Intel VTune Amplifier XE Michael Klemm Software & Services Group Developer Relations Division 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION
More informationAccelerate HPC Development with Allinea Performance Tools
Accelerate HPC Development with Allinea Performance Tools 19 April 2016 VI-HPS, LRZ Florent Lebeau / Ryan Hulguin flebeau@allinea.com / rhulguin@allinea.com Agenda 09:00 09:15 Introduction 09:15 09:45
More informationSystems Programming/ C and UNIX
Systems Programming/ C and UNIX Alice E. Fischer Lecture 6: Processes October 9, 2017 Alice E. FischerLecture 6: Processes Lecture 5: Processes... 1/26 October 9, 2017 1 / 26 Outline 1 Processes 2 Process
More informationMeteorology 5344, Fall 2017 Computational Fluid Dynamics Dr. M. Xue. Computer Problem #l: Optimization Exercises
Meteorology 5344, Fall 2017 Computational Fluid Dynamics Dr. M. Xue Computer Problem #l: Optimization Exercises Due Thursday, September 19 Updated in evening of Sept 6 th. Exercise 1. This exercise is
More informationIntel VTune Amplifier XE. Dr. Michael Klemm Software and Services Group Developer Relations Division
Intel VTune Amplifier XE Dr. Michael Klemm Software and Services Group Developer Relations Division Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED AS IS. NO LICENSE, EXPRESS
More informationCSE Compilers. Reminders/ Announcements. Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013
CSE 401 - Compilers Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013 Winter 2013 UW CSE 401 (Michael Ringenburg) Reminders/ Announcements Project Part 2 due Wednesday Midterm Friday
More informationRunning LAMMPS on CC servers at IITM
Running LAMMPS on CC servers at IITM Srihari Sundar September 9, 2016 This tutorial assumes prior knowledge about LAMMPS [2, 1] and deals with running LAMMPS scripts on the compute servers at the computer
More informationMacro Assembler. Defini3on from h6p://www.computeruser.com
The Macro Assembler Macro Assembler Defini3on from h6p://www.computeruser.com A program that translates assembly language instruc3ons into machine code and which the programmer can use to define macro
More informationAsaf Cidon, Assaf Eisenman, Mohammad Alizadeh and Sachin KaH
Cli$anger: Scaling Performance Cliffs in Memory Caches [NSDI 2016] Cache OS: Data Center Dynamic Cache Management Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh and Sachin KaH 1 Key-Value Caches are Essen1al
More informationIntroduc)on to High Performance Compu)ng Advanced Research Computing
Introduc)on to High Performance Compu)ng Advanced Research Computing Outline What cons)tutes high performance compu)ng (HPC)? When to consider HPC resources What kind of problems are typically solved?
More informationPhD in Computer And Control Engineering XXVII cycle. Torino February 27th, 2015.
PhD in Computer And Control Engineering XXVII cycle Torino February 27th, 2015. Parallel and reconfigurable systems are more and more used in a wide number of applica7ons and environments, ranging from
More informationPractical Introduction to
1 2 Outline of the workshop Practical Introduction to What is ScaleMP? When do we need it? How do we run codes on the ScaleMP node on the ScaleMP Guillimin cluster? How to run programs efficiently on ScaleMP?
More informationUPCRC. Illiac. Gigascale System Research Center. Petascale computing. Cloud Computing Testbed (CCT) 2
Illiac UPCRC Petascale computing Gigascale System Research Center Cloud Computing Testbed (CCT) 2 www.parallel.illinois.edu Mul2 Core: All Computers Are Now Parallel We con'nue to have more transistors
More informationOp#miza#on & Scalability
Op#miza#on & Scalability Carlos Rosales carlos@tacc.utexas.edu September 20 th, 2013 Parallel Compu#ng in Stampede What this talk is about Highlight main performance and scalability bo5lenecks Simple but
More informationPerformance Op>miza>on
ECPE 170 Jeff Shafer University of the Pacific Performance Op>miza>on 2 Lab Schedule This Week Ac>vi>es Background discussion Lab 5 Performance Measurement Lab 6 Performance Op;miza;on Lab 5 Assignments
More informationKaseya Fundamentals Workshop DAY FOUR. Developed by Kaseya University. Powered by IT Scholars
Kaseya Fundamentals Workshop DAY FOUR Developed by Kaseya University Powered by IT Scholars Kaseya Version 6.5 Last updated March, 2014 Day Three Review State Based Monitoring Event Based Monitoring Monitoring
More informationCONTAINERIZING JOBS ON THE ACCRE CLUSTER WITH SINGULARITY
CONTAINERIZING JOBS ON THE ACCRE CLUSTER WITH SINGULARITY VIRTUAL MACHINE (VM) Uses so&ware to emulate an en/re computer, including both hardware and so&ware. Host Computer Virtual Machine Host Resources:
More informationFluxo. Improving the Responsiveness of Internet Services with Automa7c Cache Placement
Fluxo Improving the Responsiveness of Internet Services with Automac Cache Placement Alexander Rasmussen UCSD (Presenng) Emre Kiciman MSR Redmond Benjamin Livshits MSR Redmond Madanlal Musuvathi MSR Redmond
More informationEnterprise Architecture CS 4720 Web & Mobile Systems
Enterprise Architecture Web & Mobile Systems The Concept of a Web Service Each service is built around a func=on/feature That func=on is surrounded by a specified set of protocols (SOAP, POX, WSDL, WSD,
More informationPerformance Analysis with Vampir
Performance Analysis with Vampir Ronny Brendel Technische Universität Dresden Outline Part I: Welcome to the Vampir Tool Suite Mission Event Trace Visualization Vampir & VampirServer The Vampir Displays
More information7- Reliability and performance
7- Reliability and performance (Herramientas Computacionales Avanzadas para la Inves6gación Aplicada) Rafael Palacios, Jaime Boal Contents Implemen3ng computa3onal tools 1. So:ware Reliability 2. So:ware
More informationOur new HPC-Cluster An overview
Our new HPC-Cluster An overview Christian Hagen Universität Regensburg Regensburg, 15.05.2009 Outline 1 Layout 2 Hardware 3 Software 4 Getting an account 5 Compiling 6 Queueing system 7 Parallelization
More informationMessage Passing Interface (MPI) on Intel Xeon Phi coprocessor
Message Passing Interface (MPI) on Intel Xeon Phi coprocessor Special considerations for MPI on Intel Xeon Phi and using the Intel Trace Analyzer and Collector Gergana Slavova gergana.s.slavova@intel.com
More informationmore uml: sequence & use case diagrams
more uml: sequence & use case diagrams uses of uml as a sketch: very selec)ve informal and dynamic forward engineering: describe some concept you need to implement reverse engineering: explain how some
More informationCS370 Opera;ng Systems Midterm Review. Yashwant K Malaiya Spring 2018
CS370 Opera;ng Systems Midterm Review Yashwant K Malaiya Spring 2018 1 1 Computer System Structures Computer System Opera2on Stack for calling func2ons (subrou2nes) I/O Structure: polling, interrupts,
More informationTool for Analysing and Checking MPI Applications
Tool for Analysing and Checking MPI Applications April 30, 2010 1 CONTENTS CONTENTS Contents 1 Introduction 3 1.1 What is Marmot?........................... 3 1.2 Design of Marmot..........................
More informationIntroduction to the SHARCNET Environment May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology
Introduction to the SHARCNET Environment 2010-May-25 Pre-(summer)school webinar Speaker: Alex Razoumov University of Ontario Institute of Technology available hardware and software resources our web portal
More informationR- installation and adminstration under Linux for dummie
R- installation and adminstration under Linux for dummies University of British Columbia Nov 8, 2012 Outline 1. Basic introduction of Linux Why Linux (department servers)? Some terminology Tools for windows
More informationDebugging with TotalView
Debugging with TotalView Le Yan HPC Consultant User Services Goals Learn how to start TotalView on Linux clusters Get familiar with TotalView graphic user interface Learn basic debugging functions of TotalView
More informationLinux Fundamentals (L-120)
Linux Fundamentals (L-120) Modality: Virtual Classroom Duration: 5 Days SUBSCRIPTION: Master, Master Plus About this course: This is a challenging course that focuses on the fundamental tools and concepts
More informationOp#miza#on & Scalability
Op#miza#on & Scalability Carlos Rosales carlos@tacc.utexas.edu May 5 th, 2015 Parallel Compu#ng in Stampede What this talk is about Highlight main performance and scalability bo5lenecks Simple but efficient
More informationImage Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System
Image Sharpening Practical Introduction to HPC Exercise Instructions for Cirrus Tier-2 System 2 1. Aims The aim of this exercise is to get you used to logging into an HPC resource, using the command line
More information7 Ways to Increase Your Produc2vity with Revolu2on R Enterprise 3.0. David Smith, REvolu2on Compu2ng
7 Ways to Increase Your Produc2vity with Revolu2on R Enterprise 3.0 David Smith, REvolu2on Compu2ng REvolu2on Compu2ng: The R Company REvolu2on R Free, high- performance binary distribu2on of R REvolu2on
More informationIntroduction to Parallel Performance Engineering
Introduction to Parallel Performance Engineering Markus Geimer, Brian Wylie Jülich Supercomputing Centre (with content used with permission from tutorials by Bernd Mohr/JSC and Luiz DeRose/Cray) Performance:
More informationHigh Performance Beowulf Cluster Environment User Manual
High Performance Beowulf Cluster Environment User Manual Version 3.1c 2 This guide is intended for cluster users who want a quick introduction to the Compusys Beowulf Cluster Environment. It explains how
More informationGe#ng Started with Automa3c Compiler Vectoriza3on. David Apostal UND CSci 532 Guest Lecture Sept 14, 2017
Ge#ng Started with Automa3c Compiler Vectoriza3on David Apostal UND CSci 532 Guest Lecture Sept 14, 2017 Parallellism is Key to Performance Types of parallelism Task-based (MPI) Threads (OpenMP, pthreads)
More informationAutonomic Mul,- Agents Security System for mul,- layered distributed architectures. Chris,an Contreras
Autonomic Mul,- s Security System for mul,- layered distributed architectures Chris,an Contreras Agenda Introduc,on Mul,- layered distributed architecture Autonomic compu,ng system Mul,- System (MAS) Autonomic
More informationLecture 17 Java Remote Method Invoca/on
CMSC 433 Fall 2014 Sec/on 0101 Mike Hicks (slides due to Rance Cleaveland) Lecture 17 Java Remote Method Invoca/on 11/4/2014 2012-14 University of Maryland 0 Recall Concurrency Several opera/ons may be
More informationWhat makes an applica/on a good applica/on? How is so'ware experienced by end- users? Chris7an Campo EclipseCon 2012
What makes an applica/on a good applica/on? How is so'ware experienced by end- users? Chris7an Campo EclipseCon 2012 Who are we? Chris/an Campo How is so:ware experienced by end- users? What is Usability?
More informationEffec%ve So*ware. Lecture 9: JVM - Memory Analysis, Data Structures, Object Alloca=on. David Šišlák
Effec%ve So*ware Lecture 9: JVM - Memory Analysis, Data Structures, Object Alloca=on David Šišlák david.sislak@fel.cvut.cz JVM Performance Factors and Memory Analysis» applica=on performance factors total
More informationIntroduction to Linux
Introduction to Linux Prof. Jin-Soo Kim( jinsookim@skku.edu) TA Sanghoon Han(sanghoon.han@csl.skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Announcement (1) Please come
More informationECS 165B: Database System Implementa6on Lecture 14
ECS 165B: Database System Implementa6on Lecture 14 UC Davis April 28, 2010 Acknowledgements: por6ons based on slides by Raghu Ramakrishnan and Johannes Gehrke, as well as slides by Zack Ives. Class Agenda
More informationOPT User Guide. Version 1.4
Version 1.4 Table of Contents 1 Introduction...1 1.1 Reading this User Guide...2 2 Preparing Your Application...4 2.1 MPI Profiling...4 2.1.1 Supported MPI Libraries...4 2.1.2 Compiling For Shared MPI
More informationLecture 2: Processes. CSE 120: Principles of Opera9ng Systems. UC San Diego: Summer Session I, 2009 Frank Uyeda
Lecture 2: Processes CSE 120: Principles of Opera9ng Systems UC San Diego: Summer Session I, 2009 Frank Uyeda Announcements PeerWise accounts are now live. First PeerWise ques9ons/reviews due tomorrow
More informationHow to get Access to Shaheen2? Bilel Hadri Computational Scientist KAUST Supercomputing Core Lab
How to get Access to Shaheen2? Bilel Hadri Computational Scientist KAUST Supercomputing Core Lab Live Survey Please login with your laptop/mobile h#p://'ny.cc/kslhpc And type the code VF9SKGQ6 http://hpc.kaust.edu.sa
More informationCrea?ng Cloud Apps with Oracle Applica?on Builder Cloud Service
Crea?ng Cloud Apps with Oracle Applica?on Builder Cloud Service Shay Shmeltzer Director of Product Management Oracle Development Tools and Frameworks @JDevShay hpp://blogs.oracle.com/shay This App you
More informationPerformance Analysis with Vampir. Joseph Schuchart ZIH, Technische Universität Dresden
Performance Analysis with Vampir Joseph Schuchart ZIH, Technische Universität Dresden 1 Mission Visualization of dynamics of complex parallel processes Full details for arbitrary temporal and spatial levels
More informationDynamic Web Development
Dynamic Web Development Produced by David Drohan (ddrohan@wit.ie) Department of Computing & Mathematics Waterford Institute of Technology http://www.wit.ie MODULES, VIEWS, CONTROLLERS & ROUTES PART 2 Sec8on
More informationParallel Job Support in the Spanish NGI! Enol Fernández del Cas/llo Ins/tuto de Física de Cantabria (IFCA) Spain
Parallel Job Support in the Spanish NGI! Enol Fernández del Cas/llo Ins/tuto de Física de Cantabria (IFCA) Spain Introduction (I)! Parallel applica/ons are common in clusters and HPC systems Grid infrastructures
More informationPerformance Tools for Technical Computing
Christian Terboven terboven@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Intel Software Conference 2010 April 13th, Barcelona, Spain Agenda o Motivation and Methodology
More informationLecture 1 Introduc-on
Lecture 1 Introduc-on What would you get out of this course? Structure of a Compiler Op9miza9on Example 15-745: Introduc9on 1 What Do Compilers Do? 1. Translate one language into another e.g., convert
More informationVirtualization. Introduction. Why we interested? 11/28/15. Virtualiza5on provide an abstract environment to run applica5ons.
Virtualization Yifu Rong Introduction Virtualiza5on provide an abstract environment to run applica5ons. Virtualiza5on technologies have a long trail in the history of computer science. Why we interested?
More informationIntel Parallel Studio XE 2016
Intel Parallel Studio XE 2016 Installation Guide for Linux* OS 18 August 2015 Contents 1 Introduction...2 2 Prerequisites...2 3 Installation...6 3.1 Using Online Installer...6 3.2 Installation Through
More informationPerformance Analysis of Parallel Scientific Applications In Eclipse
Performance Analysis of Parallel Scientific Applications In Eclipse EclipseCon 2015 Wyatt Spear, University of Oregon wspear@cs.uoregon.edu Supercomputing Big systems solving big problems Performance gains
More informationVAMPIR & VAMPIRTRACE INTRODUCTION AND OVERVIEW
VAMPIR & VAMPIRTRACE INTRODUCTION AND OVERVIEW 8th VI-HPS Tuning Workshop at RWTH Aachen September, 2011 Tobias Hilbrich and Joachim Protze Slides by: Andreas Knüpfer, Jens Doleschal, ZIH, Technische Universität
More informationDesigning experiments Performing experiments in Java Intel s Manycore Testing Lab
Designing experiments Performing experiments in Java Intel s Manycore Testing Lab High quality results that capture, e.g., How an algorithm scales Which of several algorithms performs best Pretty graphs
More informationIntegra(ng an open source dynamic river model in hydrology modeling frameworks
Integra(ng an open source dynamic river model in hydrology modeling frameworks Simula(on of Guadalupe and San Antonio River basin during a flood event with 1.3 x 10 5 computa(onal nodes at 100 m resolu(on.
More informationObjec0ves. Gain understanding of what IDA Pro is and what it can do. Expose students to the tool GUI
Intro to IDA Pro 31/15 Objec0ves Gain understanding of what IDA Pro is and what it can do Expose students to the tool GUI Discuss some of the important func
More informationUrb- IoT Developing a RESTful Communica>on Protocol and an Energy Op>miza>on Algorithm for a Connected Sustainable Home
Urb- IoT 2014 Developing a RESTful Communica>on Protocol and an Energy Op>miza>on Algorithm for a Connected Sustainable Home So$rios D. Kotsopoulos, Federico Casalegno, Wesley Graybill, Adrià Recasens
More information