Comprehensive Lustre I/O Tracing with Vampir

Similar documents
Vampir and Lustre. Understanding Boundaries in I/O Intensive Applications

Introducing OTF / Vampir / VampirTrace

I/O at the Center for Information Services and High Performance Computing

VAMPIR & VAMPIRTRACE INTRODUCTION AND OVERVIEW

Analyzing I/O Performance on a NEXTGenIO Class System

I/O Profiling Towards the Exascale

NEXTGenIO Performance Tools for In-Memory I/O

Analyzing the High Performance Parallel I/O on LRZ HPC systems. Sandra Méndez. HPC Group, LRZ. June 23, 2016

ISC 09 Poster Abstract : I/O Performance Analysis for the Petascale Simulation Code FLASH

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

Altix Usage and Application Programming

Performance Analysis with Vampir. Joseph Schuchart ZIH, Technische Universität Dresden

Interactive Performance Analysis with Vampir UCAR Software Engineering Assembly in Boulder/CO,

Parallel I/O on JUQUEEN

An Overview of Fujitsu s Lustre Based File System

NetApp: Solving I/O Challenges. Jeff Baxter February 2013

Performance Analysis with Vampir

I/O at JSC. I/O Infrastructure Workloads, Use Case I/O System Usage and Performance SIONlib: Task-Local I/O. Wolfgang Frings

Advanced Data Placement via Ad-hoc File Systems at Extreme Scales (ADA-FS)

I/O: State of the art and Future developments

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

Parallel I/O Libraries and Techniques

Introduction to High Performance Parallel I/O

Data Movement & Tiering with DMF 7

Emerging Technologies for HPC Storage

System Software for Big Data and Post Petascale Computing

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

Parallel File Systems Compared

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows

Introduction to HPC Parallel I/O

Score-P. SC 14: Hands-on Practical Hybrid Parallel Application Performance Engineering 1

Performance Analysis with Vampir

Datura The new HPC-Plant at Albert Einstein Institute

Performance Analysis with Vampir

Lustre overview and roadmap to Exascale computing

Xyratex ClusterStor6000 & OneStor

I/O Monitoring at JSC, SIONlib & Resiliency

[Scalasca] Tool Integrations

A More Realistic Way of Stressing the End-to-end I/O System

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

Accelerating Spectrum Scale with a Intelligent IO Manager

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Parallel I/O and Portable Data Formats I/O strategies

Evaluation of Parallel I/O Performance and Energy with Frequency Scaling on Cray XC30 Suren Byna and Brian Austin

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

Indiana University s Lustre WAN: The TeraGrid and Beyond

MOHA: Many-Task Computing Framework on Hadoop

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

VAMPIR & VAMPIRTRACE Hands On

An Evolutionary Path to Object Storage Access

Dept. Of Computer Science, Colorado State University

HPC Storage Use Cases & Future Trends

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Advanced Software for the Supercomputer PRIMEHPC FX10. Copyright 2011 FUJITSU LIMITED

Technology Challenges for Clouds. Henry Newman CTO Instrumental Inc

High-Performance Lustre with Maximum Data Assurance

Structuring PLFS for Extensibility

Data Movement & Storage Using the Data Capacitor Filesystem

Scalable Performance Analysis of Parallel Systems: Concepts and Experiences

Coordinating Parallel HSM in Object-based Cluster Filesystems

( ZIH ) Center for Information Services and High Performance Computing. Event Tracing and Visualization for Cell Broadband Engine Systems

EsgynDB Enterprise 2.0 Platform Reference Architecture

Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl

THOUGHTS ABOUT THE FUTURE OF I/O

Chapter 1: Introduction Dr. Ali Fanian. Operating System Concepts 9 th Edit9on

Storage for HPC, HPDA and Machine Learning (ML)

<Insert Picture Here> Oracle Storage

Scalasca support for Intel Xeon Phi. Brian Wylie & Wolfgang Frings Jülich Supercomputing Centre Forschungszentrum Jülich, Germany

Scientific Workflows and Cloud Computing. Gideon Juve USC Information Sciences Institute

IBM Spectrum Scale IO performance

Dell EMC PowerEdge and BlueData

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

A Flexible Data Warehouse Architecture

CIS Operating Systems File Systems. Professor Qiang Zeng Fall 2017

Programming for Fujitsu Supercomputers

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

CIS Operating Systems File Systems. Professor Qiang Zeng Spring 2018

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Leistungsanalyse von Rechnersystemen

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

Online Monitoring of I/O

HIGH PERFORMANCE COMPUTING FROM SUN

Triton file systems - an introduction. slide 1 of 28

Welcome. HRSK Practical on Debugging, Zellescher Weg 12 Willers-Bau A106 Tel

Potentials and Limitations for Energy Efficiency Auto-Tuning

Chapter 10: Mass-Storage Systems

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

XtreemStore A SCALABLE STORAGE MANAGEMENT SOFTWARE WITHOUT LIMITS YOUR DATA. YOUR CONTROL

CA485 Ray Walshe Google File System

File. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access

International Journal of High Performance Computing Applications. Trace-based Performance Analysis for the Petascale Simulation Code FLASH

ENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION

LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions

VAMPIR & VAMPIRTRACE Hands On

StackMine Performance Debugging in the Large via Mining Millions of Stack Traces

New User Seminar: Part 2 (best practices)

Transcription:

Comprehensive Lustre I/O Tracing with Vampir Lustre User Group 2010 Zellescher Weg 12 WIL A 208 Tel. +49 351-463 34217 ( michael.kluge@tu-dresden.de ) Michael Kluge

Content! Vampir Introduction! VampirTrace introduction! Enhancing I/O analysis! Creation of a comprehensive system view! Illustrating Examples Folie 2

Vampir Introduction! Visualization of dynamics of complex parallel processes! Requires two components Monitor/Collector (VampirTrace) Charts/Browser (Vampir)! Available for major platforms! Open Source (partially) http://www.vampir.eu http://www.tu-dresden.de/zih/vampirtrace Folie 3

Vampir Toolset Architecture CPU CPU CPU CPU Multi-Core Program CPU CPU CPU CPU Vampir Trace Trace File (OTF) Vampir 7 CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Vampir Trace VampirServer CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Many-Core CPU CPU CPU CPU CPU CPU Program CPU CPU CPU CPU CPU CPU Trace Bundle CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Folie 4

Vampir 7: Timelines Folie 5

Vampir 7: Summaries Folie 6

I/O in HPC Environments! Thousands of clients connected to a file server farm! Dedicated infrastructure, large number of supporting components! File systems often accessed through high level libraries! File systems typically tuned for large I/O requests! No backup! Beside the 'large&fast' file system HPC systems have slow home directory even more slow archive space! Multi cluster file systems Folie 7

I/O Subsystem Sizes! TU Dresden HRSK System (installed 2006): 2*68 TB: 12 GB/s, >1300 disks, 12 file servers, 48 FC cables 700+ nodes! DARPA HPCS Project (targeted 2010): High Productivity Computing with millions of components 50.000 spinning disks, 30.000 nodes, 100 PB file systems millions of cores, 10 billion files per directory, 1 PB single file size 40,000 file creates/sec from a single client node 30 GB/sec single client, 1.5 TB/sec total bandwidth 1 PB single file size end-to-end data integrity Folie 8

How to optimize I/O?! Instrument all available data sources! Make tools usable for production mode! Create a global view onto the system! Instrument the users application! Run the user code! Correlate and analyze the collected data! Find an optimization strategy! Decide about the layer to implement it! Implement and test again! Check if you killed someone else s performance Folie 9

I/O Stack in HPC Environments 5#$4(& 1$#4,64?A4?!"#$%"&'(#!""#$%!&$'( "34&501 /012 )*+,+-. *.6+7,+-. )#$*#+ 34&@'?= 816 1$#4,69:&4; <#'%=,>!94? 34&@'?= )#$*#+ 816 1$#4,69:&4; <#'%=,>!94? 34&@'?= BC+0,5'(&?'##4? 34&@'?= 5'(&?'##4?,>'D$% 0$:=: Folie 10

I/O Analysis Workflow Folie 11

I/O Analysis Support in VampirTrace POSIX I/O raid controller kernel events network MPI I/O file server VampirTrace memory mapped trace file Folie 12

Instrumented System Folie 13

External Tool Architecture Folie 14

Example Local and External Counter Values Folie 15

Example Local and External Counter Values Folie 16

Example threader simulation with 50 processes Folie 17

Example threader simulation with 400 processes Folie 18

Example threader simulation with 1500 processes Folie 19

Comparing the Numbers 500,0 Scaling "threader" on Lustre microseconds (avg.) 450,0 400,0 350,0 300,0 250,0 200,0 150,0 100,0 50,0 0,0 50 processes 400 processes 1500 processes fread fopen fclose function Folie 20

File Access Pattern Visualization! what does the Vampir Master Timeline show: enter leave X axis: time Y axis: processes process! what is needed for file access patterns: time X axis: byte range within a single file read start read end write start write end Y axis: processes process! record each write/read access for each byte range in a vector time! merge consecutive accesses! write an appropriate enter/leave pair based on the process number and the access offset/access range into a new trace file Folie 21

Flash I/O and HDF5! Multiphysics astronomy code! solves a broad range of astrophysical problems! uses HDF5 or pnetcdf for parallel storage access! I/O kernel available since 2001! revised implementation available since 2006! What has been changed? instrument both versions check differences in the I/O patterns Folie 22

Pattern - 2001 Version file descriptor memory pointer write size write offset 1: pwrite( 8, somewhere, 8, 1445340); 1: pwrite( 8, somewhere, 8, 1445348); 1: pwrite( 8, somewhere, 8, 1445356); 1: pwrite( 8, somewhere, 8, 1445364); 1: pwrite( 8, somewhere, 8, 1445372); 1: pwrite( 8, somewhere, 8, 1445380); 1: pwrite( 8, somewhere, 8, 1445388); 1: pwrite( 8, somewhere, 8, 1445396); 1: pwrite( 8, somewhere, 8, 1445404); 1: pwrite( 8, somewhere, 8, 1445412); 1: pwrite( 8, somewhere, 8, 1445420); 1: pwrite( 8, somewhere, 8, 1445428); Folie 23

Pattern - 2006 Version 1: pwrite( 8, somewhere, 1638400, 1054266916); 1: pwrite( 8, somewhere, 1638400, 1264502308); 1: pwrite( 8, somewhere, 1638400, 1474737700); 1: pwrite( 8, somewhere, 1638400, 1684975140); 1: pwrite( 8, somewhere, 1638400, 1895210532); 1: pwrite( 8, somewhere, 1638400, 2105445924); 1: pwrite( 8, somewhere, 1638400, 2315681316); 1: pwrite( 8, somewhere, 1638400, 2525916708);... Folie 24

Access Pattern Visualization Folie 25

Access Pattern Visualization Folie 26

Time for Questions Folie 28