Vampir and Lustre. Understanding Boundaries in I/O Intensive Applications

Size: px

Start display at page:

Download "Vampir and Lustre. Understanding Boundaries in I/O Intensive Applications"

Asher Ball
5 years ago
Views:

1 Center for Information Services and High Performance Computing (ZIH) Vampir and Lustre Understanding Boundaries in I/O Intensive Applications Zellescher Weg 14 Treffz-Bau (HRSK-Anbau) - HRSK 151 Tel (guido.juckeland@tu-dresden.de)

2 Agenda Vampir The Tool for Understanding Application Behavior ZIH Focusing Resources around the Users Data Tracing I/O First Results and Further Steps Slide 2

3 Vampir The Tool for Understanding Application Behavior Slide 3

Vampir Server Architecture Parallel Program Monitor System File System Analysis Server Merged Traces Trace 1 Trace 2 Trace 3 Trace N Worker 1 Classic Analysis: Worker 2 Master

4 Vampir Server Architecture Parallel Program Monitor System File System Analysis Server Merged Traces Trace 1 Trace 2 Trace 3 Trace N Worker 1 Classic Analysis: Worker 2 Master Worker m Event Streams Process Visualization Client Parallel I/O Timeline with 16 Traces visible Message Passing Interne t Segment Indicator 768 Processes Thumbnail View Slide 4

5 Requires: Source Code Instrumentation int foo(void* arg){ int foo(void* arg){ enter(7); if (cond){ if (cond){ leave(7); return 1; return 1; } } leave(7); return 0; } return 0; } manually or automatically Slide 5

6 What You Get: Parallel Trace (Open Trace Format (OTF)) DEF TIMERRES P 1 ENTER 5 DEF PROCESS 1 `Master` P 1 ENTER 6 DEF PROCESS 1 `Slave` P 1 ENTER P 1 SEND TO 3 LEN P 1 LEAVE P 2 ENTER 5 DEF FUNCTION 5 `main` DEF FUNCTION 6 `foo` P 1 LEAVE P 2 ENTER 6 DEF FUNCTION 9 `bar` P 1 ENTER P 2 ENTER 13 DEF FUNCTION 12 `MPI_Send` P 1 LEAVE 9 DEF FUNCTION 13 `MPI_Recv` P 2 RECV FROM 3 LEN P 2 LEAVE P 2 LEAVE P 2 ENTER P 2 LEAVE 9... Slide 6

7 Common Event Types enter/leave of function/routine/region time stamp, process/thread, function ID send/receive of P2P message (MPI) time stamp, sender, receiver, length, tag, communicator collective communication (MPI) time stamp, process, root, communicator, # bytes hardware performance counter values time stamp, process, counter ID, value etc. Slide 7

8 Vampir: Global Timeline Slide 8

9 Vampir: Global Timeline Slide 9

10 Vampir: Global Timeline Slide 10

11 Vampir: Global Timeline Slide 11

12 Vampir: Process Timeline Slide 12

13 Vampir: Process Timeline Slide 13

14 Vampir: Process Timeline Slide 14

15 Vampir: Process Timeline Slide 15

16 Vampir: Summary Chart Slide 16

17 Vampir: Summary Timeline Slide 17

18 Vampir: Process Profile Slide 18

19 ZIH Focusing Resources around the Users Data Slide 19

Realization of the Concept for Data Intensive Computing HPC Component

Capacity Capacity 68 TiByte 51TiByte 1,8 GiB/s PetaByte Tape Archive

20 Realization of the Concept for Data Intensive Computing HPC Component Memory PC Farm 6,5 TiByte SGI Altix x Sockets mit Itanium2 Montecito Dual-Core CPUs (1.6 GHz/9MB L3 Cache) LNXI PC-Farm 8 GiB/s 4 GiB/s HPC SAN 4 GiB/s PC SAN Capacity Capacity 68 TiByte 51TiByte 1,8 GiB/s PetaByte Tape Archive Capacity 1 PiByte Slide 20 AMD Opteron X85 Dual Core Chip mit 2,6 GHz 384x Single CPU Nodes 232x Dual CPU Nodes 112x Quad CPU Nodes

21 Lustre Relevant Components HPC Component Memory PC Farm 6,5 TiByte SGI Altix x Sockets mit Itanium2 Montecito Dual-Core CPUs (1.6 GHz/9MB L3 Cache) LNXI PC-Farm 8 GiB/s 4 GiB/s HPC SAN 4 GiB/s PC SAN Capacity Capacity 68 TiByte 51TiByte 1,8 GiB/s PetaByte Tape Archive Capacity 1 PiByte Slide 21 AMD Opteron X85 Dual Core Chip mit 2,6 GHz 384x Single CPU Nodes 232x Dual CPU Nodes 112x Quad CPU Nodes

22 Lustre Setup 1TB oss001 16TB oss002 mds001 SGI TP9300 mds002 oss003 oss004 16TB oss005 I/O Infiniband oss006 core switch oss TB oss008 oss009 SGI InfiniteStorage 6700 Serving two file systems: /work and /fastfs Slide 22

23 User profile: Image Analysis in Biology (MPI-CBG) Quantitative image analysis Identify cells and cell components in images Derive statistical information about the cell components Functional genomics assays High resolution images of large number of cells Live cell imaging Low resolution images in time series Slide 23

er Plates 384 well plate Images Image Packages 384 image packages ~ 3000 image files ~18 GiB s

24 Image Analysis Workflow 1 Preparing Plates 2 Acquiring Images 3 Preprocessing Images 4 Adjusting Parameters 5 Structure Identification 6 Calculating Statistics Pa es ra m et Im ag er Plates 384 well plate Images Image Packages 384 image packages ~ 3000 image files ~18 GiB s 1 parameter file Slide 24 Structure Information ~ 3000 image files ~3 GiB Results ~ 50 global statistics

25 Tracing I/O First results and further steps Slide 25

26 Getting Data from the DDN-RAIDs Own QT-GUI Uses DDN-API for live-data every 0.1s Needs network access to DDN controllers Slide 26

27 Getting Data from the Application Catching all I/O calls from the application Adding them to the trace as performance counter data Include filenames, offsets, and request sizes as OTF comments Currently done with LD_PRELOAD library Data needs to be merged with application OTF trace Captured data: open (filename, filedescriptor), read/write (filedescriptor, size), close/dup (filedescriptor), seek (filedescriptor, position) Tracing I/O requests within the kernel To follow the path of the request to the devices Slide 27

28 Vampir I/O Stats per Process (work in progress) Slide 28

29 Vampir I/O Stats per Process (work in progress) Slide 29

30 Including DDN Statistics (work in progress) Slide 30

31 Restraints and Requirements I/O Tracing cannot modify kernel (site requirements) DDN data is sampling data DDN data is global data Kernel data is global data Therefore: Need existing FS (e.g. Lustre) to include on demand trace hooks Device mappings need to be known as well Slide 31

32 Conclusions and Further Work Data intensive computing produces new bottlenecks New HPC user groups (e.g. biologists) produce new problems Vampir platform is/was extended to help indentify I/O bottlenecks DDN data needs to be correlated with application trace Should work with any device and any FS (Lustre very good starting point) Could include file object information Need to follow I/O request form application to device Slide 32

Thanks and Contacts Vampir/OTF Introduction Andreas Knüpfer (andreas.knuepfer@tu-dresden.

mueller@tu-dresden.de) I/O Tracing Facilities Michael Kluge (michael.kluge@tu-dresden.

33 Thanks and Contacts Vampir/OTF Introduction Andreas Knüpfer Holger Brunst ZIH System Setup Dr. Matthias Müller I/O Tracing Facilities Michael Kluge Holger Mickler Visit us at and Slide 33

Introducing OTF / Vampir / VampirTrace

Center for Information Services and High Performance Computing (ZIH) Introducing OTF / Vampir / VampirTrace Zellescher Weg 12 Willers-Bau A115 Tel. +49 351-463 - 34049 (Robert.Henschel@zih.tu-dresden.de)