UMAMI: A Recipe for Generating Meaningful Metrics through Holistic I/O Performance Analysis

Size: px

Start display at page:

Download "UMAMI: A Recipe for Generating Meaningful Metrics through Holistic I/O Performance Analysis"

Johnathan Hicks
6 years ago
Views:

1 UMAMI: A Recipe for Generating Meaningful Metrics through Holistic I/O Performance Analysis Glenn K. Lockwood, Shane Snyder, Wucherl Yoo, Kevin Harms, Zachary Nault, Suren Byna, Philip Carns, Nicholas J. Wright October 27,

2 Understanding I/O today is hard Compute Nodes IO Nodes, BB Nodes Storage Servers Storage hierarchy is ge1ng more complicated - 2 -

3 Understanding I/O today is hard Compute Nodes IO Nodes, BB Nodes Storage Servers Storage hierarchy is ge1ng more complicated Currently monitor each component separately is standard prac:ce Custom Binary Format ES Custom Binary Format HDF5.txt - 3 -

complicated Currently monitor each component separately is

4 Understanding I/O today is hard Compute Nodes IO Nodes, BB Nodes Storage Servers Storage hierarchy is ge1ng more complicated Currently monitor each component separately is standard prac:ce Custom Binary Format Custom Expert Knowledge ES Binary Format HDF5.txt I/O expert (Phil Carns) from ATPESC: hhps://insidehpc.com/2017/10/hpc-io-computa:onal-scien:sts/

5 Total Knowledge of I/O with holistic analysis Compute Nodes IO Nodes, BB Nodes Storage Servers Can we augment expert knowledge? Using exis:ng tools? - 5 -

Custom Binary Format ES Custom Binary Format HDF5.

6 Total Knowledge of I/O with holistic analysis Compute Nodes IO Nodes, BB Nodes Storage Servers Can we augment expert knowledge? Using exis:ng tools? Combine, index, and normalize their metrics Provide a holis:c view Custom Binary Format ES Custom Binary Format HDF5.txt Total Knowledge of I/O (TOKIO) I/O expert (Phil Carns) from ATPESC: hhps://insidehpc.com/2017/10/hpc-io-computa:onal-scien:sts/

What is possible with holistic I/O analysis?

for a month Jobs scaled to achieve > 80% of

shared ﬁle, big and small xfers Run on ALCF

One GPFS ﬁle system on Mira (gpfs-mira) Two

and lustre-bigio) Use data from producyon

for applica:on-level I/O proﬁling GPFS and

7 What is possible with holistic I/O analysis? Run four diﬀerent I/O workloads every day for a month Jobs scaled to achieve > 80% of peak fs performance Exercise ﬁle-per-proc, shared ﬁle, big and small xfers Run on ALCF Mira (IBM BG/Q) and NERSC Edison (Cray XC) One GPFS ﬁle system on Mira (gpfs-mira) Two Lustre ﬁle systems on Edison (lustre-reg and lustre-bigio) Use data from producyon monitoring tools at ALCF and NERSC Darshan for applica:on-level I/O proﬁling GPFS and Lustre-speciﬁc server-side monitoring tools -7-

8 Defining performance variation gpfs (Mira) "Frac%on of Peak Performance" is rela:ve to max performance for that app on that file system Normalizes out the effects of applica:on I/O paherns and peak file system performance - 8 -

9 Variation due to application I/O pattern gpfs (Mira) "Bad I/O paherns" can cause bad performance bad performance varia%on Some applica:on paherns are more suscep:ble to high amounts of varia:on! - 9 -

10 Variation across file system architectures gpfs (Mira) lustre-bigio (Edison) ApplicaYon I/O pazerns are not the only contributor to performance variayon

11 Variation between Lustre configurations lustre-reg (Edison) lustre-bigio (Edison) Significant differences even on similar Lustre file systems other factors (configurayon, workload) also mazer!

12 What does this tell us about variation? gpfs (Mira) lustre-bigio (Edison) lustre-reg (Edison) Performance variayon a funcyon of applica:on I/O paherns (cf. HACC, VPIC) architecture (cf. gpfs, lustre-bigio) other factors (cf. lustre-bigio, lustre-reg)

13 What does this tell us about variation? gpfs (Mira) lustre-bigio (Edison) lustre-reg (Edison) File systems have their own "I/O climate" (like Berkeley vs. Argonne) Understanding these "other factors" (climate) holisycally is essenyal to understanding performance variability!

14 Exploring I/O weather and climate lustre-reg (Edison) Let's look at a few cases of bad performance using a Unified Monitoring and Metrics Interface (UMAMI) What can a holis:c view (climate) tell us about performance (weather)?

15 Case Study #1:" HACC write performance on lustre-reg Is this a snowy day at Argonne or a snowy day at Berkeley? Quan:ta:vely define "bad" based on quar:les Use UMAMI to determine which aspects of weather were "bad"

16 Case Study #1:" First guess: blame someone else Coverage Factor = how much global bandwidth was consumed by my job?

17 Case Study #1:" Add Coverage Factor to UMAMI Most jobs get exclusive access to Lustre bandwidth (CF bw 1.0)

18 Case Study #1:" Add Coverage Factor to UMAMI Bad performance coincided with low CF Performance varia:on caused by bandwidth conten:on

19 Case Study #2:" VPIC/GPFS: when bandwidth contention isn't the issue Bad performance did not coincide with low CF Either use expert knowledge or sta:s:cal analysis to add more metrics

20 Case Study #2:" VPIC/GPFS: when bandwidth contention isn't the issue Sta:s:cally "bad" levels of conten:on for metadata IOPS Performance loss affected by file system implementa:on

21 Case Study #3: " HACC/lustre-bigio: effects of "I/O climate change" Abnormally good performance revealed a long-term bad I/O climate Bandwidth conten:on was not the culprit

22 Case Study #3: " HACC/lustre-bigio: effects of "I/O climate change" Moderate nega:ve correla:on with OSS CPU load Strong nega:ve correla:on with file system fullness Result of Lustre block alloca:on at >90% fullness

23 Conclusions Performance variability is a funcyon of file system climate: file system architecture overall system workload file system configura:on (default striping, etc) and health No single metric predicts variayon universally; many factors can affect I/O weather: bandwidth conten:on metadata op conten:on (GPFS) file system fullness (Lustre) A holisyc view of the storage subsystem is essenyal to understand performance on complex I/O architectures

Closer to Total Knowledge Compute Nodes IO Nodes, BB

Knowledge of I/O Format 5 (TOKIO) Storage Servers.

mo:fs to define I/O climates Infer cri:cal metrics to

Open source & development contribu:ons welcome!

24 Closer to Total Knowledge Compute Nodes IO Nodes, BB Nodes Custom Custom Binary ES Binary HDF Format Total Knowledge of I/O Format 5 (TOKIO) Storage Servers.txt Incorporate machine learning Cluster similar I/O mo:fs to define I/O climates Infer cri:cal metrics to remove expert from the loop Join the TOKIO effort! Open source & development contribu:ons welcome! hhps://github.com/nersc/pytokio/ Support for new component-level tools being added regularly

25 This material is based upon work supported by the U.S. Department of Energy, Office of Science, under contracts DE-AC02-05CH11231 and DE-AC02-06CH11357 (Project: A Framework for Holistic I/O Workload Characterization, Program manager: Dr. Lucy Nowell). This research used resources and data generated from resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE- AC02-05CH11231 and the Argonne Leadership Computing Facility, a DOE Office of Science User Facility supported under Contract DE- AC02-06CH

CHARACTERIZING HPC I/O: FROM APPLICATIONS TO SYSTEMS

erhtjhtyhy CHARACTERIZING HPC I/O: FROM APPLICATIONS TO SYSTEMS PHIL CARNS carns@mcs.anl.gov Mathematics and Computer Science Division Argonne National Laboratory April 20, 2017 TU Dresden MOTIVATION FOR