Introduction to HPC Parallel I/O

Similar documents
IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

A More Realistic Way of Stressing the End-to-end I/O System

FCP: A Fast and Scalable Data Copy Tool for High Performance Parallel File Systems

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Introduction to High Performance Parallel I/O

DVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011

An Exploration into Object Storage for Exascale Supercomputers. Raghu Chandrasekar

Parallel File Systems Compared

Allowing Users to Run Services at the OLCF with Kubernetes

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Analyzing the High Performance Parallel I/O on LRZ HPC systems. Sandra Méndez. HPC Group, LRZ. June 23, 2016

Efficient Object Storage Journaling in a Distributed Parallel File System

An Evolutionary Path to Object Storage Access

Mining Supercomputer Jobs' I/O Behavior from System Logs. Xiaosong Ma

Philip C. Roth. Computer Science and Mathematics Division Oak Ridge National Laboratory

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

The Fusion Distributed File System

An Overview of Fujitsu s Lustre Based File System

Analytics in the cloud

Guidelines for Efficient Parallel I/O on the Cray XT3/XT4

On the Use of Burst Buffers for Accelerating Data-Intensive Scientific Workflows

ECE7995 (7) Parallel I/O

Introduction to Parallel I/O

INTEGRATING HPFS IN A CLOUD COMPUTING ENVIRONMENT

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Parallel File Systems. John White Lawrence Berkeley National Lab

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

Parallel I/O on JUQUEEN

Extreme I/O Scaling with HDF5

CA485 Ray Walshe Google File System

A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing

JULEA: A Flexible Storage Framework for HPC

Improved Solutions for I/O Provisioning and Application Acceleration

Scaling Data Center Application Infrastructure. Gary Orenstein, Gear6

HPC Storage Use Cases & Future Trends

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017

Experiences with the Parallel Virtual File System (PVFS) in Linux Clusters

The Spider Center-Wide File System

High Performance Computing. Introduction to Parallel Computing

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

LUSTRE NETWORKING High-Performance Features and Flexible Support for a Wide Array of Networks White Paper November Abstract

ISC 09 Poster Abstract : I/O Performance Analysis for the Petascale Simulation Code FLASH

The State and Needs of IO Performance Tools

vsan Mixed Workloads First Published On: Last Updated On:

Preparing GPU-Accelerated Applications for the Summit Supercomputer

HPC Input/Output. I/O and Darshan. Cristian Simarro User Support Section

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

Oak Ridge National Laboratory

IBM Spectrum Scale IO performance

Parallel I/O Libraries and Techniques

outthink limits Spectrum Scale Enhancements for CORAL Sarp Oral, Oak Ridge National Laboratory Gautam Shah, IBM

Challenges in HPC I/O

Lustre A Platform for Intelligent Scale-Out Storage

Ian Foster, CS554: Data-Intensive Computing

Lustre overview and roadmap to Exascale computing

High Performance Computing Software Development Kit For Mac OS X In Depth Product Information

libhio: Optimizing IO on Cray XC Systems With DataWarp

File Systems for HPC Machines. Parallel I/O

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

CSE6230 Fall Parallel I/O. Fang Zheng

The Google File System

HPSS Treefrog Summary MARCH 1, 2018

Oak Ridge National Laboratory Computing and Computational Sciences

Evaluation of Parallel I/O Performance and Energy with Frequency Scaling on Cray XC30 Suren Byna and Brian Austin

API and Usage of libhio on XC-40 Systems

Analytics of Wide-Area Lustre Throughput Using LNet Routers

An Introduction to GPFS

Advanced Data Placement via Ad-hoc File Systems at Extreme Scales (ADA-FS)

Analyzing the Performance of IWAVE on a Cluster using HPCToolkit

The Google File System

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

MOHA: Many-Task Computing Framework on Hadoop

Performance and Scalability Evaluation of the Ceph Parallel File System

GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations

Best Practices for Setting BIOS Parameters for Performance

Automatic Identification of Application I/O Signatures from Noisy Server-Side Traces. Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S.

The Titan Tools Experience

Composite Metrics for System Throughput in HPC

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

Eliminate the Complexity of Multiple Infrastructure Silos

Lock Ahead: Shared File Performance Improvements

New User Seminar: Part 2 (best practices)

RAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System

Welcome! Virtual tutorial starts at 15:00 BST

Uniform Resource Locator Wide Area Network World Climate Research Programme Coupled Model Intercomparison

Empirical Analysis of a Large-Scale Hierarchical Storage System

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

The STREAM Benchmark. John D. McCalpin, Ph.D. IBM eserver Performance ^ Performance

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

NetApp High-Performance Storage Solution for Lustre

Parallel I/O from a User s Perspective

GPFS on a Cray XT. Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009

Data Transfer Study for HPSS Archiving

Introduction to parallel Computing

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

HDF5 I/O Performance. HDF and HDF-EOS Workshop VI December 5, 2002

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

Portable Heterogeneous High-Performance Computing via Domain-Specific Virtualization. Dmitry I. Lyakh.

Execution Models for the Exascale Era

Transcription:

Introduction to HPC Parallel I/O Feiyi Wang (Ph.D.) and Sarp Oral (Ph.D.) Technology Integration Group Oak Ridge Leadership Computing ORNL is managed by UT-Battelle for the US Department of Energy

Outline Parallel I/O in HPC General concepts I/O environment Programming perspective System perspective Performance perspective Establish end-to-end system perspective and set the right expectation 2

HPC parallel I/O environment HPC I/O System is more than just (PFS) parallel file system, but PFS is the first user interaction point. Compute Node Application PFS Client Compute Node Application PFS Client Compute Node Application PFS Client At very high-level, PFS is logically composed of three components: PFS clients metadata services I/O servers IO ops metadata ops Network Fabrics Metadata services PFS is also the first point to reveal any system problem Network Memory Storage PFS I/O Server PFS I/O Server PFS I/O Server 3

Parallel file system Global namespace Designed for parallelism and scalability Concurrent access from tens of thousands clients Efficient N-to-1 and N-to-N access pattern Designed for high-performance High-performance interconnect High-performance protocol stack POSIX-compliant PFS is different from desktop/laptop file system A contested and shared resource Vastly more complex Network attached. Common PFS flavors Lustre, GPFS, Ceph, PanasasFS, PVFS, BeeGFS 4

I/O parallelization File A data File B data Logical View 0 1 2 3 4 5 6 7 0 1 2 Physical View striping and data placement 0 1 2 3 6 4 5 0 1 2 7 Disk bundles (LUN) Storage Target 1 Storage Target 2 Storage Target 3 5

Programming perspective of parallel I/O Persist some amount of data as quickly as possible, as reliably as possible In HPC, we have a wide array of choices with language bindings for C, C++ and Fortran POSIX I/O MPI/IO Parallel netcdf or HDF5 Middleware such as ADIOS Mapping Mapping Application Data Model Storage Data Model I/O Hardware 6

Programming perspective - serial I/O Processes send data to the master Master writes the data to a file Read is in the reverse order 0 1 2 Processes Advantages File Simple Good performance for very small I/O sizes Disadvantages Not efficient Two-stage process Not scalable Slow for any large number of processes or data sizes May not be possible if memory constrained 7 3 4

Programming perspective - parallel I/O file per process Each process writes its own data to a separate file N to N 0 1 2 3 4 File File File File File Advantages Simple to program Can be fast Disadvantages Impact on PFS metadata server performance weak spot of single shared directory Can quickly accumulate many files, hard to manage Difficult for archival systems, e.g. HPSS, to handle many small files 8

Programming perspective - parallel I/O single shared file Each process writes its own data to the same file using such as MPI-IO mapping Processes N to 1 0 1 2 3 4 Advantages Single file Manageable data File Disadvantages Possible lower performance than file per process mode due to contention 9

System perspective - parallel I/O A combination of software and hardware A much more complex end-to-end I/O path write(buf, len) Compute Nodes 18688 x 16 I/O Routers 440 Ext. Network 1600 Storage Servers 288 Disk controllers 36 Disks 20,160 10

11 How does knowing more about HPC I/O system help to understand I/O behavior and performance?

Storage Design Paradigm: Machine Exclusive Compute 2 post-analysis Compute 3 testbed Compute 1 Cluster Data transfer Cluster Visualization Cluster 12 PROS: A simple and natural starting point CONS: wasted resource, procurement, operational cost and data island

Storage Design Paradigm: Data Centric Eos Cray XC30 11,776 cores Rhea Commodity 3136 cores Titan Compute Machine ATLAS DTN Cluster Everrest Visualization 13 PROS: eliminate data island & data availability CONS: mixed workload & data availability

System Perspective Observations HPC I/O system design is about trade-offs performance capacity scalability availability usability and cost Complexity and control of I/O end-to-end path The impact of mixed workload I/O workload characterization is paramount important Write is not necessarily dominant even for a scratch file system Performance isolation and QoS 14

Performance perspective: what kind? Sequential Read Application performance Sequential Write File system level performance Random Read Random Write Block level performance Small or Big files Mixed workload 15

Performance perspective: some observations The end-user experience (performance wise) is less predictable in HPC: latency and bandwidth It is extremely challenging if not impossible to achieve peak performance in production environment. A fraction of the whole machine can overwhelm the underlying storage system. The perils of hero runs: XYZ Dilemma: Why I am allocated X CPU cores, you have said the system is capable of performance Y, and I am only getting Z number 16

Some Useful Links NERSC I/O In Practice http://www.nersc.gov/assets/training/pio-in-practice-sc12.pdf Livermore Computing I/O guide https://computing.llnl.gov/lcdocs/ioguide/ Darshan HPC I/O Characterization Tool http://www.mcs.anl.gov/research/projects/darshan/ http://www.nersc.gov/users/software/performance-and-debuggingtools/darshan/ OLCF Resources https://www.olcf.ornl.gov/kb_articles/spider-the-center-wide-lustrefile-system/ https://www.olcf.ornl.gov/kb_articles/lustre-basics/ https://www.olcf.ornl.gov/kb_articles/darshan-basics/ 17

Useful I/O libraries and middleware ADIOS https://www.olcf.ornl.gov/center-projects/adios/ HDF5 https://www.hdfgroup.org/hdf5/ NetCDF http://www.unidata.ucar.edu/software/netcdf/ Parallel NetCDF https://trac.mcs.anl.gov/projects/parallel-netcdf 18

Summary Trend of scalable storage in HPC More hierarchical More heterogeneous Mapping from application data model to storage hardware efficiently is increasingly more complex Setting the right expectation of performance Understand application I/O requirement and characterization Think about data transformation and pick the right I/O interface for job Measure and know the bottleneck Deeper understanding on the storage architecture Think about portability 19