h5perf_serial, a Serial File System Benchmarking Tool

Similar documents
HDF5 I/O Performance. HDF and HDF-EOS Workshop VI December 5, 2002

Caching and Buffering in HDF5

Parallel I/O and Portable Data Formats HDF5

DRAFT. HDF5 Data Flow Pipeline for H5Dread. 1 Introduction. 2 Examples

HDF5: theory & practice

What NetCDF users should know about HDF5?

COSC 6374 Parallel Computation. Scientific Data Libraries. Edgar Gabriel Fall Motivation

Parallel I/O and Portable Data Formats

[537] Fast File System. Tyler Harter

Crossing the Chasm: Sneaking a parallel file system into Hadoop

COSC 6397 Big Data Analytics. Distributed File Systems (II) Edgar Gabriel Fall HDFS Basics

API and Usage of libhio on XC-40 Systems

HDF5: An Introduction. Adam Carter EPCC, The University of Edinburgh

CA485 Ray Walshe Google File System

Optimization of non-contiguous MPI-I/O operations

Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package

The HDF Group. Parallel HDF5. Extreme Scale Computing Argonne.

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Revealing Applications Access Pattern in Collective I/O for Cache Management

Taming Parallel I/O Complexity with Auto-Tuning

File Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18

Parallel HDF5 (PHDF5)

Introduction to serial HDF5

PatternFinder is a tool that finds non-overlapping or overlapping patterns in any input sequence.

Parallel I/O Libraries and Techniques

File System Aging: Increasing the Relevance of File System Benchmarks

Distributed Filesystem

The HDF Group. Parallel HDF5. Quincey Koziol Director of Core Software & HPC The HDF Group.

YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores

RFC: HDF5 Virtual Dataset

Crossing the Chasm: Sneaking a parallel file system into Hadoop

End-to-End Data Integrity in the Intel/EMC/HDF Group Exascale IO DOE Fast Forward Project

Distributed File Systems II

libhio: Optimizing IO on Cray XC Systems With DataWarp

A FRAMEWORK ARCHITECTURE FOR SHARED FILE POINTER OPERATIONS IN OPEN MPI

b. How many bits are there in the physical address?

Chapter 12: File System Implementation

EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures

15-440: Project 4. Characterizing MapReduce Task Parallelism using K-Means on the Cloud

Introduction to High Performance Parallel I/O

143A: Principles of Operating Systems. Lecture 6: Address translation (Paging) Anton Burtsev October, 2017

Reducing I/O variability using dynamic I/O path characterization in petascale storage systems

Implementing HDF5 in MATLAB

CS3600 SYSTEMS AND NETWORKS

I/O in scientific applications

Benchmarking the CGNS I/O performance

7C.2 EXPERIENCE WITH AN ENHANCED NETCDF DATA MODEL AND INTERFACE FOR SCIENTIFIC DATA ACCESS. Edward Hartnett*, and R. K. Rew UCAR, Boulder, CO

LAPI on HPS Evaluating Federation

Extreme I/O Scaling with HDF5

Splotch: High Performance Visualization using MPI, OpenMP and CUDA

ECSS Project: Prof. Bodony: CFD, Aeroacoustics

Enosis: Bridging the Semantic Gap between

Introduction to HDF5

File System Internals. Jo, Heeseung

The Hadoop Distributed File System Konstantin Shvachko Hairong Kuang Sanjay Radia Robert Chansler

Post-processing issue, introduction to HDF5

COSC 6374 Parallel Computation. Performance Modeling and 2 nd Homework. Edgar Gabriel. Spring Motivation

A Plugin for HDF5 using PLFS for Improved I/O Performance and Semantic Analysis

Advanced Parallel Programming

NetCDF-4: : Software Implementing an Enhanced Data Model for the Geosciences

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

5.4 - DAOS Demonstration and Benchmark Report

COSC 6397 Big Data Analytics. Distributed File Systems (II) Edgar Gabriel Spring HDFS Basics

HDF5 User s Guide. HDF5 Release November

The Fusion Distributed File System

SE Memory Consumption

Lecture 33: More on MPI I/O. William Gropp

Parallel I/O and Portable Data Formats I/O strategies

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

CSE 4/521 Introduction to Operating Systems. Lecture 14 Main Memory III (Paging, Structure of Page Table) Summer 2018

Warehouse- Scale Computing and the BDAS Stack

Systems I. Optimizing for the Memory Hierarchy. Topics Impact of caches on performance Memory hierarchy considerations

BlobSeer: Bringing High Throughput under Heavy Concurrency to Hadoop Map/Reduce Applications

SAP. JCo Configuration. Properties and examples

CS 390 Chapter 8 Homework Solutions

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Distributed Systems 16. Distributed File Systems II

Fast File System (FFS)

Scientific File Formats

Optimization of MPI Applications Rolf Rabenseifner

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

Adapting Software to NetCDF's Enhanced Data Model

Operating systems. Part 1. Module 11 Main memory introduction. Tami Sorgente 1

YCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores

Optimized Non-contiguous MPI Datatype Communication for GPU Clusters: Design, Implementation and Evaluation with MVAPICH2

CSE 4/521 Introduction to Operating Systems. Lecture 23 File System Implementation II (Allocation Methods, Free-Space Management) Summer 2018

Hierarchical Data Format 5:

Portable Parallel I/O SIONlib

Paging & Segmentation

Ravindra Babu Ganapathi

WHITE PAPER SINGLE & MULTI CORE PERFORMANCE OF AN ERASURE CODING WORKLOAD ON AMD EPYC

Main Memory and the CPU Cache

CS370 Operating Systems

CS 550 Operating Systems Spring File System

Challenges in HPC I/O

SE Memory Consumption

Big Data processing: a framework suitable for Economists and Statisticians

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

Research Faculty Summit Systems Fueling future disruptions

Transcription:

h5perf_serial, a Serial File System Benchmarking Tool The HDF Group April, 2009 HDF5 users have reported the need to perform serial benchmarking on systems without an MPI environment. The parallel benchmarking tool, h5perf, cannot be used for this purpose since the code is dependent on MPI functions and names. In addition, desired features like the use of extendable datasets and file drivers are not available in h5perf. These considerations call for the development of a new benchmarking tool, h5perf_serial. Requirements Although h5perf_serial is still under development, the following initial requirements have been implemented: 1. Use of POSIX I/O calls a. Write an entire file using a single I/O operation. b. Write a file using several I/O operations. 2. Use of HDF5 I/O calls a. Write an entire dataset using a single I/O operation. b. Write a dataset using several I/O operations with hyperslabs. c. Select contiguous and chunked storage. d. Select fixed or extendable dimensions sizes for the test dataset. e. Select file drivers. 3. Support for datasets and buffers with multiple dimensions. Most of the design and options are taken from h5perf, e.g. the datatype of each array element is char. The options and parameters of the tool, dataset organization examples, and API features are described next. Options and Parameters -A api_list Specifies which APIs to test. api_list is a comma-separated list with the following valid values: hdf5, posix. (Default: All APIs) -e dataset-dimension-size-list Specifies the sizes of the dataset dimensions in a comma-separated list. The dataset dimensionality is inferred from the list size. For example, a 3D dataset of dimensions 20 30 40 can be specified by -e 20,30,40 (Default: 1D dataset of 16M, i.e. e 16M) 1

-x buffer-dimension-size-list Specifies the sizes of the transfer buffer dimensions in a comma-separated list. The buffer dimensionality is inferred from the list size. For instance, a 3D buffer of dimensions 2 3 4 can be specified by -x 2,3,4 (Default: 1D buffer of 256K, i.e. x 256K) -r dimension-access-order-list Specifies the dimension access order in a comma-separated list. h5perf_serial starts accessing the dataset at the cardinal origin, then it traverses the dataset contiguously in the order specified. For example, -r 2,3,1 will cause the tool to traverse first the dataset dimension 2, then the dimension 3, and finally, the dimension 1. (Default: -r 1) -c chunk-dimension-size-list Creates HDF5 datasets in chunked layout, and specifies the sizes of the chunks dimensions in a comma-separated list. The chunk dimensionality is inferred from the list size. For instance, a 3D chunk of dimensions 2 3 4 can be specified by -c 2,3,4 (Default: Off). -t Use an HDF5 dataset with extendable dimensions. (Default: Off, i.e., fixed dimensions). -v file-driver Specifies which file driver to test with HDF5. Valid values include: sec2, stdio, core, split, multi, family, and direct. (Default: sec2) -i iterations Sets the number of iterations to perform. (Default: 1) -w Performs only write tests, not read tests. (Default: Read and write tests) Data Organization Examples An execution of the following command h5perf_serial -e 16,16 -x 2,4 -r 2,1 defines a dataset of 16 16 bytes, a transfer buffer of 2 4 bytes, and dimension access order 2,1. Figure 1 shows the state of the dataset after seven write operations. 2

Figure 1 Data pattern for access order 2,1 A different buffer size and access order can be specified. The following command h5perf_serial -e 16,16 -x 4,2 -r 1,2 defines a dataset of 16 16 bytes, a transfer buffer of 4 2 bytes, and dimension access order 1,2. Figure 2 shows the state of the dataset after seven write operations. Figure 2 Data pattern for access order 1,2 h5perf_serial will check that the size of each dataset dimension is a multiple of the size of the same dimension in the transfer buffer. Also, the dimensionality of the dataset, transfer buffer, and chunks must be the same. 3

APIs Features The available APIs for testing are POSIX and HDF5. In both cases, h5perf_serial can write the dataset through one or several I/O operations by setting the appropriate sizes for the dataset and transfer buffer, e.g. a dataset can be written in a single operation by using a transfer buffer that matches the dataset dimensions. The HDF5 API allows the selection of chunked storage using the option -c. When chunked storage is selected, h5perf_serial offers the added option t, which extends the dataset while it is being written. During each I/O access the dataset is extended minimally, if needed, to support the writing of the transfer buffer. The extension process continues until the dataset has achieved its final size. Following this scheme, the datasets on Figures 1 and 2 would have been extended four times assuming that the initial dataset size was that of the transfer buffer. An additional feature of h5perf_serial under HDF5 is the possibility to select a particular file driver to test the variation in I/O performance under different file access implementations. Tool Output The resulting output is similar to that of h5perf; throughput statistics are displayed by API and write/read operations. The use of multiple iterations provides information for maximum, average, and minimum throughput values. 4

./h5perf_serial -e 4K,4K -x 512,256 -c 4K,256 -i 3 HDF5 Library: Version 1.9.39 ==== Parameters ==== IO API=posix hdf5 Number of iterations=3 Dataset size=4kb 4KB Transfer buffer size=512 256 Dimension access order=1 2 HDF5 data storage method=chunked HDF5 chunk size=4kb 256 HDF5 dataset dimensions=fixed HDF5 file driver=sec2 Env HDF5_PREFIX=not set ==== End of Parameters ==== Transfer Buffer Size (bytes): 131072 File Size(MB): 16.00 IO API = POSIX Write (3 iteration(s)): Maximum Throughput: 52.78 MB/s Average Throughput: 52.03 MB/s Minimum Throughput: 50.68 MB/s Write Open-Close (3 iteration(s)): Maximum Throughput: 35.53 MB/s Average Throughput: 34.05 MB/s Minimum Throughput: 31.52 MB/s Read (3 iteration(s)): Maximum Throughput: 62.50 MB/s Average Throughput: 62.24 MB/s Minimum Throughput: 61.87 MB/s Read Open-Close (3 iteration(s)): Maximum Throughput: 62.44 MB/s Average Throughput: 62.17 MB/s Minimum Throughput: 61.80 MB/s IO API = HDF5 Write (3 iteration(s)): Maximum Throughput: 533.30 MB/s Average Throughput: 523.39 MB/s Minimum Throughput: 505.82 MB/s Write Open-Close (3 iteration(s)): Maximum Throughput: 89.24 MB/s Average Throughput: 59.46 MB/s Minimum Throughput: 35.76 MB/s Read (3 iteration(s)): Maximum Throughput: 607.49 MB/s Average Throughput: 605.97 MB/s Minimum Throughput: 603.66 MB/s Read Open-Close (3 iteration(s)): Maximum Throughput: 585.72 MB/s Average Throughput: 581.98 MB/s Minimum Throughput: 577.45 MB/s 5