What NetCDF users should know about HDF5?

Similar documents
Caching and Buffering in HDF5

HDF5 I/O Performance. HDF and HDF-EOS Workshop VI December 5, 2002

Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package

HDF5: theory & practice

Introduction to High Performance Parallel I/O

HDF- A Suitable Scientific Data Format for Satellite Data Products

Introduction to HDF5

COSC 6374 Parallel Computation. Scientific Data Libraries. Edgar Gabriel Fall Motivation

Extreme I/O Scaling with HDF5

7C.2 EXPERIENCE WITH AN ENHANCED NETCDF DATA MODEL AND INTERFACE FOR SCIENTIFIC DATA ACCESS. Edward Hartnett*, and R. K. Rew UCAR, Boulder, CO

h5perf_serial, a Serial File System Benchmarking Tool

Hierarchical Data Format 5:

Parallel I/O Libraries and Techniques

Introduction to serial HDF5

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

DRAFT. HDF5 Data Flow Pipeline for H5Dread. 1 Introduction. 2 Examples

Parallel I/O and Portable Data Formats HDF5

Parallel I/O and Portable Data Formats

CS3600 SYSTEMS AND NETWORKS

The HDF Group. Parallel HDF5. Extreme Scale Computing Argonne.

Introduction to HDF5

Adapting Software to NetCDF's Enhanced Data Model

Introduction to NetCDF

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

HDF5: An Introduction. Adam Carter EPCC, The University of Edinburgh

New Features in HDF5. Why new features? September 9, 2008 SPEEDUP Workshop - HDF5 Tutorial

CA485 Ray Walshe Google File System

Parallel I/O and Portable Data Formats I/O strategies

NetCDF-4: : Software Implementing an Enhanced Data Model for the Geosciences

Distributed File Systems II

HDF Update. Elena Pourmal The HDF Group. October 16, 2008 IDL User Group Meeting 1

I/O in scientific applications

Introduction to HDF5

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

COS 318: Operating Systems. File Systems. Topics. Evolved Data Center Storage Hierarchy. Traditional Data Center Storage Hierarchy

Lecture 33: More on MPI I/O. William Gropp

CLOUD-SCALE FILE SYSTEMS

COS 318: Operating Systems. Journaling, NFS and WAFL

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Storage and File System

CSE 153 Design of Operating Systems

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

RFC: HDF5 File Space Management: Paged Aggregation

File Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18

An Overview of the HDF5 Technology Suite and its Applications

Parallel NetCDF. Rob Latham Mathematics and Computer Science Division Argonne National Laboratory

HPC Input/Output. I/O and Darshan. Cristian Simarro User Support Section

The HDF Group. Parallel HDF5. Quincey Koziol Director of Core Software & HPC The HDF Group.

Crash Consistency: FSCK and Journaling. Dongkun Shin, SKKU

Chapter 12: File System Implementation

What is a file system

The Google File System

NPTEL Course Jan K. Gopinath Indian Institute of Science

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission

Topics. " Start using a write-ahead log on disk " Log all updates Commit

Programming with MPI

File Systems: Consistency Issues

OPERATING SYSTEM. Chapter 12: File System Implementation

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Crossing the Chasm: Sneaking a parallel file system into Hadoop

HDF5 User s Guide. HDF5 Release November

Raima Database Manager Version 14.1 In-memory Database Engine

Parallel I/O from a User s Perspective

CS370 Operating Systems

COS 318: Operating Systems

Chapter 12: File System Implementation

JHDF5 (HDF5 for Java) 14.12

An Evolutionary Path to Object Storage Access

Percona Live September 21-23, 2015 Mövenpick Hotel Amsterdam

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

HDF Product Designer: A tool for building HDF5 containers with granule metadata

Benchmarking the CGNS I/O performance

Operating Systems. Operating Systems Professor Sina Meraji U of T

Programming with MPI

File System Implementation

End-to-End Data Integrity in the Intel/EMC/HDF Group Exascale IO DOE Fast Forward Project

Chapter 8: Main Memory

Adaptable IO System (ADIOS)

I/O: State of the art and Future developments

Chapter 11: Implementing File Systems

Chapter 12: File System Implementation

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Chapter 11: Implementing File

Chapter 8: Memory-Management Strategies

The Google File System

HDF5 File Space Management. 1. Introduction

Open Data Standards for Administrative Data Processing

ECSS Project: Prof. Bodony: CFD, Aeroacoustics

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Ext3/4 file systems. Don Porter CSE 506

The netcdf- 4 data model and format. Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012

Storage and File Hierarchy

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

ò Very reliable, best-of-breed traditional file system design ò Much like the JOS file system you are building now

NTFS Recoverability. CS 537 Lecture 17 NTFS internals. NTFS On-Disk Structure

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

The Fusion Distributed File System

Operating System Concepts Ch. 11: File System Implementation

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

Transcription:

What NetCDF users should know about HDF5? Elena Pourmal The HDF Group July 20, 2007 7/23/07 1

Outline The HDF Group and HDF software HDF5 Data Model Using HDF5 tools to work with NetCDF-4 programs files Performance issues Chunking Variable-length datatypes Parallel performance Crash proofing in HDF5 7/23/07 2

The HDF Group Non-for-profit company with a mission to sustain and develop HDF technology affiliated with University of Illinois Spun-off NCSA University of Illinois in July 2006 Located at the U of I Campus South Research Park 17 team members, 5 graduate and undergraduate students Owns IP for HDF fie format and software Funded by NASA, DOE, others 7/23/07 3

HDF5 file format and I/O library General simple data model Flexible store data of diverse origins, sizes, types supports complex data structures Portable available for many operating systems and machines Scalable works in high end computing environments accommodates date of any size or multiplicity Efficient fast access, including parallel i/o stores big data efficiently 7/23/07 4

HDF5 file format and I/O library File format Complex Objects headers Raw data B-trees Local and Global heaps etc C Library 500+ APIs C++, Fortran90 and Java wrappers High-level APIs (images, tables, packets) 7/23/07 5

Apps: simulation, visualization, remote sensing Examples: Thermonuclear simulations Product modeling Data mining tools Visualization tools Climate models Common application-specific data models appl-specific NetCDF-4 SAF hdf5mesh IDL HDF-EOS APIs UnidataLANL LLNL, SNL Grids COTS NASA HDF5 data model & API HDF5 serial & parallel I/O HDF5 virtual file layer (I/O drivers) Sec2 Split Files MPI I/O Custom Stream Storage HDF5 format File Split metadata File on parallel and raw data files file system? Across the network User-defined or to/from another device application or library 7/23/07 6

HDF5 file format and I/O library For NetCDF-4 users HDF5 complexity is hidden behind NetCDF-4 APIs 7/23/07 7

HDF5 Tools Command line utilities http://www.hdfgroup.org/hdf5tools.html Readers h5dump h5ls Writers h5repack h5copy h5import Miscellaneous h5diff, h5repart, h5mkgrp, h5stat, h5debug, h5jam/h5unjam Converters h52gif, gif2h5, h4toh5, h5toh4 HDFView (Java browser and editor) 7/23/07 8

Other HDF5 Tools HDF Explorer Windows only, works with NetCDF-4 files http://www.space-research.org/ PyTables IDL Matlab Labview Mathematica See http://www.hdfgroup.org/tools5app.html 7/23/07 9

HDF Information HDF Information Center http://hdfgroup.org HDF Help email address help@hdfgroup.org HDF users mailing lists news@hdfgroup.org hdf-forum@hdfgroup.org 7/23/07 10

NetCDF and HDF5 terminology NetCDF HDF5 Dataset Dimensions Attribute Variable Coordinate variable HDF5 file Dataspace Attribute Dataset Dimension scale 7/23/07 11

Mesh Example, in HDFView 7/23/07 12

HDF5 Data Model 7/23/07 13

HDF5 data model HDF5 file container for scientific data Primary Objects NetCDF-4 builds from these parts. Groups Datasets Additional ways to organize data Attributes Sharable objects Storage and access properties 7/23/07 14

HDF5 Dataset Rank 3 Metadata Dataspace Dimensions Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Data Datatype IEEE 32-bit float Storage info chunked compressed checksum Attributes time = 32.4 pressure = 987 temp = 56 7/23/07 15

Datatypes HDF5 atomic types normal integer & float user-definable (e.g. 13-bit integer) variable length types (e.g. strings, ragged arrays) pointers - references to objects/dataset regions enumeration - names mapped to integers array opaque HDF5 compound types Comparable to C structs Members can be atomic or compound types No restriction on comlexity 7/23/07 16

HDF5 dataset: array of records 3 5 Dimensionality: 5 x 3 Datatype: int8 int4 int16 2x3x2 array of float32 Record 7/23/07 17

Groups A mechanism for collections of related objects Every file starts with a root group t / d h Similar to UNIX directories Can have attributes a b c a Objects are identified by a path e.g. /d/b, /t/a 7/23/07 18

Attributes Attribute data of the form name = value, attached to an object (group, dataset, named datatype) Operations scaled down versions of dataset operations Not extendible No compression No partial I/O Optional Can be overwritten, deleted, added during the life of a dataset Size under 64K in releases before HDF5 1.8.0 7/23/07 19

Using HDF5 tools with NetCDF-4 programs and files 7/23/07 20

Example Create netcdf-4 file /Users/epourmal/Working/_NetCDF-4 s.c creates simple_xy.nc (NetCDF3 file) sh5.c creates simple_xy_h5.nc (NetCDF4 file) Use h5cc script to compile both examples See contents simple_xy_h5.nc with ncdump and h5dump Useful flags -h to print help menu -b to export data to binary file -H to display metadata information only HDF Explorer 7/23/07 21

NetCDF view: ncdump output % ncdump -h simple_xy_h5.nc netcdf simple_xy_h5 { dimensions: x = 6 ; y = 12 ; variables: int data(x, y) ; data: } % h5dump -H simple_xy.nc h5dump error: unable to open file "simple_xy.nc This is NetCDF3 file, h5dump will not work 7/23/07 22

HDF5 view: h5dump output % h5dump -H simple_xy_h5.nc HDF5 "simple_xy_h5.nc" { GROUP "/" { DATASET "data" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 6, 12 ) / ( 6, 12 ) } ATTRIBUTE "DIMENSION_LIST" { DATATYPE H5T_VLEN { H5T_REFERENCE} DATASPACE SIMPLE { ( 2 ) / ( 2 ) } } } DATASET "x" { DATATYPE H5T_IEEE_F32BE DATASPACE SIMPLE { ( 6 ) / ( 6 ) }. } 7/23/07 23

HDF Explorer 7/23/07 24

HDF Explorer 7/23/07 25

Performance issues 7/23/07 26

Performance issues Choose appropriate HDF5 library features to organize and access data in HDF5 files Three examples: Collective vs. Independent access in parallel HDF5 library Chunking Variable length data 7/23/07 27

Layers parallel example NetCDF-4 Application Parallel computing system (Linux cluster) Compute node Compute node Compute node I/O library (HDF5) Parallel I/O library (MPI-I/O) Parallel file system (GPFS) Compute node I/O flows through many layers from application to disk. Switch network/i/o servers Disk architecture & layout of data on disk 7/23/07 28

h5perf An I/O performance measurement tool Test 3 File I/O API Posix I/O (open/write/read/close ) MPIO (MPI_File_{open,write,read.close}) PHDF5 H5Pset_fapl_mpio (using MPI-IO) H5Pset_fapl_mpiposix (using Posix I/O) 7/23/07 29

H5perf: Some features Check (-c) verify data correctness Added 2-D chunk patterns in v1.8 7/23/07 30

My PHDF5 Application I/O inhales If my application I/O performance is bad, what can I do? Use larger I/O data sizes Independent vs Collective I/O Specific I/O system hints Parallel File System limits 7/23/07 31

Independent Vs Collective Access User reported Independent data transfer was much slower than the Collective mode Data array was tall and thin: 230,000 rows by 6 columns : : 230,000 rows : : 7/23/07 32

Independent vs. Collective write (6 processes, IBM p-690, AIX, GPFS) # of Rows Data Size Independent (Sec.) Collective (Sec.) (MB) 16384 0.25 8.26 1.72 32768 0.50 65.12 1.80 65536 1.00 108.20 2.68 122918 1.88 276.57 3.11 150000 2.29 528.15 3.63 180300 2.75 881.39 4.12 7/23/07 33

Independent vs Collective write (6 processes, IBM p-690, AIX, GPFS) Performance (non-contiguous) Time (s) 1000 900 800 700 600 500 400 300 200 100 0 0.00 0.50 1.00 1.50 2.00 2.50 3.00 Data space size (MB) Independent Collective 7/23/07 34

Some performance results 1. A parallel version of NetCDF-3 from ANL/Northwestern University/University of Chicago (PnetCDF) 2. HDF5 parallel library 1.6.5 3. NetCDF-4 beta1 4. For more details see http://www.hdfgroup.uiuc.edu/papers/papers/parallelperf ormance.pdf 7/23/07 35

HDF5 and PnetCDF Performance Comparison Flash I/O Website http://flash.uchicago.edu/~zingale/flash_benchmark_io/ Robb Ross, etc. Parallel NetCDF: A Scientific High-Performance I/O Interface 7/23/07 36

HDF5 and PnetCDF performance comparison Flash I/O Benchmark (Checkpoint files) Flash I/O Benchmark (Checkpoint files) PnetCDF HDF5 independent PnetCDF HDF5 independent 60 2500 50 2000 MB/s 40 30 20 MB/s 1500 1000 10 500 0 10 60 110 160 Number of Processors 0 10 110 210 310 Number of Processors Bluesky: Power 4 up: Power 5 7/23/07 37

HDF5 and PnetCDF performance comparison Flash I/O Benchmark (Checkpoint files) Flash I/O Benchmark (Checkpoint files) PnetCDF HDF5 collective HDF5 independent PnetCDF HDF5 collective HDF5 independent 60 2500 MB/s 50 40 30 20 MB/s 2000 1500 1000 10 500 0 10 60 110 160 Number of Processors 0 10 110 210 310 Number of Processors Bluesky: Power 4 up: Power 5 7/23/07 38

Parallel NetCDF-4 and PnetCDF PNetCDF from ANL NetCDF4 Bandwidth (MB/S) 160 140 120 100 80 60 40 20 0 0 16 32 48 64 80 96 112 128 144 Number of processors Fixed problem size = 995 MB Performance of PnetCDF4 is close to PnetCDF 7/23/07 39

HDF5 chunked dataset Dataset is partitioned into fixed-size chunks Data can be added along any dimension Compression is applied to each chunk Datatype conversion is applied to each chunk Chunking storage creates additional overhead in a file Do not use small chunks 7/23/07 40

Writing chunked dataset Chunked dataset Chunk cache A C B C Filter pipeline File B A.. C Each chunk is written as a contiguous blob Chunks may be scattered all over the file Compression is performed when chunk is evicted from the chunk cache Other filters when data goes through filter pipeline (e.g. encryption) 7/23/07 41

Writing chunked datasets Dataset_1 header Dataset_N header Metadata cache Chunking B-tree nodes Chunk cache Default size is 1MB Size of chunk cache is set for file Each chunked dataset has its own chunk cache Chunk may be too big to fit into cache Memory may grow if application keeps opening datasets Application memory 7/23/07 42

Partial I/O for chunked dataset 1 2 3 4 Build list of chunks and loop through the list For each chunk: Bring chunk into memory Map selection in memory to selection in file Gather elements into conversion buffer and perform conversion Scatter elements back to the chunk Apply filters (compression) when chunk is flushed from chunk cache For each element 3 memcopy performed 7/23/07 43

Partial I/O for chunked dataset Application buffer 3 Chunk memcopy Elements participated in I/O are gathered into corresponding chunk Application memory 7/23/07 44

Partial I/O for chunked dataset Chunk cache 3 Gather data Conversion buffer Scatter data Application memory On eviction from cache chunk is compressed and is written to the file File Chunk 7/23/07 45

Chunking and selections Great performance Poor performance Selection coincides with a chunk Selection spans over all chunks 7/23/07 46

Things to remember about HDF5 chunking Use appropriate chunk sizes Make sure that cache is big enough to contain chunks for partial I/O Use hyperslab selections that are aligned with chunks Memory may grow when application opens and modifies a lot of chunked datasets 7/23/07 47

Variable length datasets and I/O Examples of variable-length data String A[0] the first string we want to write A[N-1] the N-th string we want to write Each element is a record of variable-length A[0] (1,1,0,0,0,5,6,7,8,9) length of the first record is 10 A[1] (0,0,110,2005).. A[N] (1,2,3,4,5,6,7,8,9,10,11,12,.,M) length of the N+1 record is M 7/23/07 48

Variable length datasets and I/O Variable length description in HDF5 application typedef struct { }hvl_t; size_t length; void *p; Base type can be any HDF5 type H5Tvlen_create(base_type) ~ 20 bytes overhead for each element Raw data cannot be compressed 7/23/07 49

Variable length datasets and I/O Raw data Global heap Global heap Application buffer Elements in application buffer point to global heaps where actual data is stored 7/23/07 50

VL chunked dataset in a file Chunking B-tree File Raw data Dataset chunks Dataset header 7/23/07 51

Hints Variable length datasets and I/O Avoid closing/opening a file while writing VL datasets global heap information is lost global heaps may have unused space Avoid writing VL datasets interchangeably data from different datasets will is written to the same heap If maximum length of the record is known, use fixedlength records and compression 7/23/07 52

Crash-proofing 7/23/07 53

Why crash proofing? HDF5 applications tend to run long times (sometimes until system crashes) Application crash may leave HDF5 file in a corrupted state Currently there is no way to recover data One of the main obstacles for productions codes that use NetCDF-3 to move to NetCDF-4 Funded by ASC project Prototype release is scheduled for the end of 2007 7/23/07 54

HDF5 Solution Journaling Modifications to HDF5 metadata are stored in an external journal file HDF5 will be using asynchronous writes to the journal file for efficiency Recovering after crash HDF5 recovery tool will replay the journal and apply all metadata writes bringing HDF5 file to a consistent state Raw data will consist of data that made to disk Solution will be applicable for both sequential and parallel modes 7/23/07 55

Thank you! Questions? 7/23/07 56