SDS: A Framework for Scientific Data Services

Similar documents
Evaluation of Parallel I/O Performance and Energy with Frequency Scaling on Cray XC30 Suren Byna and Brian Austin

ArrayUDF Explores Structural Locality for Faster Scientific Analyses

Extreme I/O Scaling with HDF5

Introduction to High Performance Parallel I/O

The Fusion Distributed File System

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

Fast Forward I/O & Storage

Parallel, In Situ Indexing for Data-intensive Computing. Introduction

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy

Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures

Stream Processing for Remote Collaborative Data Analysis

Big Data in HPC. John Shalf Lawrence Berkeley National Laboratory

Taming Parallel I/O Complexity with Auto-Tuning

Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies

MDHIM: A Parallel Key/Value Store Framework for HPC

Big Data in Scientific Domains

Caching and Buffering in HDF5

Informatica Data Explorer Performance Tuning

Parallel I/O Libraries and Techniques

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

libhio: Optimizing IO on Cray XC Systems With DataWarp

CS November 2018

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

Distributed File Systems II

Spark and HPC for High Energy Physics Data Analyses

CSE 124: Networked Services Lecture-16

Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright. lbl.gov

CS November 2017

Challenges in HPC I/O

Data Movement & Tiering with DMF 7

CSE 124: Networked Services Fall 2009 Lecture-19

EMPRESS Extensible Metadata PRovider for Extreme-scale Scientific Simulations

PreDatA - Preparatory Data Analytics on Peta-Scale Machines

Where We Are. Review: Parallel DBMS. Parallel DBMS. Introduction to Data Management CSE 344

FastForward I/O and Storage: IOD M5 Demonstration (5.2, 5.3, 5.9, 5.10)

Using file systems at HC3

Blue Waters I/O Performance

Open Data Standards for Administrative Data Processing

April Copyright 2013 Cloudera Inc. All rights reserved.

Oak Ridge National Laboratory Computing and Computational Sciences

Oracle Big Data Connectors

Overlapping Computation and Communication for Advection on Hybrid Parallel Computers

API and Usage of libhio on XC-40 Systems

High Performance Data Analytics for Numerical Simulations. Bruno Raffin DataMove

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

Dac-Man: Data Change Management for Scientific Datasets on HPC systems

Introduction to Parallel I/O

Storage Hierarchy Management for Scientific Computing

High Performance Computing on MapReduce Programming Framework

Overview. Idea: Reduce CPU clock frequency This idea is well suited specifically for visualization

Applying DDN to Machine Learning

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here

Jialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon, Alex Gittens, Lisa Gerhardt, Suren Byna, Mike F. Ringenburg, Prabhat

DATABASE MANAGEMENT SYSTEM ARCHITECTURE

A Plugin for HDF5 using PLFS for Improved I/O Performance and Semantic Analysis

Approaching the Petabyte Analytic Database: What I learned

The Red Storm System: Architecture, System Update and Performance Analysis

Lustre architecture for Riccardo Veraldi for the LCLS IT Team

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp

Shark. Hive on Spark. Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker

Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis

Lustre Parallel Filesystem Best Practices

Welcome! Virtual tutorial starts at 15:00 BST

Parallel I/O from a User s Perspective

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

Execution Models for the Exascale Era

Leveraging Flash in HPC Systems

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Structuring PLFS for Extensibility

Crossing the Chasm: Sneaking a parallel file system into Hadoop

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Dr. John Dennis

Introduction to Data Management CSE 344

UCX: An Open Source Framework for HPC Network APIs and Beyond

Chapter 11: Implementing File Systems

Introduction & Motivation Problem Statement Proposed Work Evaluation Conclusions Future Work

RobinHood Project Status

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

CSE 190D Database System Implementation

Using the In-Memory Columnar Store to Perform Real-Time Analysis of CERN Data. Maaike Limper Emil Pilecki Manuel Martín Márquez

A Note on Interfacing Object Warehouses and Mass Storage Systems for Data Mining Applications *

Realtime Data Analytics at NERSC

In Situ Generated Probability Distribution Functions for Interactive Post Hoc Visualization and Analysis

A Comparison of Memory Usage and CPU Utilization in Column-Based Database Architecture vs. Row-Based Database Architecture

End-to-End Data Integrity in the Intel/EMC/HDF Group Exascale IO DOE Fast Forward Project

Introduction to Hadoop. Owen O Malley Yahoo!, Grid Team

Massive Online Analysis - Storm,Spark

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

Jingren Zhou. Microsoft Corp.

HPC Storage Use Cases & Future Trends

CIS Operating Systems File Systems. Professor Qiang Zeng Fall 2017

Transcription:

SDS: A Framework for Scientific Data Services Bin Dong, Suren Byna*, John Wu Scientific Data Management Group Lawrence Berkeley National Laboratory

Finding Newspaper Articles of Interest Finding news articles of your interest in the old era of print newspaper Turn pages Read headlines Searching for specific news was time consuming Online newspapers Search box Limited to one newspaper Customized News Articles are organized based on reader s interest Search numerous online newspapers 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 2

Finding Newspaper Articles of Interest Finding news articles of your interest in the old era of print newspaper Turn pages Read headlines Searching for specific news was time consuming Online newspapers Search box Limited to one newspaper Customized News Articles are organized based on reader s interest Search numerous online newspapers 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 3

Finding Newspaper Articles of Interest Finding news articles of your interest in the old era of print newspaper Turn pages Read headlines Searching for specific news was time consuming Online newspapers Search box Limited to one newspaper Customized News Articles are organized based on reader s interest Search numerous online newspapers 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 4

Finding Newspaper Articles of Interest Finding news articles of your interest in the old era of print newspaper Turn pages Read headlines Searching for specific news was time consuming Online newspapers Search box Limited to one newspaper Customized News Articles are organized based on reader s interest à Better organization, faster access to news Search numerous online newspapers 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 5

Scientific Discovery needs Fast Data Analysis Modern scientific discoveries are driven by massive data Scientific simulations and experiments in many domains produce data in the range of terabytes and petabytes LSST image simulator LHC: Higgs Boson search Fast data analysis is an important requirement for discovery Data is typically stored as files on disks that are managed by file systems High data read performance from storage is critical 11/18/2013 Light source experiments at LCLS, ALS, SNS, etc. NCAR s CESM data Image Credits http://www.science.tamu.edu/articles/911 http://www.lsst.org/lsst/gallery/data http://nsidc.org/icelights/category/research-2/ Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 6

A Generic Data Analysis Use Case Simulation Site Exascale Simulation Machine + analysis Parallel Storage Perform some data analysis on exascale machine (e.g. in situ pattern identification) Experiment/observation Site Experiment/observation Processing Machine (Parallel) Storage Archive Archive Analysis Sites Need to reduce EBs and PBs of data, and move only TBs from simulation sites Analysis Analysis Analysis Machine Machines Shared Shared Shared storage storage storage 11/18/2013 7 Reduce and prepare data for further exploratory Analysis (e.g., Data mining)

Current Practice: Immutable Data in File Systems Data stored on file systems is immutable After data producers (simulations and experiments) store data, the file format/ organization is unchanged over time Data stored in files, which is treated by the system as a sequence of bytes User code (and libraries) has to impose meaning of the file layout and conduct analyses Burden of optimizing read performance falls on application developer However, optimizations are complex due to several layers of parallel I/O stack Applications High-level I/O libraries and formats High-level I/O libraries and formats I/O Optimization Layers Parallel File System Storage Hardware Parallel I/O Software Stack 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 8

Changing layout of data on file systems is helpful Separation of logical and physical data organization through reorganization Example Access all variables where 1.15 < Energy < 1.4 Full scan of data or random accesses to non-contiguous data locations results in poor performance Sorting the data and accessing contiguous chunks of data leads to good performance Energy X Y Z 1.1692 165.834-28.0448-2.76863 2.5364 166.249-27.9321-2.87165 1.4364 166.538-27.9735-3.16969 1.0862 166.663-27.9769-2.79716 1.2862 166.621-27.9046-2.78382 1.5862 167.001-27.9366-3.09605 1.3173 167.222-27.9551-3.22406 Several studies showing reorganization can help Listed some in the related work section 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 9

Changing layout of data on file systems is helpful Separation of logical and physical data organization through reorganization Example Access all variables where 1.15 < Energy < 1.4 Full scan of data or random accesses to non-contiguous data locations results in poor performance Sorting the data and accessing contiguous chunks of data leads to good performance Several studies showing reorganization can help Listed some in the related work section Energy X Y Z 1.1692 165.834-28.0448-2.76863 2.5364 166.249-27.9321-2.87165 1.4364 166.538-27.9735-3.16969 1.0862 166.663-27.9769-2.79716 1.2862 166.621-27.9046-2.78382 1.5862 167.001-27.9366-3.09605 1.3173 167.222-27.9551-3.22406 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 10

Changing layout of data on file systems is helpful Separation of logical and physical data organization through reorganization Example Access all variables where 1.15 < Energy < 1.4 Full scan of data or random accesses to non-contiguous data locations results in poor performance Sorting the data and accessing contiguous chunks of data leads to good performance Energy X Y Z 1.0862 166.663-27.9769-2.79716 1.1692 165.834-28.0448-2.76863 1.2862 166.621-27.9046-2.78382 1.3173 167.222-27.9551-3.22406 1.5862 167.001-27.9366-3.09605 1.4364 166.538-27.9735-3.16969 2.5364 166.249-27.9321-2.87165 Several studies showing reorganization can help Listed some in the related work section 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 11

Database Systems for Scientific Data Management Database management systems perform various optimizations without user involvement Many efforts from database community reach out to manage scientific data Objectivity/DB, Array QL, SciQL, SciDB, etc. SciDB A database system for scientific applications Key features: Array-oriented data model Append-only storage First-class support for user-defined functions Massively parallel computations 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 12

More DB Optimizations, but Database systems use various optimization tricks and perform them automatically Cache frequently accessed data Use indexes Store materialized views However, preparing and loading of scientific data to fit into database schemas is cumbersome Scientific data management requirements are different from DBMS Established data formats (HDF5, NetCDF, ROOT, FITS, ADIOS-BP, etc.) Arrays are first class citizens Highly concurrent applications 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 13

Our Solution: Scientific Data Services (SDS) Preserving scientific data formats and adding database optimizations as services A framework for bringing the merits of database technologies to scientific data Examples of services Indexing Sorting Reorganization to improve data locality Compression Remote and selective data movement 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 14

First step: Automatic Data Reorganization Finding an optimal data organization strategy Capture common read patterns Estimate cost with various data organization strategies Rank data organizations for each pattern and select the best for a pattern Performing data reorganization Similar to database management systems, perform data reorganization transparently Resolving permissions to the original data Preserve the original permissions when data is reorganized Using an optimally reorganized dataset transparently Based on a read pattern, recognize the best available organization of data Redirect to read the selected organization Perform any needed post processing, such as decompression, transposition 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 15

The Design of the SDS Framework Initial implementation of the SDS framework Client-server approach SDS Client library for each MPI process SDS Server to find the best dataset from available reorganized datasets Supporting HDF5 Read API Added SDS Query API for answering range queries MPI(Process(0( Applica4on( MPI(Process(N( Applica4on( HDF5(API( SDS(Query(API( HDF5(API( SDS(Query(API( SDS(Client( SDS(Client( Network( SDS(Server( Parallel&File&System&& Database( Batch(System( 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 16

SDS Client Implementation HDF5 API Requested& Data& Post& Processor& SDS&Query& Interface& File&Handle& and& Query&String&& ApplicaAon& Parser& SDS&&Client& HDF5&API& File&Handle&and&Read&Parameters& File&Handle& and& Final&Query&String& Server& Connector& SDS&&Server& The HDF5 Virtual Object Layer (VOL) feature allows capturing HDF5 calls Developed a VOL plugin for SDS for capturing file open, read, and close functions Data& Reader& Original&File&Metadata&or&& Reorganized&File&Metadata& Parallel&File&System& 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 17

SDS Client Implementation Requested& Data& Post& Processor& SDS&Query& Interface& File&Handle& and& Query&String&& ApplicaAon& Parser& SDS&&Client& HDF5&API& File&Handle&and&Read&Parameters& File&Handle& and& Final&Query&String& Server& Connector& SDS&&Server& SDS Query Interface An interface to perform SQL-style queries on arrays Function-based API that can be used from C/C++ applications Data& Reader& Original&File&Metadata&or&& Reorganized&File&Metadata& Parallel&File&System& 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 18

SDS Client Implementation ApplicaAon& Requested& Data& Post& Processor& SDS&Query& Interface& File&Handle& and& Query&String&& Parser& SDS&&Client& HDF5&API& File&Handle&and&Read&Parameters& File&Handle& and& Final&Query&String& Server& Connector& SDS&&Server& Parser Checks the conditions in a query Verifies the validity of files Data& Reader& Original&File&Metadata&or&& Reorganized&File&Metadata& Parallel&File&System& 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 19

SDS Client Implementation ApplicaAon& Requested& Data& Post& Processor& Data& SDS&Query& Interface& File&Handle& and& Query&String&& Reader& Parser& Parallel&File&System& SDS&&Client& HDF5&API& File&Handle&and&Read&Parameters& File&Handle& and& Final&Query&String& Original&File&Metadata&or&& Reorganized&File&Metadata& Server& Connector& SDS&&Server& Server Connector Packages a query or HDF5 read call information and sends to the SDS server Using protocol buffers for communication MPI Rank 0 communicates with the server and then informs the remaining MPI processes 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 20

SDS Client Implementation ApplicaAon& Requested& Data& Post& Processor& Data& SDS&Query& Interface& File&Handle& and& Query&String&& Reader& Parser& Parallel&File&System& SDS&&Client& HDF5&API& File&Handle&and&Read&Parameters& File&Handle& and& Final&Query&String& Original&File&Metadata&or&& Reorganized&File&Metadata& Server& Connector& SDS&&Server& Reader Reads data from the dataset location returned by the SDS Server Makes use of the native HDF5 read calls 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 21

SDS Client Implementation ApplicaAon& Requested& Data& Post& Processor& Data& SDS&Query& Interface& File&Handle& and& Query&String&& Reader& Parser& Parallel&File&System& SDS&&Client& HDF5&API& File&Handle&and&Read&Parameters& File&Handle& and& Final&Query&String& Original&File&Metadata&or&& Reorganized&File&Metadata& Server& Connector& SDS&&Server& Post Processor Performs any postprocessing needed before returning the data to the application memory Eg. Decompression, transposition, etc. 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 22

SDS Server Implementation Client#Request# Query#Request# Query#Response# SDS#Client# Request# Dispatcher# Query# Evaluator# User#Reorganiza6on#Request# Periodic#SelfJstart#Request# Reorganiza6on## Request# Organiza6on#List# Reorganiza6on# Evaluator# Organiza6on# List# Data#Organiza6on# Recommender# SDS#Admin#Interface# SDS#Server# File#Handle## and## Reorganiza6on#Type# Data# Reorganizer# Job#Script# Job#Results# Reorganiza6on##Job##Running# Read#Sta6s6cs# Manager# Frequently#Read#File# Handle#and#its#SDS# Metadata# ACribute#of##Reorganized#file# Original#File## Reorganized#File# File#Data# Parallel#File#System# 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 23

SDS Server Implementation Request Dispatcher Client#Request# Query#Request# Query#Response# SDS#Client# Request# Dispatcher# Query# Evaluator# User#Reorganiza6on#Request# Periodic#SelfJstart#Request# Read#Sta6s6cs# Reorganiza6on## Request# Organiza6on#List# Manager# Reorganiza6on# Evaluator# Organiza6on# List# Data#Organiza6on# Recommender# Frequently#Read#File# Handle#and#its#SDS# Metadata# SDS#Admin#Interface# File#Handle## and## Reorganiza6on#Type# Data# Reorganizer# ACribute#of##Reorganized#file# SDS#Server# Job#Script# Job#Results# Original#File## Reorganiza6on##Job##Running# Reorganized#File# Receives SDS client requests and SDS Admin interface SDS Admin interface issues reorganization commands Based on the request, dispatcher passes on the request to Query Evaluator and Reorganization Evaluator File#Data# Parallel#File#System# 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 24

SDS Server Implementation Query Evaluator Client#Request# Query#Request# Query#Response# SDS#Client# Request# Dispatcher# Query# Evaluator# User#Reorganiza6on#Request# Periodic#SelfJstart#Request# Read#Sta6s6cs# Reorganiza6on## Request# Organiza6on#List# Manager# Reorganiza6on# Evaluator# Organiza6on# List# Data#Organiza6on# Recommender# Frequently#Read#File# Handle#and#its#SDS# Metadata# SDS#Admin#Interface# File#Handle## and## Reorganiza6on#Type# Data# Reorganizer# ACribute#of##Reorganized#file# SDS#Server# Job#Script# Job#Results# Original#File## Reorganiza6on##Job##Running# Reorganized#File# Looks up SDS Metadata for finding available reorganized datasets and their locations for a given dataset SDS Metadata File name, HDF5 dataset info, permissions File#Data# Parallel#File#System# 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 25

SDS Server Implementation Client#Request# Query#Request# Query#Response# SDS#Client# Request# Dispatcher# Query# Evaluator# User#Reorganiza6on#Request# Periodic#SelfJstart#Request# Read#Sta6s6cs# Reorganiza6on## Request# Organiza6on#List# Manager# Reorganiza6on# Evaluator# Organiza6on# List# Data#Organiza6on# Recommender# Frequently#Read#File# Handle#and#its#SDS# Metadata# SDS#Admin#Interface# File#Handle## and## Reorganiza6on#Type# Data# Reorganizer# ACribute#of##Reorganized#file# SDS#Server# Job#Script# Job#Results# Original#File## Reorganiza6on##Job##Running# Reorganized#File# Reorganization Evaluator Decides whether to reorganize based on the frequency of read accesses Takes commands from the Admin interface Instructs Data Reorganizer to create a reorganization job script File#Data# Parallel#File#System# 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 26

SDS Server Implementation Client#Request# Query#Request# Query#Response# SDS#Client# Request# Dispatcher# Query# Evaluator# User#Reorganiza6on#Request# Periodic#SelfJstart#Request# Read#Sta6s6cs# Reorganiza6on## Request# Organiza6on#List# Manager# Reorganiza6on# Evaluator# Organiza6on# List# Data#Organiza6on# Recommender# Frequently#Read#File# Handle#and#its#SDS# Metadata# SDS#Admin#Interface# File#Handle## and## Reorganiza6on#Type# Data# Reorganizer# ACribute#of##Reorganized#file# SDS#Server# Job#Script# Job#Results# Original#File## Reorganiza6on##Job##Running# Reorganized#File# Data Organization Recommender Identifies optimal data reorganization Informs the Reorganization Evaluator with the selected strategy File#Data# Parallel#File#System# 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 27

SDS Server Implementation File#Data# Client#Request# Query#Request# Query#Response# SDS#Client# Request# Dispatcher# Query# Evaluator# User#Reorganiza6on#Request# Periodic#SelfJstart#Request# Read#Sta6s6cs# Reorganiza6on## Request# Organiza6on#List# Manager# Reorganiza6on# Evaluator# Organiza6on# List# Data#Organiza6on# Recommender# Frequently#Read#File# Handle#and#its#SDS# Metadata# File#Handle## and## Reorganiza6on#Type# Data# Reorganizer# ACribute#of##Reorganized#file# Parallel#File#System# SDS#Admin#Interface# SDS#Server# Job#Script# Job#Results# Original#File## Reorganiza6on##Job##Running# Reorganized#File# Data Reorganizer Locates reorganization code, such as sorting, indexing algorithms Decides on the number of cores to use Prepares a batch job script Monitors the job execution After reorganization, stores the new data location in the SDS Metadata Manager 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 28

Performance Evaluation: Setup NERSC Hopper supercomputer 6,384 compute nodes 2 twelve-core AMD 'MagnyCours' 2.1-GHz processors per node 32 GB DDR3 1333-MHz memory per node Employs the Gemini interconnect with a 3D torus topology Lustre parallel file system with 156 OSTs at a peak BW of 35 GB/s Software Cray s MPI library HDF5 Virtual Object Layer code base Data Particle data from a plasma physics simulation (VPIC), simulating magnetic reconnection phenomenon in space weather 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 29

Performance Evaluation: Overhead of SDS SDS Metadata Manager uses Berkeley DB for storing metadata of reorganized data Measured response time of multiple concurrent SDS Clients requesting metadata from one SDS Server Low overhead of < 0.5 seconds with 240 concurrent clients 1" 0.8" HDF5%Open+Close% SDS%Metadata%Read% Time"(sec)" 0.6" 0.4" 0.2" 0" 40" 80" 120" 160" 200" 240" 280" 320" Number"of"Concurent"Clients" 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 30

Performance Evaluation: Benefits of SDS Time%(sec)% Ran various queries on particle energy conditions Performance benefit over full scan varies based on selectivity of data 20X to 50X speedup when data selectivity less than 5% Full%Data%Scan% Read%Reorganized%Data% 140.00% 120.00% 100.00% 80.00% 60.00% 40.00% 20.00% 0.00% 1.2X% 2.3X% 8.1X% 18.9X% 25.7X% 36.3X% 42.0X% 44.8X% 51.2X% E>1.10% E>1.15% E>1.20% E>1.25% E>1.30% E>1.35% E>1.40% E>1.45% E>1.50% Selectivity: 79% 32% 11% 4.6% 2.2% 1.2% 0.75% 0.5% 0.33% Query% 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 31

Conclusion and Future Work File systems on current supercomputers are analogous to print newspapers The SDS framework is vehicle for applying various optimizations Live customization of data organization based on data usage (in progress) To work with existing scientific data formats Status: Platform for performing various optimizations is ready Low overhead and high performance benefits Ongoing and future work: Dynamic recognition of read patterns Model-driven data organization optimizations In-memory query execution and query plan optimization FastBit indexing to support querying Support for more data formats 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 32

Thanks! Contact: Suren Byna [SByna@lbl.gov] Thanks to: 11/18/2013 Computational Research Division Lawrence Berkeley National Laboratory Department of Energy 33