JULEA: A Flexible Storage Framework for HPC

Similar documents
NEXTGenIO Performance Tools for In-Memory I/O

A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

FastForward I/O and Storage: ACG 8.6 Demonstration

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Introduction to HPC Parallel I/O

Structuring PLFS for Extensibility

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

Lustre overview and roadmap to Exascale computing

Challenges in HPC I/O

Lustre File System. Proseminar 2013 Ein-/Ausgabe - Stand der Wissenschaft Universität Hamburg. Paul Bienkowski Author. Michael Kuhn Supervisor

An Exploration into Object Storage for Exascale Supercomputers. Raghu Chandrasekar

Storage-Based Convergence Between HPC and Big Data

Parallel I/O Libraries and Techniques

Performance Benefits of Running RocksDB on Samsung NVMe SSDs

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Enosis: Bridging the Semantic Gap between

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Understanding Write Behaviors of Storage Backends in Ceph Object Store

CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart

I/O Profiling Towards the Exascale

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Linux multi-core scalability

VoltDB vs. Redis Benchmark

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

Harp-DAAL for High Performance Big Data Computing

Application Performance on IME

Analyzing I/O Performance on a NEXTGenIO Class System

INTRODUCTION TO CEPH. Orit Wasserman Red Hat August Penguin 2017

Fast Forward Storage & I/O. Jeff Layton (Eric Barton)

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

BeoLink.org. Design and build an inexpensive DFS. Fabrizio Manfredi Furuholmen. FrOSCon August 2008

Decentralized Distributed Storage System for Big Data

EIOW Exa-scale I/O workgroup (exascale10)

Exploiting Weather & Climate Data at Scale (WP4)

COSC 6397 Big Data Analytics. Distributed File Systems (II) Edgar Gabriel Fall HDFS Basics

An Evolutionary Path to Object Storage Access

The State and Needs of IO Performance Tools

Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems. Dong Dai, Yong Chen, Dries Kimpe, and Robert Ross

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

Parallelizing Inline Data Reduction Operations for Primary Storage Systems

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

I/O at the German Climate Computing Center (DKRZ)

Advanced Data Placement via Ad-hoc File Systems at Extreme Scales (ADA-FS)

Using DDN IME for Harmonie

DataMods. Programmable File System Services. Noah Watkins

Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects?

Analyzing the High Performance Parallel I/O on LRZ HPC systems. Sandra Méndez. HPC Group, LRZ. June 23, 2016

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work

Data Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research

Introduction to High Performance Parallel I/O

Overview of Tianhe-2

EsgynDB Enterprise 2.0 Platform Reference Architecture

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience

A Plugin-based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS

Quobyte The Data Center File System QUOBYTE INC.

New User Seminar: Part 2 (best practices)

MOHA: Many-Task Computing Framework on Hadoop

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

Tracing and Visualization of Energy Related Metrics

Effective Use of CSAIL Storage

Aerie: Flexible File-System Interfaces to Storage-Class Memory [Eurosys 2014] Operating System Design Yongju Song

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl

-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.

Fast Forward I/O & Storage

Thrill : High-Performance Algorithmic

Exploring cloud storage for scien3fic research

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Kinetic Open Storage Platform: Enabling Break-through Economics in Scale-out Object Storage PRESENTATION TITLE GOES HERE Ali Fenn & James Hughes

Coordinating Parallel HSM in Object-based Cluster Filesystems

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED

Optimizing Local File Accesses for FUSE-Based Distributed Storage

Open Source Storage. Ric Wheeler Architect & Senior Manager April 30, 2012

HIGH-PERFORMANCE STORAGE FOR DISCOVERY THAT SOARS

FUJITSU PHI Turnkey Solution

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

Compute Node Linux: Overview, Progress to Date & Roadmap

Red Hat Roadmap for Containers and DevOps

Online Monitoring of I/O

Data Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research

Jessica Popp Director of Engineering, High Performance Data Division

an Object-Based File System for Large-Scale Federated IT Infrastructures

Percipient StorAGe for Exascale Data Centric Computing Computing for the Exascale

Why software defined storage matters? Sergey Goncharov Solution Architect, Red Hat

Hyper-converged infrastructure with Proxmox VE virtualization platform and integrated Ceph Storage.

Feedback on BeeGFS. A Parallel File System for High Performance Computing

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval

DataMods: Programmable File System Services

SUSE. High Performance Computing. Eduardo Diaz. Alberto Esteban. PreSales SUSE Linux Enterprise

This module reviews the core syntax and features of the C# programming language. It also provides an introduction to the Visual Studio 2012 debugger.

High Performance Data Analytics for Numerical Simulations. Bruno Raffin DataMove

TOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT

Programming in C# Course: Course Details ABOUT THIS COURSE AUDIENCE PROFILE. Síguenos en:

THE AFS NAMESPACE AND CONTAINERS

Accelerating MPI Message Matching and Reduction Collectives For Multi-/Many-core Architectures Mohammadreza Bayatpour, Hari Subramoni, D. K.

Accelerating MPI Message Matching and Reduction Collectives For Multi-/Many-core Architectures

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

Technologies for High Performance Data Analytics

SDP Design for Cloudy Regions

Transcription:

JULEA: A Flexible Storage Framework for HPC Workshop on Performance and Scalability of Storage Systems Michael Kuhn Research Group Scientific Computing Department of Informatics Universität Hamburg 2017-06-22 Michael Kuhn JULEA: A Flexible Storage Framework for HPC 1 / 21

About Us: Scientific Computing Analysis of parallel I/O I/O & energy tracing tools Middleware optimization Alternative I/O interfaces Data reduction techniques Cost & energy efficiency We are an Intel Parallel Computing Center for Lustre ( Enhanced Adaptive Compression in Lustre ) Michael Kuhn JULEA: A Flexible Storage Framework for HPC 2 / 21

1 Introduction and Motivation 2 Flexible Storage Framework for HPC 3 Performance Evaluation 4 Future Work and Summary Michael Kuhn JULEA: A Flexible Storage Framework for HPC 3 / 21

Motivation Hard to try new file system approaches Changes to many different components required File systems are typically monolithic in design Single interface, set of semantics and storage backend Portability is an important factor Two majors problems: 1 Many specialized solutions for particular problems Often based on existing file systems, seldom contributed back 2 Necessary to have complete understanding of the file systems Unnecessary hurdle for young researchers and students Michael Kuhn JULEA: A Flexible Storage Framework for HPC 4 / 21

Motivation... Applications rely on high-level I/O libraries Exchangeability of data is a primary concern Self-describing data formats such as NetCDF and HDF5 Multiple projects investigate integrating I/O libraries and file systems more closely (DAOS, ESiWACE etc.) Hard to achieve with current file systems Requires extensive changes Related research HPC and big data convergence Alternative file system interfaces Michael Kuhn JULEA: A Flexible Storage Framework for HPC 5 / 21

Motivation... Many projects implement basic functionality from scratch Communication, distribution, backends etc. Possible solution is a flexible storage framework Rapid prototyping of new ideas Plugins for interface, storage backend and semantics JULEA is such a framework Supports plugins that are configurable at runtime Provides a convenient framework for research and teaching Existing solutions have different focuses Michael Kuhn JULEA: A Flexible Storage Framework for HPC 6 / 21

Overview Parallel Application NetCDF User Space Kernel Space HDF5 MPI-IO ADIO Lustre ldiskfs Block Storage Parallel Application NetCDF HDF5 JULEA Data and Metadata Stores Block Storage User Space Kernel Space (a) I/O stack commonly found in HPC (b) Proposed I/O stack with JULEA JULEA runs completely in user space High-level libraries and applications can use it directly Michael Kuhn JULEA: A Flexible Storage Framework for HPC 7 / 21

Overview... Possible to offer arbitrary interfaces to applications Traditional file system interfaces and completely new ones Servers are able to use a many existing storage technologies Support for multiple backends to foster experimentation Both clients and backends are easy to integrate and exchange Can be changed at runtime through configuration file Dynamically adaptable semantics for all I/O operations For example, POSIX and MPI-IO on a per-operation basis Michael Kuhn JULEA: A Flexible Storage Framework for HPC 8 / 21

Overview... Metadata Server Server Process Client Application I/O Library JULEA Client Data Server Server Process Applications can use one or more JULEA clients Clients can be used either directly by applications or by adapting I/O libraries to make use of them Servers are split into data and metadata servers Allows tuning the servers for their respective access patterns Michael Kuhn JULEA: A Flexible Storage Framework for HPC 9 / 21

Clients File systems typically offer a single interface Interwoven with the rest of the file system architecture Clients are completely unrestricted regarding their interfaces User space, therefore arbitrary interfaces can be provided Typically problematic for kernel space file systems due to VFS Useful for both applications and I/O libraries For instance, HDF5 directly on top of JULEA Michael Kuhn JULEA: A Flexible Storage Framework for HPC 10 / 21

Backends Separated into data and metadata backends Additionally, client and server backends Data backends manage objects Influenced by file systems (Lustre and OrangeFS), object stores (Ceph s RADOS) and I/O interfaces (MPI-IO) Metadata backends manage key-value pairs Influenced by database (SQLite and MongoDB) and key-value (LevelDB and LMDB) solutions Backends support namespaces Allows multiple clients to co-exist and not interfere Michael Kuhn JULEA: A Flexible Storage Framework for HPC 11 / 21

Semantics Adapt file system to application instead of other way around Operations semantics can be changed at runtime Different categories: atomicity, concurrency, consistency, ordering, persistency and safety Possible to mix the settings for each of these categories Not all combinations might produce reasonable results Templates to emulate existing semantics such as POSIX Clients can fix appropriate semantics or give control to users Michael Kuhn JULEA: A Flexible Storage Framework for HPC 12 / 21

Implementation Modern C11 code Automatic cleanup of variables etc. Open source (LGPL 3.0 or later) 1 Only two mandatory dependencies GLib for data structures, libbson for (de)serialization Clients are provided in the form of shared libraries Allow applications to use multiple clients at the same time Server can function as both a data and metadata server Integrated support for tracing, unit tests etc. 1 https://github.com/wr-hamburg/julea Michael Kuhn JULEA: A Flexible Storage Framework for HPC 13 / 21

Implementation... Client Application Application libjulea-item.so libjulea-item.so libjulea.so libjulea.so libmongodb.so Server mongod Server julea-server libjulea.so libleveldb.so Server julea-server libjulea.so libposix.so Michael Kuhn JULEA: A Flexible Storage Framework for HPC 14 / 21

Implementation... object: direct access to JULEA s data store Able to access arbitrary namespaces Provides abstractions for other clients kv: direct access to JULEA s metadata store Able to access arbitrary namespaces Provides abstractions for other clients item: cloud-like interface Collections and items with flat hierarchy posix: POSIX file system using FUSE Michael Kuhn JULEA: A Flexible Storage Framework for HPC 15 / 21

Implementation... posix: compatibility with existing POSIX file systems, certain functionalities are duplicated gio: uses the GIO library that supports multiple backends of its own (including POSIX, FTP and SSH) lexos: uses LEXOS to provide a light-weight data store null: intended for performance measurements of the overall I/O stack, discards all incoming data leveldb: uses LevelDB for metadata storage mongodb: uses MongoDB, maps key-value pairs to documents Michael Kuhn JULEA: A Flexible Storage Framework for HPC 16 / 21

Evaluation Setup Performance depends on the used data and metadata backends Focus on some general performance aspects Two configurations: local: desktop machine (Intel Xeon E3-1225v3) with a consumer SSD (Samsung SSD 840 EVO) remote: two dual-socket nodes (Intel Xeon X5650) with an HDD (Seagate Barracuda 7200.12), connected via Gbit Ethernet Michael Kuhn JULEA: A Flexible Storage Framework for HPC 17 / 21

Performance Results Storage Backend Operation Perf. (local) Perf. (remote) Create 19,500 ops/s 3,600 ops/s POSIX Delete 29,500 ops/s 4,300 ops/s Data Status 39,500 ops/s 4,500 ops/s Create 49,000 ops/s 5,500 ops/s NULL Delete 49,500 ops/s 5,000 ops/s Status 49,500 ops/s 4,900 ops/s Metadata LevelDB Put 41,500 ops/s 4,300 ops/s Delete 43,000 ops/s 4,300 ops/s MongoDB Put 7,500 ops/s 1,400 ops/s Delete 8,000 ops/s 1,500 ops/s Both: MongoDB much slower than LevelDB Remote: performance mainly limited by network Michael Kuhn JULEA: A Flexible Storage Framework for HPC 18 / 21

Performance Results... Safety Operation Perf. (local) Perf. (remote) None Put 225,000 ops/s 62,000 ops/s Delete 197,000 ops/s 59,500 ops/s Network Put 41,500 ops/s 4,300 ops/s Delete 43,000 ops/s 4,300 ops/s Storage Put 29,500 ops/s 3,800 ops/s Delete 31,000 ops/s 3,900 ops/s Network: operations require one round trip None: performance much higher due to pipelining Michael Kuhn JULEA: A Flexible Storage Framework for HPC 19 / 21

Future Work Basic storage framework and some initial backends finished Implement an HDF5 VOL plugin Map data to objects and metadata to key-value pairs Further extend JULEA s backend support Data backend for Ceph s RADOS, metadata backend for LMDB Further improvements to JULEA s backend interface Should remain stable in the foreseeable future Provide a reliable base for third-party plugins Michael Kuhn JULEA: A Flexible Storage Framework for HPC 20 / 21

Summary JULEA provides a flexible storage framework Contains necessary building blocks for storage systems Facilitates rapid prototyping and evaluation Few dependencies and can be used without system-level access Easy to use on clusters Runs completely in user space Easy to debug and develop Michael Kuhn JULEA: A Flexible Storage Framework for HPC 21 / 21