Improved Solutions for I/O Provisioning and Application Acceleration

Similar documents
IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

DDN About Us Solving Large Enterprise and Web Scale Challenges

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

HPC Storage Use Cases & Future Trends

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage

朱义普. Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration. Director, North Asia, HPC

IME Infinite Memory Engine Technical Overview

DDN. DDN Updates. DataDirect Neworks Japan, Inc Nobu Hashizume. DDN Storage 2018 DDN Storage 1

DDN and Flash GRIDScaler, Flashscale Infinite Memory Engine

Infinite Memory Engine Freedom from Filesystem Foibles

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Using DDN IME for Harmonie

DDN Annual High Performance Computing Trends Survey Reveals Rising Deployment of Flash Tiers & Private/Hybrid Clouds vs.

Accelerating Spectrum Scale with a Intelligent IO Manager

IBM FlashSystem. IBM FLiP Tool Wie viel schneller kann Ihr IBM i Power Server mit IBM FlashSystem 900 / V9000 Storage sein?

Smart Trading with Cray Systems: Making Smarter Models + Better Decisions in Algorithmic Trading

Optimizing Tier-1 Enterprise Storage for Solid State Memory

Analyzing the High Performance Parallel I/O on LRZ HPC systems. Sandra Méndez. HPC Group, LRZ. June 23, 2016

Session 201-B: Accelerating Enterprise Applications with Flash Memory

PCIe Storage Beyond SSDs

Accelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage

LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions

Data Movement & Tiering with DMF 7

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

Applying DDN to Machine Learning

Optimizing the Data Center with an End to End Solutions Approach

The Fusion Distributed File System

Design Considerations for Using Flash Memory for Caching

Tech Talk on HPC. Ken Claffey. VP, Cloud Systems. May 2016

Designing Next Generation FS for NVMe and NVMe-oF

Virtual Storage Tier and Beyond

Tuning I/O Performance for Data Intensive Computing. Nicholas J. Wright. lbl.gov

IBM Spectrum Scale IO performance

Roadmap for Enterprise System SSD Adoption

Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE

NVM Express over Fabrics Storage Solutions for Real-time Analytics

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory

Accelerate Big Data Insights

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

Non-Volatile Memory Cache Enhancements: Turbo-Charging Client Platform Performance

Introduction to High Performance Parallel I/O

Nimble Storage vs HPE 3PAR: A Comparison Snapshot

Harmonia: An Interference-Aware Dynamic I/O Scheduler for Shared Non-Volatile Burst Buffers

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl

Universal Storage. Innovation to Break Decades of Tradeoffs VASTDATA.COM

Storage for HPC, HPDA and Machine Learning (ML)

Computational Storage: Acceleration Through Intelligence & Agility

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

Automatic Identification of Application I/O Signatures from Noisy Server-Side Traces. Yang Liu Raghul Gunasekaran Xiaosong Ma Sudharshan S.

2013 AWS Worldwide Public Sector Summit Washington, D.C.

Optimizing Apache Spark with Memory1. July Page 1 of 14

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona

Developing Low Latency NVMe Systems for HyperscaleData Centers. Prepared by Engling Yeo Santa Clara, CA Date: 08/04/2017

Application Performance on IME

Quobyte The Data Center File System QUOBYTE INC.

A Talari Networks White Paper. Turbo Charging WAN Optimization with WAN Virtualization. A Talari White Paper

Analyzing I/O Performance on a NEXTGenIO Class System

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

Lustre* is designed to achieve the maximum performance and scalability for POSIX applications that need outstanding streamed I/O.

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Introduction to Parallel I/O

Blue Waters I/O Performance

CS 318 Principles of Operating Systems

I/O: State of the art and Future developments

libhio: Optimizing IO on Cray XC Systems With DataWarp

2012 HPC Advisory Council

NetApp: Solving I/O Challenges. Jeff Baxter February 2013

Persistent Memory. High Speed and Low Latency. White Paper M-WP006

InfiniBand Networked Flash Storage

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

CS6453. Data-Intensive Systems: Rachit Agarwal. Technology trends, Emerging challenges & opportuni=es

PRESERVE DATABASE PERFORMANCE WHEN RUNNING MIXED WORKLOADS

Nimble Storage Adaptive Flash

Parallel File Systems. John White Lawrence Berkeley National Lab

The Intersection of Cloud & Solid State Storage

Solving the I/O bottleneck with Flash

A New Metric for Analyzing Storage System Performance Under Varied Workloads

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED

I/O Profiling Towards the Exascale

Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp

SFA12KX and Lustre Update

An Exploration into Object Storage for Exascale Supercomputers. Raghu Chandrasekar

Optimization of non-contiguous MPI-I/O operations

A Gentle Introduction to Ceph

Increasing Performance of Existing Oracle RAC up to 10X

New Approach to Unstructured Data

Lessons from Post-processing Climate Data on Modern Flash-based HPC Systems

Evolution of Rack Scale Architecture Storage

All-Flash Storage Solution for SAP HANA:

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

VMware Virtual SAN. High Performance Scalable Storage Architecture VMware Inc. All rights reserved.

Solid Access Technologies, LLC

Let s Make Parallel File System More Parallel

I/O CANNOT BE IGNORED

Early Evaluation of the "Infinite Memory Engine" Burst Buffer Solution

The Impact of SSDs on Software Defined Storage

Transcription:

1 Improved Solutions for I/O Provisioning and Application Acceleration August 11, 2015 Jeff Sisilli Sr. Director Product Marketing jsisilli@ddn.com

2 Why Burst Buffer? The Supercomputing Tug-of-War A supercomputer is a device for turning compute-bound problems into I/O bound problems DDN s storage mission is to eliminate I/O-bound problems and revert them back to compute-bound ones Ken Batcher, Emeritus Professor of Computer Science, Kent State University The Divide Driving Exascale Innovation Alex Bouzari, CEO & Founder, DDN COMPUTE ACCELERATION STORAGE ACCELERATION

3 PAIN: Problem I/O Bound Applications Longstanding PFS I/O Bottlenecks Must Be Eliminated TAKEAWAY 1. Problem applications are a huge source of known pain in HPC HOW A BURST BUFFER HELPS 1. Accelerates applications and returns time available for computation by orders of magnitude

4 Polling HPC TOP500 Sites RFP Mindshare of Various Flash-based I/O Acceleration Technologies

5 Current Motivation for I/O Accelerators Faster Time to Results vs. Gaining New Efficiencies

6 The Next I/O Provisioning Revolution: Decoupling Physical Storage from Compute Resources! View Slide in Full Screen BEFORE EVERYTHING IS OVERPROVISIONED Too Many: COMPUTE NODES DISKS, NETWORKING NODES, ARRAYS, ADMIN, H/W AFTER A LOT MORE SPEED TO THE APPLICATION & A LOT LESS COMPONENTS Much Fewer: COMPUTE NODES DISKS, NETWORKING NODES, ARRAYS, ADMIN, H/W

7 Even Building the World s Fastest PFS... Will NOT Fix These I/O Challenges PFS Locking Storage Latency Fragmented I/O Patterns Out of Core Data Limited Power PFS are not designed for today s mixed I/O & ensembles HDD seek times & network traversing add latency Mal-aligned apps slow down the PFS & entire cluster Many datasets are too big for expensive DRAM Exascale or next scale needs more space & power No matter how many HDDs you add to a PFS, you can t break I/O bottlenecks without a burst buffer

8 Burst Buffer & Beyond: IME

9 The New I/O Acceleration Architecture TM AIM - Acceleration IN Memory View Slide in Full Screen Introducing AIM, An Active I/O Tier, inserted right between compute and your PFS ACTIVE I/O TIER ARCHIVE STORAGE DISK/TAPE TIER Intelligent IME software virtualizes disparate NVMe SSDs into a single pool of shared memory that accelerates I/O, PFS & Applications IME I/O APPLIANCES OBJECT STORAGE & TAPE LIBRARIES

10 Take AIM & Target I/O Bottlenecks With IME Eliminate Overprovisioning & Storage Sprawl Slow PFS I/O Bound Applications Long Checkpoints POSIX Locking HDD Latency Big Footprints Large Datasets Low Capacity HDDs Latency High Power

11 Introducing IME Key Components and Operations View Slide in Full Screen A thin IME Client resides on compute nodes or I/O nodes within the cluster. Client overloads (intercepts) I/O calls Traditional Lustre (or GPFS ) Client model with MDS (or NSD) interface The Parallel File System (PFS) is unaware that IME may or may not be an intermediary POSIX MPI-IO Application interfaces can include: POSIX, MPI-IO, ROMIO, HDF5, NetCDF, etc. Application is unaware that it it accessing PFS or burst buffer. Application does not require modification of any type Low-level communications key value pair based protocol; is not POSIX. This allows for breakthrough performance and scalability IME Server software does the heavy lifting of managing the state

12 IME: A Burst Buffer & Way, Way, Way Beyond Game Changing, Enabling Technology Cache is only the beginning. Right out of the box, IME does so much more... BURST BUFFER PFS ACCELERATOR APP OPTIMIZER DRAM EXTENDER + + + Most cost & space efficient way to provision peak performance Finally breaks POSIX locking bottleneck with instant open/close Dynamically aligns mal-formed I/O into striped writes without code mods No dataset is too big with TBs or even PBs of fast, cost efficient NVMe

13 New Considerations for Architecting I/O Performance

14 Game Changing Bandwidth IME Disrupts How Performance is Provisioned View Slide in Full Screen IME introduces a more efficient way to provision performance than just storage arrays alone BEFORE IME: PFS Systems were designed to handle the entire performance load This required lots of storage controllers, enclosures and drives to deliver full bandwidth STORAGE BANDWIDTH UTILIZATION which was rarely OF A used MAJOR... HPC PRODUCTION STORAGE SYSTEM 99% of the time < 33% of max 70% of the time < 5% of max TODAY, WITH IME: IME s BURST BUFFER Absorbs the Peak Load PARALLEL FILE SYSTEM Handles the Sustained Load IME enables peak performance to be provisioned with much less hardware, power, space

15 IME Accelerates I/O in Several Ways Problem Application Case Study: S3D 1) MITIGATES POOR PFS PERFORMANCE caused by PFS locking, small I/O, and mal-aligned, fragmented I/O patterns. IME makes bad apps run well and also prevents a poor-behaving app from impacting the entire supercomputer. This is especially valuable to diverse workload environments and ISV applications. 25 MB/s 50 GB/s 4 GB/s 10x View Slide in Full Screen 2) PROVIDES HIGHER PERFORMANCE I/O (bandwidth and latency) to the application. Providing additional bandwidth here is relatively inexpensive. Configuring 10x more bandwidth compared to PFS is typical. 3) IME DRIVES I/O MORE EFFICIENTLY TO THE PFS by re-aligning and coalescing data within the non-volatile storage. At SC14, we demonstrated 1000x speed-up on mal-formed I/O when using non-posix low-level communications. 1000x S3D Turbulent Flow Model At SC14, we demonstrated 100x speed-up due to this efficiency. IOR benchmarks show a 3x 20x speedup on I/Os <32KB. 100x

16 How Does IME Help? Disrupts the I/O Provisioning Paradigm & Reduces the Total Cost of Storage 1. IME enables organizations to separate the provisioning of peak & sustained performance requirements with greater operational efficiency and cost savings than utilizing exclusively disk-based parallel file systems STORAGE BANDWIDTH UTILIZATION OF A MAJOR HPC PRODUCTION STORAGE SYSTEM 99% of the time < 33% of max 70% of the time< 5% of max IME Reduces Storage Hardware up to 70% Fewer systems to buy, power manage, maintain

17 How Does IME Help? Increases I/O & Application Performance 2. IME Accelerates applications, especially those with small or mal-aligned I/O for faster time to results & insight

18 Thank You! Keep in touch with us sales@ddn.com @ddn_limitless 2929 Patrick Henry Drive Santa Clara, CA 95054 1.800.837.2298 1.818.700.4000 company/datadirect-networks