Leveraging Flash in Scalable Environments: A Systems Perspective on How FLASH Storage is Displacing Disk Storage

Similar documents
Optimizing the Data Center with an End to End Solutions Approach

At-Scale Data Centers & Demand for New Architectures

Credit Suisse 18 th Annual Technology Conference

SanDisk Enterprise Storage Solutions

Interface Trends for the Enterprise I/O Highway

Benefits of 25, 40, and 50GbE Networks for Ceph and Hyper- Converged Infrastructure John F. Kim Mellanox Technologies

Utilizing Ultra-Low Latency Within Enterprise Architectures

SUPERMICRO NEXENTASTOR 5.0 REFERENCE ARCHITECTURE

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory

An Intelligent & Optimized Way to Access Flash Storage Increase Performance & Scalability of Your Applications

Guardian Technology Platform Webcast

Flash In the Data Center

From Rack Scale to Network Scale: NVMe Over Fabrics Enables Exabyte Applica>ons. Zivan Ori, CEO & Co-founder, E8 Storage

Deep Storage for Exponential Data. Nathan Thompson CEO, Spectra Logic

DDN About Us Solving Large Enterprise and Web Scale Challenges

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

IBM FlashSystem. IBM FLiP Tool Wie viel schneller kann Ihr IBM i Power Server mit IBM FlashSystem 900 / V9000 Storage sein?

Extremely Fast Distributed Storage for Cloud Service Providers

TITLE. the IT Landscape

Kevin Conley SVP and GM, Client Storage Solutions

An Oracle White Paper December Accelerating Deployment of Virtualized Infrastructures with the Oracle VM Blade Cluster Reference Configuration

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

Take control of storage performance

All-Flash Storage Solution for SAP HANA:

FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

Dell EMC Isilon All-Flash

HGST: Market Creator to Market Leader

Create a Flexible, Scalable High-Performance Storage Cluster with WekaIO Matrix

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek

Accelerating Ceph with Flash and High Speed Networks

Lightspeed. Solid. Impressive. Nytro 3000 SAS SSD

SanDisk Overview. Mike Chenery Senior Fellow. 5th Annual Needham HDD & Memory Conference Boston, MA. November 3, 2011

Building for the Next 10 x of Computing. Christopher Bergey Vice President Mobile, Connected and Wearable Solutions June 5, 2014

Aerospike Scales with Google Cloud Platform

New Approach to Unstructured Data

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

IBM DS8880F All-flash Data Systems

At the heart of a new generation of data center infrastructures and appliances. Sept 2017

Corporate Update. Enabling The Use of Data January Mellanox Technologies

Getting the Most Out of SSDs IT System Optimization Best Practices

VMware Virtual SAN Technology

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Lenovo Database Configuration

Deep Learning Performance and Cost Evaluation

SOFTWARE-DEFINED BLOCK STORAGE FOR HYPERSCALE APPLICATIONS

SNIA Developers Conference - Growth of the iscsi RDMA (iser) Ecosystem

SurFS Product Description

Accelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage

It s Time to Move Your Critical Data to SSDs Introduction

FLASHARRAY//M Business and IT Transformation in 3U

INTEL NEXT GENERATION TECHNOLOGY - POWERING NEW PERFORMANCE LEVELS

Nimble Storage vs HPE 3PAR: A Comparison Snapshot

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads

MODERNISE WITH ALL-FLASH. Intel Inside. Powerful Data Centre Outside.

Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor

Flash Storage Complementing a Data Lake for Real-Time Insight

The Datacentered Future Greg Huff CTO, LSI Corporation

Storage Optimization with Oracle Database 11g

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

Hyper-converged storage for Oracle RAC based on NVMe SSDs and standard x86 servers

A Better Storage Solution

Survey: Users Share Their Storage Performance Needs. Jim Handy, Objective Analysis Thomas Coughlin, PhD, Coughlin Associates

Deep Learning Performance and Cost Evaluation

McKesson mixes SSDs with HDDs for Optimal Performance and ROI. Bob Fine, Dir., Product Marketing

Innovative Edge Storage Solutions for the Video Surveillance Industry

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

Introducing NVDIMM-X: Designed to be the World s Fastest NAND-Based SSD Architecture and a Platform for the Next Generation of New Media SSDs

Maximizing Fraud Prevention Through Disruptive Architectures Delivering speed at scale.

Red Hat Ceph Storage and Samsung NVMe SSDs for intensive workloads

Life In The Flash Director - EMC Flash Strategy (Cross BU)

Deep Dive on SimpliVity s OmniStack A Technical Whitepaper

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc.

NexentaStor 5.x Reference Architecture

Evolution of Rack Scale Architecture Storage

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

A New Metric for Analyzing Storage System Performance Under Varied Workloads

Modernize Without. Compromise. Modernize Without Compromise- All Flash. All-Flash Portfolio. Haider Aziz. System Engineering Manger- Primary Storage

LATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN

AMD Disaggregates the Server, Defines New Hyperscale Building Block

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics

Software-Defined Storage with the InfiniFlash System from SanDisk: Decouple Storage for Lower Costs, Higher Performance, and Improved Availability

The Impact of Hyper- converged Infrastructure on the IT Landscape

Persistent Memory in Mission-Critical Architecture (How and Why) Adam Roberts Engineering Fellow, Western Digital Corporation

The Fastest And Most Efficient Block Storage Software (SDS)

DDN. DDN Updates. DataDirect Neworks Japan, Inc Nobu Hashizume. DDN Storage 2018 DDN Storage 1

ADDENDUM TO: BENCHMARK TESTING RESULTS UNPARALLELED SCALABILITY OF ITRON ENTERPRISE EDITION ON SQL SERVER

ADVANCED IN-MEMORY COMPUTING USING SUPERMICRO MEMX SOLUTION

IBM System Storage DS8870 Release R7.3 Performance Update

SATA in Mobile Computing

Decentralized Distributed Storage System for Big Data

How to Network Flash Storage Efficiently at Hyperscale. Flash Memory Summit 2017 Santa Clara, CA 1

朱义普. Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration. Director, North Asia, HPC

HCI: Hyper-Converged Infrastructure

Isilon Performance. Name

Transcription:

Leveraging Flash in Scalable Environments: A Systems Perspective on How FLASH Storage is Displacing Disk Storage Roark Hilomen, Engineering Fellow Systems & Software Solutions May 3, 2016

Forward-Looking Statements During our meeting today, we may make forward-looking statements. Any statement that refers to expectations, projections, or other characterizations of future events or circumstances is a forward-looking statement, including those related to product performance, cost, capacity, various use cases, expectations that flash will displace traditional media and HDD, and deployment of flash with Hadoop. Risks that may cause these forward-looking statements to be inaccurate include among others: products may not perform as expected, at the cost expected, in the capacity expected, use cases may not apply as expected, flash may not displace traditional media and HDD as expected, and deployment of flash with Hadoop may not continue as expected or at all; or the other risks detailed from time-to-time in our Securities and Exchange Commission filings and reports, including, but not limited to, our annual report on Form 10-K for the year ended January 3, 2016. This release contains statements from third parties. We undertake no obligation to update these forward-looking statements, which speak only as of the date hereof or the date of issuance by a third party, as applicable. 2

Speeds and Feeds Technology Throughput Latency (micro) AFR 1Gbe Ethernet 80MB/s 400+ (40+ RDMA) IP based solutions 10Gbe Ethernet 800MB/s 400+ (40+ RDMA) IP based solutions 25Gbe Ethernet ~2GB/s 400+ (40+ RDMA) IP based solutions 40Gbe ~3+GB/s 400+ (40+ RDMA) IP based solutions HDD Enterprise ~200MB/s 4-6ms / 200-300 IOPS 0.73 Published AFR 0.73% (various by vendor) HDD Cloud ~200MB/s 4-6ms / 100 IOPS 0.73 6G SAS SSD 550+MB/s 80-800 0.15 Dual Ported Capable * Latency is controller bound and GC impacted 12G SAS SSD 700MB-1GB/s Per Port PCIE/NVME SSD 800-2+GB/s 50-200 0.15 80-600 0.15 Dual Ported Capable * Latency is controller bound and GC impacted 6G SATA SSD 500+MB/s 300-800 0.15 Consumer Grade 3

Speeds and Feeds Comparison Media 1 Media 2 Ratio Fill Time @ 4TB 6G SAS/SATA HDD ~2-3x 2 hrs vs 5 ½ hrs 12G SAS SSD HDD ~5x 1 hr vs 5 ½ hrs) 12G SAS SSD 6G SAS/SATA SSD ~2x 1 hr vs 2 hrs PCIE/NVME HDD 10x ½ hr vs 5 ½ hrs PCIE/NVME 6G SAS/SATA SSD 4-6ms / 200-300 IOPS ½ hr vs 2 hrs PCIE/NVME 12G SAS SSD 4-6ms / 100 IOPS ½ hr vs 1 hr Transport Fill time @ 2TB HDD 6SAS/SATA 12G SAS PCIE/NVME 1Gbe IP ~7 hrs.4.15.08 0.04 10Gbe IP ~40 min 4 1 ½ ~1 0.4 25Gbe IP ~15 min 10 ~3 ½ ~2 ~1 40Gbe IP ~10 min 15 ~5 ½ ~3 ~1 ½ 100Gbe IP ~4min 40 ~14 ½ 8 4 4

Is it the Economics or is it Really about Value Flash Fragmenting into Extremes Lowest cost consumer drives ~42c /GB Street. Prosumer Devices Highest capacity, balanced cost/performance Capacity Cost 4TB / 8TB Flash Devices Highest performance, high cost, limited capacity Application Filter is a Modifier to Value Real-time application ignores cost for performance Performanc e Application Filter Big Data requires capacity and cost at performance penalty DB requires middle ground of capacity, cost, and performance Value 5

Value Real Time/Near Real Time Characteristics Latency is king (lower the better) High value results (Time to results utmost importance) Completion measured in ms DRAM is expensive and data sets are in 10s or 100s TB PCIe/NVME Cache/Intermediary Store to offset Cost/Performance Application Capacity Cost Performanc e In Memory Hyper Cubes In Memory DB (NoSQL) Edge Sensor Analytics Values High Rating: Performance (low latency, high IO) High Rating: Capacity in Single TB xx Moderate Rating: Cost Application Filter Value 6

Value Batch Analytics Characteristics Large data sets (100s TB-> 100s PB) Completion measured in seconds/minutes/hours Bandwidth is king, was network bottlenecked Application Value Search Commerce/eCommerce Close to core sensor analytics Fraud detection during transaction High Rating: Bandwidth/Aggregate IOPS High Rating: Capacity High Rating: Cost Moderate Rating: Performance (not sub ms latency bound) Capacity Performanc e Value Cost Application Filter 7

Data Lakes? Or is it really Ponds, Lakes, Rivers, and Oceans WAN limitations pushing need for Edge to Core Big Data/Data Reduction Common Factor - lack of real estate and power constraints As ponds feed lakes, lakes feed ocean Commerce : Retail -> Regional DC -> Source of Truth Sensor Analytics: Critical Infrastructure -> Geolocal DC -> Source of Truth LAN getting faster 1->10->25->100->400Gbe striking distance Hyperscale 10Gbe->40Gbe, transceiver change 25Gbe->100Gbe, not a heavy lift WAN still a bottleneck 8

Big Data Use Cases Where Flash is Displacing Traditional Media InfiniFlash System - Purpose-built for Big Data 64 x 8TB flash cards (500TB) 2+ MIOPS 4k Read (aggregate) 500+ KIOPS 4k Write (aggregate) 15GB/s Chassis level bandwidth (full duplex) 1/10 AFR of HDD (20HDD failure for every InfiniFlash card) Chassis level idle @ less then 200W, active @ 400-500W (Comparative HDD @700W avg) Endurance (multi-exabyte at chassis level) 3U 9

Big Data Use Cases Where Flash is Displacing Traditional Media Perspective Active Archive (Rack) - Heavy read, Batch/Periodic Writes (90/10) 15 IF100, 2 Servers Capacity 7.5PB (6.8PB Usable) vs. 8TB HDD usable @4+PB Bandwidth 255 GB/s vs. HDD @ 30GB/s Power 3KW Idle, 7KW all active vs. HDD @ 9+KW Data Lake Model (Rack) Heavy Read, Moderate Writes (70/30) 8-12 IF100, 8-12 Servers Capacity 4-6PB Bandwidth up to 180GB/s Power 6KW Active Active Analytics Model (Rack) (50/50) 4-8 IF100, 8-24 Servers Capacity 2-4 PB Bandwidth up to 120GB/s Power 10KW Very Active 10

InfiniFlash Acquisition vs. Operations Ratio 3 year TCO comparison Tradtional Hadoop S3 InfiniFlash EC TCA 3 year TCO 11

Sensor Analytics / Video Surveillance Value Event Event Characteristics Data set size High def 100KB/s each 4K 10s MB/s each Source 10,000s or more data generators /sec Capacity and ingest bandwidth key Constrained by rack space/power Time 12

Federal/State, Commerce, ecommerce, Banking Value Q4 Q1 Q2 Q3 Large Data Set x Millions Individual Seasonality Data latent until needed Constant data adjustment Fraud analytics Buying patterns Taxes Resources Highly Inefficient Procured for high watermark 10% utilized on non-peak All data important, but very long tail Access pattern varies 13

Edge To Core Sample use cases Retail/eCommerce Critical Infrastructure Banking/Fraud Detection Oil & Gas Commonality Unwanted Data Fidelity Reduction from Edge to Core High Value to Analyze at Edge, Regional, and Core High Value to get to Core, WAN Limitations Source of Truth Regional Data Center Edge Location 14

Key Technologies for Flash Displacement of HDD Thicker network pipes (data pump) Erasure coding (thesis: Flash will never get below the price of HDD) Deep archive for HDD (1.5x cost multiplier, fail in place enablement) Equivalent to hybrid-based storage arrays on flash (1.15x cost multiplier) With heavy compute requirements for HDD and less so for flash, price within all-flash systems are within 20% or less with better TCO advantages and still better performance Advent of Big Data flash (devices of 4-16TB changes endurance and IO dynamics Throughput optimized flash vs general-purpose flash. Cost savings passed to customer. Advent of in-memory applications and fast network 15

Thank You! @BigDataFlash #bigdataflash http://bigdataflash.sandisk.com 2016 SanDisk Corporation. All rights reserved. SanDisk is a trademark of SanDisk Corporation, registered in the United States and other countries. InfiniFlash is a trademark of SanDisk Corporation. All other product and company names are used for identification purposes and may be trademarks of their respective holder(s).

Big Data on Flash: a Performance Perspective BSC paper: https://www.bscmsrc.eu/sites/default/files/bsc-msr aloja.pdf SNDK: http://www.sandisk.com/assets/docs/increasing-hadoop-performancewith-sandisk-ssds-whitepaper.pdf 17

18

19