SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research

Similar documents
N V M e o v e r F a b r i c s -

Maximizing Data Center and Enterprise Storage Efficiency

Toward a Memory-centric Architecture

Implementing SQL Server 2016 with Microsoft Storage Spaces Direct on Dell EMC PowerEdge R730xd

Designing elastic storage architectures leveraging distributed NVMe. Your network becomes your storage!

Was ist dran an einer spezialisierten Data Warehousing platform?

New Approach to Unstructured Data

Technology Advancement in SSDs and Related Ecosystem Changes

Apache Spark Graph Performance with Memory1. February Page 1 of 13

Approaching the Petabyte Analytic Database: What I learned

Maximizing heterogeneous system performance with ARM interconnect and CCIX

FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC

Innovator, Disruptor or Laggard, Where will your storage applications live? Next generation storage

Gen-Z Memory-Driven Computing

Recent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect

SUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS

Application Advantages of NVMe over Fabrics RDMA and Fibre Channel

Evolution of Rack Scale Architecture Storage

Caribou: Intelligent Distributed Storage

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

Performance Benefits of Running RocksDB on Samsung NVMe SSDs

Colin Cunningham, Intel Kumaran Siva, Intel Sandeep Mahajan, Oracle 03-Oct :45 p.m. - 5:30 p.m. Moscone West - Room 3020

Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE

Computational Storage: Acceleration Through Intelligence & Agility

IBM Emulex 16Gb Fibre Channel HBA Evaluation

33% 148% 2. at 4 years. Silo d applications & data pockets. Slow Deployment of new services. Security exploits growing. Network bottlenecks

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory

Next Generation Enterprise Solutions from ARM

Oracle Exadata: Strategy and Roadmap

Accelerating Business Analytics with Flash Storage and FPGAs

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

Solid State Storage Technologies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Oracle Exadata X7. Uwe Kirchhoff Oracle ACS - Delivery Senior Principal Service Delivery Engineer

SOFTWARE-DEFINED BLOCK STORAGE FOR HYPERSCALE APPLICATIONS

stec Host Cache Solution

Enabling NVMe I/O Scale

Highly Scalable, Non-RDMA NVMe Fabric. Bob Hansen,, VP System Architecture

HPE ProLiant Gen10. Franz Weberberger Presales Consultant Server

Hardware NVMe implementation on cache and storage systems

Warehouse- Scale Computing and the BDAS Stack

DataON and Intel Select Hyper-Converged Infrastructure (HCI) Maximizes IOPS Performance for Windows Server Software-Defined Storage

Summarizer: Trading Communication with Computing Near Storage

ACCELERATE YOUR ANALYTICS GAME WITH ORACLE SOLUTIONS ON PURE STORAGE

Lenovo Database Configuration for Microsoft SQL Server TB

By John Kim, Chair SNIA Ethernet Storage Forum. Several technology changes are collectively driving the need for faster networking speeds.

POWER9 Announcement. Martin Bušek IBM Server Solution Sales Specialist

Accelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage

NVMe: The Protocol for Future SSDs

Building dense NVMe storage

Hypervisors at Hyperscale

How to Network Flash Storage Efficiently at Hyperscale. Flash Memory Summit 2017 Santa Clara, CA 1

FC-NVMe. NVMe over Fabrics. Fibre Channel the most trusted fabric can transport NVMe natively. White Paper

Big Data Systems on Future Hardware. Bingsheng He NUS Computing

Hardened Security in the Cloud Bob Doud, Sr. Director Marketing March, 2018

Benefits of 25, 40, and 50GbE Networks for Ceph and Hyper- Converged Infrastructure John F. Kim Mellanox Technologies

IBM Power Systems Update. David Spurway IBM Power Systems Product Manager STG, UK and Ireland

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage

Preface. Fig. 1 Solid-State-Drive block diagram

Altera SDK for OpenCL

Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet

IBM Power 9 надежная платформа для развертывания облаков. Ташкент. Юрий Кондратенко Cross-Brand Sales Specialist

QLE10000 Series Adapter Provides Application Benefits Through I/O Caching

NVM Express Awakening a New Storage and Networking Titan Shaun Walsh G2M Research

Hitachi Virtual Storage Platform Family

White Paper: Smashing Big Data Costs with GZIP Hardware. Abstract. Juan D. Deaton, Ph.D. Research and Development

Evaluation of the Chelsio T580-CR iscsi Offload adapter

Pivot3 Acuity with Microsoft SQL Server Reference Architecture

PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 4 th, 2014 NEC Corporation

Persistent Memory. High Speed and Low Latency. White Paper M-WP006

IBM Power AC922 Server

Facilitating IP Development for the OpenCAPI Memory Interface Kevin McIlvain, Memory Development Engineer IBM. Join the Conversation #OpenPOWERSummit

DDN. DDN Updates. DataDirect Neworks Japan, Inc Nobu Hashizume. DDN Storage 2018 DDN Storage 1

Accelerating Data Centers Using NVMe and CUDA

SD Express Cards with PCIe and NVMeTM Interfaces

When, Where & Why to Use NoSQL?

Extending the NVMHCI Standard to Enterprise

LATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED

Enyx soft-hardware design services and development framework for FPGA & SoC

IBM FlashSystem. IBM FLiP Tool Wie viel schneller kann Ihr IBM i Power Server mit IBM FlashSystem 900 / V9000 Storage sein?

Achieving Memory Level Performance: Secrets Beyond Shared Flash

Looking ahead with IBM i. 10+ year roadmap

STORAGE NETWORKING TECHNOLOGY STEPS UP TO PERFORMANCE CHALLENGES

Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc.

GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS

Dell EMC PowerEdge R740xd as a Dedicated Milestone Server, Using Nvidia GPU Hardware Acceleration

Lenovo Database Configuration

Flash Storage Complementing a Data Lake for Real-Time Insight

Performance Innovations with Oracle Database In-Memory

Rack-scale Data Processing System

Interface Trends for the Enterprise I/O Highway

It s Time for Mass Scale VDI Adoption

Delivering High-Throughput SAN with Brocade Gen 6 Fibre Channel. Friedrich Hartmann SE Manager DACH & BeNeLux

What is Gluent? The Gluent Data Platform

Windows 10 IoT Overview. Microsoft Corporation

Arm Processor Technology Update and Roadmap

NGD Systems: Introduction to Computational Storage

Optimizing Apache Spark with Memory1. July Page 1 of 14

ADVANCED IN-MEMORY COMPUTING USING SUPERMICRO MEMX SOLUTION

Transcription:

SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research 1

The world s most valuable resource Data is everywhere! May. 2017 Values from Data! Need infrastructures for Volume and Velocity of data! 2

Case Study: OCP storage server (1/2) PCIe Switch DRAM CPU Root complex PCIe Switch Needs to scale up due to data growth PCIe Switch High cost of moving data 3

Case Study: OCP storage server (2/2) 0 PCIe Switch 0 1 DRAM CPU 16 lanes of PCIe = ~ 16 GB/s Root complex 1 PCIe Switch Throughput gap of 66x 15 PCIe Switch 31 32 channels X ~500 MB/s = ~16 GB/s 1 64 s X ~16 GB/s/SSD = ~ Flash 1TB/sSSD 0 31 4

Programmable components in data center Software-defined control and management is an inevitable trend that is already touching other parts of the data center infrastructure Don t Leave Storage Behind! Software-Defined Networking (SDN) making network switches and server NIC cards increasingly programmable for enhanced networkwide functionality Programmable GPGPUs and FGPAs leveraged by new generations of applications like deep learning Rapidly-changing requirements can be supported on-the-fly once DC infrastructures become dynamically programmable 5

Next up: Storage (1/2) Unfortunately, the lack of such programmable capabilities in storage results in a major disconnect in terms of the speed of innovation between application/os and storage infrastructures! While application/os is patched with new/improved functionality every few weeks at cloud speed while storage devices are off limits for such sustained innovation during their hardware life cycle of 3-5 years in data centers. 6

Next up: Storage (2/2) A fully programmable storage gives opportunities to better bridge the gap between application/os needs and storage capabilities/limitations, while allowing us to innovate in-house at cloud speed. s can be programmable? 7

Today s NAND (PCIe) Host Server Controller PCIe gen3 (4 lanes) Hardware b/w = 4 GB/s Achievable b/w = 2-2.5 GB/s Throughput gap of 2x - 8x 16-32 channels @ 500MB/s Total = 8-16 GB/sec 8

Past efforts Smart SSD (1/2) Query Processing on Smart SSDs: Opportunities and Challenges, SIGMOD 2013 Goal: Exploring the opportunities and challenges associated with running selected database operations inside an SSD Used Samsung SAS SSD Modified Microsoft SQL Server 2012 to offload database operations onto a Samsung Smart SSD Simple selection and aggregation operators were hard-coded and compiled into the firmware of the SSD 9

Past efforts Smart SSD (2/2) Base Case Smart SSD TPC-H Q6: Result: Compared to the base case, 70% better query response time 2X energy efficiency Device became a performance bottleneck! The dev. environment was not ready! 10

Past efforts YourSQL (1/2) YourSQL: A High-Performance Database System Leveraging In-Storage Computing, VLDB16 Goal: Accelerating data-intensive queries with the help of hardware pattern matcher Used Samsung PCIe SSD Modified a variation of MySQL to realize early filtering of data by offloading data scanning of a query to programmable SSDs Developed a framework (called Biscuit) that follows a data-flow model 11

Past efforts YourSQL (2/2) Result: Compared to the base case, 11X Speed up Computing resource is not powerful enough! Existing applications need to be redesigned! 12

Challenges of past efforts Not enough spare processing power H/W architecture limitations Programming tools are not application-developer friendly Prototype devices are not accessible to application developers 13

Disruptive trend that enable SoftFlash (CPU #cores/clock speed, hardware offload, DRAM) Abundant resources inside SSD Frugal resources inside SSD Today s SSD Program -mable SSD Embedded CPU, proprietary firmware OS General purpose CPU, server-like OS (Linux) (Ease of programmability inside SSD) 14

The SoftFlash project Goal: Embrace flash SSDs as a first-class programmable platform in the cloud data center Add custom capabilities to storage over time Better bridge the gap between application needs and flash media capabilities/limitations Innovate in-house at cloud speed Powerful and flexible prototype board with enterprise-grade capabilities and resources Linux, SDK, user/kernel libraries for the on-chip H/W accelerators, built-in FTL Moving compute closer to storage, flexible storage interface, secure computation 15

Hardware prototype (1/3) Dragon fire card DragonFire Card (DFC): A programmable SSD device designed by DellEMC and NXP (acquired by Qualcomm) Developed for research purposes Can be equipped with various forms of NVM (RAM, Flash, etc) Composed of two types of boards: main and storage boards An open-source community (http://github.com/dfc- OpenSource) is organized around this hardware platform with teams working on a wide range of issues 16

Hardware prototype (2/3) Main board 4 X 10Gbps SFP+ - RoCE protocol 16GB DRAM 8-Core ARMv8 (1.8GHz) Hardware Accelerators Compression (10Gbps) Encryption (10Gbps) PatternMatching (20Gbps) 1 X 4-Lane PCIe 3.0 2 X 4-Lane PCIe 3.0 17

Hardware Prototype (3/3) Storage Board 4 X DIMM Slots FPGA (storage controller) 2 X M.2 SSDs 18

Revisit: OCP storage server 0 PCIe Switch 0 1 16 lanes of PCIe = ~ 16 GB/s CPU DRAM Root complex 1 Throughput gap PCIe of 66x Switch 2.3GHz/core X 20 cores = 46GHz Compute capability 15 gap of 20x PCIe Switch 64 s X ~16 31 GB/s/SSD = ~ Flash 1TB/sSSD 0 1.8GHz/core X 8 cores X 64 SSDs = 921.6GHz + DRAM, H/W accelerators! 1 31 19

Software framework Host NVMe Driver PCIe RPC Driver SDK/Library PCIe (NVMe) SSD NVMe Driver Command Proxy Block Device Interface Flash Translation Layer Regular SSD path RPC Driver App1 App2 App3 Container Container Container OS: Linux Programmable SSD path Flash Memory Controller H/W Accelerator Drivers Device Firmware 20

Application 1: Move compute closer to data Reduced data movement across storage/network/memory/cpu for compute Programable SSD Host Server Example: Stream analytics over data logs Moving compute to inside SSD to leverage low latency, high bandwidth access to data Select/Project/Aggregation for relational database/data warehouse services 21

Application 2: Agile, flexible storage Custom SSD capabilities to better meet application needs Programable SSD Host Server Example: Multi-streamed writes for cloud storage platform Read I/O priority for latency-sensitive services Atomic writes for transactional services Agile, flexible storage interface leveraging programmability within SSD 22

Example) Stream-Aware Flash Data Stream #1 Data Stream #2 Data Stream #1 Data Stream #2 Incoming Writes Tag writes with stream ID Flash pages Erase Block A Erase Block B Erase Block A Erase Block B Regular (stream-unaware) SSD Stream-aware Programmable SSD (2x device lifetime, 1.5x read throughput) 23

Application 3: Secure computation in cloud SSD provides a trusted domain for secure computation on encrypted data, without cleartext leaving the device Programable SSD Trusted domain for secure computation; cleartext not allowed to egress this boundary Example: Existing and new scenarios in a trusted cloud setting -- user stores encrypted data in the cloud and needs to do compute over it 24

Value propositions Moving compute close data Programmable SSD Agile, flexible Storage Secure Computation 25

Big Data Analysis (1/6) Today, big data analytics fetches huge volumes of data from storage and processes it in host server Programmable SSDs enable data analytics inside storage Exploit higher bandwidth inside SSD (vs. SSD external interface) Leverage ARM cores + hardware offload engines inside SSD 26

Big Data Analysis (2/6) Efficient use of heterogeneous hardware in the data center for higher performance @lower power Free up expensive host server CPU + memory resources, opportunities to increase service density Reduced energy footprint due to significantly less data movement + low power compute inside SSD 27

Big Data Analysis (3/6) Traditional Arch. Compute Cluster Compute Node Compute Node D C A Fetch compressed data from storage cluster B B Decompress data Storage Cluster Storage Node Storage Node Storage Node A C D Decode data Do required computations (filtering, aggregation, etc) 28

Big Data Analysis (4/6) - programmable SSDs Compute Cluster Compute Node Compute Node D A Return only results to Storage Cluster B Decompress data Storage Cluster Storage Node Storage Node Storage Node A C D Decode data Do required computations (filtering, aggregation, etc) D C B D D C C B B 29

Big Data Analysis (5/6) Apache Hive Hive is a data warehouse infrastructure built on top of Hadoop Designed to enable easy data summarization, ad-hoc querying and analysis of large volumes of data Encoding based on data type Run-length encoding for integer Dictionary encoding for string Compression using a codec Zlib or Snappy 30

Big Data Analysis (6/6) Preiminary Result Scanning a ZLIB-compressed, integer dataset (1 Billion rows, ~10GB) on a X86 server or inside the programmable SSD Note that only a single core was used! What if we used +60 programmable SSDs? 31

Conclusion The SoftFlash project proposes to create a software-defined storage substrate of flash SSDs in the data center that is as programmable, agile, and flexible as the applications and operating systems accessing it from servers. This is made possible by recent disruptive trends in the flash storage industry towards increased easy of programmability and abundance of resources in side the SSD We are still in an early stage! 32

Thank you! jaedo@microsoft.com 33