Bringing Intelligence to Enterprise Storage Drives

Similar documents
Bringing Intelligence to Enterprise Storage Drives

The Changing Face of Edge Compute

WAVE ONE MAINFRAME WAVE THREE INTERNET WAVE FOUR MOBILE & CLOUD WAVE TWO PERSONAL COMPUTING & SOFTWARE Arm Limited

Deep Learning on Arm Cortex-M Microcontrollers. Rod Crawford Director Software Technologies, Arm

How to Build Optimized ML Applications with Arm Software

How to Build Optimized ML Applications with Arm Software

An introduction to Machine Learning silicon

Accelerating intelligence at the edge for embedded and IoT applications

Unleash the DSP performance of Arm Cortex processors

Comprehensive Arm Solutions for Innovative Machine Learning (ML) and Computer Vision (CV) Applications

A Secure and Connected Intelligent Future. Ian Smythe Senior Director Marketing, Client Business Arm Tech Symposia 2017

Using Virtual Platforms To Improve Software Verification and Validation Efficiency

ARM instruction sets and CPUs for wide-ranging applications

A Developer's Guide to Security on Cortex-M based MCUs

2017 Arm Limited. How to design an IoT SoC and get Arm CPU IP for no upfront license fee

Artificial Intelligence Enriched User Experience with ARM Technologies

Enable AI on Mobile Devices

Beyond TrustZone PSA Reed Hinkel Senior Manager Embedded Security Market Development

Machine learning for the Internet of Things

Advanced IP solutions enabling the autonomous driving revolution

CCIX: a new coherent multichip interconnect for accelerated use cases

DynamIQ Processor Designs Using Cortex-A75 & Cortex- A55 for 5G Networks

Cortex-A75 and Cortex-A55 DynamIQ processors Powering applications from mobile to autonomous driving

Arm s Latest CPU for Laptop-Class Performance

New Approaches to Connected Device Security

A New Security Platform for High Performance Client SoCs

Making progress vs strategy

Implementing debug. and trace access. through functional I/O. Alvin Yang Staff FAE. Arm Tech Symposia Arm Limited

Compute solutions for mass deployment of autonomy

Connect your IoT device: Bluetooth 5, , NB-IoT

Arm crossplatform. VI-HPS platform October 16, Arm Limited

Cortex-A75 and Cortex-A55 DynamIQ processors Powering applications from mobile to autonomous driving

Accelerating Data Centers Using NVMe and CUDA

Connect Your IoT Device: Bluetooth 5, , NB-IoT

DynamIQ Processor Designs Using Cortex-A75 & Cortex-A55 for 5G Networks

Accelerate Ceph By SPDK on AArch64

SmartNICs: Giving Rise To Smarter Offload at The Edge and In The Data Center

Arm TrustZone Armv8-M Primer

Arm s First-Generation Machine Learning Processor

Hardware- Software Co-design at Arm GPUs

Software Ecosystem for Arm-based HPC

TZMP-1 Software Reference Implementation. Ken Liu 2018-Mar-12

Accelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage

Computational Storage: Acceleration Through Intelligence & Agility

A backward glance and a forward view

Optimize HPC - Application Efficiency on Many Core Systems

Next Generation Architecture for NVM Express SSD

How to Network Flash Storage Efficiently at Hyperscale. Flash Memory Summit 2017 Santa Clara, CA 1

VMware Virtual SAN Technology

DPDK on Arm64 Status Review & Plan

Bringing the benefits of Cortex-M processors to FPGA

Fast and Easy Persistent Storage for Docker* Containers with Storidge and Intel

Beyond Hardware IP An overview of Arm development solutions

Accelerating Ceph with Flash and High Speed Networks

How to protect Automotive systems with ARM Security Architecture

N V M e o v e r F a b r i c s -

Beyond TrustZone Part 1 - PSA

The Next Steps in the Evolution of Embedded Processors

Designing, developing, debugging ARM Cortex-A and Cortex-M heterogeneous multi-processor systems

Each Milliwatt Matters

Building an All Flash Server What s the big deal? Isn t it all just plug and play?

InfiniBand Networked Flash Storage

SNIA Developers Conference - Growth of the iscsi RDMA (iser) Ecosystem

ARM processors driving automotive innovation

The Next Steps in the Evolution of ARM Cortex-M

Beyond TrustZone Security Enclaves Reed Hinkel Senior Manager Embedded Security Market Develop

Building supercomputers from embedded technologies

Exploring System Coherency and Maximizing Performance of Mobile Memory Systems

Next Generation Enterprise Solutions from ARM

Open Standards for Vision and AI Peter McGuinness NNEF WG Chair CEO, Highwai, Inc May 2018

What is gem5 and where do I get it?

RapidIO.org Update. Mar RapidIO.org 1

Designing Security & Trust into Connected Devices

ARM: Investing for future growth

SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research

Diversity of. connectivity required for scalable IoT devices. Sam Grove Principal Software Engineer Arm. Arm TechCon 2017.

Arm Mbed Edge. Shiv Ramamurthi Arm. Arm Tech Symposia Arm Limited

Flash Controller Solutions in Programmable Technology

Maximizing heterogeneous system performance with ARM interconnect and CCIX

Designing Security & Trust into Connected Devices

Building High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink. Robert Kaye

Toward a Memory-centric Architecture

FC-NVMe. NVMe over Fabrics. Fibre Channel the most trusted fabric can transport NVMe natively. White Paper

Developing the Bifrost GPU architecture for mainstream graphics

RISC-V: Opportunities and Challenges in SoCs

Building Ultra-Low Power Wearable SoCs

ARM TrustZone for ARMv8-M for software engineers

Opportunities from our Compute, Network, and Storage Inflection Points

Improve the container image compatibility on Arm

Gen-Z Memory-Driven Computing

Using MRAM to Create Intelligent SSDs

OCP Engineering Workshop - Telco

Architecting For Availability, Performance & Networking With ScaleIO

The Evolving NAND Flash Business Model for SSD

2017 Storage Developer Conference. Mellanox Technologies. All Rights Reserved.

Optimizing the Data Center with an End to End Solutions Approach

ARM Multimedia IP: working together to drive down system power and bandwidth

LINARO CONNECT 23 HKG18 George Grey, Linaro CEO

The PowerEdge M830 blade server

HPE SimpliVity. The new powerhouse in hyperconvergence. Boštjan Dolinar HPE. Maribor Lancom

Transcription:

Bringing Intelligence to Enterprise Storage Drives Neil Werdmuller Director Storage Solutions Arm Santa Clara, CA 1

Who am I? 28 years experience in embedded Lead the storage solutions team Work closely with the industry s top storage suppliers Previously in wireless at Texas Instruments BSc in Computer Science from Portsmouth University (UK) I enjoy brewing beer at home! Santa Clara, CA 2

What will we cover today? What benefit does in-storage compute bring What is needed for in-storage compute Ecosystem support available Machine Learning in-storage Santa Clara, CA 3

Arm computing is everywhere #1 shipping processor in storage devices > 5Bn people using Arm-based mobile phones 21Bn Arm-based chips shipped in 2017 120Bn Arm-based chips to date Santa Clara, CA 4

Why computation is moving to storage Bandwidth Power Cost Latency Reliability Security Santa Clara, CA 5

Moving data to compute 1. Compute waits for data Takes time to move data across fabric 2. Adds latency Multiple layers of interface and protocols Data copied many times Bottlenecks often exist 3. Consumes bandwidth and power Moving data is expensive Data copies increase system DRAM 3. Compute 1. Request data from storage 2. Move data to compute 4. Move results to storage Santa Clara, CA 6

In-storage compute 1. Compute happens on the data Moved from flash to in-drive DRAM and processed 2. Lowest possible latency No additional protocols just flash to DRAM 3. Minimum bandwidth and power Data remains on the drive only results delivered 4. Data centric processing Workloads specific to the computation deployed to the drive 5. Security Unencrypted data does not leave the drive 1. Request operation 2. Compute 3. Return result Santa Clara, CA 7

Compression Host Control Garbage collection, ECC/LDPC Error Correction/Detection, Read Scrubbing... Control Encryption Flash Control Control NAND Flash NAND Flash MVMe Command Queues Garbage collection, ECC/LDPC Error Correction/Detection, Read Scrubbing... NAND Flash NAND Flash Compute in SSD controllers Compute: Frontend: Host I/F + Flash Translation Layer Cortex-R or Cortex-A series : Flash management Cortex-R or Cortex-M series Accelerators: Encryption, LDPC, Arm NEON, ML, FPGA : DRAM ~1GB for each 1TB of flash Storage: 256GB to 64TB flash storage Interfaces: PCIe/SATA/SAS SSD SoC Functionality: PCIe/ SATA/ SAS Host Control DRAM Cache + FTL Tables Frontend (s) Flash Translation Layer Cache / Buffer Management Flash System Management Block Management Wear Levelling, Bad Block mapping... Host / Flash Address Translation CLE ALE CE RE WE WP R/B IDin IDout CLE ALE CE RE WE WP R/B IDin IDout CLE ALE CE RE WE WP R/B IDin IDout CLE ALE CE RE WE WP R/B IDin IDout Control Control Santa Clara, CA 8

What is needed for in-storage compute? Application processor to run a HLOS Runs high-level OS through a memory management unit Linux for Open Source software stacks All major Linux distribution run on Arm Networking protocol stacks: Ethernet, TCP/IP, RDMA Linux workloads: Me-oF, databases, file-systems, SDS, custom applications, Containerization for workload deployment and portability Accelerators for specific workloads or for Machine Learning (ML) Potentially combined with additional accelerators: Custom hardware, ML, FPGA, GPU, DSP Custom workloads can be run without apps processor, but complex to develop/deploy Santa Clara, CA 9

In-storage compute evolution Separate Cortex-A series processor Enables any SSD (or HDD) to run Linux Wide performance range from Cortex-A5 Cortex-A76 Single SoC for cost/latency reduction Lower latency by removing internal (PCIe) interface Separation of apps processor and the SSD processing Shared DRAM and other SoC resources Combined into frontend/apps processor Hypervisor provides SSD frontend separation from Linux Lowest cost and tightest integration Lowest possible latency Highest internal bandwidth Network Interface Network Interface Network Interface Applications DRAM Applications DRAM Combined Applications and Frontend PCIe Internal Interface DRAM PCIe Internal Interface Frontend Controller DRAM Frontend Controller Santa Clara, CA 10

The benefits of in-storage compute Scalability of compute From a single, low-power core to multiple clusters of high-performance cores Flexibility One SSD SoC that is suitable for: In-storage compute, Edge SSD, Me-oF, In-storage compute: Network Interface Applications DRAM Frontend Controller Security TrustZone isolates Linux and SSD functionality Processing of data is all done on the drive Decrypted data remains on the drive Network Interface DRAM Combined Applications and Frontend Santa Clara, CA 11

Linux ecosystem on Arm www.linaro.org Santa Clara, CA 12

A few Works on Arm partners www.worksonarm.com Santa Clara, CA 13

Increasing performance (ops/second) Machine Learning (ML) Data center Arm enables ML everywhere 90% of the AI-enabled units shipped today are based on Arm (source: IDC WW Embedded and Intelligent Systems Forecast, 2017-2022 and Arm forecast) Voice and image recognition Image enhancement Autonomous driving Keyword detection Pattern training Object detection Santa Clara, CA Increasing power and cost (Silicon) 14

Machine Learning training Training data Neural Network Model For each piece of data used to train the model, millions of model parameters are adjusted The process is repeated many times until the model delivers satisfactory performance In-storage compute can perform this directly on the data, without movement Santa Clara, CA 15

Machine Learning inference Input Neural Network of model Output 97.4% confidence 96.4% confidence When new data is presented to the trained model, large numbers of multiply-add operations are performed using the new data and the model parameters. Process is performed once In-storage compute can perform this directly on the data, returning results Santa Clara, CA 16

Project Trillium: Arm s ML computing platform Ecosystem AI/ML Applications, Algorithms and Frameworks Android NNAPI Software Products CMSIS-NN Software Libraries Optimized for Arm Hardware Arm NN Compute Library Object Detection Libraries Arm Hardware IP for AI/ML Hardware Products CPU Armv8 SVE Armv8 SVE GPU NPU and ODP Machine Learning(ML) Object Detection(OD) processors Partner IP DSPs, FPGAs, Accelerators Santa Clara, CA 17

Arm Compute Library Enable faster deployment of ML Any Arm processor with NEON and GPU (OpenCL) Significant performance uplift compared to OSS alternatives (up to 15x) Optimized low-level functions for Arm Most popular ML and CV functions Supports common ML frameworks Over 80 functions in all Quarterly releases CMSIS-NN separately targets Cortex-M Publicly available now (no fee, MIT license) developer.arm.com/technologies/compute-library Key Function Categories Neural network Convolutions Colour manipulation Feature detection Basic arithmetic GEMM Pyramids Filters Image reshaping Mathematical functions TensorFlow / Caffe etc. Arm NN Cortex-A Application Compute Library Mali Arm ML Santa Clara, CA 18

Bringing Intelligence to SDD Enterprise SSD already has considerable compute performance Cortex-A series already adopted by some Arm partners In-storage compute delivers with low-cost, low-power and lowest-latency Machine Learning use cases growing rapidly In-storage compute and Edge SSD opens up many possibilities Please download this presentation COMP-301-1: Bringing Intelligence to Enterprise Storage Drives If you missed my first talk on Tuesday please download the presentation ARCH-102-1: Transforming an SSD into a Cost-Effective Edge Server Santa Clara, CA 19

To learn more I ll be here all week! For more information, visit storage.arm.com. neil.werdmuller@arm.com linkedin.com/nwerdmuller Santa Clara, CA 20

21 2018 Arm Limited Thank You! Danke! Merci! 谢谢! ありがとう! Gracias! Kiitos! 감사합니다 धन यव द

The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners. 22 www.arm.com/company/policies/trademarks Santa Clara, CA