Architectural Principles for Networked Solid State Storage Access

Similar documents
Extending RDMA for Persistent Memory over Fabrics. Live Webcast October 25, 2018

Everything You Wanted To Know About Storage: Part Teal The Buffering Pod

Mark Rogov, Dell EMC Chris Conniff, Dell EMC. Feb 14, 2018

EVERYTHING YOU WANTED TO KNOW ABOUT STORAGE, BUT WERE TOO PROUD TO ASK Where Does My Data Go? August 1, :00 am PT

RDMA Requirements for High Availability in the NVM Programming Model

Use Cases for iscsi and FCoE: Where Each Makes Sense

Planning For Persistent Memory In The Data Center. Sarah Jelinek/Intel Corporation

SNIA Tutorial 3 EVERYTHING YOU WANTED TO KNOW ABOUT STORAGE: Part Teal Queues, Caches and Buffers

Adrian Proctor Vice President, Marketing Viking Technology

Virtualization Practices:

Apples to Apples, Pears to Pears in SSS performance Benchmarking. Esther Spanjer, SMART Modular

SNIA NVM Programming Model Workgroup Update. #OFADevWorkshop

EVERYTHING YOU WANTED TO KNOW ABOUT STORAGE, BUT WERE TOO PROUD TO ASK Part Cyan Storage Management. September 28, :00 am PT

Multi-Cloud Storage: Addressing the Need for Portability and Interoperability

Overview and Current Topics in Solid State Storage

Tiered File System without Tiers. Laura Shepard, Isilon

Overview and Current Topics in Solid State Storage

SNIA Tutorial 1 A CASE FOR FLASH STORAGE HOW TO CHOOSE FLASH STORAGE FOR YOUR APPLICATIONS

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE

Virtualization Practices: Providing a Complete Virtual Solution in a Box

PRESENTATION TITLE GOES HERE. Understanding Architectural Trade-offs in Object Storage Technologies

LTFS Bulk Transfer Standard PRESENTATION TITLE GOES HERE

SAS: Today s Fast and Flexible Storage Fabric

Facing an SSS Decision? SNIA Efforts to Evaluate SSS Performance. Ray Lucchesi Silverton Consulting, Inc.

The Benefits of Solid State in Enterprise Storage Systems. David Dale, NetApp

SAS: Today s Fast and Flexible Storage Fabric. Rick Kutcipal President, SCSI Trade Association Product Planning and Architecture, Broadcom Limited

Fibre Channel vs. iscsi. January 31, 2018

Trends in Worldwide Media and Entertainment Storage

The Long-Term Future of Solid State Storage Jim Handy Objective Analysis

Tutorial. A New Standard for IP Based Drive Management. Mark Carlson SNIA Technical Council Co-Chair

Jeff Dodson / Avago Technologies

Expert Insights: Expanding the Data Center with FCoE

WHAT HAPPENS WHEN THE FLASH INDUSTRY GOES TO TLC? Luanne M. Dauber, Pure Storage

Notes & Lessons Learned from a Field Engineer. Robert M. Smith, Microsoft

SNIA s SSSI Solid State Storage Initiative. Jim Pappas Vice-Char, SNIA

Ron Emerick, Oracle Corporation

Storage Performance Management Overview. Brett Allison, IntelliMagic, Inc.

Everything You Wanted To Know About Storage (But Were Too Proud To Ask) The Basics

Application Access to Persistent Memory The State of the Nation(s)!

Introducing NVDIMM-X: Designed to be the World s Fastest NAND-Based SSD Architecture and a Platform for the Next Generation of New Media SSDs

Scaling Data Center Application Infrastructure. Gary Orenstein, Gear6

Mobile and Secure Healthcare: Encrypted Objects and Access Control Delegation

REMOTE PERSISTENT MEMORY ACCESS WORKLOAD SCENARIOS AND RDMA SEMANTICS

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation

Felix Xavier CloudByte Inc.

Analysis and Optimization. Carl Waldspurger Irfan Ahmad CloudPhysics, Inc.

Persistent Memory over Fabrics

Data Deduplication Methods for Achieving Data Efficiency

Remote Persistent Memory SNIA Nonvolatile Memory Programming TWG

Interoperable Cloud Storage with the CDMI Standard. Mark Carlson, SNIA TC and Oracle Chair, SNIA Cloud Storage TWG

Felix Xavier CloudByte Inc.

NVMe SSDs with Persistent Memory Regions

Memories of Tomorrow

ADVANCED DEDUPLICATION CONCEPTS. Thomas Rivera, BlueArc Gene Nagle, Exar

How to create a synthetic workload test. Eden Kim, CEO Calypso Systems, Inc.

The SNIA NVM Programming Model. #OFADevWorkshop

ASPECTS OF DEDUPLICATION. Dominic Kay, Oracle Mark Maybee, Oracle

Performance and Innovation of Storage. Advances through SCSI Express

A Promise Kept: Understanding the Monetary and Technical Benefits of STaaS Implementation. Mark Kaufman, Iron Mountain

Load-Sto-Meter: Generating Workloads for Persistent Memory Damini Chopra, Doug Voigt Hewlett Packard (Enterprise)

ADVANCED DATA REDUCTION CONCEPTS

The Role of WAN Optimization in Cloud Infrastructures. Josh Tseng, Riverbed

RoCE vs. iwarp A Great Storage Debate. Live Webcast August 22, :00 am PT

Paving the Way to the Non-Volatile Memory Frontier. PRESENTATION TITLE GOES HERE Doug Voigt HP

Opportunities from our Compute, Network, and Storage Inflection Points

Interoperable Cloud Storage with the CDMI Standard. Mark Carlson, SNIA TC and Oracle Co-Chair, SNIA Cloud Storage TWG

Persistent Memory over Fabric (PMoF) Adding RDMA to Persistent Memory Pawel Szymanski Intel Corporation

Storage in combined service/product data infrastructures. Craig Dunwoody CTO, GraphStream Incorporated

Design and Implementations of FCoE for the DataCenter. Mike Frase, Cisco Systems

Deploying Public, Private, and Hybrid. Storage Cloud Environments

A Crash Course In Wide Area Data Replication. Jacob Farmer, CTO, Cambridge Computer

An Introduction to Key Management for Secure Storage. Walt Hubis, LSI Corporation

NVDIMM-N Cookbook: A Soup-to-Nuts Primer on Using NVDIMM-Ns to Improve Your Storage Performance

SRM: Can You Get What You Want? John Webster, Evaluator Group.

SRM: Can You Get What You Want? John Webster Principal IT Advisor, Illuminata

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Annual Update on Flash Memory for Non-Technologists

OpenStack Manila An Overview of Manila Liberty & Mitaka

ETHERNET ENHANCEMENTS FOR STORAGE. Sunil Ahluwalia, Intel Corporation

PCIe Storage Beyond SSDs

What NVMe /TCP Means for Networked Storage. Live Webcast January 22, 2019

High Availability Using Fault Tolerance in the SAN. Wendy Betts, IBM Mark Fleming, IBM

Application Recovery. Andreas Schwegmann / HP

The File Systems Evolution. Christian Bandulet, Sun Microsystems

How Next Generation NV Technology Affects Storage Stacks and Architectures

File vs. Block vs. Object Storage

Blockchain Beyond Bitcoin. Mark O Connell

Trends in Data Protection and Restoration Technologies. Jason Iehl, NetApp

Windows Support for PM. Tom Talpey, Microsoft

Remote Persistent Memory With Nothing But Net Tom Talpey Microsoft

STORAGE NETWORKING TECHNOLOGY STEPS UP TO PERFORMANCE CHALLENGES

Cloud Archive and Long Term Preservation Challenges and Best Practices

NVMe: The Protocol for Future SSDs

Evolution of Fibre Channel. Mark Jones FCIA / Emulex Corporation

Windows Support for PM. Tom Talpey, Microsoft

Computer Organization

Why Composable Infrastructure? Live Webcast February 13, :00 am PT

Testing. Michael Ault, IBM Oracle FlashSystem Consulting Manager

IP-Based Object Drives Now Have a Management Standard

Persistent Memory, NVM Programming Model, and NVDIMMs. Presented at Storage Field Day June 15, 2017

Transcription:

Architectural Principles for Networked Solid State Storage Access

SNIA Legal Notice! The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted.! Member companies and individual members may use this material in presentations and literature under the following conditions:! Any slide or slides used must be reproduced in their entirety without modification! The SNIA must be acknowledged as the source of any material used in the body of any document containing material from these presentations.! This presentation is a project of the SNIA Education Committee.! Neither the author nor the presenter is an attorney and nothing in this presentation is intended to be, or should be construed as legal advice or an opinion of counsel. If you need legal advice or a legal opinion please contact your attorney.! The information presented herein represents the author's personal opinion and current understanding of the relevant issues involved. The author, the presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK.

Who We Are Doug Voigt Chair, NVM Programming Model, SNIA Technical Council Distinguished Technologist, HPE J Metz SNIA Board of Directors R&D Engineer Cisco 3

Focus on Principles! Technical and market dynamics are creating complexity! Permutations of technologies and positioning! Intricacies of new technology integration! Foundational principles have not changed! Application views of memory and storage! The role of data access time (latency) in system architecture! Principles transcend details! This is not a best practices presentation! This is presentation does not report benchmark results 4

Times are changing! Storage access times are shrinking! Emerging persistent memory (PM) technologies! Faster than flash! Interconnects are getting faster! Bandwidth! Latency! Creating challenges for software! Software stacks are starting to dominate latency! Trigger for a fundamental architecture shift 5

Principles we will cover today! Application view I/O vs. Load/Store (Ld/St)! Disruption caused by latency! Persistence domains! Latency impact of remote persistence 6

Application View

Input/Output (IO)! is read or written using RAM buffers! Software has control over how to wait! Poll! Context switch! Status is explicitly checked by software 8

What is IO? Software RAM Controller Media Create RAM buffer Buffer IO copies data between RAM and Disk (HDD: Hard Disk Drive, SSD: Solid State Disk) SSD/ HDD 9

What is IO? Software RAM Controller Media Create RAM buffer Put command in RAM (NVMe, SATA, SCSI) Send command Storage stack software creates a command and sends it to a disk Buffer Command Receive Command SSD/ HDD 10

What is IO? Software RAM Controller Media Create RAM buffer Put command in RAM (NVMe, SCSI) Buffer Send command Command Wait for completion (Context Switch or Poll) Poll: thread stays busy -or- Context Switch: thread releases processor Receive Command Copy SSD/ HDD 11

What is IO? Software RAM Controller Media Create RAM buffer Put command in RAM (NVMe, SCSI) Send command Disk drive sends status message upon completion Check status in RAM Buffer Command Status Receive Command Copy Send Status For more on latency and throughput see: https://www.brighttalk.com/webcast/663/164323 SSD/ HDD 12

What is IO? Software RAM Controller Media 2uS 100 ms Create RAM buffer Put command in RAM (NVMe, SCSI) Send command Check status in RAM Buffer Command Range of IO latency Lower bound: single threaded SSD Upper bound: multi-threaded HDD Status Receive Command Copy Send Status For more on latency and throughput see: https://www.brighttalk.com/webcast/663/164323 SSD/ HDD 13

What is IO? Software RAM Controller Media Create RAM buffer Put command in RAM (NVMe, SCSI) Buffer Send command Command 2uS 100 ms Wait for completion (Context Switch or Poll) Check status in RAM Status Receive Command Copy Send Status SSD/ HDD 14

Enter Persistent Memory! Much faster than flash alone! Available today in the form of an NVDIMM! Can be used to store application data structures! Software uses Pmalloc (Persistent Memory Allocation) to allocate a region of persistent memory! Accessed by processor instructions like any other memory 15

Load/Store (Ld/St)! Load and Store are processor instructions! Generated by compliers to access memory! Proxy for any processor instruction that accesses memory! is loaded into or stored from processor registers! Software may be forced wait for data during an instruction! The thread is busy! The processor won t let the thread do anything else! No status checking errors generate exceptions 16

What is Ld/St? Software Pmalloc memory Processor Registers Processor Cache Media Pmalloc returns persistent memory (PM) where software data structures can be permanently stored. 17

What is Ld/St? < 200 ns Software Pmalloc memory Load Processor Registers Address Processor Cache A load (Ld) instruction fetches data from RAM into a processor register and may keep a copy in cache. This normally takes less than 200 ns. & Ack Media 18

What is Ld/St? Software Pmalloc memory Load Processor Registers Address Processor Cache Media Manipulate Registers & Ack Store Applications manipulate registers and use store (St) instructions to send data into processor cache. 19

What is Ld/St? Software Pmalloc memory Load Processor Registers Address Processor Cache Media Manipulate Registers & Ack Store Eventually the processor will flush data out of its cache. Address & 20

What is Ld/St? Software Pmalloc memory Load Processor Registers Address Processor Cache Media Manipulate Registers & Ack Store Sync/Flush Flush If the application cares about persistence it must force a flush Address & Ack 21

What is Ld/St? < 200 ns Software Pmalloc memory Load Manipulate Registers Processor Registers Address Processor Cache & Ack Media Store Sync/Flush Flush Address & Ack 22

Compare and Contrast Application View! IO! is read or written using RAM buffers! Software has control over how to wait (context switch or poll)! Status is explicitly checked by software! Ld/St! is loaded into or stored from processor registers! Software is forced by processor to wait for data during instruction! No status checking errors generate exceptions 23

Ld/St and memory mapping! Memory mapped IO! PCIe card registers are given memory addresses! Ld/St is used to access those registers! Designed for use to control IO (or other) card functions! Memory mapped files! stored in files (in PM) is given memory addresses! Ld/St is used to manipulate user data! Must flush after St to assure persistence 24

Latency Is The Disrupter

Compare and Contrast Technology View Key Question: Is it OK to force the CPU memory access pipeline to wait for a given storage or memory access to complete before proceeding?! May pause a thread in the middle of executing an instruction! May block reads and writes to all cores 26

When does latency fit into a Ld/St Budget? Key Question: Is it OK to force the CPU memory access pipeline to wait for a given storage or memory access to complete before proceeding?! May pause a thread in the middle of executing an instruction! May block reads and writes to all cores Technology DDR Connect Mem Mapped PCIe PCIe I/O Network (IB/Enet/FC) RAM Yes Maybe No No Battery+RAM Yes Maybe No No PM Yes Maybe No No Flash Sometimes Sometimes No No Rotating NA NA No No 27

Sometimes??!!! Flash write latency generally does not fit in Ld/St Budget! Write process takes much longer than a St! Long latency tail due to erase cycle, garbage collection! Flash read latency interference from controller! Erase cycle/wear leveling forces block granularity indexing! Index traversal at SSD scale would stress Ld/St latency budget! Flash with cache! Makes latency unpredictable! Is cache non-volatile? Hold that thought 28

Latency Thresholds Cause Disruption Latency (Log) 2 us 200 ns Min, Max Latencies For Example Technologies Context Switch NUMA * IO Ld/St HDD SATA SSD NVMe Flash Persistent Memory * Non Uniform Memory Access 29

Persistence Domains

What is a persistence domain?! A place that retains data during power loss! Because it is inherently non-volatile! Because it is flushed during or after power loss! Examples:! A group of flash or PM devices! An NVDIMM! A group of RAM devices that are powered by a battery! (part of) A controller that flushes caches and RAM to an SSD! With battery power! With capacitance 31

Examples of Persistence Domains RAM CPU NVDIMM PCIe SSD/ HDD! Requires Flush-On-Demand! Explicit flush by application or library! Disrupts CPU Caches Disk Array Controller RAM CPU PCIe SSD/ HDD! Enables Flush-On-Fail! Everything is saved after power loss! We re gonna need a bigger battery 32

Interconnect Latency

What about remote persistence domains? Application flush( ); RAM CPU PCIe Network Adapter Network Network Adapter PCIe RAM CPU! Network Latency Disruption SSD/ HDD NVDIMM! Most of today s networks do not approach Ld/St timescales.! The fastest port to port times for IB and projected Omnipath technologies might fit within a Ld/St budget! RDMA helps by eliminating round trips and software overhead! BUT For more on RDMA see: https://www.brighttalk.com/webcast/663/185909 34

What about remote persistence domains? Application flush( ); RAM CPU PCIe Network Adapter Network Network Adapter PCIe RAM CPU! Network Latency Disruption! There is no hardware flush action over PCI SSD/ HDD NVDIMM! AND there is no RDMA completion that indicates persistence! SO software is involved at the remote node 35

So How About This? Application write( ); RAM CPU PCIe Network Adapter Network Network Adapter PCIe RAM CPU! OK, but its not Ld/St! Avoids remote processor! Minimal remote latency for IO SSD/ HDD NVDIMM 36

Or This? Application flush( ); RAM CPU PCIe Network Adapter Network Network Adapter PCIe RAM CPU! Remote Flush on Fail SSD/ HDD! This is how inter-controller links on disk arrays work.! But it is still very difficult to get sync anywhere near Ld/St latency! especially across blades or chassis! or with routing! AND remember, the remote write is on flush, not St. 37

Current Industry Challenge Bring remote flush latency much closer to Ld/St time! Effects all of the elements in the system! Processor! NIC! Persistence Domain! Lots of activity is happening so stay tuned 38

NVM Programming Model TWG work One important use case is described by SNIA s new NVM PM Remote Access for High Availability paper! Taxonomy of local and remote use cases! Application recoverability requirements! Model and requirements for remote implementations of optimized flush as described in the SNIA NVM Programming Model 39

After This Webcast! Please rate this Webcast and provide us with feedback! This Webcast and a PDF of the slides will be posted to the SNIA Ethernet Storage Forum (ESF) website and available on-demand! http://www.snia.org/forums/esf/knowledge/webcasts! A full Q&A from this webcast, including answers to questions we couldn't get to today, will be posted to the SNIA-ESF blog! http://sniaesfblog.org/! Follow us on Twitter @ SNIAESF 40

Architectural Principles for Networked Solid State Storage Access