Near- Data Computa.on: It s Not (Just) About Performance

Size: px
Start display at page:

Download "Near- Data Computa.on: It s Not (Just) About Performance"

Transcription

1 Near- Data Computa.on: It s Not (Just) About Performance Steven Swanson Non- Vola0le Systems Laboratory Computer Science and Engineering University of California, San Diego 1

2 Solid State Memories NAND flash Ubiquitous, cheap Sort of slow, idiosyncra0c Phase change, Spin torque MRAMs, etc. On the horizon DRAM- like speed DRAM or flash- like density 2

3 Bandwidth Rela.ve to disk x à 2.4x/yr PCIe- Flash (2012) PCIe- PCM (2010) PCIe- PCM (2015?) DDR Fast NVM (2016?) Hard Drives (2006) PCIe- Flash (2007) 7200x à 2.4x/yr /Latency Rela.ve To Disk 3

4 Programmability in High-Speed Peripherals As hardware gets more sophis0cated, programmability emerges GPUs fixed func0on à Programmable shaders à full- blown programmability NICs Fixed func0on à MAC offloading à TCP offload Storage has been leb behind It hasn t goden any faster in 40 years 4

5 Why Near-data Processing? Well- studied Data Dependent Accesses Internal bandwidth interconnect bandwidth Moving data takes energy Trusted Computa0on NDP compute can be energy efficient Worth a Closer Look Storage latencies Interconnect Latency Storage latencies CPU latency Seman0c Flexibility NDP compute can be trusted Improved Real- 0me capabili0es 5

6 Diverse SSD Semantics File system offload/os bypass [Caulfield, ASPLOS 2012] [Zhang, FAST 2012] Caching support [Bhaskaran, INFLOW 2013] [Saxena, EuroSYS 2012] Database transac0ons support [Coburn SOSP 2013] [Ouyang HPCA 2011] [Prabhakaran OSDI 2008] NoSQL offload Samsung s SmartSSD [De FCCM 2013] Sparse storage address space FusionIO DFS Lessons Learned Our Goal: Make Implementa0on costs are high Programming an SSD Easy and Flexible. Fit with applica0ons is uncertain 6

7 The Sobware Defined SSD 7

8 The Software-defined SSD SDSSD Host Machine Application Generic SSD- APP interface PCIe Ctrl PCIe Bridge Memory Memory Memory Controller SSD CPU Memory Memory 3GB/s 3GB/s 3GB/s Memory Controller SSD CPU Memory Memory Memory Controller SSD CPU 4 GB/s 8

9 SDSSD Programming Model The host communicates with SDSSD processors via a general RPC interface. Host- CPU Send and receive RPC requests SSD- CPU Send and receive RPC requests Manage access to NV storage banks SDSSD CPU RPC Host CPU PCIe Ring Data streamer Memory 9

10 The SDSSD Prototype FPGA- based implementa0on DDR2 DRAM emulates PCM Configurable memory latency 48 ns reads, 150 ns writes 64GB across 8 controllers PCIe: 2 GB/s, full duplex 10

11 SDSSD Case Studies Basic IO Caching Transac0on processing NoSQL Databases 11

12 Basic IO 12

13 Normal IO Operations: Base-IO Base- IO App provides read/write func0ons. Just like a normal SSD. Applica0ons make system calls to the kernel The kernel block driver issues RPCs user accesses 1. SysCall 2. Read/Write RPCs kernel module SDSSD CPU NVM 13

14 Faster IO: Direct access IO OS installs access permissions on behalf of a process. Applica0ons issue RPCs directly to HW The App checks permission user accesses 1. userspace channel kernel module 2. kernel installs perms 3. userspace channel direct IO accesses SDSSD CPU NVM 14

15 Preliminary Performance Comparison 1800 Read 1800 Write Bandwidth (MB/s) Direct- IO Direct- IO- HW Base- IO Direct- IO Direct- IO- HW Base- IO

16 Caching 16

17 Caching NVMs will be caches before they are storage Applica.on file file Cache hits should be as fast as Direct- IO Caching App Tracks dirty data Tracks usage sta0s0cs Userspace Kernel File System system reads/writes Cache Library Cache Miss Cache Manager Device Driver Cache Hit Disk SDSSD Caching 17

18 4KB Cache Hit Latency 18

19 Transac0on Processing 19

20 Atomic Writes Write- ahead logging guarantees atomicity New interface for atomic opera0ons LogWrite(TID, file, file_off, log, log_off) Transac0on commit occurs at the memory interface à full memory bandwidth Logging Module inside SDSSD Transaction Table TID 15 PENDING TID 24 FREE TID 37 COMMITTED " " X " " Metadata File Log File New D New B New A New C Data File Old D Old B Old A Old C 20

21 Atomic Writes SDSSD atomic- writes outperform pure sobware approach by nearly 2.0x SDSSD atomic- writes achieve comparable performance to a pure hardware implementa0on AtomicWrite Bandwidth (GB/s) SDSSD Tx Opt SDSSD Tx Moneta Tx Soft atomic Request Size (KB) 21

22 Key Value Store 22

23 Key-value store Key- value store is a fundamental building block for many enterprise applica0ons Supports simple opera0ons o o o Get to retrieve the value corresponding to a key Put to insert or update a key- value pair Delete to remove a key- value pair Open- chaining hash table implementa0on. 23

24 Direct-IO based Key-Value Store Get K3 Key K3 Hash(K3) 0 1 K1 K4 V1 V4 K2 V2 K3 V3 Compare Keys M- 1 Index Structure I/O K7 V7 K8 V8 K9 V9 Key- value data Mismatch Match Host SSD 24

25 SDSSD Key-Value Store App Get K3 K1 V1 K2 V2 K3 V3 0 1 K4 V4 Key K3 Hash(K3) Hash Idx RPC(Get) M- 1 Index Structure Hash Idx Key K3 K7 V7 K8 V8 K9 V9 Mismatch Match Compare Keys Key- value data K3 V3 DMA Host SDSSD SPU 25

26 MemcacheDB performance 0.6 Throughput (in Million ops/s) Direct- IO Key- Value- HW Key- Value 0 Get Put Workload- A Workload- B Throughput (Million ops/s) Get Avg. chain length Direct- IO Key- Value Key- Value- HW Throughput (Million ops/s) Put Avg. chain length Direct- IO Key- Value Key- Value- HW 26

27 Ease-of-Use 27

28 Conclusion New memories argue for near- data processing But not just for bandwidth! Trust, low- latency, and flexibility are cri0cal too! Seman0c flexibility is valuable and viable 15% reduc0on in performance compared to fixed- func0on implementa0on 6-30x reduc0on in implementa0on 0me Every applica0on could have a custom SSD 28

29 Thanks! 29

30 Ques0ons? 30

Redrawing the Boundary Between So3ware and Storage for Fast Non- Vola;le Memories

Redrawing the Boundary Between So3ware and Storage for Fast Non- Vola;le Memories Redrawing the Boundary Between So3ware and Storage for Fast Non- Vola;le Memories Steven Swanson Director, Non- Vola;le System Laboratory Computer Science and Engineering University of California, San

More information

Willow: A User- Programmable SSD

Willow: A User- Programmable SSD Willow: A User- Programmable SSD Sudharsan Seshadri, Mark Gahagan, Sundaram Bhaskaran, Trevor Bunker, Arup De, Yanqin Jin, Yang Liu, and Steven Swanson Non- VolaDle Systems Laboratory Computer Science

More information

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Adrian M. Caulfield Arup De, Joel Coburn, Todor I. Mollov, Rajesh K. Gupta, Steven Swanson Non-Volatile Systems

More information

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010 Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed

More information

Onyx: A Prototype Phase-Change Memory Storage Array

Onyx: A Prototype Phase-Change Memory Storage Array Onyx: A Prototype Phase-Change Memory Storage Array Ameen Akel * Adrian Caulfield, Todor Mollov, Rajesh Gupta, Steven Swanson Non-Volatile Systems Laboratory, Department of Computer Science and Engineering

More information

Implica(ons of Non Vola(le Memory on So5ware Architectures. Nisha Talagala Lead Architect, Fusion- io

Implica(ons of Non Vola(le Memory on So5ware Architectures. Nisha Talagala Lead Architect, Fusion- io Implica(ons of Non Vola(le Memory on So5ware Architectures Nisha Talagala Lead Architect, Fusion- io Overview Non Vola;le Memory Technology NVM in the Datacenter Op;mizing sobware for the iomemory Tier

More information

New Abstractions for Fast Non-Volatile Storage

New Abstractions for Fast Non-Volatile Storage New Abstractions for Fast Non-Volatile Storage Joel Coburn, Adrian Caulfield, Laura Grupp, Ameen Akel, Steven Swanson Non-volatile Systems Laboratory Department of Computer Science and Engineering University

More information

Aerie: Flexible File-System Interfaces to Storage-Class Memory [Eurosys 2014] Operating System Design Yongju Song

Aerie: Flexible File-System Interfaces to Storage-Class Memory [Eurosys 2014] Operating System Design Yongju Song Aerie: Flexible File-System Interfaces to Storage-Class Memory [Eurosys 2014] Operating System Design Yongju Song Outline 1. Storage-Class Memory (SCM) 2. Motivation 3. Design of Aerie 4. File System Features

More information

Farewell to Servers: Resource Disaggregation

Farewell to Servers: Resource Disaggregation Farewell to Servers: Hardware, Software, and Network Approaches towards Datacenter Resource Disaggregation Yiying Zhang 2 Monolithic Computer OS / Hypervisor 3 Can monolithic Application Hardware servers

More information

Onyx: A Protoype Phase Change Memory Storage Array

Onyx: A Protoype Phase Change Memory Storage Array Onyx: A Protoype Phase Change ory Storage Array Ameen Akel Adrian M. Caulfield Todor I. Mollov Rajesh K. Gupta Steven Swanson Computer Science and Engineering University of California, San Diego Abstract

More information

PCIe Storage Beyond SSDs

PCIe Storage Beyond SSDs PCIe Storage Beyond SSDs Fabian Trumper NVM Solutions Group PMC-Sierra Santa Clara, CA 1 Classic Memory / Storage Hierarchy FAST, VOLATILE CPU Cache DRAM Performance Gap Performance Tier (SSDs) SLOW, NON-VOLATILE

More information

A Disseminated Distributed OS for Hardware Resource Disaggregation Yizhou Shan

A Disseminated Distributed OS for Hardware Resource Disaggregation Yizhou Shan LegoOS A Disseminated Distributed OS for Hardware Resource Disaggregation Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang Y 4 1 2 Monolithic Server OS / Hypervisor 3 Problems? 4 cpu mem Resource

More information

Designing a True Direct-Access File System with DevFS

Designing a True Direct-Access File System with DevFS Designing a True Direct-Access File System with DevFS Sudarsun Kannan, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau University of Wisconsin-Madison Yuangang Wang, Jun Xu, Gopinath Palani Huawei Technologies

More information

DCS-ctrl: A Fast and Flexible Device-Control Mechanism for Device-Centric Server Architecture

DCS-ctrl: A Fast and Flexible Device-Control Mechanism for Device-Centric Server Architecture DCS-ctrl: A Fast and Flexible ice-control Mechanism for ice-centric Server Architecture Dongup Kwon 1, Jaehyung Ahn 2, Dongju Chae 2, Mohammadamin Ajdari 2, Jaewon Lee 1, Suheon Bae 1, Youngsok Kim 1,

More information

Providing Safe, User Space Access to Fast, Solid State Disks

Providing Safe, User Space Access to Fast, Solid State Disks Providing Safe, User Space Access to Fast, Solid State Disks Adrian M. Caulfield Todor I. Mollov Louis Alex Eisner Arup De Joel Coburn Steven Swanson Computer Science and Engineering Department University

More information

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson A Cross Media File System Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson 1 Let s build a fast server NoSQL store, Database, File server, Mail server Requirements

More information

RDMA and Hardware Support

RDMA and Hardware Support RDMA and Hardware Support SIGCOMM Topic Preview 2018 Yibo Zhu Microsoft Research 1 The (Traditional) Journey of Data How app developers see the network Under the hood This architecture had been working

More information

Solid State Storage Technologies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Solid State Storage Technologies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Solid State Storage Technologies Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu NVMe (1) NVM Express (NVMe) For accessing PCIe-based SSDs Bypass

More information

Getting Real: Lessons in Transitioning Research Simulations into Hardware Systems

Getting Real: Lessons in Transitioning Research Simulations into Hardware Systems Getting Real: Lessons in Transitioning Research Simulations into Hardware Systems Mohit Saxena, Yiying Zhang Michael Swift, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau Flash Storage Stack Research SSD

More information

Persistent Memory over Fabrics

Persistent Memory over Fabrics Persistent Memory over Fabrics Rob Davis, Mellanox Technologies Chet Douglas, Intel Paul Grun, Cray, Inc Tom Talpey, Microsoft Santa Clara, CA 1 Agenda The Promise of Persistent Memory over Fabrics Driving

More information

SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research

SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research 1 The world s most valuable resource Data is everywhere! May. 2017 Values from Data! Need infrastructures for

More information

LITE Kernel RDMA. Support for Datacenter Applications. Shin-Yeh Tsai, Yiying Zhang

LITE Kernel RDMA. Support for Datacenter Applications. Shin-Yeh Tsai, Yiying Zhang LITE Kernel RDMA Support for Datacenter Applications Shin-Yeh Tsai, Yiying Zhang Time 2 Berkeley Socket Userspace Kernel Hardware Time 1983 2 Berkeley Socket TCP Offload engine Arrakis & mtcp IX Userspace

More information

Summarizer: Trading Communication with Computing Near Storage

Summarizer: Trading Communication with Computing Near Storage Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I, H.V. Krishina Giri Nara*, Jing Li, Hung-Wei Tseng, Steven Swanson, Murali Annavaram* *University of

More information

STORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us)

STORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us) 1 STORAGE LATENCY 2 RAMAC 350 (600 ms) 1956 10 5 x NAND SSD (60 us) 2016 COMPUTE LATENCY 3 RAMAC 305 (100 Hz) 1956 10 8 x 1000x CORE I7 (1 GHZ) 2016 NON-VOLATILE MEMORY 1000x faster than NAND 3D XPOINT

More information

Farewell to Servers: Hardware, Software, and Network Approaches towards Datacenter Resource Disaggregation

Farewell to Servers: Hardware, Software, and Network Approaches towards Datacenter Resource Disaggregation Farewell to Servers: Hardware, Software, and Network Approaches towards Datacenter Resource Disaggregation Yiying Zhang Datacenter 3 Monolithic Computer OS / Hypervisor 4 Can monolithic Application Hardware

More information

Solid State Storage Technologies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Solid State Storage Technologies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Solid State Storage Technologies Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu NVMe (1) The industry standard interface for high-performance NVM

More information

Extending RDMA for Persistent Memory over Fabrics. Live Webcast October 25, 2018

Extending RDMA for Persistent Memory over Fabrics. Live Webcast October 25, 2018 Extending RDMA for Persistent Memory over Fabrics Live Webcast October 25, 2018 Today s Presenters John Kim SNIA NSF Chair Mellanox Tony Hurson Intel Rob Davis Mellanox SNIA-At-A-Glance 3 SNIA Legal Notice

More information

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices Arash Tavakkol, Juan Gómez-Luna, Mohammad Sadrosadati, Saugata Ghose, Onur Mutlu February 13, 2018 Executive Summary

More information

Toward a Memory-centric Architecture

Toward a Memory-centric Architecture Toward a Memory-centric Architecture Martin Fink EVP & Chief Technology Officer Western Digital Corporation August 8, 2017 1 SAFE HARBOR DISCLAIMERS Forward-Looking Statements This presentation contains

More information

Mohsen Imani. University of California San Diego. System Energy Efficiency Lab seelab.ucsd.edu

Mohsen Imani. University of California San Diego. System Energy Efficiency Lab seelab.ucsd.edu Mohsen Imani University of California San Diego Winter 2016 Technology Trend for IoT http://www.flashmemorysummit.com/english/collaterals/proceedi ngs/2014/20140807_304c_hill.pdf 2 Motivation IoT significantly

More information

PASTE: A Networking API for Non-Volatile Main Memory

PASTE: A Networking API for Non-Volatile Main Memory PASTE: A Networking API for Non-Volatile Main Memory Michio Honda (NEC Laboratories Europe) Lars Eggert (NetApp) Douglas Santry (NetApp) TSVAREA@IETF 99, Prague May 22th 2017 More details at our HotNets

More information

Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University I/O System Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Introduction (1) I/O devices can be characterized by Behavior: input, output, storage

More information

Flash Controller Solutions in Programmable Technology

Flash Controller Solutions in Programmable Technology Flash Controller Solutions in Programmable Technology David McIntyre Senior Business Unit Manager Computer and Storage Business Unit Altera Corp. dmcintyr@altera.com Flash Memory Summit 2012 Santa Clara,

More information

The Non-Volatile Memory Verbs Provider (NVP): Using the OFED Framework to access solid state storage

The Non-Volatile Memory Verbs Provider (NVP): Using the OFED Framework to access solid state storage The Non-Volatile Memory Verbs Provider (NVP): Using the OFED Framework to access solid state storage Bernard Metzler 1, Animesh Trivedi 1, Lars Schneidenbach 2, Michele Franceschini 2, Patrick Stuedi 1,

More information

XPU A Programmable FPGA Accelerator for Diverse Workloads

XPU A Programmable FPGA Accelerator for Diverse Workloads XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for

More information

The Many Flavors of NAND and More to Come

The Many Flavors of NAND and More to Come The Many Flavors of NAND and More to Come Brian Shirley VP Micron Memory Product Group 1 NAND Market Growth Drivers Top 10 Applications by Units Shipped 4000 # of Units per Application 3500 Millions of

More information

Unblinding the OS to Optimize User-Perceived Flash SSD Latency

Unblinding the OS to Optimize User-Perceived Flash SSD Latency Unblinding the OS to Optimize User-Perceived Flash SSD Latency Woong Shin *, Jaehyun Park **, Heon Y. Yeom * * Seoul National University ** Arizona State University USENIX HotStorage 2016 Jun. 21, 2016

More information

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager 1 T HE D E N A L I N E X T - G E N E R A T I O N H I G H - D E N S I T Y S T O R A G E I N T E R F A C E Laura Caulfield Senior Software Engineer Arie van der Hoeven Principal Program Manager Outline Technology

More information

Comparing Performance of Solid State Devices and Mechanical Disks

Comparing Performance of Solid State Devices and Mechanical Disks Comparing Performance of Solid State Devices and Mechanical Disks Jiri Simsa Milo Polte, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University Motivation Performance gap [Pugh71] technology

More information

Solros: A Data-Centric Operating System Architecture for Heterogeneous Computing

Solros: A Data-Centric Operating System Architecture for Heterogeneous Computing Solros: A Data-Centric Operating System Architecture for Heterogeneous Computing Changwoo Min, Woonhak Kang, Mohan Kumar, Sanidhya Kashyap, Steffen Maass, Heeseung Jo, Taesoo Kim Virginia Tech, ebay, Georgia

More information

Evolving Solid State Storage. Bob Beauchamp EMC Distinguished Engineer

Evolving Solid State Storage. Bob Beauchamp EMC Distinguished Engineer Evolving Solid State Storage Bob Beauchamp EMC Distinguished Engineer Macro trends Scale-out and multi-core processors Flash hitting its stride but at a cliff? Looming DRAM challenge The rise of low-latency

More information

MANAGING MULTI-TIERED NON-VOLATILE MEMORY SYSTEMS FOR COST AND PERFORMANCE 8/9/16

MANAGING MULTI-TIERED NON-VOLATILE MEMORY SYSTEMS FOR COST AND PERFORMANCE 8/9/16 MANAGING MULTI-TIERED NON-VOLATILE MEMORY SYSTEMS FOR COST AND PERFORMANCE 8/9/16 THE DATA CHALLENGE Performance Improvement (RelaLve) 4.4 ZB Total data created, replicated, and consumed in a single year

More information

Efficient Memory Mapped File I/O for In-Memory File Systems. Jungsik Choi, Jiwon Kim, Hwansoo Han

Efficient Memory Mapped File I/O for In-Memory File Systems. Jungsik Choi, Jiwon Kim, Hwansoo Han Efficient Memory Mapped File I/O for In-Memory File Systems Jungsik Choi, Jiwon Kim, Hwansoo Han Operations Per Second Storage Latency Close to DRAM SATA/SAS Flash SSD (~00μs) PCIe Flash SSD (~60 μs) D-XPoint

More information

The Long-Term Future of Solid State Storage Jim Handy Objective Analysis

The Long-Term Future of Solid State Storage Jim Handy Objective Analysis The Long-Term Future of Solid State Storage Jim Handy Objective Analysis Agenda How did we get here? Why it s suboptimal How we move ahead Why now? DRAM speed scaling Changing role of NVM in computing

More information

Chapter 6. Storage and Other I/O Topics

Chapter 6. Storage and Other I/O Topics Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections

More information

Using MRAM to Create Intelligent SSDs

Using MRAM to Create Intelligent SSDs Using MRAM to Create Intelligent SSDs Jérôme Gaysse Senior Technology&Market Analyst jerome.gaysse@silinnov-consulting.com Santa Clara, CA 1 Study context Analysis of system & application Performance modeling

More information

Extreme Storage Performance with exflash DIMM and AMPS

Extreme Storage Performance with exflash DIMM and AMPS Extreme Storage Performance with exflash DIMM and AMPS 214 by 6East Technologies, Inc. and Lenovo Corporation All trademarks or registered trademarks mentioned here are the property of their respective

More information

NVMe over Universal RDMA Fabrics

NVMe over Universal RDMA Fabrics NVMe over Universal RDMA Fabrics Build a Flexible Scale-Out NVMe Fabric with Concurrent RoCE and iwarp Acceleration Broad spectrum Ethernet connectivity Universal RDMA NVMe Direct End-to-end solutions

More information

Overcoming System Memory Challenges with Persistent Memory and NVDIMM-P

Overcoming System Memory Challenges with Persistent Memory and NVDIMM-P Overcoming System Memory Challenges with Persistent Memory and NVDIMM-P JEDEC Server Forum 2017 Bill Gervasi, Discobolus Designs Copyright 2017 Jonathan Hinkle, Lenovo Datacenter Research and Technology

More information

Beyond Block I/O: Rethinking

Beyond Block I/O: Rethinking Beyond Block I/O: Rethinking Traditional Storage Primitives Xiangyong Ouyang *, David Nellans, Robert Wipfel, David idflynn, D. K. Panda * * The Ohio State University Fusion io Agenda Introduction and

More information

N V M e o v e r F a b r i c s -

N V M e o v e r F a b r i c s - N V M e o v e r F a b r i c s - H i g h p e r f o r m a n c e S S D s n e t w o r k e d f o r c o m p o s a b l e i n f r a s t r u c t u r e Rob Davis, VP Storage Technology, Mellanox OCP Evolution Server

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection

More information

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. November 6, Prof. Joe Pasquale

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. November 6, Prof. Joe Pasquale CSE 120: Principles of Operating Systems Lecture 10 File Systems November 6, 2003 Prof. Joe Pasquale Department of Computer Science and Engineering University of California, San Diego 2003 by Joseph Pasquale

More information

Exploiting the benefits of native programming access to NVM devices

Exploiting the benefits of native programming access to NVM devices Exploiting the benefits of native programming access to NVM devices Ashish Batwara Principal Storage Architect Fusion-io Traditional Storage Stack User space Application Kernel space Filesystem LBA Block

More information

Using Transparent Compression to Improve SSD-based I/O Caches

Using Transparent Compression to Improve SSD-based I/O Caches Using Transparent Compression to Improve SSD-based I/O Caches Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr

More information

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM Computer Architecture Computer Science & Engineering Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine

More information

Accessing NVM Locally and over RDMA Challenges and Opportunities

Accessing NVM Locally and over RDMA Challenges and Opportunities Accessing NVM Locally and over RDMA Challenges and Opportunities Wendy Elsasser Megan Grodowitz William Wang MSST - May 2018 Emerging NVM A wide variety of technologies with varied characteristics Address

More information

Key Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs

Key Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs IO 1 Today IO 2 Key Points CPU interface and interaction with IO IO devices The basic structure of the IO system (north bridge, south bridge, etc.) The key advantages of high speed serial lines. The benefits

More information

How Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC

How Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC How Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC Three Consortia Formed in Oct 2016 Gen-Z Open CAPI CCIX complex to rack scale memory fabric Cache coherent accelerator

More information

Disks: Structure and Scheduling

Disks: Structure and Scheduling Disks: Structure and Scheduling COMS W4118 References: Opera;ng Systems Concepts (9e), Linux Kernel Development, previous W4118s Copyright no2ce: care has been taken to use only those web images deemed

More information

Memory Hierarchy Y. K. Malaiya

Memory Hierarchy Y. K. Malaiya Memory Hierarchy Y. K. Malaiya Acknowledgements Computer Architecture, Quantitative Approach - Hennessy, Patterson Vishwani D. Agrawal Review: Major Components of a Computer Processor Control Datapath

More information

Accelerating Data Centers Using NVMe and CUDA

Accelerating Data Centers Using NVMe and CUDA Accelerating Data Centers Using NVMe and CUDA Stephen Bates, PhD Technical Director, CSTO, PMC-Sierra Santa Clara, CA 1 Project Donard @ PMC-Sierra Donard is a PMC CTO project that leverages NVM Express

More information

A Row Buffer Locality-Aware Caching Policy for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu

A Row Buffer Locality-Aware Caching Policy for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu A Row Buffer Locality-Aware Caching Policy for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu Overview Emerging memories such as PCM offer higher density than

More information

Data Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research

Data Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research Data Processing at the Speed of 100 Gbps using Apache Crail Patrick Stuedi IBM Research The CRAIL Project: Overview Data Processing Framework (e.g., Spark, TensorFlow, λ Compute) Spark-IO Albis Pocket

More information

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed

More information

Evaluating Phase Change Memory for Enterprise Storage Systems

Evaluating Phase Change Memory for Enterprise Storage Systems Hyojun Kim Evaluating Phase Change Memory for Enterprise Storage Systems IBM Almaden Research Micron provided a prototype SSD built with 45 nm 1 Gbit Phase Change Memory Measurement study Performance Characteris?cs

More information

Enabling NVMe I/O Scale

Enabling NVMe I/O Scale Enabling NVMe I/O Determinism @ Scale Chris Petersen, Hardware System Technologist Wei Zhang, Software Engineer Alexei Naberezhnov, Software Engineer Facebook Facebook @ Scale 800 Million 1.3 Billion 2.2

More information

Overview and Current Topics in Solid State Storage

Overview and Current Topics in Solid State Storage Overview and Current Topics in Solid State Storage Presenter name, company affiliation Presenter Rob name, Peglar company affiliation Xiotech Corporation SNIA Legal Notice The material contained in this

More information

January 28-29, 2014 San Jose

January 28-29, 2014 San Jose January 28-29, 2014 San Jose Flash for the Future Software Optimizations for Non Volatile Memory Nisha Talagala, Lead Architect, Fusion-io Gary Orenstein, Chief Marketing Officer, Fusion-io @garyorenstein

More information

Emerging NVM Memory Technologies

Emerging NVM Memory Technologies Emerging NVM Memory Technologies Yuan Xie Associate Professor The Pennsylvania State University Department of Computer Science & Engineering www.cse.psu.edu/~yuanxie yuanxie@cse.psu.edu Position Statement

More information

3D Xpoint Status and Forecast 2017

3D Xpoint Status and Forecast 2017 3D Xpoint Status and Forecast 2017 Mark Webb MKW 1 Ventures Consulting, LLC Memory Technologies Latency Density Cost HVM ready DRAM ***** *** *** ***** NAND * ***** ***** ***** MRAM ***** * * *** 3DXP

More information

Generic System Calls for GPUs

Generic System Calls for GPUs Generic System Calls for GPUs Ján Veselý*, Arkaprava Basu, Abhishek Bhattacharjee*, Gabriel H. Loh, Mark Oskin, Steven K. Reinhardt *Rutgers University, Indian Institute of Science, Advanced Micro Devices

More information

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble CSE 451: Operating Systems Spring 2009 Module 12 Secondary Storage Steve Gribble Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution

More information

CBM: A Cooperative Buffer Management for SSD

CBM: A Cooperative Buffer Management for SSD 3 th International Conference on Massive Storage Systems and Technology (MSST 4) : A Cooperative Buffer Management for SSD Qingsong Wei, Cheng Chen, Jun Yang Data Storage Institute, A-STAR, Singapore June

More information

VMware vsphere Virtualization of PMEM (PM) Richard A. Brunner, VMware

VMware vsphere Virtualization of PMEM (PM) Richard A. Brunner, VMware VMware vsphere Virtualization of PMEM (PM) Richard A. Brunner, VMware Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents

More information

WORT: Write Optimal Radix Tree for Persistent Memory Storage Systems

WORT: Write Optimal Radix Tree for Persistent Memory Storage Systems WORT: Write Optimal Radix Tree for Persistent Memory Storage Systems Se Kwon Lee K. Hyun Lim 1, Hyunsub Song, Beomseok Nam, Sam H. Noh UNIST 1 Hongik University Persistent Memory (PM) Persistent memory

More information

NVM PCIe Networked Flash Storage

NVM PCIe Networked Flash Storage NVM PCIe Networked Flash Storage Peter Onufryk Microsemi Corporation Santa Clara, CA 1 PCI Express (PCIe) Mid-range/High-end Specification defined by PCI-SIG Software compatible with PCI and PCI-X Reliable,

More information

Adrian Proctor Vice President, Marketing Viking Technology

Adrian Proctor Vice President, Marketing Viking Technology Storage PRESENTATION in the TITLE DIMM GOES HERE Socket Adrian Proctor Vice President, Marketing Viking Technology SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless

More information

SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory

SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory Hyeonho Song, Sam H. Noh UNIST HotStorage 2018 Contents Persistent Memory Motivation SAY-Go Design Implementation Evaluation

More information

HADP Talk BlueDBM: An appliance for Big Data Analytics

HADP Talk BlueDBM: An appliance for Big Data Analytics HADP Talk BlueDBM: An appliance for Big Data Analytics Sang-Woo Jun* Ming Liu* Sungjin Lee* Jamey Hicks+ John Ankcorn+ Myron King+ Shuotao Xu* Arvind* *MIT Computer Science and Artificial Intelligence

More information

SUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS

SUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS TABLE OF CONTENTS 2 THE AGE OF INFORMATION ACCELERATION Vexata Provides the Missing Piece in The Information Acceleration Puzzle The Vexata - Supermicro Partnership 4 CREATING ULTRA HIGH-PERFORMANCE DATA

More information

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark Testing validation report prepared under contract with Dell Introduction As innovation drives

More information

Non-Volatile Memory Cache Enhancements: Turbo-Charging Client Platform Performance

Non-Volatile Memory Cache Enhancements: Turbo-Charging Client Platform Performance Non-Volatile Memory Cache Enhancements: Turbo-Charging Client Platform Performance By Robert E Larsen NVM Cache Product Line Manager Intel Corporation August 2008 1 Legal Disclaimer INFORMATION IN THIS

More information

Flash Trends: Challenges and Future

Flash Trends: Challenges and Future Flash Trends: Challenges and Future John D. Davis work done at Microsoft Researcher- Silicon Valley in collaboration with Laura Caulfield*, Steve Swanson*, UCSD* 1 My Research Areas of Interest Flash characteristics

More information

Overview and Current Topics in Solid State Storage

Overview and Current Topics in Solid State Storage Overview and Current Topics in Solid State Storage Presenter name, company affiliation Presenter Rob name, Peglar company affiliation Xiotech Corporation SNIA Legal Notice The material contained in this

More information

SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS

SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CSAIL IAP MEETING MAY 21, 2013 Research Agenda Lack of technology progress Moore s Law still alive Power

More information

NVMe SSDs with Persistent Memory Regions

NVMe SSDs with Persistent Memory Regions NVMe SSDs with Persistent Memory Regions Chander Chadha Sr. Manager Product Marketing, Toshiba Memory America, Inc. 2018 Toshiba Memory America, Inc. August 2018 1 Agenda q Why Persistent Memory is needed

More information

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc.

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc. Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc. 1 DISCLAIMER This presentation and/or accompanying oral statements by Samsung

More information

Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation

Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation Demartek Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation Evaluation report prepared under contract with Emulex Executive Summary The computing industry is experiencing an increasing demand for storage

More information

Developing Low Latency NVMe Systems for HyperscaleData Centers. Prepared by Engling Yeo Santa Clara, CA Date: 08/04/2017

Developing Low Latency NVMe Systems for HyperscaleData Centers. Prepared by Engling Yeo Santa Clara, CA Date: 08/04/2017 Developing Low Latency NVMe Systems for HyperscaleData Centers Prepared by Engling Yeo Santa Clara, CA 95054 Date: 08/04/2017 Quality of Service IOPS, Throughput, Latency Short predictable read latencies

More information

Programmable Solutions for Data Center Applications

Programmable Solutions for Data Center Applications Programmable Solutions for Data Center Applications DS McIntyre Consulting dmm961@gmail.com 1 Topics Data Center Trends o Storage, Compute, Networking Technology Options FPGA Examples 2 Data Center Macro

More information

NOVA: The Fastest File System for NVDIMMs. Steven Swanson, UC San Diego

NOVA: The Fastest File System for NVDIMMs. Steven Swanson, UC San Diego NOVA: The Fastest File System for NVDIMMs Steven Swanson, UC San Diego XFS F2FS NILFS EXT4 BTRFS Disk-based file systems are inadequate for NVMM Disk-based file systems cannot exploit NVMM performance

More information

5 Solutions. Solution a. no solution provided. b. no solution provided

5 Solutions. Solution a. no solution provided. b. no solution provided 5 Solutions Solution 5.1 5.1.1 5.1.2 5.1.3 5.1.4 5.1.5 5.1.6 S2 Chapter 5 Solutions Solution 5.2 5.2.1 4 5.2.2 a. I, J b. B[I][0] 5.2.3 a. A[I][J] b. A[J][I] 5.2.4 a. 3596 = 8 800/4 2 8 8/4 + 8000/4 b.

More information

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec Introduction I/O 1 I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections I/O Device Summary I/O 2 I/O System

More information

MegaPipe: A New Programming Interface for Scalable Network I/O

MegaPipe: A New Programming Interface for Scalable Network I/O MegaPipe: A New Programming Interface for Scalable Network I/O Sangjin Han in collabora=on with Sco? Marshall Byung- Gon Chun Sylvia Ratnasamy University of California, Berkeley Yahoo! Research tl;dr?

More information

Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference

Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference The 2017 IEEE International Symposium on Workload Characterization Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference Shin-Ying Lee

More information

Eleos: Exit-Less OS Services for SGX Enclaves

Eleos: Exit-Less OS Services for SGX Enclaves Eleos: Exit-Less OS Services for SGX Enclaves Meni Orenbach Marina Minkin Pavel Lifshits Mark Silberstein Accelerated Computing Systems Lab Haifa, Israel What do we do? Improve performance: I/O intensive

More information

Next Generation Architecture for NVM Express SSD

Next Generation Architecture for NVM Express SSD Next Generation Architecture for NVM Express SSD Dan Mahoney CEO Fastor Systems Copyright 2014, PCI-SIG, All Rights Reserved 1 NVMExpress Key Characteristics Highest performance, lowest latency SSD interface

More information

CIT 668: System Architecture. Computer Systems Architecture

CIT 668: System Architecture. Computer Systems Architecture CIT 668: System Architecture Computer Systems Architecture 1. System Components Topics 2. Bandwidth and Latency 3. Processor 4. Memory 5. Storage 6. Network 7. Operating System 8. Performance Implications

More information

memory VT-PM8 & VT-PM16 EVALUATION WHITEPAPER Persistent Memory Dual Port Persistent Memory with Unlimited DWPD Endurance

memory VT-PM8 & VT-PM16 EVALUATION WHITEPAPER Persistent Memory Dual Port Persistent Memory with Unlimited DWPD Endurance memory WHITEPAPER Persistent Memory VT-PM8 & VT-PM16 EVALUATION VT-PM drives, part of Viking s persistent memory technology family of products, are 2.5 U.2 NVMe PCIe Gen3 drives optimized with Radian Memory

More information