Unblinding the OS to Optimize User-Perceived Flash SSD Latency
|
|
- Duane Hamilton
- 6 years ago
- Views:
Transcription
1 Unblinding the OS to Optimize User-Perceived Flash SSD Latency Woong Shin *, Jaehyun Park **, Heon Y. Yeom * * Seoul National University ** Arizona State University USENIX HotStorage 2016 Jun. 21, 2016
2 OS I/O Path Optimizations: Reducing S/W Overheads Application Simplified I/O Path Storage Device OS I/O Path Issue Side Completion Side Overhead from 13 us down to 0.2 ~ 2.5 us Under 20us Technology Memory technology with ultra low (nanoseconds), predictable latencies. Block CPU (poll) while waiting for the I/O to complete
3 To Block the CPU (sync) or to Yield the CPU (async) Issue Side Completion Side OS I/O Path CPU Deferred Processing Hard IRQ (i.e., read: 20us ~ 150us) Higher latency & High Variability Cannot use polling: Wastes CPU cycles Harms system parallelism H/W context switch := 4 us ~ 5 us OS involved switch := 7 us ~ 8 us Impact of Modern SSDs More contexts on a CPU core Larger scheduling delays
4 Impact of Modern SSDs 700,000 IOPS NVM-e SSD Bandwidth: 4kB x 700 kiops = 2.8 GB/s Latency: 1 sec / 700 kiops = 1.42 µs?!?
5 Impact of Modern SSDs 700,000 IOPS NVM-e SSD Single NAND die: apprx. 14,285 8kB IOPS (i.e, 70us read latency) Requires more than 49 NAND dies to achieve 700,000 IOPS Large DRAM Powerful Controllers Multi-channel Multi-way High die count NAND array
6 Impact of Modern SSDs 700,000 IOPS NVM-e SSD Single NAND die: apprx. 14,285 8kB IOPS (i.e, 70us read latency) Requires more than 49 NAND dies to achieve 700,000 IOPS High count of I/O contexts (threads, state machines) Large Powerful required DRAM for Controllers high IOPS Multi-channel Multi-way High die count NAND array
7 Higher Context Multiplexing Cost Scheduling Delays Context To Context CPU Core Scheduler Time sharing a CPU core (Sync. I/O) CPU CPU NAND die count >>> CPU core count i.e., Four PCI-e 3.0 4lane slots in a chassis, more SSDs Redundancy, more capacity... Context To Context CPU Core Worker thread With multiple I/O contexts (Async. I/O) Large DRAM Powerful Controllers Multi-channel Multi-way High die count NAND array
8 The OS is Blind: Conservative Strategies Issue Side Completion Side OS I/O Path DRAM SSD Controller
9 The OS is Blind: Conservative Strategies Issue Side Completion Side OS I/O Path Deferred Processing Hard IRQ 20us ~ 10ms Higher latency, Higher Variance Externally Experienced Latency R W R R W R R R R E W E R SSD Controller DRAM NAND dies Physical Destination within the SSD
10 This Work: Unblinding the OS Issue Side Completion Side OS I/O Path Deferred Processing Hard IRQ SSD Latency H/W context switch := 4 us ~ 5 us OS involved switch := 7 us ~ 8 us More contexts on a CPU core (higher)
11 This Work: Unblinding the OS Issue Side Completion Side OS I/O Path Deferred Processing Hard IRQ SSD Latency OS SSD Latency I/O destination
12 This Work: Unblinding the OS Issue Side Completion Side OS I/O Path Deferred Processing Hard IRQ Latency Exposed Knowledge A simplified model A B C Unpredictable I/O destination x I/O type DRAM NAND NAND SmallR/W Small READ Else SSD Latency OS Better Interaction Extended Interface SSD Latency Internal Knowledge A B C Unpredictable DRAM NAND SmallR/W Small READ I/O destination x I/O type NAND Else
13 Latency OS I/O Path Issue Side This Work: Unblinding the OS SSD Latency Completion Side Exploiting SSD Internal Information Exposed Knowledge A simplified model SmallR/W Small READ Towards a SSD A B C Unpredictable I/O destination x I/O type Else OS Better Interaction Extended Interface SSD Deferred Processing Hard IRQ New Optimization Opportunities Latency Internal Knowledge A B C Unpredictable DRAM NAND SmallR/W Small READ I/O destination x I/O type NAND Else
14 Exploiting SSD Internal Information OS SSD The Unpredictable DRAM SSD SSD Controller Multi-channel Multi-way High die count NAND array
15 Exploiting SSD Internal Information Decomposition & Classification I/O request Classification OS Destination Decomposition DRAM NAND Small R Small W Large R Large W Cache hit Buffer hit N/A N/A Cache miss Buffer miss Interleaved Interleaved reads writes Unpredictable No benefit No benefit SSD DRAM SSD Controller Multi-channel Multi-way High die count NAND array
16 Exploiting SSD Internal Information Decomposition & Classification I/O request Classification OS Small R Small W SSD Destination Decomposition DRAM NAND Cache hit Cache miss Buffer hit Buffer miss Unpredictable DRAM SSD Controller Multi-channel Multi-way High die count NAND array
17 Exploiting SSD Internal Information Decomposition & Classification I/O request Classification OS Small R Small W SSD Destination Decomposition DRAM NAND Cache hit Cache miss Buffer hit Buffer miss Unpredictable DRAM SSD Controller Multi-channel Multi-way High die count NAND array
18 Mitigating the Impact of Scheduling Delays Issue Side Completion Side OS I/O Path Deferred Processing Hard IRQ NAND Read I/O request Classification OS Small R Small W SSD Destination Decomposition DRAM NAND Cache hit Cache miss Buffer hit Buffer miss Unpredictable
19 Mitigating the Impact of Scheduling Delays Issue Side Completion Side Issue Side Completion Side OS I/O Path Deferred Processing Hard IRQ NAND Read Buffer Hits I/O request Classification OS Small R Small W SSD Destination Decomposition DRAM NAND Cache hit Cache miss Buffer hit Buffer miss Unpredictable
20 Accurate Latency Prediction: Remaining I/O Time Issue Side OS I/O Path Classification Small Read Latency Predictor I/O processing & Queuing delays Flash I/O + ECC + DMA transfer Only for NAND reads Remaining I/O time Total SSD I/O time
21 Precompletions: Overlapping I/O & Scheduling Delay Issue Side Completion Side Busy wait for flag (waiting on L1) OS I/O Path Classification Small Read Deferred Processing Hard IRQ Latency Predictor Precompletion Wait period Precompletion IRQ Precompletion window Actual Completion (Flag update: No IRQ) I/O processing & Queuing delays Flash I/O + ECC + DMA transfer Only for NAND reads Remaining I/O time Total SSD I/O time
22 OS & SSD Interaction: Simple Behavioral Models I/O request Classification OS Small R Small W SSD Destination Decomposition DRAM NAND Cache hit Cache miss Buffer hit Buffer miss Unpredictable
23 In-band Communication Channel Piggybacked Information Previous I/O completion! I/O request OS I/O Path I/O Classifier Current Blocks Blocks to consume Buffer Monitor Buffer Monitor
24 Implementation & Evaluation Host system Application FIO DDR3 DRAM 512MB x 2 OS: Linux Custom Block Driver NVM-e like I/F FTL NAND Controller 128GB NAND Module x 1 (4ch, 4way each) External PCI-e Gen 2 four lane connection (Operating in Gen2 one lane) MSI IRQ PCI-e external adaptor Zync-7000 FPGA + ARM Cortex A9 Dual core
25 Implementation & Evaluation Limitation Single I/O Depth Host system Application Unoptimized FPGA FIONAND Controller (Higher Latencies) OS: Linux Fixed latency Custom Slow DMA transfers Block (low freq. bus) PCI-e Gen2 one lane Driver External PCI-e Gen 2 four lane connection (Operating in Gen2 one lane) PCI-e external adaptor DDR3 DRAM 512MB x 2 128GB NAND NVM-e like Module x 1 I/F Latency FTL Prediction (4ch, 4way each) NAND Controller MSI IRQ Impact of precompletion Zync-7000 FPGA + ARM Cortex A9 Dual core
26 Predicting SSD Latency (Small NAND Read) Flash: NAND I/O + ECC Prediction: three value moving average DMA: device to host transfer (4kB) Low variance Low Error
27 The Impact of Precompletion fio I/O thread vs background threads Non-NAND Avg. latency: Measured AVG latency 352us (NAND latency)
28 The Impact of Precompletion fio I/O thread vs background threads I/O vs CPU requires Priority boost for Polling (priority degradataion) Polling damages system parallelism Non-NAND Avg. latency: Measured AVG latency 352us (NAND latency)
29 The Impact of Precompletion fio I/O thread vs background threads Overshoot harms parallelism (busy wait) Undershoot will expose scheduling delays Non-NAND Avg. latency: Measured AVG latency 352us (NAND latency) Precompletion requires an adequate precompletion window
30 The Impact of Precompletion fio I/O thread vs background threads Approx. 20% degradation of system parallelism IRQ vs Precompletion vscpu: 7.16 us gain vsio: 7.52 us gain With no degradation in system parallelism Non-NAND Avg. latency: Measured AVG latency 352us (NAND latency)
31 Summary Unblinding the OS - Cross layer optimization Achieved a partially predictable SSD decomposition / classification Exploit SSD internal information - Remaining I/O time Protecting SSD proprietary internals Abstracted behavioral models Mitigating scheduling delays Exploiting predictability of certain I/O requests Pre-completion - Projection (1 I/O depth vs other threads) IRQ Polling This work Latency Bad Good Good Parallelism Good Bad Good
32 Future Work Future Implementation & Evaluation Full blown SSD Projection (1 I/O depth this work) à Simulation (varying tech latency & etc) à Real implementation Cross layer optimization More models More use cases More backend technologies rather than flash
33
Linux Storage System Bottleneck Exploration
Linux Storage System Bottleneck Exploration Bean Huo / Zoltan Szubbocsev Beanhuo@micron.com / zszubbocsev@micron.com 215 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications
More informationDongjun Shin Samsung Electronics
2014.10.31. Dongjun Shin Samsung Electronics Contents 2 Background Understanding CPU behavior Experiments Improvement idea Revisiting Linux I/O stack Conclusion Background Definition 3 CPU bound A computer
More informationHardware NVMe implementation on cache and storage systems
Hardware NVMe implementation on cache and storage systems Jerome Gaysse, IP-Maker Santa Clara, CA 1 Agenda Hardware architecture NVMe for storage NVMe for cache/application accelerator NVMe for new NVM
More informationEfficient Memory Mapped File I/O for In-Memory File Systems. Jungsik Choi, Jiwon Kim, Hwansoo Han
Efficient Memory Mapped File I/O for In-Memory File Systems Jungsik Choi, Jiwon Kim, Hwansoo Han Operations Per Second Storage Latency Close to DRAM SATA/SAS Flash SSD (~00μs) PCIe Flash SSD (~60 μs) D-XPoint
More informationExploring System Challenges of Ultra-Low Latency Solid State Drives
Exploring System Challenges of Ultra-Low Latency Solid State Drives Sungjoon Koh Changrim Lee, Miryeong Kwon, and Myoungsoo Jung Computer Architecture and Memory systems Lab Executive Summary Motivation.
More informationReliability, Availability, Serviceability (RAS) and Management for Non-Volatile Memory Storage
Reliability, Availability, Serviceability (RAS) and Management for Non-Volatile Memory Storage Mohan J. Kumar, Intel Corp Sammy Nachimuthu, Intel Corp Dimitris Ziakas, Intel Corp August 2015 1 Agenda NVDIMM
More informationMoneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010
Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed
More informationReplacing the FTL with Cooperative Flash Management
Replacing the FTL with Cooperative Flash Management Mike Jadon Radian Memory Systems www.radianmemory.com Flash Memory Summit 2015 Santa Clara, CA 1 Data Center Primary Storage WORM General Purpose RDBMS
More informationNext Generation Architecture for NVM Express SSD
Next Generation Architecture for NVM Express SSD Dan Mahoney CEO Fastor Systems Copyright 2014, PCI-SIG, All Rights Reserved 1 NVMExpress Key Characteristics Highest performance, lowest latency SSD interface
More informationHigh Performance SSD & Benefit for Server Application
High Performance SSD & Benefit for Server Application AUG 12 th, 2008 Tony Park Marketing INDILINX Co., Ltd. 2008-08-20 1 HDD SATA 3Gbps Memory PCI-e 10G Eth 120MB/s 300MB/s 8GB/s 2GB/s 1GB/s SSD SATA
More informationPresented by: Nafiseh Mahmoudi Spring 2017
Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory
More informationHigh-Speed NAND Flash
High-Speed NAND Flash Design Considerations to Maximize Performance Presented by: Robert Pierce Sr. Director, NAND Flash Denali Software, Inc. History of NAND Bandwidth Trend MB/s 20 60 80 100 200 The
More informationNAND Interleaving & Performance
NAND Interleaving & Performance What You Need to Know Presented by: Keith Garvin Product Architect, Datalight August 2008 1 Overview What is interleaving, why do it? Bus Level Interleaving Interleaving
More informationStorage. Hwansoo Han
Storage Hwansoo Han I/O Devices I/O devices can be characterized by Behavior: input, out, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections 2 I/O System Characteristics
More informationFive Key Steps to High-Speed NAND Flash Performance and Reliability
Five Key Steps to High-Speed Flash Performance and Reliability Presenter Bob Pierce Flash Memory Summit 2010 Santa Clara, CA 1 NVM Performance Trend ONFi 2 PCM Toggle ONFi 2 DDR SLC Toggle Performance
More informationLinux Kernel Abstractions for Open-Channel SSDs
Linux Kernel Abstractions for Open-Channel SSDs Matias Bjørling Javier González, Jesper Madsen, and Philippe Bonnet 2015/03/01 1 Market Specific FTLs SSDs on the market with embedded FTLs targeted at specific
More informationNear-Data Processing for Differentiable Machine Learning Models
Near-Data Processing for Differentiable Machine Learning Models Hyeokjun Choe 1, Seil Lee 1, Hyunha Nam 1, Seongsik Park 1, Seijoon Kim 1, Eui-Young Chung 2 and Sungroh Yoon 1,3 1 Electrical and Computer
More informationLecture 13. Storage, Network and Other Peripherals
Lecture 13 Storage, Network and Other Peripherals 1 I/O Systems Processor interrupts Cache Processor & I/O Communication Memory - I/O Bus Main Memory I/O Controller I/O Controller I/O Controller Disk Disk
More informationPCIe Storage Beyond SSDs
PCIe Storage Beyond SSDs Fabian Trumper NVM Solutions Group PMC-Sierra Santa Clara, CA 1 Classic Memory / Storage Hierarchy FAST, VOLATILE CPU Cache DRAM Performance Gap Performance Tier (SSDs) SLOW, NON-VOLATILE
More informationCOS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University
COS 318: Operating Systems Storage Devices Vivek Pai Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall11/cos318/ Today s Topics Magnetic disks Magnetic disk
More informationSPDK China Summit Ziye Yang. Senior Software Engineer. Network Platforms Group, Intel Corporation
SPDK China Summit 2018 Ziye Yang Senior Software Engineer Network Platforms Group, Intel Corporation Agenda SPDK programming framework Accelerated NVMe-oF via SPDK Conclusion 2 Agenda SPDK programming
More information저작권법에따른이용자의권리는위의내용에의하여영향을받지않습니다.
저작자표시 - 비영리 - 변경금지 2.0 대한민국 이용자는아래의조건을따르는경우에한하여자유롭게 이저작물을복제, 배포, 전송, 전시, 공연및방송할수있습니다. 다음과같은조건을따라야합니다 : 저작자표시. 귀하는원저작자를표시하여야합니다. 비영리. 귀하는이저작물을영리목적으로이용할수없습니다. 변경금지. 귀하는이저작물을개작, 변형또는가공할수없습니다. 귀하는, 이저작물의재이용이나배포의경우,
More informationMQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices
MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices Arash Tavakkol, Juan Gómez-Luna, Mohammad Sadrosadati, Saugata Ghose, Onur Mutlu February 13, 2018 Executive Summary
More informationIntroduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec
Introduction I/O 1 I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections I/O Device Summary I/O 2 I/O System
More informationMANAGING MULTI-TIERED NON-VOLATILE MEMORY SYSTEMS FOR COST AND PERFORMANCE 8/9/16
MANAGING MULTI-TIERED NON-VOLATILE MEMORY SYSTEMS FOR COST AND PERFORMANCE 8/9/16 THE DATA CHALLENGE Performance Improvement (RelaLve) 4.4 ZB Total data created, replicated, and consumed in a single year
More informationToward a Memory-centric Architecture
Toward a Memory-centric Architecture Martin Fink EVP & Chief Technology Officer Western Digital Corporation August 8, 2017 1 SAFE HARBOR DISCLAIMERS Forward-Looking Statements This presentation contains
More informationD E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager
1 T HE D E N A L I N E X T - G E N E R A T I O N H I G H - D E N S I T Y S T O R A G E I N T E R F A C E Laura Caulfield Senior Software Engineer Arie van der Hoeven Principal Program Manager Outline Technology
More informationLinux Storage System Analysis for e.mmc With Command Queuing
Linux Storage System Analysis for e.mmc With Command Queuing Linux is a widely used embedded OS that also manages block devices such as e.mmc, UFS and SSD. Traditionally, advanced embedded systems have
More informationOnyx: A Prototype Phase-Change Memory Storage Array
Onyx: A Prototype Phase-Change Memory Storage Array Ameen Akel * Adrian Caulfield, Todor Mollov, Rajesh Gupta, Steven Swanson Non-Volatile Systems Laboratory, Department of Computer Science and Engineering
More informationThe Non-Volatile Memory Verbs Provider (NVP): Using the OFED Framework to access solid state storage
The Non-Volatile Memory Verbs Provider (NVP): Using the OFED Framework to access solid state storage Bernard Metzler 1, Animesh Trivedi 1, Lars Schneidenbach 2, Michele Franceschini 2, Patrick Stuedi 1,
More informationComparing Memory Systems for Chip Multiprocessors
Comparing Memory Systems for Chip Multiprocessors Jacob Leverich Hideho Arakida, Alex Solomatnikov, Amin Firoozshahian, Mark Horowitz, Christos Kozyrakis Computer Systems Laboratory Stanford University
More informationMoneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories
Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Adrian M. Caulfield Arup De, Joel Coburn, Todor I. Mollov, Rajesh K. Gupta, Steven Swanson Non-Volatile Systems
More informationComputer System Overview OPERATING SYSTEM TOP-LEVEL COMPONENTS. Simplified view: Operating Systems. Slide 1. Slide /S2. Slide 2.
BASIC ELEMENTS Simplified view: Processor Slide 1 Computer System Overview Operating Systems Slide 3 Main Memory referred to as real memory or primary memory volatile modules 2004/S2 secondary memory devices
More informationPersistent Memory. High Speed and Low Latency. White Paper M-WP006
Persistent Memory High Speed and Low Latency White Paper M-WP6 Corporate Headquarters: 3987 Eureka Dr., Newark, CA 9456, USA Tel: (51) 623-1231 Fax: (51) 623-1434 E-mail: info@smartm.com Customer Service:
More informationCOS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University
COS 318: Operating Systems Storage Devices Jaswinder Pal Singh Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall13/cos318/ Today s Topics Magnetic disks
More informationComputer System Overview
Computer System Overview Operating Systems 2005/S2 1 What are the objectives of an Operating System? 2 What are the objectives of an Operating System? convenience & abstraction the OS should facilitate
More informationCBM: A Cooperative Buffer Management for SSD
3 th International Conference on Massive Storage Systems and Technology (MSST 4) : A Cooperative Buffer Management for SSD Qingsong Wei, Cheng Chen, Jun Yang Data Storage Institute, A-STAR, Singapore June
More informationComputer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM
Computer Architecture Computer Science & Engineering Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine
More informationECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017
ECE 550D Fundamentals of Computer Systems and Engineering Fall 2017 Input/Output (IO) Prof. John Board Duke University Slides are derived from work by Profs. Tyler Bletsch and Andrew Hilton (Duke) IO:
More informationmemory VT-PM8 & VT-PM16 EVALUATION WHITEPAPER Persistent Memory Dual Port Persistent Memory with Unlimited DWPD Endurance
memory WHITEPAPER Persistent Memory VT-PM8 & VT-PM16 EVALUATION VT-PM drives, part of Viking s persistent memory technology family of products, are 2.5 U.2 NVMe PCIe Gen3 drives optimized with Radian Memory
More informationFlash Controller Solutions in Programmable Technology
Flash Controller Solutions in Programmable Technology David McIntyre Senior Business Unit Manager Computer and Storage Business Unit Altera Corp. dmcintyr@altera.com Flash Memory Summit 2012 Santa Clara,
More informationBeyond Block I/O: Rethinking
Beyond Block I/O: Rethinking Traditional Storage Primitives Xiangyong Ouyang *, David Nellans, Robert Wipfel, David idflynn, D. K. Panda * * The Ohio State University Fusion io Agenda Introduction and
More informationSecondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane
Secondary storage CS 537 Lecture 11 Secondary Storage Michael Swift Secondary storage typically: is anything that is outside of primary memory does not permit direct execution of instructions or data retrieval
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University Storage and Other I/O Topics I/O Performance Measures Types and Characteristics of I/O Devices Buses Interfacing I/O Devices
More informationLinux Kernel Extensions for Open-Channel SSDs
Linux Kernel Extensions for Open-Channel SSDs Matias Bjørling Member of Technical Staff Santa Clara, CA 1 The Future of device FTLs? Dealing with flash chip constrains is a necessity No way around the
More informationComparing UFS and NVMe Storage Stack and System-Level Performance in Embedded Systems
Comparing UFS and NVMe Storage Stack and System-Level Performance in Embedded Systems Bean Huo, Blair Pan, Peter Pan, Zoltan Szubbocsev Micron Technology Introduction Embedded storage systems have experienced
More informationMemory Systems DRAM, etc.
Memory Systems DRAM, etc. Prof. Bruce Jacob Keystone Professor & Director of Computer Engineering Program Electrical & Computer Engineering University of Maryland at College Park Today s Story DRAM (the
More informationEnabling Cost-effective Data Processing with Smart SSD
Enabling Cost-effective Data Processing with Smart SSD Yangwook Kang, UC Santa Cruz Yang-suk Kee, Samsung Semiconductor Ethan L. Miller, UC Santa Cruz Chanik Park, Samsung Electronics Efficient Use of
More informationNVMe SSD s. NVMe is displacing SATA in applications which require performance. NVMe has excellent programing model for host software
NVMe SSD s NVMe is displacing SATA in applications which require performance NVMe has excellent programing model for host software Latency is becoming the key driving force for system performance, although
More informationWebscale, All Flash, Distributed File Systems. Avraham Meir Elastifile
Webscale, All Flash, Distributed File Systems Avraham Meir Elastifile 1 Outline The way to all FLASH The way to distributed storage Scale-out storage management Conclusion 2 Storage Technology Trend NAND
More informationParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices
ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Devices Jiacheng Zhang, Jiwu Shu, Youyou Lu Tsinghua University 1 Outline Background and Motivation ParaFS Design Evaluation
More informationFlash Trends: Challenges and Future
Flash Trends: Challenges and Future John D. Davis work done at Microsoft Researcher- Silicon Valley in collaboration with Laura Caulfield*, Steve Swanson*, UCSD* 1 My Research Areas of Interest Flash characteristics
More informationA Next Generation Home Access Point and Router
A Next Generation Home Access Point and Router Product Marketing Manager Network Communication Technology and Application of the New Generation Points of Discussion Why Do We Need a Next Gen Home Router?
More informationAccelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage
Accelerating Real-Time Big Data Breaking the limitations of captive NVMe storage 18M IOPs in 2u Agenda Everything related to storage is changing! The 3rd Platform NVM Express architected for solid state
More informationThe Transition to PCI Express* for Client SSDs
The Transition to PCI Express* for Client SSDs Amber Huffman Senior Principal Engineer Intel Santa Clara, CA 1 *Other names and brands may be claimed as the property of others. Legal Notices and Disclaimers
More informationAn NVMe-based Offload Engine for Storage Acceleration Sean Gibb, Eideticom Stephen Bates, Raithlin
An NVMe-based Offload Engine for Storage Acceleration Sean Gibb, Eideticom Stephen Bates, Raithlin 1 Overview Acceleration for Storage NVMe for Acceleration How are we using (abusing ;-)) NVMe to support
More informationDisclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme
SER2734BU Extreme Performance Series: Byte-Addressable Nonvolatile Memory in vsphere VMworld 2017 Content: Not for publication Qasim Ali and Praveen Yedlapalli #VMworld #SER2734BU Disclaimer This presentation
More informationNVMe : Redefining the Hardware/Software Architecture
NVMe : Redefining the Hardware/Software Architecture Jérôme Gaysse, IP-Maker Santa Clara, CA 1 NVMe Protocol How to implement the NVMe protocol? SW, HW/SW or HW? 2- NVMe command ready CPU 1-Host driver
More informationI/O Handling. ECE 650 Systems Programming & Engineering Duke University, Spring Based on Operating Systems Concepts, Silberschatz Chapter 13
I/O Handling ECE 650 Systems Programming & Engineering Duke University, Spring 2018 Based on Operating Systems Concepts, Silberschatz Chapter 13 Input/Output (I/O) Typical application flow consists of
More informationMemory-Based Cloud Architectures
Memory-Based Cloud Architectures ( Or: Technical Challenges for OnDemand Business Software) Jan Schaffner Enterprise Platform and Integration Concepts Group Example: Enterprise Benchmarking -) *%'+,#$)
More informationReducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet
Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems
More informationDesign Choices for FPGA-based SoCs When Adding a SATA Storage }
U4 U7 U7 Q D U5 Q D Design Choices for FPGA-based SoCs When Adding a SATA Storage } Lorenz Kolb & Endric Schubert, Missing Link Electronics Rudolf Usselmann, ASICS World Services Motivation for SATA Storage
More informationSamsung PM1725a NVMe SSD
Samsung PM1725a NVMe SSD Exceptionally fast speeds and ultra-low latency for enterprise application Brochure 1 Extreme performance from an SSD technology leader Maximize data transfer with the high-performance,
More informationToward SLO Complying SSDs Through OPS Isolation
Toward SLO Complying SSDs Through OPS Isolation October 23, 2015 Hongik University UNIST (Ulsan National Institute of Science & Technology) Sam H. Noh 1 Outline Part 1: FAST 2015 Part 2: Beyond FAST 2
More informationExtending the NVMHCI Standard to Enterprise
Extending the NVMHCI Standard to Enterprise Amber Huffman Principal Engineer Intel Corporation August 2009 1 Outline Remember: What is NVMHCI PCIe SSDs Coming with Challenges Enterprise Extensions to NVMHCI
More informationKey Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs
IO 1 Today IO 2 Key Points CPU interface and interaction with IO IO devices The basic structure of the IO system (north bridge, south bridge, etc.) The key advantages of high speed serial lines. The benefits
More informationReFlex: Remote Flash Local Flash
ReFlex: Remote Flash Local Flash Ana Klimovic Heiner Litz Christos Kozyrakis NVMW 18 Memorable Paper Award Finalist 1 Flash in Datacenters Flash provides 1000 higher throughput and 100 lower latency than
More informationu Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices
Where Are We? COS 318: Operating Systems Storage Devices Jaswinder Pal Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) u Covered: l Management of CPU
More informationFalcon: Scaling IO Performance in Multi-SSD Volumes. The George Washington University
Falcon: Scaling IO Performance in Multi-SSD Volumes Pradeep Kumar H Howie Huang The George Washington University SSDs in Big Data Applications Recent trends advocate using many SSDs for higher throughput
More information100% PACKET CAPTURE. Intelligent FPGA-based Host CPU Offload NIC s & Scalable Platforms. Up to 200Gbps
100% PACKET CAPTURE Intelligent FPGA-based Host CPU Offload NIC s & Scalable Platforms Up to 200Gbps Dual Port 100 GigE ANIC-200KFlex (QSFP28) The ANIC-200KFlex FPGA-based PCIe adapter/nic features dual
More informationEnd-to-End Adaptive Packet Aggregation for High-Throughput I/O Bus Network Using Ethernet
Hot Interconnects 2014 End-to-End Adaptive Packet Aggregation for High-Throughput I/O Bus Network Using Ethernet Green Platform Research Laboratories, NEC, Japan J. Suzuki, Y. Hayashi, M. Kan, S. Miyakawa,
More informationLECTURE 5: MEMORY HIERARCHY DESIGN
LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive
More informationReadings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important
Storage Hierarchy III: I/O System Readings reg I$ D$ L2 L3 memory disk (swap) often boring, but still quite important ostensibly about general I/O, mainly about disks performance: latency & throughput
More informationFlashShare: Punching Through Server Storage Stack from Kernel to Firmware for Ultra-Low Latency SSDs
FlashShare: Punching Through Server Storage Stack from Kernel to Firmware for Ultra-Low Latency SSDs Jie Zhang, Miryeong Kwon, Donghyun Gouk, Sungjoon Koh, Changlim Lee, Mohammad Alian, Myoungjun Chun,
More informationVirtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili
Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed
More informationZiye Yang. NPG, DCG, Intel
Ziye Yang NPG, DCG, Intel Agenda What is SPDK? Accelerated NVMe-oF via SPDK Conclusion 2 Agenda What is SPDK? Accelerated NVMe-oF via SPDK Conclusion 3 Storage Performance Development Kit Scalable and
More informationFacilitating IP Development for the OpenCAPI Memory Interface Kevin McIlvain, Memory Development Engineer IBM. Join the Conversation #OpenPOWERSummit
Facilitating IP Development for the OpenCAPI Memory Interface Kevin McIlvain, Memory Development Engineer IBM Join the Conversation #OpenPOWERSummit Moral of the Story OpenPOWER is the best platform to
More informationSolid State Storage Technologies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Solid State Storage Technologies Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu NVMe (1) The industry standard interface for high-performance NVM
More informationThe Long-Term Future of Solid State Storage Jim Handy Objective Analysis
The Long-Term Future of Solid State Storage Jim Handy Objective Analysis Agenda How did we get here? Why it s suboptimal How we move ahead Why now? DRAM speed scaling Changing role of NVM in computing
More informationBlueDBM: An Appliance for Big Data Analytics*
BlueDBM: An Appliance for Big Data Analytics* Arvind *[ISCA, 2015] Sang-Woo Jun, Ming Liu, Sungjin Lee, Shuotao Xu, Arvind (MIT) and Jamey Hicks, John Ankcorn, Myron King(Quanta) BigData@CSAIL Annual Meeting
More informationRobert Gottstein, Ilia Petrov, Guillermo G. Almeida, Todor Ivanov, Alex Buchmann
Using Flash SSDs as Pi Primary Database Storage Robert Gottstein, Ilia Petrov, Guillermo G. Almeida, Todor Ivanov, Alex Buchmann {lastname}@dvs.tu-darmstadt.de Fachgebiet DVS Ilia Petrov 1 Flash SSDs,
More informationMemory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)
Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2011/12 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2011/12 1 2
More informationv02.54 (C) Copyright , American Megatrends, Inc.
1 Main Advanced H/W Monitor Boot Security Exit System Overview System Time System Date [ 14:00:09] [Tue 02/21/2006] BIOS Version : P4i65G BIOS P1.00 Processor Type : Intel (R) Pentium (R) 4 CPU 2.40 GHz
More informationSamsung Z-SSD SZ985. Ultra-low Latency SSD for Enterprise and Data Centers. Brochure
Samsung Z-SSD SZ985 Ultra-low Latency SSD for Enterprise and Data Centers Brochure 1 A high-speed storage device from the SSD technology leader Samsung Z-SSD SZ985 offers more capacity than PRAM-based
More informationOptimizing Fsync Performance with Dynamic Queue Depth Adaptation
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.5, OCTOBER, 2015 ISSN(Print) 1598-1657 http://dx.doi.org/10.5573/jsts.2015.15.5.570 ISSN(Online) 2233-4866 Optimizing Fsync Performance with
More informationCapabilities and System Benefits Enabled by NVDIMM-N
Capabilities and System Benefits Enabled by NVDIMM-N Bob Frey Arthur Sainio SMART Modular Technologies August 7, 2018 Santa Clara, CA 1 NVDIMM-N Maturity and Evolution If there's one takeaway you should
More informationNVM PCIe Networked Flash Storage
NVM PCIe Networked Flash Storage Peter Onufryk Microsemi Corporation Santa Clara, CA 1 PCI Express (PCIe) Mid-range/High-end Specification defined by PCI-SIG Software compatible with PCI and PCI-X Reliable,
More informationv02.54 (C) Copyright , American Megatrends, Inc.
1 Main Advanced H/W Monitor Boot Security Exit System Overview System Time System Date BIOS Version Processor Type Processor Speed Cache Size [ 14:00:09] [Fri 05/19/2006] : ConRoe865PE BIOS P1.00 : Intel
More informationA Prototype Storage Subsystem based on PCM
PSS A Prototype Storage Subsystem based on IBM Research Zurich Ioannis Koltsidas, Roman Pletka, Peter Mueller, Thomas Weigold, Evangelos Eleftheriou University of Patras Maria Varsamou, Athina Ntalla,
More informationMemory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)
Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2012/13 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2012/13 1 2
More informationCOS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University
COS 318: Operating Systems Storage Devices Kai Li Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall11/cos318/ Today s Topics Magnetic disks Magnetic disk
More informationSFS: Random Write Considered Harmful in Solid State Drives
SFS: Random Write Considered Harmful in Solid State Drives Changwoo Min 1, 2, Kangnyeon Kim 1, Hyunjin Cho 2, Sang-Won Lee 1, Young Ik Eom 1 1 Sungkyunkwan University, Korea 2 Samsung Electronics, Korea
More informationUNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568
UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Computer Architecture ECE 568 Part 6 Input/Output Israel Koren ECE568/Koren Part.6. CPU performance keeps increasing 26 72-core Xeon
More informationChapter 6. Storage and Other I/O Topics
Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections
More informationHard Disk Drives (HDDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Hard Disk Drives (HDDs) Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Virtualization Virtual CPUs Virtual memory Concurrency Threads Synchronization
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationComputer Systems Laboratory Sungkyunkwan University
I/O System Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Introduction (1) I/O devices can be characterized by Behavior: input, output, storage
More informationAccelerating NVMe-oF* for VMs with the Storage Performance Development Kit
Accelerating NVMe-oF* for VMs with the Storage Performance Development Kit Jim Harris Principal Software Engineer Intel Data Center Group Santa Clara, CA August 2017 1 Notices and Disclaimers Intel technologies
More informationArchitectural Differences nc. DRAM devices are accessed with a multiplexed address scheme. Each unit of data is accessed by first selecting its row ad
nc. Application Note AN1801 Rev. 0.2, 11/2003 Performance Differences between MPC8240 and the Tsi106 Host Bridge Top Changwatchai Roy Jenevein risc10@email.sps.mot.com CPD Applications This paper discusses
More informationHP SSD EX920 M.2. 2TB Sustained sequential read: Up to 3200 MB/s Sustained sequential write: Up to 1600 MB/s
HP SSD EX920 M.2 Product Specification Capacity: 256GB, 512GB, 1TB, 2TB Components: 3D NAND/ DRAM Cache Read and Write IOPS (Iometer* Queue Depth 32) 256 GB Random 4 KB reads: Up to 180K IOPS Random 4
More information