AutoStream: Automatic Stream Management for Multi-stream SSDs

Similar documents
AutoStream: Automatic Stream Management for Multi-stream SSDs in Big Data Era

Maximizing Data Center and Enterprise Storage Efficiency

Don t stack your Log on my Log

FStream: Managing Flash Streams in the File System

ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices

Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs

Overprovisioning and the SanDisk X400 SSD

pblk the OCSSD FTL Linux FAST Summit 18 Javier González Copyright 2018 CNEX Labs

VSSIM: Virtual Machine based SSD Simulator

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc.

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics

HEC: Improving Endurance of High Performance Flash-based Cache Devices

Enabling NVMe I/O Scale

SSD ENDURANCE. Application Note. Document #AN0032 Viking SSD Endurance Rev. A

SFS: Random Write Considered Harmful in Solid State Drives

Standards for improving SSD performance and endurance

SSD Applications in the Enterprise Area

Presented by: Nafiseh Mahmoudi Spring 2017

Accelerating OLTP performance with NVMe SSDs Veronica Lagrange Changho Choi Vijay Balakrishnan

LightNVM: The Linux Open-Channel SSD Subsystem Matias Bjørling (ITU, CNEX Labs), Javier González (CNEX Labs), Philippe Bonnet (ITU)

Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives

OSSD: A Case for Object-based Solid State Drives

Flash-Conscious Cache Population for Enterprise Database Workloads

vstream: Virtual Stream Management for Multi-streamed SSDs

Performance Benefits of Running RocksDB on Samsung NVMe SSDs

White Paper: Understanding the Relationship Between SSD Endurance and Over-Provisioning. Solid State Drive

SSD WRITE AMPLIFICATION

FStream: Managing Flash Streams in the File System

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Optimizing Software-Defined Storage for Flash Memory

Chapter 12 Wear Leveling for PCM Using Hot Data Identification

Optimizing Performance and Reliability of Mobile Storage Memory

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Freewrite: Creating (Almost) Zero-Cost Writes to SSD

Reducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression

Ben Walker Data Center Group Intel Corporation

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device

NAND Flash Architectures Reducing Write Amplification Through Multi-Write Codes

January 28-29, 2014 San Jose

Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Holistic Flash Management for Next Generation All-Flash Arrays

Replacing the FTL with Cooperative Flash Management

FlashTier: A Lightweight, Consistent and Durable Storage Cache

Open-Channel SSDs Then. Now. And Beyond. Matias Bjørling, March 22, Copyright 2017 CNEX Labs

Google File System. Arun Sundaram Operating Systems

Technical Notes. Considerations for Choosing SLC versus MLC Flash P/N REV A01. January 27, 2012

Efficient Memory Mapped File I/O for In-Memory File Systems. Jungsik Choi, Jiwon Kim, Hwansoo Han

3ME4 Series. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver. Customer Approver

Don t Let RAID Raid the Lifetime of Your SSD Array

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

BROMS: Best Ratio of MLC to SLC

What is QES 2.1? Agenda. Supported Model. Live demo

Introduction to Open-Channel Solid State Drives

ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices

Chapter 14 HARD: Host-Level Address Remapping Driver for Solid-State Disk

Clustered Page-Level Mapping for Flash Memory-Based Storage Devices

Using MLC Flash to Reduce System Cost in Industrial Applications

Seagate Enterprise SATA SSD with DuraWrite Technology Competitive Evaluation

SUPA: A Single Unified Read-Write Buffer and Pattern-Change-Aware FTL for the High Performance of Multi-Channel SSD

p-oftl: An Object-based Semantic-aware Parallel Flash Translation Layer

Open-Channel Solid State Drives Specification

Solid State Storage Technologies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory

CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives

Linux Storage System Bottleneck Exploration

Design of Flash-Based DBMS: An In-Page Logging Approach

3MS4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

G i v e Y o u r H a r d D r i v e w h a t i t s B e e n M i s s i n g P e r f o r m a n c e The new Synapse

Flash Memory Based Storage System

Raising QLC Reliability in All-Flash Arrays

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

LSI MegaRAID Advanced Software Evaluation Guide V3.0

Compressed Swap for Embedded Linux. Alexander Belyakov, Intel Corp.

The Virtual Desktop Infrastructure Storage Behaviors and Requirements Spencer Shepler Microsoft

Gecko: Contention-Oblivious Disk Arrays for Cloud Storage

Getting it Right: Testing Storage Arrays The Way They ll be Used

Using Transparent Compression to Improve SSD-based I/O Caches

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

(Altera ) SoC FPGA Platforms

Exploiting the benefits of native programming access to NVM devices

Differential RAID: Rethinking RAID for SSD Reliability

NVM Express 1.3 Delivering Continuous Innovation

The Unwritten Contract of Solid State Drives

Azor: Using Two-level Block Selection to Improve SSD-based I/O caches

Optimizing Translation Information Management in NAND Flash Memory Storage Systems

WHITE PAPER. Know Your SSDs. Why SSDs? How do you make SSDs cost-effective? How do you get the most out of SSDs?

All-NVMe Performance Deep Dive Into Ceph + Sneak Preview of QLC + NVMe Ceph

Dongjun Shin Samsung Electronics

3SE4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems

Samsung Z-SSD SZ985. Ultra-low Latency SSD for Enterprise and Data Centers. Brochure

NAND Flash-based Storage. Computer Systems Laboratory Sungkyunkwan University

FlashShare: Punching Through Server Storage Stack from Kernel to Firmware for Ultra-Low Latency SSDs

2.5-Inch SATA SSD PSSDS27Txxx6

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

BPCLC: An Efficient Write Buffer Management Scheme for Flash-Based Solid State Disks

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Transcription:

AutoStream: Automatic Stream Management for Multi-stream SSDs Jingpei Yang, PhD, Rajinikanth Pandurangan, Changho Choi, PhD, Vijay Balakrishnan Memory Solutions Lab Samsung Semiconductor

Agenda SSD NAND flash characteristics Multi-stream Autostream: Automatic stream management Multi-Q SFR Performance enhancement Summary 2

SSD NAND Flash Characteristics Different IO units Read/Program: Page, Erase: Block (=multiple of pages) Erase before program Out-of-place update Unavoidable GC overhead The higher GC overhead, the larger Write Amplification*(= the lower endurance) Limited number of Program/Erase cycles WAF(Write Amplification Factor) amount of data written to NAND Flash = amount of data written by host To maximize SSD lifetime, need to minimize Write Amplification! 3

Multi-stream: Minimize Write Amplification Store similar lifetime data into the same erase block and reduce WA (GC overhead) Provide better endurance and improved performance Host associates each write operation with a stream All data associated with a stream is expected to be invalidated at the same time (e.g., updated, trimmed, unmapped, deallocated) Align NAND block allocation based on application data characteristics(e.g., update frequency) 4

AutoStream: Automatic Stream Management Multi-stream shows good benefit but requires application and system modification More challenges in multi-application, multi-tenant environments (e.g., VM or Docker) AutoStream Make stream detection independent of applications (e.g., in device driver) Cluster data into streams according to data update frequency, recency and sequentiality Minimize stream management overhead in application and systems Multi-stream Applications manage streams App. & Kernel modification req d Stream sync overhead Application Filesystem, block layer, etc. Device Driver Application Filesystem, block layer, etc. Device Driver No app. & Kernel modification required AutoStream Automatic stream management based on data characteristics 5

AutoStream IO Processing with Minimal Overhead READ I/O bypass AutoStream WRITE I/O just one table look up 6

AutoStream Implementation application OS kernel File system Block layer Device driver Write <slba, sz> 4 Multi-stream SSD 1 <slba, sz> 3 Write<sLBA, sz, sid> <sid> 2 Submission queue <slba> AutoStream controller AutoStream module TL Multi-Q queue update SFR table update 7

Multi-Q Algorithm Basics Divide a whole SSD space into the same size chunks 480GB SSD, 1MB chunk size -> 480,000 chunks Track statistics for each chunk access time, access count, expiry time, etc. Expiry time hottest chunk s lifetime := current time last access time Other chunk s expiry time:= current time + hottest chunk s lifetime chunk id c c x y z u w c access time 4 5 6 7 8 9 10 11 access count 1 2 1 1 1 1 1 3 d 12 1 Access time 11: Hottest chunk = c Chunk c s lifetime = 11 5 = 6 Access time 12: chunk d expiry time = 18 (12+6) 8

Multi-Q Update (Promotion & Demotion) Submission Q c b c a f Multi-Q thread processes each entry head tail Chunk a s access count is bigger than Q1 s access count threshold (frequency) Promotion d a f b a......... Demotion e e c Chunk e s expiry time has passed (recency*) Q1 cold Q2 Q7 Q8 hot 9 * Recency considers the last updated time

SFR - SequentialityFrequencyRecency Algorithm AutoStream controller yes sid := prev_sid <slba, sz> Sequential write? no Get sid from stream table Submission Q c b c a f SFR thread processes each entry Increase access_cnt Calculate recency_weight := pow(2, (curr_time last_acess_time)/decay_period) Update prev_sid access_cnt := access_cnt/recency_weight Put slba to submission queue Sequentiality sid := log(access_cnt) Stream table update (Frequency, Recency) 10

Docker Environment Performance Measurement Running 2 MySQL & 2 Cassandra instances simultaneously MySQL TPC-C Database Size 800 warehouse Workload TPC-C: 30 connection 5 4 3 2 WAF 42% 39% Cassandra -Stress 1KB record, 100 million entries r/w: 50/50 1 0 legacy SFR MQ 1200 1000 800 MySQL average tpmc Cassandra average TPS 6% 144% 129% 23500 23000 22500 2% 600 22000 400 200 21500 0 legacy SFR MQ 21000 legacy SFR MQ 11

Summary AutoStream With no application and system modification, improve SSD lifetime and performance AutoStream with minimal overhead Works well under different workloads for diverse applications on various system environments Up to 60% WAF reduction Up to 237% performance improvement Future work Optimize resource utilization and performance to fit into devices 12