Application-Managed Flash

Similar documents
ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices

NOHOST: A New Storage Architecture for Distributed Storage Systems. Chanwoo Chung

Lightweight KV-based Distributed Store for Datacenters

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device

Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

p-oftl: An Object-based Semantic-aware Parallel Flash Translation Layer

Linux Kernel Abstractions for Open-Channel SSDs

SFS: Random Write Considered Harmful in Solid State Drives

Yiying Zhang, Leo Prasath Arulraj, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. University of Wisconsin - Madison

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager

Linux Kernel Extensions for Open-Channel SSDs

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

[537] Flash. Tyler Harter

Optimizing Flash-based Key-value Cache Systems

BlueDBM: An Appliance for Big Data Analytics*

FFS: The Fast File System -and- The Magical World of SSDs

SOS : Software-based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

Using Transparent Compression to Improve SSD-based I/O Caches

VSSIM: Virtual Machine based SSD Simulator

Presented by: Nafiseh Mahmoudi Spring 2017

The Unwritten Contract of Solid State Drives

OSSD: A Case for Object-based Solid State Drives

I/O Devices & SSD. Dongkun Shin, SKKU

ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

An SMR-aware Append-only File System Chi-Young Ku Stephen P. Morgan Futurewei Technologies, Inc. Huawei R&D USA

Freewrite: Creating (Almost) Zero-Cost Writes to SSD

Exploiting the benefits of native programming access to NVM devices

Operating Systems. File Systems. Thomas Ropars.

The What, Why and How of the Pure Storage Enterprise Flash Array. Ethan L. Miller (and a cast of dozens at Pure Storage)

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Replacing the FTL with Cooperative Flash Management

LightNVM: The Linux Open-Channel SSD Subsystem Matias Bjørling (ITU, CNEX Labs), Javier González (CNEX Labs), Philippe Bonnet (ITU)

Solid State Storage Technologies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Mass-Storage Structure

Improving File System Performance of Mobile Storage Systems Using a Decoupled Defragmenter

NOVA: The Fastest File System for NVDIMMs. Steven Swanson, UC San Diego

Solid State Storage Technologies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Don t stack your Log on my Log

A File-System-Aware FTL Design for Flash Memory Storage Systems

FlashKV: Accelerating KV Performance with Open-Channel SSDs

Denali Open-Channel SSDs

Performance Benefits of Running RocksDB on Samsung NVMe SSDs

HashKV: Enabling Efficient Updates in KV Storage via Hashing

Getting Real: Lessons in Transitioning Research Simulations into Hardware Systems

Open-Channel SSDs Then. Now. And Beyond. Matias Bjørling, March 22, Copyright 2017 CNEX Labs

Flash Trends: Challenges and Future

Toward SLO Complying SSDs Through OPS Isolation

Lecture 18: Reliable Storage

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

NOVA-Fortis: A Fault-Tolerant Non- Volatile Main Memory File System

Maximizing Data Center and Enterprise Storage Efficiency

ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

HADP Talk BlueDBM: An appliance for Big Data Analytics

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

COS 318: Operating Systems. Journaling, NFS and WAFL

Unblinding the OS to Optimize User-Perceived Flash SSD Latency

SSD Garbage Collection Detection and Management with Machine Learning Algorithm 1

GearDB: A GC-free Key-Value Store on HM-SMR Drives with Gear Compaction

Managing Array of SSDs When the Storage Device is No Longer the Performance Bottleneck

pblk the OCSSD FTL Linux FAST Summit 18 Javier González Copyright 2018 CNEX Labs

Introduction to Open-Channel Solid State Drives and What s Next!

A Semi Preemptive Garbage Collector for Solid State Drives. Junghee Lee, Youngjae Kim, Galen M. Shipman, Sarp Oral, Feiyi Wang, and Jongman Kim

EECS 482 Introduction to Operating Systems

Improving Performance and Lifetime of Solid-State Drives Using Hardware-Accelerated Compression

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

LAST: Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems

Optimizing Translation Information Management in NAND Flash Memory Storage Systems

OpenSSD Platform Simulator to Reduce SSD Firmware Test Time. Taedong Jung, Yongmyoung Lee, Ilhoon Shin

Design and Implementation for Multi-Level Cell Flash Memory Storage Systems

Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Main Points. File layout Directory layout

MTD Based Compressed Swapping for Embedded Linux.

Enabling NVMe I/O Scale

Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives

Beyond Block I/O: Rethinking

When Hadoop-like Distributed Storage Meets NAND Flash: Challenge and Opportunity

Toward Seamless Integration of RAID and Flash SSD

File System Implementation. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Overprovisioning and the SanDisk X400 SSD

Journal Remap-Based FTL for Journaling File System with Flash Memory

Main Points. File systems. Storage hardware characteristics. File system usage patterns. Useful abstractions on top of physical devices

Main Points. File System Reliability (part 2) Last Time: File System Reliability. Reliability Approach #1: Careful Ordering 11/26/12

NAND Flash-based Storage. Computer Systems Laboratory Sungkyunkwan University

Power Analysis for Flash Memory SSD. Dongkun Shin Sungkyunkwan University

Asymmetric Programming: A Highly Reliable Metadata Allocation Strategy for MLC NAND Flash Memory-Based Sensor Systems

Yet other uses of a level of indirection...! Log-structured & Solid State File Systems Nov 19, Garth Gibson Dave Eckhardt Greg Ganger

OSSD: Object-based Solid State Drive. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Linux SMR Support Status

Announcements. Persistence: Log-Structured FS (LFS)

Introduction to Open-Channel Solid State Drives

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 25

2011/11/04 Sunwook Bae

Transcription:

Application-Managed Flash Sungjin Lee*, Ming Liu, Sangwoo Jun, Shuotao Xu, Jihong Kim and Arvind *Inha University Massachusetts Institute of Technology Seoul National University Operating System Support for Next Generation Large Scale NVRAM (NVRAMOS) October 20-21, 2016 (Presented at USENIX FAST 16)

NAND Flash and FTL NAND flash SSDs have become the preferred storage devices in consumer electronics and datacenters FTL plays an important role in flash management The principal virtue of FTL is providing interoperability with the existing block I/O abstraction Host System bases File-systems KV Store Block I/O Layer Flash Device Overwriting restriction Asymmetric I/O operation Limited P/E cycles Flash Translation Layer (FTL) Bad blocks NAND Flash 2

FTL is a Complex Piece of Software FTL runs complicated firmware algorithms to avoid in-place updates and manages unreliable NAND substrates Host System bases File-systems KV Store Block I/O Layer Flash Translation Layer (FTL) Flash Device Address Remapping Garbage Collection I/O Scheduling Wear-leveling & Bad-block But, FTL is a root of evil in terms of HW resources and performance Requires significant hardware resources (e.g., 4 CPUs / 1-4 GB DRAM) Incurs extra I/Os for flash management (e.g., GC) Badly affects the behaviors of host applications 3 NAND Flash

Existing Approach Improving FTL itself Better logical-to-physical address mapping and garbage collection algorithms Limited optimization due to the lack of information Optimizing FTL with custom interface Delivering system-level information to FTL for better optimization (e.g., file access pattern, hint when to trigger GC, and stream ID, ) Special interfaces, hard for standardization, more functions added to FTL Offloading host functions into FTL Moving some part of a file system to FTL (e.g., nameless writes and objectbased flash storage) More hardware resources and greater storage design complexity Many efforts have been made to put more functions to flash storage devices 4

However, Functionality of FTL is Mostly Useless Many host applications manage underlying storage in a log-like manner, mostly avoiding in-place updates Log-structured Host Applications Host System Object-to-storage Remapping Versioning & Cleaning bases File-systems I/O KV Scheduling Store Block I/O Duplicate Layer Management Flash Translation Layer (FTL) Flash Device Address Remapping Garbage Collection I/O Scheduling Wear-leveling & Bad-block 5 NAND Flash This duplicate management not only (1) incurs serious performance penalties but also (2) wastes hardware resources

Which Applications??? 6 SpriteLFS F2FS Key-value Stores LevelDB RocksDB LSM-Tree File Systems WAFL Btrfs NILFS Which Applications??? FlexVol Storage Virtualization BlueSky HDFS bases RethinkDB Cassandra BigTable LogBase MongoDB Hyder

Question: What if we removed FTL from storage devices and allowed host applications to directly manage NAND flash?

Application-Managed Flash (AMF) (2) The host runs almost all of the complicated algorithms - Reuse existing algorithms to manage storage devices Host Log-structured Applications Host (Log-structured) Applications Host System Object-to-storage Remapping Versioning & Cleaning I/O Scheduling Flash Device Address Remapping AMF Block Block I/O I/O Layer Layer (AMF I/O) Light-weight Flash Translation Flash Translation Layer (FTL) Layer Garbage Collection NAND Flash I/O Scheduling Wear-leveling & Bad-block (1) The device runs essential device management algorithms - Manages unreliable NAND flash and hides internal storage architectures NAND Flash (3) A new AMF block I/O abstraction enables us to separate the roles of the host and the device 8

AMF Block I/O Abstraction (AMF I/O) AMF I/O is similar to a conventional block I/O interface A linear array of fixed-size sectors (e.g., 4 KB) with existing I/O primitives (e.g., READ and WRITE) Minimize changes in existing host applications A logical layout exposed to applications READ and WRITE Sector (4KB) Host Applications Host System Flash Device AMF Block I/O Layer 9

Append-only : a group of 4 KB sectors (e.g., several MB) A unit of free-space allocation and free-space reclamation Append-only: overwrite of data is prohibited Host Applications Appending new data (WRITE) Overwrite TRIM Appending (MB) Host System Flash Device AMF Block I/O Layer Only sequential writes with no in-place updates Minimize the functionality of the FTL 10

Case Study with AMF 11 SpriteLFS F2FS Key-value Stores LevelDB RocksDB LSM-Tree File Systems WAFL Btrfs NILFS Which Applications??? FlexVol Storage Virtualization BlueSky HDFS bases RethinkDB Cassandra BigTable LogBase MongoDB Hyder

Case Study with File System Host Applications (Log-structured) AMF Log-structured File System (ALFS) Object-to-storage Remapping Versioning & Cleaning (based on F2FS) I/O Scheduling Host System Flash Device AMF Block I/O Layer AMF Flash Translation Layer (AFTL) -level Address Remapping Wear-leveling & Bad-block NAND Flash 12

AMF Log-structured File System (ALFS) ALFS is based on the F2FS file system How did we modify F2FS for ALFS? Eliminate in-place updates F2FS overwrites check-points and inode-map blocks Change the TRIM policy TRIM is issued to individual sectors How many new codes were added? ALFS F2FS 1300 lines 0 4000 8000 12000 16000 <A comparison of source-code lines of F2FS and ALFS> 13

How Conventional LFS (F2FS) Works Check-Point Inode-Map 0 1 2 LFS PFTL Block with 2 pages * PFTL: page-level FTL 14

How Conventional LFS (F2FS) Works Check-point and inode-map blocks are overwritten Check-Point Inode-Map 0 1 2 LFS CP IM #0 A B C D E B F G PFTL Invalid * PFTL: page-level FTL 15

How Conventional LFS (F2FS) Works Check-Point Inode-Map 0 1 2 LFS CP IM #0 A B C D E B F G PFTL IM IM CP A B C D CP E B F G CP #0 #0 The FTL appends incoming data to NAND flash * PFTL: page-level FTL 16

How Conventional LFS (F2FS) Works Check-Point Inode-Map 0 1 2 LFS CP IM #0 A B C D E B F G PFTL IM IM CP A B C D CP E B F G CP #0 #0 A C D E The FTL triggers garbage collection: 4 page copies and 4 block erasures * PFTL: page-level FTL 17

How Conventional LFS (F2FS) Works Check-Point Inode-Map 0 1 2 LFS CP IM #0 A B C D E B F G A C D PFTL TRIM CP A B C IM IM D CP E B F G #0 #0 CP A C D E A C D The LFS triggers garbage collection: 3 page copies * PFTL: page-level FTL 18

How ALFS Works Check-Point Inode-Map 0 1 2 ALFS AFTL with 2 flash blocks 19

How ALFS Works No in-place updates Check-Point Inode-Map 0 1 2 ALFS CP CP CP IM #0 IM #0 A B C D E B F G AFTL CP CP IM IM CP A B C D #0 #0 E B F G No obsolete pages GC is not necessary 20

How ALFS Works Check-Point Inode-Map 0 1 2 ALFS CP CP CP IM #0 IM #0 A B C D E B F G A C D AFTL TRIM CP CP IM IM CP A B C D #0 #0 E B F G A C D The ALFS triggers garbage collection: 3 page copies and 2 block erasures 21

Comparison of F2FS and AMF Duplicate Management F2FS AMF File System PFTL File System 3 page copies 4 copies + 4 erasures 3 copies + 2 erasures 7 copies + 4 erasures 3 copies + 2 erasures 22

Experimental Setup Implemented ALFS and AFTL in the Linux kernel 3.13 Compared AMF with different file-systems Two file-systems: EXT4 and F2FS with page-level FTL (PFTL) Ran all of them in our in-house SSD platform BlueDBM developed by MIT 23

Performance with FIO For random writes, AMF shows better throughput F2FS is badly affected by the duplicate management problem 24

Performance with bases AMF outperforms EXT4 with more advanced GC policies F2FS shows the worst performance 25

Erasure Counts AMF achieves 6% and 37% better lifetimes than EXT4 and F2FS, respectively, on average 26

Resource (DRAM & CPU) FTL mapping table size SSD Capacity Block-level FTL Hybrid FTL Page-level FTL AMF 512 GB 4 MB 96 MB 512 MB 4 MB 1 TB 8 MB 186 MB 1 GB 8 MB Host CPU usage 27

Conclusion We proposed the Application-Managed Flash (AMF) architecture. AMF was based on a new block I/O interface, called AMF IO, which exposed flash storage as append-only segments Based on AMF IO, we implemented a new FTL scheme (AFTL) and a new file system (ALFS) in the Linux kernel and evaluated them using our in-house SSD prototype Our results showed that DRAM in the flash controller was reduced by 128X and performance was improved by 80% Future Work We are doing case studies with key-value stores, database systems, and storage virtualization platforms 28

Discussion Hardware Implementation of AFTL Smaller segment size Open-Channel SSDs vs AMF 29

Hardware Implementation of AFTL Implement pure hardware-based FTL in FPGA that support the basic functions of AFTL Expose block I/O interfaces to host -level remapping Dynamic wear-leveling Bad-block management It is still a proof-of-concept prototype But, it strongly shows that CPU-less and DRAM-less flash storage could be a promising design choice 30

Smaller s ALFS shows good performance with smaller segments F2FS and ALFS(small) are with 2MB segments The segment size of ALFS increase in proportional to channel and way # EXT4 F2FS(FTL) F2FS(FS) ALFS ALFS(Small) 31

Open-Channel SSDs vs AMF Two different approaches are based on similar ideas The main difference is a level of abstraction AMF still maintains block I/O abstraction AMF respects the unreliable NAND management by FTL AMF allows SSD vendors to hide the details of their SSDs AMF requires small modification on the host kernel side AMF exhibits better data persistency and reliability 32

Source Code All of the software/hardware is being developed under the GPL license Please refer to our Git repositories Hardware Platform: https://github.com/sangwoojun/bluedbm.git FTL: https://github.com/chamdoo/bdbm_drv.git File-System: https://bitbucket.org/chamdoo/risa-f2fs 33 Thank you!