A Semi Preemptive Garbage Collector for Solid State Drives. Junghee Lee, Youngjae Kim, Galen M. Shipman, Sarp Oral, Feiyi Wang, and Jongman Kim

Similar documents
Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Optimizing Translation Information Management in NAND Flash Memory Storage Systems

Presented by: Nafiseh Mahmoudi Spring 2017

SSD Applications in the Enterprise Area

SFS: Random Write Considered Harmful in Solid State Drives

I/O Devices & SSD. Dongkun Shin, SKKU

VSSIM: Virtual Machine based SSD Simulator

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Toward SLO Complying SSDs Through OPS Isolation

Performance Modeling and Analysis of Flash based Storage Devices

NAND Flash-based Storage. Computer Systems Laboratory Sungkyunkwan University

A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives

Performance of PC Solid-State Disks

OSSD: A Case for Object-based Solid State Drives

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

[537] Flash. Tyler Harter

FlashTier: A Lightweight, Consistent and Durable Storage Cache

Using Transparent Compression to Improve SSD-based I/O Caches

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device

S-FTL: An Efficient Address Translation for Flash Memory by Exploiting Spatial Locality

A File-System-Aware FTL Design for Flash Memory Storage Systems

Flash Trends: Challenges and Future

16:30 18:00, June 20 (Monday), 2011 # (even student IDs) # (odd student IDs) Scope

UCS Invicta: A New Generation of Storage Performance. Mazen Abou Najm DC Consulting Systems Engineer

A Caching-Oriented FTL Design for Multi-Chipped Solid-State Disks. Yuan-Hao Chang, Wei-Lun Lu, Po-Chun Huang, Lue-Jane Lee, and Tei-Wei Kuo

A Novel Buffer Management Scheme for SSD

Preface. Fig. 1 Solid-State-Drive block diagram

Parallel-DFTL: A Flash Translation Layer that Exploits Internal Parallelism in Solid State Drives

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

CBM: A Cooperative Buffer Management for SSD

Tutorial FTL. Sejun Kwon Computer Systems Laboratory Sungkyunkwan University

3MG2-P Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date:

Power Analysis for Flash Memory SSD. Dongkun Shin Sungkyunkwan University

Age Aware Pre-emptive Garbage Collection for SSD RAID

High Performance SSD & Benefit for Server Application

Toward Seamless Integration of RAID and Flash SSD

Caching less for better performance: Balancing cache size and update cost of flash memory cache in hybrid storage systems"

Tutorial FTL. Jinyong Computer Systems Laboratory Sungkyunkwan University

3MG2-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Purity: building fast, highly-available enterprise flash storage from commodity components

Reducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

3SE4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

QuickSpecs. PCIe Solid State Drives for HP Workstations

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

Differential RAID: Rethinking RAID for SSD Reliability

HEC: Improving Endurance of High Performance Flash-based Cache Devices

3MG2-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

SAMSUNG SSD 845DC EVO OUTSTANDING PERFORMANCE AND RELIABILITY SPECIFICALLY ENGINEERED FOR DATA CENTER USE

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory

3SE4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Delayed Partial Parity Scheme for Reliable and High-Performance Flash Memory SSD

3MG2-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Solving the I/O bottleneck with Flash

Comparing Performance of Solid State Devices and Mechanical Disks

Samsung Z-SSD SZ985. Ultra-low Latency SSD for Enterprise and Data Centers. Brochure

2TB DATA SHEET Preliminary

Architecture Exploration of High-Performance PCs with a Solid-State Disk

Robert Gottstein, Ilia Petrov, Guillermo G. Almeida, Todor Ivanov, Alex Buchmann

Overprovisioning and the SanDisk X400 SSD

Partitioned Real-Time NAND Flash Storage. Katherine Missimer and Rich West

Join Processing for Flash SSDs: Remembering Past Lessons

AD910A M.2 (NGFF) to SATA III Converter Card

A Flash Scheduling Strategy for Current Capping in Multi-Power-Mode SSDs

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

I/O Characterization of Commercial Workloads

3ME3 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Asymmetric Programming: A Highly Reliable Metadata Allocation Strategy for MLC NAND Flash Memory-Based Sensor Systems

SSDs Driving Greater Efficiency in Data Centers

Minerva. Performance & Burn In Test Rev AD903A/AD903D Converter Card. Table of Contents. 1. Overview

Gecko: Contention-Oblivious Disk Arrays for Cloud Storage

SUPERTALENT ULTRADRIVE LE/ME PERFORMANCE IN APPLE MAC WHITEPAPER

Interface Trends for the Enterprise I/O Highway

SSD-based Information Retrieval Systems

arxiv: v1 [cs.ar] 11 Apr 2017

QLC Challenges. QLC SSD s Require Deep FTL Tuning Karl Schuh Micron. Flash Memory Summit 2018 Santa Clara, CA 1

SATA III 6Gb/S 2.5 SSD Industrial Temp AQS-I25S I Series. Advantech. Industrial Temperature. Datasheet. Rev

FAQs HP Z Turbo Drive Quad Pro

SSDs tend to be more rugged than hard drives with respect to shock and vibration because SSDs have no moving parts.

SSD (Solid State Disk)

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Innodisk ismart Windows 4.0.X

Exploring System Challenges of Ultra-Low Latency Solid State Drives

ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices

3ME2 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

SanDisk Enterprise Storage Solutions

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

1MG3-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Flash In the Data Center

2.5-Inch SATA SSD PSSDS27Txxx6

DLOOP: A Flash Translation Layer Exploiting Plane-Level Parallelism

Flash-Conscious Cache Population for Enterprise Database Workloads

Advantech. AQS-I42N I Series. Semi-Industrial Temperature. Datasheet. SATA III 6Gb/s M.2 SSD Semi-Industrial Temp AQS-I42N I series

MINERVA. Performance & Burn In Test Rev AD912A Interposer Card. Table of Contents. 1. Overview

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

Transcription:

A Semi Preemptive Garbage Collector for Solid State Drives Junghee Lee, Youngjae Kim, Galen M. Shipman, Sarp Oral, Feiyi Wang, and Jongman Kim Presented by Junghee Lee

High Performance Storage Systems Server centric services File,web & media servers,transaction processing servers Enterprise scale Storage Systems Information technology focusing on storage, protection, retrieval of data in large scale environments High Performance Google's massive server farms Storage Systems Storage Unit Hard Disk Drive 2

Spider: A Large scale Storage System Jaguar Peta scalecomputing computing machine 25,000 nodes with 250,000 cores and over 300 TB memory Spider storage system The largest center wide Lustre based file system Over 10.7 PB of RAID 6 formatted capacity 13,400 x 1 TB HDDs 192 Lustre I/O servers Over 3TB of memory (on Lustre I/O servers) 3

Emergence of NAND Flash based SSD NAND Flash vs. Hard Disk Drives Pros: Semi-conductor technology, no mechanical parts Offer lower access latencies μs for SSDs vs. ms for HDDs Lower power consumption Higher robustness to vibrations and temperature Cons: Limited lifetime 10K - 1M erases per block High cost About 8X more expensive than current hard disks Performance variability 4

Outline Introduction Background and Motivation NAND Flash and SSD Garbage Collection Pathological Behavior of SSDs Semi Preemptive Garbage Collection Evaluation Conclusion 5

NAND Flash based SSD fwrite Process Process (file, data) Application Block write (LBA, size) File System (FAT, Ext2, NTFS ) Block Device Driver OS Page write (bank, block, page) Block Interface (SATA, SCSI, etc) Device Memory SSD CPU (FTL) Flash Flash Flash Flash 6

NAND Flash Organization Plane 0 Package Register Read Die 0 Die 1 Block 0 Page 0 0.025 ms Write Plane 0 Plane 1 Plane 2 Plane 3 Plane 0 Plane 1 Plane 2 Plane 3 Page 63 Block 2047 0.200 ms Erased Page 0 Page 63 Erase 1.500 ms 7

Out Of Place Write Logical-to-Physical Address Mapping Table LPN0 PPN1 LPN1 PPN4 LPN2 PPN2 PPN3 LPN3 PPN5 P0 P1 P2 P3 P4 P5 P6 P7 Physical Blocks I V VI EV V V E E Write to LPN2 Invalidate PPN2 Write to PPN3 Update table 8

Garbage Collection Select Victim Block Move Valid Pages Erase Victim Block P0 P1 P2 P3 P4 P5 P6 P7 Physical Blocks E I VE I EI VE I V V EV EV 2 reads + 2 writes + 1 erase= 2*0.025 + 2*0.200 + 1.5 = 1.950(ms)!! 9

Pathological Behavior of SSDs Does GC have an impact on the foreground operations? If so, we canobservesuddenbandwidth sudden bandwidth drop More drop with more write requests More drop with more bursty workloads Experimental Setup SSD devices Intel (SLC) 64GB SSD SuperTalent (MLC) 120GB SSD I/O generator Used libaio asynchronous I/O library for block level testing 10

MB/s Bandwidth Drop for Write Dominant Workloads oads Experiments 280 260 240 220 200 180 160 140 120 Measured bandwidth for 1MB by varying read write ratio Intel SLC (SSD) 1MB Sequential 0 10 20 30 40 50 60 Time (Sec) MB/s 240 220 200 180 160 140 120 1MB Sequential SuperTalent MLC (SSD) 0 10 20 30 40 50 60 Time (Sec) 80% Write 20% Read 60% Write 40% Read 40% Write 60% Read 20% Write 80% Read 80% Write 20% Read 60% Write 40% Read 40% Write 60% Read 20% Write 80% Read Performance variability increases as we increase write-percentage of workloads. 11

Performance Variability for Bursty Workloads Experiments Measured SSD write bandwidth for queue depth (qd) is 8 and 64 Normalized I/O bandwidth with a Z distribution Intel SLC (SSD) SuperTalent MLC (SSD) Performance variability increases as we increase the arrival- rate of requests ests (bursty workloads). 12

Lessons Learned From the empirical study, we learned: Performance variability increasesas the percentageof writes in workloads increases. Performance variability increases with respect to the arrival rate of write requests. This is because: Any incoming requests during the GC should wait until the on going GC ends. GC is not preemptive 13

Outline Introduction Background and Motivation Semi Preemptive Garbage Collection Semi Preemption Further Optimization Level of Allowed Preemption Evaluation Conclusion 14

Technique #1: Semi Preemption W z Request GC R x W x R y W y E Time W z Preemptive GC W z Non-Preemptive GC R x Read page x Data transfer W x Write page x Meta data update E Erase a block 15

Technique #2: Merge R y Request GC R x W x R y W y E Time R x Read page x Data transfer W x Write page x Meta data update E Erase a block 16

Technique #3: Pipeline R z Request GC R x W x R y W y E Time R z R x Read page x Data transfer W x Write page x Meta data update E Erase a block 17

Level of Allowed Preemption Drawback of PGC : The completion time of GC is delayed May incur lack of free blocks Sometimes need to prohibit preemption States of PGC State 0 State 1 State 2 State 3 Garbage Read Write collection requests requests X O O O O O O X X X 18

Outline Introduction Background and Motivation Semi Preemptive Garbage Collection Evaluation Setup Synthetic Workloads Realistic Workloads Conclusion 19

Setup Simulator MSR s SSD simulator based on DiskSim Workloads Synthetic workloads Used the synthetic workload generator in DiskSim Realistic workloads Workloads Average request Read ratio Arrival rate size (KB) (%) (IOP/s) Write Financial 7.09 18.92 47.19 dominant Cello 706 7.06 19.63 74.24 Read dominant TPC-H 31.62 91.80 172.73 OpenMail 9.49 63.30 846.62 20

Performance Improvements for Synthetic Workloads oads Varied four parameters: request size, inter arrival time, sequentiality and read/write ratio Varied one at a time fixing i others Average e respons se time (m ms) 2.5 2.0 1.5 1.0 0.5 0 8 16 32 64 Request size (KB) 4.5 4.0 3.5 3.0 2.5 2.0 1.5 10 1.0 0.5 0 eviation St tandard d NPGC PGC NPGC std PGC std 21

Performance Improvement for Synthetic Workloads oads(co (con t) Bursty Random dominant Write dominant 10 5 3 1 0.8 0.6 0.4 0.2 Inter-arrival time (ms) Probability of sequential access 0.8 0.6 0.4 0.2 Probability of read access 22

Performance Improvement for Realistic Workloads oads Average Response Time N ormalized ave. res sponse tim me 1.0 0.8 0.6 0.4 0.2 0 Financial Cello TPC-H OpenMail Variance of Response Times No ormalized standard d deviatio on 1.0 0.8 0.6 0.4 0.2 0 Financial Cello TPC-H OpenMail Improvement of average Improvement of variance of response time by 6.5% and response time by 49.8% and 66.6% for Financial and Cello. 83.3% for Financial and Cello. 23

Conclusions Solid state drives Fast access speed Performance variation garbage collection Semi preemptive garbage collection Service incoming requests ests during ringgc Average response time and performance variation are reduced by up to 66.6% and 83.3% 24

Questions? Contact info Junghee Lee junghee.lee@gatech.edu Electrical and Computer Engineering Georgia Institute of Technology 25

26 Thank you!