Differential RAID: Rethinking RAID for SSD Reliability

Similar documents
Differential RAID: Rethinking RAID for SSD Reliability

Differential RAID: Rethinking RAID for SSD Reliability

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

Purity: building fast, highly-available enterprise flash storage from commodity components

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Frequently asked questions from the previous class survey

MS800 Series. msata Solid State Drive Datasheet. Product Feature Capacity: 32GB,64GB,128GB,256GB,512GB Flash Type: MLC NAND FLASH

[537] Flash. Tyler Harter

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Chapter 6. Storage and Other I/O Topics

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Reducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression

Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

RAID4S: Improving RAID Performance with Solid State Drives

NAND Flash-based Storage. Computer Systems Laboratory Sungkyunkwan University

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

SSD Applications in the Enterprise Area

Performance Modeling and Analysis of Flash based Storage Devices

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

COMP283-Lecture 3 Applied Database Management

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

Name: Instructions. Problem 1 : Short answer. [48 points] CMU / Storage Systems 23 Feb 2011 Spring 2012 Exam 1

A Design for Networked Flash

CR5M: A Mirroring-Powered Channel-RAID5 Architecture for An SSD

New KingFast Technology. F9 C-Drive Series. KF2710MCS08 SATA Solid State Drive Datasheet

UCS Invicta: A New Generation of Storage Performance. Mazen Abou Najm DC Consulting Systems Engineer

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Flash Trends: Challenges and Future

How does a Client SSD Controller Fit the Bill in Hyperscale Applications?

- SLED: single large expensive disk - RAID: redundant array of (independent, inexpensive) disks

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

Understanding the Relation between the Performance and Reliability of NAND Flash/SCM Hybrid Solid- State Drive

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager

An Experimental Study of Redundant Array of Independent SSDs and Filesystems

Database Management Systems, 2nd edition, Raghu Ramakrishnan, Johannes Gehrke, McGraw-Hill

Yiying Zhang, Leo Prasath Arulraj, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. University of Wisconsin - Madison

Solid-state drive controller with embedded RAID functions

Storage. Hwansoo Han

Flash In the Data Center

Linux Kernel Abstractions for Open-Channel SSDs

NAND Flash Memory. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Mass-Storage. ICS332 Operating Systems

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

SSD (Solid State Disk)

MLC MICRO SATA III FLASH MODULE

CS311 Lecture 21: SRAM/DRAM/FLASH

SSD Server Hard Drives for Dell

Storage. CS 3410 Computer System Organization & Programming

SSD Server Hard Drives for IBM

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Computer System Architecture

Advanced Database Systems

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

Mass-Storage Structure

Toward Seamless Integration of RAID and Flash SSD

NAND Flash Memory. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

The What, Why and How of the Pure Storage Enterprise Flash Array. Ethan L. Miller (and a cast of dozens at Pure Storage)

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

CS143: Disks and Files

3ME2 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

CSCI 350 Ch. 12 Storage Device. Mark Redekopp Michael Shindler & Ramesh Govindan

Monday, May 4, Discs RAID: Introduction Error detection and correction Error detection: Simple parity Error correction: Hamming Codes

I/O CANNOT BE IGNORED

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

DMP SATA DOM SDM-4G-V SDM-8G-V SDM-16G-V

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

3ME3 Series. Customer Approver. InnoDisk Approver. Customer: Customer Part Number: InnoDisk Part Number: InnoDisk Model Name: Date:

System Impacts of HDD and Flash Reliability Steven Hetzler IBM Fellow Manager Storage Architecture Research

Definition of RAID Levels

INTRODUCTION TO EMC XTREMSF

Yet other uses of a level of indirection...! Log-structured & Solid State File Systems Nov 19, Garth Gibson Dave Eckhardt Greg Ganger

TOP CONSIDERATIONS FOR ENTERPRISE SSDS - A PRIMER. Top Considerations for Enterprise SSDs A Primer

Customer: Customer. Part Number: Innodisk. Part Number: Innodisk. Model Name: Date: Customer. Innodisk. Approver. Approver

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

3MG2-P Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date:

3D NAND - Data Recovery and Erasure Verification

3ME3 Series. Customer Approver. InnoDisk Approver. Customer: Customer Part Number: InnoDisk Part Number: InnoDisk Model Name: Date:

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

Don t Let RAID Raid the Lifetime of Your SSD Array

16:30 18:00, June 20 (Monday), 2011 # (even student IDs) # (odd student IDs) Scope

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

NAND Interleaving & Performance

IBM High IOPS SSD PCIe Adapters IBM System x at-a-glance guide

SFS: Random Write Considered Harmful in Solid State Drives

Chapter 6 - External Memory

SSD/Flash for Modern Databases. Peter Zaitsev, CEO, Percona November 1, 2014 Highload Moscow,Russia

Presented by: Nafiseh Mahmoudi Spring 2017

SSD/Flash for Modern Databases. Peter Zaitsev, CEO, Percona October 08, 2014 Percona Technical Webinars

1MG3-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

pblk the OCSSD FTL Linux FAST Summit 18 Javier González Copyright 2018 CNEX Labs

Transcription:

Differential RAID: Rethinking RAID for SSD Reliability Mahesh Balakrishnan Asim Kadav 1, Vijayan Prabhakaran, Dahlia Malkhi Microsoft Research Silicon Valley 1 The University of Wisconsin-Madison

Solid State Storage Flash storage is now mainstream Commodity SSDs Thousands of IOPS Low power consumption How do we protect data on SSDs? Device-Level redundancy: RAID levels 2

Primer on SSDs Smallest unit of read/write is a Page (e.g., 4KB) Pages must be erased before they are overwritten More writes More erasures MTTF, Bit Error Rate (BER) More erasures Higher BER More writes Higher BER 3

The BER Curve Can we use RAID to protect data on SSDs? 4

RAID-5 and SSDs Drive 1 Drive 2 Drive 3 Drive 4 Due to random writes, parity blocks are updated more often than data blocks In RAID-5, parity blocks are evenly distributed across all drives In RAID-5, all drives age at the same rate 5

Correlated Failures in RAID-5 Used Erasures 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Distribution: Drive 1 Drive 2 Drive 3 Drive 4 Conventional RAID can induce correlated failures with SSDs 6

RAID-5 Reliability Loss Probability: When a drive fails, the probability that it cannot be completely reconstructed. Drives are replaced when they reach erasure limit simultaneously 7

Solution: Differential RAID Goal: Age SSDs at different rates Uneven Assignment Example: (70, 10, 10, 10): 70% of parity on Device 1 Any possible configuration between RAID-4 and RAID-5 Drive Replacement 8

Uneven Distribution Drive 1 Drive 2 Drive 3 Drive 4 Drive 1: More parity More writes Ages 2x faster than other drives 9

Uneven Distribution Used Erasures 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Distribution: Drive 1 Drive 2 Drive 3 Drive 4 Aging SSDs at different rates helps 10

Solution: Differential RAID Goal: Age SSDs at different rates Uneven Assignment Example: (70, 10, 10, 10): 70% of parity on Device 1 Any possible configuration between RAID-4 and RAID-5 Drive Replacement 11

Drive Replacement Used Erasures 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Distribution: Drive 1 Drive 2 Drive 3 Drive 4 Drive 5 Naïve Drive Replacement can still lead to correlated failures! 12

Convergence Used Erasures 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Distribution: Drive 1 Drive 2 Drive 3 Drive 4 Age distribution provably converges for any parity assignment! 13

Evaluation Simulation for reliability Real BER data for 12 chips from two studies 1 5-10K erase cycles Assumed 4-bit ECC per 512-byte sector Metric: Loss Probability (DLP) Implementation for performance 1 N.Mielke et al., Bit Error Rate in NAND Flash Memories. International Reliability Physics Symposium, 2008. 14 L.M.Grupp et al., Characterizing Flash Memory: Anomalies, Observations, and Applications. Micro 2009.

Evaluation Simulation for reliability Implementation for performance 5 x Intel X25-M MLC SSDs 80 GB each Random write: 3.3K IOPS, Random read: 35K IOPS Sequential write: 70 MB/s, Sequential read: 250 MB/s 1 N.Mielke et al., Bit Error Rate in NAND Flash Memories. International Reliability Physics Symposium, 2008. 15 L.M.Grupp et al., Characterizing Flash Memory: Anomalies, Observations, and Applications. Micro 2009.

Diff-RAID Reliability RAID-5 Diff-RAID Device replacements 16

Diff-RAID Reliability Diff-RAID is more reliable than RAID-5 by at least 10x for 7/12 chips 17

Reliability vs Throughput Reliability decreases as parity is less concentrated (80,5,5,5,5) (20,20,20,20,20) RAID-5 Throughput increases as parity is less concentrated 18

Conclusion RAID can cause correlated failures with Flash Not just RAID-5; not just SSDs Differential RAID: Key Idea: Age SSDs at different rates Same space overhead as RAID-5 Trade-off between reliability and throughput 19

Other Results Diff-RAID works on real workloads Diff-RAID can be used to extend SSD lifetime Replace drives at 13K cycles DLP < 0.001 Replace drives at 15K cycles DLP < 0.01 Diff-RAID can lower ECC requirements Diff-RAID on 3-bit ECC == RAID-5 on 5-bit ECC 20

SSDs and RAID SSDs are not hard disks! RAID-5 is a bad idea for Hard Disks: Performance: Slow random writes Cost: Storage is cheap Just use RAID-1/10 Reliability: High probability of data loss in large arrays RAID-5 is a great idea for SSDs! Performance: Fast random writes (5 SSDs = 14K/sec) Cost: Storage is expensive Can t use RAID-1/10 Reliability: Correlated Failures! (Not just RAID-5: RAID-1, RAID-4, RAID-10, RAID-6 ) 21

Other Benefits Diff-RAID allows SSDs to be used past the erasure limit: Diff-RAID allows SSDs to be used with less ECC: 30% more lifetime with DLP 0.001 50% more lifetime with DLP 0.01 Diff-RAID with 3-bit ECC is equal to RAID-5 with 5-bit ECC 22

Real Workloads RAID stripe size matters: larger stripe size more random writes Reliability versus throughput on real workloads Larger stripe more reliable 23

Reliability Random device fails: Diff- RAID is at least 2x more reliable for 9/12 chips. Oldest device fails: Diff- RAID is at least 10x more reliable for 7/12 chips. Flash Chips 24