Flash Reliability in Produc4on: The Importance of Measurement and Analysis in Improving System Reliability

Size: px
Start display at page:

Download "Flash Reliability in Produc4on: The Importance of Measurement and Analysis in Improving System Reliability"

Transcription

1 Flash Reliability in Produc4on: The Importance of Measurement and Analysis in Improving System Reliability Bianca Schroeder University of Toronto (Currently on sabbatical at Microsoft Research Redmond)

2 System reliability Why and how do systems fail in the wild?

3 Data from a large number of large- scale produc4on systems at different organiza4ons:

4 Different hardware failure events Hardware replacements Correctable and uncorrectable errors in DRAM Server outages Hard disk drive failures Sector errors in hard disk drives Data corrup4on in storage systems Failures/errors in solid state drives Job logs Google, OpenCloud (Hadoop cluster at CMU), Yahoo! Hadoop trace Observa4ons oten different from expecta4ons Surprising to operators as well as manufacturers 4

5 Why flash reliability? More and more data is living on flash => data reliability depends on flash reliability Worry about flash wear- out For a long 4me only lab studies using accelerated tes4ng Recently, some field studies: Sigmetrics 15 (Facebook) Meza et al. FAST 16 (Google) Schroeder et al. Systor 17 (MicrosoT) Narayanan et al. 5

6 Data on repairs, replacements, bad blocks & bad chips 6 years of data Data on workload and variety of error types Google fleet 10 drive models MLC, SLC, emlc 4 chip vendors Custom drives based on commodity chips (but custom firmware and FTL) Drives are reporting counters many times per day 6

7 Percentage(%) Percentage of drives replaced annually due to suspected hardware problems over the first 4 years in the field: Consistent with [Narayanan 17] MLC- A MLC- B MLC- C MLC- D SLC- A SLC- B SLC- C SLC- D Average annual replacement rates for hard disks (2-20%) Good news: Only 1-2% of drives replaced annually - - much lower than hard disks! Drives benefiaed from ability to tolerate chip failure % of drives developed bad chips per year 7

8 Much worse than for hard disk drives (3.5% experiencing sector errors)! These errors are insideous as they are latent. Bad news: Uncorrectable errors common 26-60% of drives see uncorrectable errors in their life (Google) 2-6 out of 1,000 drive days experience uncorrectable errors % of drives at Facebook [Meza et al. 2015] Rates at MicrosoT 10X higher than target rate [Narayanan et al. 2016] 8

9 Wear- out (limited program erase cycles) Technology (MLC, SLC) Lithography Age Workload Temperature Other factors? What reliability metric to use? Raw bit error rate (RBER) Assump4on: as raw bit errors accumulate they turn uncorrectable Probability of uncorrectable errors Why not UBER we will see 9

10 Common expecta4on: Exponen4al increase of RBER with PE cycles or maybe polynomial or other super- linear? Exponential growth RBER PE cycles 10

11 Big differences across models (all drives use same ECC & FTL, so differences are not due to ECC) Linear increase (for range of PE cycles in our data) No sudden increase after PE cycle limit 11

12 Common expecta4on: Lower error rates under SLC ($$$) than MLC 12

13 Red lines are SLC drives RBER is lower for SLC drives than MLC Uncorrectable errors are not lower for SLC drives (all drives use same ECC, FTL, etc. so differences are not due to ECC) SLC drives don t have lower rate of repairs or replacement 13

14 Common expecta4on: Higher error rates for smaller feature size 14

15 43nm versus 50nm 34nm versus 50nm Smaller lithography => higher RBER Lithography has less impact on uncorrectable errors 15

16 Common expecta4on: Exponen4al increase in hardware failures with temperature (Arrhenius equa4on) Exponential growth RBER Temperature 16

17 Increase Decrease! Little effect Uncorrectable errors might increase, decrease or not be affected by temperature. Drive- internal mechanisms protect against temperature, e.g. through throttling. Other effects might dominate 17

18 Lab studies find workload induced error modes Read & program disturb errors, incomplete erases Metrics from lab studies do not always make sense for field data. Field data: no correla4on between read / write / erase opera4ons versus errors (Google, Facebook) Possibly because data at per- drive level too coarse Consequence: For field studies UBER is not a meaningful metric. #Uncorrectable bits UBER= Total # bits read 18

19 Different RBER for same model in different clusters Other factors at play 19

20 The main purpose of RBER is as a metric for overall drive reliability Allows for projec4ons on uncorrectable errors [Mielke2008] 20

21 Drive models with higher RBER don t have higher frequency of uncorrectable errors 21

22 Drives (or drive days) with higher RBER don t have higher frequency of uncorrectable errors RBER is not a good predictor of field reliability Uncorrectable errors caused by other mechanisms than corr. errors? 22

23 Prior errors highly predictive of later uncorrectable errors Can we predict uncorrectable errors? 23

24 Drives report many opera4onal sta4s4cs e.g. through S.M.A.R.T: Workload, temperature, power- on- hours, prior errors, etc. SMART1 SMART2 SMART254? Time now Based on data from interval n, will there be uncorrectable errors in interval n+1? 24

25 Common machine learning techniques for classifica4on problems: Classifica4on and regression trees Random forests Logis4c regression Support vector machines Neural networks Increasing complexity How does predic4on accuracy compare? How can we use predic4ons in prac4ce? 25

26 Betterr Trained models for Google SSDs & Backblaze HDD data Metrics: False posi4ve rate (rate of false alarms) False nega4ve rate (frac4on of errors not caught)

27 Seagate HDS722020ALA330 Catch 90% of errors at 2% false alarm rate. A simple classifier performs best: random forests Accuracy is very good Catch 90% of errors at false alarm rate of 2-8%. Catch 50% of errors at false alarm rate of 1%.

28 Accuracy lower than for HDDs. Could be due to differences in hardware or error repor4ng. Will see that it s s4ll good enough for applica4ons we consider.

29 Seagate error predictions using Hitachi-trained model What if only lisle data is available? Accuracy nearly iden4cal. What if only data for a different model available? Accuracy slightly lower, but s4ll very good.

30 Use case: Op4mize disk scrubbing Why scrubbing? Periodic disk scan to detect and correct errors early. Standard: Scrub con4nuously at a fixed speed (e.g. once per week). Expensive, in par4cular with increasing capaci4es and impact on foreground workload Idea: Increase scrub rate when error is predicted => reduce mean 4me to error detec4on (MTTD) 30

31 Approach: When error is predicted for next interval, increase scrub speed by factor X for this interval. Benefits: Errors that were predicted will be caught X- 4mes faster, on average. Cost: Every 4me an error is predicted, X- 4mes more scrub opera4ons will be issued. Cost & benefit trade- off can be controlled by choosing X and tuning FPR of predictor. 31

32 32

33 Nearly 2X improvement for 2% time spent accelerated. Nearly 1.5X improvement for 1% time spent accelerated. Significant reduc4on in mean 4me to error detec4on at limited overheads. Effec4ve even for the SSD model with the weakest predictor. 33

34 Results when using Hitachi predictor on Seagate drive: Even classifier trained on different model is useful. 34

35 Dynamically adapt ECC strength of SSDs. Proac4vely re4re blocks or chips. Tuning the cache policy of SSD caches, switching from write- back to write- through when chance of errors is high. Tuning file system protec4on mechanisms. 35

36 Predic4ng whole SSD failures [Narayanan 16] Problem: false posi4ves much more expensive (e.g. unnecessary drive replacements) False posi4ve rates of 13% too high. More work let to be done 36

37 Many aspects different from expecta4ons Linear increase with PE cycles RBER not predic4ve of uncorrectable errors SLC not generally more reliable than MLC UBER not a great metric in the field Importance of collec4ng field data Errors are oten predictable using simple ML techniques Can be used to improve storage system reliability Need to think about what data to report/collect We have shown some use cases, probably many more S4ll many open ques4ons 37

38 For more informa4on: [Sigmetrics 15] (Facebook) A large- scale study of flash memory failures in the field, J. Meza, Q. Wu, S. Kumar, O. Mutlu. [FAST 16] (Google) Flash reliability in produc<on: The expected and the unexpected, B. Schroeder, R. Lagisesy, A. Merchant. [Systor 17] (MicrosoT) SSD failures in data centers: The what, when and why?, I. Narayanan, D. Wang, M. Jeon, B. Sharma, L. Caulfield, A. Sivasubramaniam, B. Cutler, J. Liu, B. Khessib, K. Vaid. [IEEE Proceedings 17] (Survey) Reliability of NAND- based SSD: What field studies tell us, B. Schroeder, R. Lagisesy, A. Merchant. 38

Case studies from the real world: The importance of measurement and analysis in system building and design

Case studies from the real world: The importance of measurement and analysis in system building and design Case studies from the real world: The importance of measurement and analysis in system building and design Bianca Schroeder University of Toronto Main interest: system reliability Why and how do systems

More information

SSD Failures in Datacenters: What? When? And Why?

SSD Failures in Datacenters: What? When? And Why? SSD Failures in Datacenters: What? When? And Why? Iyswarya Narayanan, Di Wang, Myeongjae Jeon, Bikash Sharma, Laura Caulfield, Anand Sivasubramaniam, Ben Cutler, Jie Liu, Badriddine Khessib, Kushagra Vaid

More information

Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery

Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery Yu Cai, Yixin Luo, Saugata Ghose, Erich F. Haratsch*, Ken Mai, Onur Mutlu Carnegie Mellon University, *Seagate Technology

More information

Machine Learning Crash Course: Part I

Machine Learning Crash Course: Part I Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec

More information

Using MLC Flash to Reduce System Cost in Industrial Applications

Using MLC Flash to Reduce System Cost in Industrial Applications Using MLC Flash to Reduce System Cost in Industrial Applications Chris Budd SMART High Reliability Solutions Santa Clara, CA 1 Introduction Component selection: cost versus quality Use same component to

More information

Nutanix s RF2 Configuration for Flash Storage: Myth and Fact

Nutanix s RF2 Configuration for Flash Storage: Myth and Fact Nutanix s RF2 Configuration for Flash Storage: Myth and Fact 385 Moffett Park Dr. Sunnyvale, CA 94089 844-478-8349 www.datrium.com Technical Report About the Authors Lakshmi N. Bairavasundaram is a Member

More information

Embedded SSD Product Challenges and Test Mitigation

Embedded SSD Product Challenges and Test Mitigation Embedded SSD Product Challenges and Test Mitigation Flash Memory Summit, 2015 ATP Electronics, Inc. August 2015 1 Overview Embedded SSD Product Challenges The Factor of Industry Focus & Validation Challenges

More information

Differential RAID: Rethinking RAID for SSD Reliability

Differential RAID: Rethinking RAID for SSD Reliability Differential RAID: Rethinking RAID for SSD Reliability Mahesh Balakrishnan Asim Kadav 1, Vijayan Prabhakaran, Dahlia Malkhi Microsoft Research Silicon Valley 1 The University of Wisconsin-Madison Solid

More information

HeatWatch Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu

HeatWatch Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu HeatWatch Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu Storage Technology Drivers

More information

Temperature Considerations for Industrial Embedded SSDs

Temperature Considerations for Industrial Embedded SSDs Solid State Storage and Memory White Paper Temperature Considerations for Industrial Embedded SSDs Introduction NAND flash-based SSDs and memory cards continue to be the dominant storage media for most

More information

TEMPERATURE MANAGEMENT IN DATA CENTERS: WHY SOME (MIGHT) LIKE IT HOT

TEMPERATURE MANAGEMENT IN DATA CENTERS: WHY SOME (MIGHT) LIKE IT HOT TEMPERATURE MANAGEMENT IN DATA CENTERS: WHY SOME (MIGHT) LIKE IT HOT Nosayba El-Sayed, Ioan Stefanovici, George Amvrosiadis, Andy A. Hwang, Bianca Schroeder {nosayba, ioan, gamvrosi, hwang, bianca}@cs.toronto.edu

More information

Flash Trends: Challenges and Future

Flash Trends: Challenges and Future Flash Trends: Challenges and Future John D. Davis work done at Microsoft Researcher- Silicon Valley in collaboration with Laura Caulfield*, Steve Swanson*, UCSD* 1 My Research Areas of Interest Flash characteristics

More information

Understanding SSD Reliability in Large-Scale Cloud Systems

Understanding SSD Reliability in Large-Scale Cloud Systems Understanding SSD Reliability in Large-Scale Cloud Systems Erci Xu Mai Zheng Feng Qin Yikang Xu Jiesheng Wu Ohio State University Iowa State University Ohio State University Aliyun Alibaba Aliyun Alibaba

More information

Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques

Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques Yu Cai, Saugata Ghose, Yixin Luo, Ken Mai, Onur Mutlu, Erich F. Haratsch February 6, 2017

More information

Improving LDPC Performance Via Asymmetric Sensing Level Placement on Flash Memory

Improving LDPC Performance Via Asymmetric Sensing Level Placement on Flash Memory Improving LDPC Performance Via Asymmetric Sensing Level Placement on Flash Memory Qiao Li, Liang Shi, Chun Jason Xue Qingfeng Zhuge, and Edwin H.-M. Sha College of Computer Science, Chongqing University

More information

Increasing NAND Flash Endurance Using Refresh Techniques

Increasing NAND Flash Endurance Using Refresh Techniques Increasing NAND Flash Endurance Using Refresh Techniques Yu Cai 1, Gulay Yalcin 2, Onur Mutlu 1, Erich F. Haratsch 3, Adrian Cristal 2, Osman S. Unsal 2 and Ken Mai 1 DSSC, Carnegie Mellon University 1

More information

What Can One Billion Hours of Spinning Hard Drives Tell Us?

What Can One Billion Hours of Spinning Hard Drives Tell Us? What Can One Billion Hours of Spinning Hard Drives Tell Us? What When Who Storage Developer Conference September 2016 Gleb Budman, CEO @GlebBudman Overview Our environment How we diagnose sick drives Drive

More information

ECC Approach for Correcting Errors Not Handled by RAID Recovery

ECC Approach for Correcting Errors Not Handled by RAID Recovery ECC Approach for Correcting Errors Not Handled by RAID Recovery Jeff Yang Siliconmotion Flash Memory Summit 27 Note: All the material are the concept proof and simulation.s It is not the real Siliconmotion

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery

Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery Yu Cai, Yixin Luo, Erich F. Haratsch*, Ken Mai, Onur Mutlu Carnegie Mellon University, *LSI Corporation 1 Many use

More information

Pseudo SLC. Comparison of SLC, MLC and p-slc structures. pslc

Pseudo SLC. Comparison of SLC, MLC and p-slc structures. pslc 1 Pseudo SLC In the MLC structures, it contains strong pages and weak pages for 2-bit per cell. Pseudo SLC (pslc) is to store only 1bit per cell data on the strong pages of MLC. With this algorithm, it

More information

It Takes Guts to be Great

It Takes Guts to be Great It Takes Guts to be Great Sean Stead, STEC Tutorial C-11: Enterprise SSDs Tues Aug 21, 2012 8:30 to 11:20AM 1 Who s Inside Your SSD? Full Data Path Protection Host Interface It s What s On The Inside That

More information

Lecture 5: Scheduling and Reliability. Topics: scheduling policies, handling DRAM errors

Lecture 5: Scheduling and Reliability. Topics: scheduling policies, handling DRAM errors Lecture 5: Scheduling and Reliability Topics: scheduling policies, handling DRAM errors 1 PAR-BS Mutlu and Moscibroda, ISCA 08 A batch of requests (per bank) is formed: each thread can only contribute

More information

Improving the Reliability of Chip-Off Forensic Analysis of NAND Flash Memory Devices. Aya Fukami, Saugata Ghose, Yixin Luo, Yu Cai, Onur Mutlu

Improving the Reliability of Chip-Off Forensic Analysis of NAND Flash Memory Devices. Aya Fukami, Saugata Ghose, Yixin Luo, Yu Cai, Onur Mutlu Improving the Reliability of Chip-Off Forensic Analysis of NAND Flash Memory Devices Aya Fukami, Saugata Ghose, Yixin Luo, Yu Cai, Onur Mutlu 1 Example Target Devices for Chip-Off Analysis Fire damaged

More information

Evaluating Phase Change Memory for Enterprise Storage Systems

Evaluating Phase Change Memory for Enterprise Storage Systems Hyojun Kim Evaluating Phase Change Memory for Enterprise Storage Systems IBM Almaden Research Micron provided a prototype SSD built with 45 nm 1 Gbit Phase Change Memory Measurement study Performance Characteris?cs

More information

DPA: A data pattern aware error prevention technique for NAND flash lifetime extension

DPA: A data pattern aware error prevention technique for NAND flash lifetime extension DPA: A data pattern aware error prevention technique for NAND flash lifetime extension *Jie Guo, *Zhijie Chen, **Danghui Wang, ***Zili Shao, *Yiran Chen *University of Pittsburgh **Northwestern Polytechnical

More information

Flash Memory Overview: Technology & Market Trends. Allen Yu Phison Electronics Corp.

Flash Memory Overview: Technology & Market Trends. Allen Yu Phison Electronics Corp. Flash Memory Overview: Technology & Market Trends Allen Yu Phison Electronics Corp. 25,000 20,000 15,000 The NAND Market 40% CAGR 10,000 5,000 ($Million) - 2001 2002 2003 2004 2005 2006 2007 2008 2009

More information

A Self Learning Algorithm for NAND Flash Controllers

A Self Learning Algorithm for NAND Flash Controllers A Self Learning Algorithm for NAND Flash Controllers Hao Zhi, Lee Firmware Manager Core Storage Electronics Corp./Phison Electronics Corp. haozhi_lee@phison.com Santa Clara, CA 1 Outline Basic FW Architecture

More information

Differential RAID: Rethinking RAID for SSD Reliability

Differential RAID: Rethinking RAID for SSD Reliability Differential RAID: Rethinking RAID for SSD Reliability ABSTRACT Asim Kadav University of Wisconsin Madison, WI kadav@cs.wisc.edu Vijayan Prabhakaran vijayanp@microsoft.com Deployment of SSDs in enterprise

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annota1ons by Michael L. Nelson All slides Addison Wesley, 2008 Evalua1on Evalua1on is key to building effec$ve and efficient search engines measurement usually

More information

Sub-block Wear-leveling for NAND Flash

Sub-block Wear-leveling for NAND Flash IBM Research Zurich March 6, 2 Sub-block Wear-leveling for NAND Flash Roman Pletka, Xiao-Yu Hu, Ilias Iliadis, Roy Cideciyan, Theodore Antonakopoulos Work done in collaboration with University of Patras

More information

HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness

HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness HeatWatch: Improving 3D NAND Flash Memory Device Reliability by Exploiting Self-Recovery and Temperature Awareness Yixin Luo Saugata Ghose Yu Cai Erich F. Haratsch Onur Mutlu Carnegie Mellon University

More information

Stages of (Batch) Machine Learning

Stages of (Batch) Machine Learning Evalua&on Stages of (Batch) Machine Learning Given: labeled training data X, Y = {hx i,y i i} n i=1 Assumes each x i D(X ) with y i = f target (x i ) Train the model: model ß classifier.train(x, Y ) x

More information

3MG2-P Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date:

3MG2-P Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date: 3MG2-P Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of Contents 1.8 SATA SSD 3MG2-P LIST OF FIGURES... 6 1. PRODUCT

More information

It s Not Your. Drive, or Is It? Transformational Storage Technology. John Scaramuzzo Senior VP/GM

It s Not Your. Drive, or Is It? Transformational Storage Technology. John Scaramuzzo Senior VP/GM It s Not Your Father s Hard Drive, or Is It? SSDs: A Transformational Storage Technology John Scaramuzzo Senior VP/GM 2 Transformational Technologies Disruptive innovation is transformational 3 Solid State

More information

islc Claiming the Middle Ground of the High-end Industrial SSD Market

islc Claiming the Middle Ground of the High-end Industrial SSD Market White Paper islc Claiming the Middle Ground of the High-end Industrial SSD Market Executive Summary islc is a NAND flash technology designed to optimize the balance between cost and performance. The firmware

More information

Quality Comparison of SLC, MLC and emlc.

Quality Comparison of SLC, MLC and emlc. Quality Comparison of SLC, MLC and emlc. Director of Engineer, InnoDisk C.C. Wu Santa Clara, CA 1 Agenda The features of SLC, MLC and emlc. Read, Write and Erase Time Read Bit Error VS. Program/Erase Error

More information

Reducing MLC Flash Memory Retention Errors through Programming Initial Step Only

Reducing MLC Flash Memory Retention Errors through Programming Initial Step Only Reducing MLC Flash Memory Retention Errors through Programming Initial Step Only Wei Wang 1, Tao Xie 2, Antoine Khoueir 3, Youngpil Kim 3 1 Computational Science Research Center, San Diego State University

More information

Designing Enterprise SSDs with Low Cost Media

Designing Enterprise SSDs with Low Cost Media Designing Enterprise SSDs with Low Cost Media Jeremy Werner Director of Marketing SandForce Flash Memory Summit August 2011 Santa Clara, CA 1 Everyone Knows Flash is migrating: To smaller nodes 2-bit and

More information

arxiv: v1 [cs.dc] 1 Jan 2019

arxiv: v1 [cs.dc] 1 Jan 2019 arxiv:1901.03401v1 [cs.dc] 1 Jan 2019 Large Scale Studies of Memory, Storage, and Network Failures in a Modern Data Center Submitted in partial fulfillment of the requirements for the degree of Doctor

More information

Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs

Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs 1 Public and Private Cloud Providers 2 Workloads and Applications Multi-Tenancy Databases Instance

More information

Context based Online Configura4on Error Detec4on

Context based Online Configura4on Error Detec4on Context based Online Configura4on Error Detec4on Ding Yuan, Yinglian Xie, Rina Panigrahy, Junfeng Yang Γ, Chad Verbowski, Arunvijay Kumar MicrosoM Research, UIUC and UCSD, Γ Columbia University, 1 Mo4va4on

More information

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris ML4Bio Lecture #1: Introduc3on February 24 th, 216 Quaid Morris Course goals Prac3cal introduc3on to ML Having a basic grounding in the terminology and important concepts in ML; to permit self- study,

More information

FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance

FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance Ping Huang, Pradeep Subedi, Xubin He, Shuang He and Ke Zhou Department of Electrical and Computer Engineering, Virginia Commonwealth

More information

Access Characteristic Guided Read and Write Cost Regulation for Performance Improvement on Flash Memory

Access Characteristic Guided Read and Write Cost Regulation for Performance Improvement on Flash Memory Access Characteristic Guided Read and Write Cost Regulation for Performance Improvement on Flash Memory Qiao Li, Liang Shi, Chun Jason Xue Kaijie Wu, Cheng Ji, Qingfeng Zhuge, and Edwin H. M. Sha College

More information

[537] Flash. Tyler Harter

[537] Flash. Tyler Harter [537] Flash Tyler Harter Flash vs. Disk Disk Overview I/O requires: seek, rotate, transfer Inherently: - not parallel (only one head) - slow (mechanical) - poor random I/O (locality around disk head) Random

More information

WHITE PAPER. Know Your SSDs. Why SSDs? How do you make SSDs cost-effective? How do you get the most out of SSDs?

WHITE PAPER. Know Your SSDs. Why SSDs? How do you make SSDs cost-effective? How do you get the most out of SSDs? WHITE PAPER Know Your SSDs Why SSDs? How do you make SSDs cost-effective? How do you get the most out of SSDs? Introduction: Solid-State Drives (SSDs) are one of the fastest storage devices in the world

More information

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018 Storage Systems : Disks and SSDs Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018 Why study storage? Scalable High Performance Main Memory System Using Phase-Change Memory Technology,

More information

Analysis of SSD Health & Prediction of SSD Life

Analysis of SSD Health & Prediction of SSD Life Analysis of SSD Health & Prediction of SSD Life Dr. M. K. Jibbe Technical Director NetApp, Inc Bernard Chan Senior Engineer NetApp, Inc Agenda Problem Statement SMART Endurance Reporting Wear Life Prediction

More information

InnoREC 3MV2-P. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

InnoREC 3MV2-P. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: InnoREC 3MV2-P Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents InnoREC SATA Slim 3MV2-P LIST OF FIGURES... 6 1. PRODUCT

More information

3ME Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date:

3ME Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date: 3ME Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents 2.5 SATA SSD 3ME LIST OF FIGURES... 6 1. PRODUCT OVERVIEW...

More information

Error Detection/Correction And Bad Block Management

Error Detection/Correction And Bad Block Management An InnoDisk White Paper August 2012 Error Detection/Correction And Bad Block Management Early Detection of Factory-Marked Bad Blocks And Normal Operation Tracking Of Bad Blocks in Solid-State Drives (SSDs)

More information

Controller Concepts for 1y/1z nm and 3D NAND Flash

Controller Concepts for 1y/1z nm and 3D NAND Flash Controller Concepts for 1y/1z nm and 3D NAND Flash Erich F. Haratsch Santa Clara, CA 1 NAND Evolution Planar NAND scaling is coming to an end in the sub- 20nm process 15nm and 16nm NAND are the latest

More information

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018 Storage Systems : Disks and SSDs Manu Awasthi CASS 2018 Why study storage? Scalable High Performance Main Memory System Using Phase-Change Memory Technology, Qureshi et al, ISCA 2009 Trends Total amount

More information

Sentient Storage: Do SSDs have a mind of their own? Tom Kopchak

Sentient Storage: Do SSDs have a mind of their own? Tom Kopchak Sentient Storage: Do SSDs have a mind of their own? Tom Kopchak :: @tomkopchak About me Why we're here Current forensic practices for working with hard drives are well-defined Solid state drives behave

More information

3MV2-P Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date:

3MV2-P Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date: 3MV2-P Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents LIST OF FIGURES... 6 1. PRODUCT OVERVIEW... 7 1.1 INTRODUCTION

More information

Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery

Read Disturb Errors in MLC NAND Flash Memory: Characterization, Mitigation, and Recovery Carnegie Mellon University Research Showcase @ CMU Department of Electrical and Computer Engineering Carnegie Institute of Technology 6-2015 Read Disturb Errors in MLC NAND Flash Memory: Characterization,

More information

3ME3 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

3ME3 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: 3ME3 Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents msata 3ME3 LIST OF FIGURES... 6 1. PRODUCT OVERVIEW...

More information

CS 6140: Machine Learning Spring Final Exams. What we learned Final Exams 2/26/16

CS 6140: Machine Learning Spring Final Exams. What we learned Final Exams 2/26/16 Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment

More information

CS 6140: Machine Learning Spring 2016

CS 6140: Machine Learning Spring 2016 CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment

More information

Don t Let RAID Raid the Lifetime of Your SSD Array

Don t Let RAID Raid the Lifetime of Your SSD Array Don t Let RAID Raid the Lifetime of Your SSD Array Sangwhan Moon Texas A&M University A. L. Narasimha Reddy Texas A&M University Abstract Parity protection at system level is typically employed to compose

More information

Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver. Customer Approver

Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver. Customer Approver Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents CFast 3IE4 REVISION HISTORY... 4 LIST OF TABLES... 5 LIST OF FIGURES...

More information

3MG2-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

3MG2-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: 3MG2-P Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents 1. PRODUCT OVERVIEW... 7 1.1 INTRODUCTION OF INNODISK

More information

S-FTL: An Efficient Address Translation for Flash Memory by Exploiting Spatial Locality

S-FTL: An Efficient Address Translation for Flash Memory by Exploiting Spatial Locality S-FTL: An Efficient Address Translation for Flash Memory by Exploiting Spatial Locality Song Jiang, Lei Zhang, Xinhao Yuan, Hao Hu, and Yu Chen Department of Electrical and Computer Engineering Wayne State

More information

3MG2-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

3MG2-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: 3MG2-P Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents SATA Slim 3MG2-P LIST OF FIGURES... 6 1. PRODUCT OVERVIEW...

More information

3ME4 Series. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver. Customer Approver

3ME4 Series. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver. Customer Approver 3ME4 Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of Contents Slim SSD 3ME4 LIST OF FIGURES... 6 1. PRODUCT OVERVIEW...

More information

Holistic Flash Management for Next Generation All-Flash Arrays

Holistic Flash Management for Next Generation All-Flash Arrays Holistic Flash Management for Next Generation All-Flash Arrays Roman Pletka, Nikolas Ioannou, Ioannis Koltsidas, Nikolaos Papandreou, Thomas Parnell, Haris Pozidis, Sasa Tomic IBM Research Zurich Aaron

More information

CTWP015: S.M.A.R.T. Features for Cactus Technologies Industrial-Grade Flash-Storage Products

CTWP015: S.M.A.R.T. Features for Cactus Technologies Industrial-Grade Flash-Storage Products CTWP015: S.M.A.R.T. Features for Cactus Technologies Industrial-Grade Flash-Storage Products Covered Products: 203, 303, 503 CF cards, 806/808 SD cards, 300 UFD and 900S SATA products 1 Introduction Self

More information

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University COS 318: Operating Systems Storage Devices Vivek Pai Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall11/cos318/ Today s Topics Magnetic disks Magnetic disk

More information

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane Secondary storage CS 537 Lecture 11 Secondary Storage Michael Swift Secondary storage typically: is anything that is outside of primary memory does not permit direct execution of instructions or data retrieval

More information

NAND Flash Memory. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

NAND Flash Memory. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University NAND Flash Memory Jinkyu Jeong (Jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong (jinkyu@skku.edu) Flash

More information

What can we learn from 100,000 spinning hard drives? Andrew Klein Backblaze

What can we learn from 100,000 spinning hard drives? Andrew Klein Backblaze What can we learn from 100,000 spinning hard drives? Andrew Klein Backblaze 1 Overview History Drive Stats Reliability over time Enterprise vs. consumer drives SMART Stats Is predicting drive failure possible?

More information

Storwize in IT Environments Market Overview

Storwize in IT Environments Market Overview Storwize in IT Environments Market Overview Topic Challenges in Tradi,onal IT Environment Types of informa,on Storage systems required Storage for private clouds where tradi,onal IT is involved Storwize

More information

Main Points. File systems. Storage hardware characteris7cs. File system usage Useful abstrac7ons on top of physical devices

Main Points. File systems. Storage hardware characteris7cs. File system usage Useful abstrac7ons on top of physical devices Storage Systems Main Points File systems Useful abstrac7ons on top of physical devices Storage hardware characteris7cs Disks and flash memory File system usage pa@erns File System Abstrac7on File system

More information

3ME2 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

3ME2 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: 3ME2 Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents 2.5 SATA SSD 3ME2 LIST OF FIGURES... 6 1. PRODUCT OVERVIEW...

More information

3MG2-P Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date:

3MG2-P Series. Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date: 3MG2-P Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents msata 3MG2-P LIST OF FIGURES... 6 1. PRODUCT OVERVIEW...

More information

Self-Adaptive NAND Flash DSP

Self-Adaptive NAND Flash DSP Self-Adaptive NAND Flash DSP Wei Xu 2018/8/9 Outline NAND Flash Data Error Recovery Challenges of NAND Flash Data Integrity A Self-Adaptive DSP Technology to Improve NAND Flash Memory Data Integrity 6

More information

ML pipelines at OK. Dmitry Bugaychenko

ML pipelines at OK. Dmitry Bugaychenko ML pipelines at OK Dmitry Bugaychenko OK is 70 000 000+ monthly unique users OK is 800 000 000+ family links in the social graph OK is A place where people share their posi9ve feelings OK is 10000+ servers

More information

3SE4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

3SE4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: 3SE4 Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents LIST OF FIGURES... 6 1. PRODUCT OVERVIEW... 7 1.1 INTRODUCTION

More information

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value

More information

Technical Notes. Considerations for Choosing SLC versus MLC Flash P/N REV A01. January 27, 2012

Technical Notes. Considerations for Choosing SLC versus MLC Flash P/N REV A01. January 27, 2012 Considerations for Choosing SLC versus MLC Flash Technical Notes P/N 300-013-740 REV A01 January 27, 2012 This technical notes document contains information on these topics:...2 Appendix A: MLC vs SLC...6

More information

3ME3 Series. Customer Approver. InnoDisk Approver. Customer: Customer Part Number: InnoDisk Part Number: InnoDisk Model Name: Date:

3ME3 Series. Customer Approver. InnoDisk Approver. Customer: Customer Part Number: InnoDisk Part Number: InnoDisk Model Name: Date: 3ME3 Series Customer: Customer Part Number: InnoDisk Part Number: InnoDisk Model Name: Date: InnoDisk Approver Customer Approver Table of contents SATADOM-SH 3ME3 LIST OF FIGURES... 6 1. PRODUCT OVERVIEW...

More information

Reducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression

Reducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression Reducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression Xuebin Zhang, Jiangpeng Li, Hao Wang, Kai Zhao and Tong Zhang xuebinzhang.rpi@gmail.com ECSE Department,

More information

3ME4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

3ME4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: 3ME4 Series Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date: Innodisk Approver Customer Approver Table of contents LIST OF FIGURES... 6 1. PRODUCT OVERVIEW... 7 1.1 INTRODUCTION

More information

About the Course. Reading List. Assignments and Examina5on

About the Course. Reading List. Assignments and Examina5on Uppsala University Department of Linguis5cs and Philology About the Course Introduc5on to machine learning Focus on methods used in NLP Decision trees and nearest neighbor methods Linear models for classifica5on

More information

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager 1 T HE D E N A L I N E X T - G E N E R A T I O N H I G H - D E N S I T Y S T O R A G E I N T E R F A C E Laura Caulfield Senior Software Engineer Arie van der Hoeven Principal Program Manager Outline Technology

More information

Datasheet. Embedded Storage Solutions. Industrial. SATA III 2.5 Solid State Drive. SED2FIV Series CSXXXXXRC-XX09XXXX. 04 Jul 2016 V1.

Datasheet. Embedded Storage Solutions. Industrial. SATA III 2.5 Solid State Drive. SED2FIV Series CSXXXXXRC-XX09XXXX. 04 Jul 2016 V1. Embedded Storage Solutions Industrial SATA III 2.5 Solid State Drive SED2FIV Series CSXXXXXRC-XX09XXXX 1 Table of Contents 1. Product Description... 4 1.1. Product Overview... 4 1.2. Product Features...

More information

Intelligent Secure Storage for Industrial IoT

Intelligent Secure Storage for Industrial IoT Intelligent Secure Storage for Industrial IoT Creating Reliable Embedded Systems with Flash Memory Design Tools for IoT Storage Scott Phillips VP, Marketing Virtium Solid State Storage and Memory 30052

More information

WWW. FUSIONIO. COM. Fusion-io s Solid State Storage A New Standard for Enterprise-Class Reliability Fusion-io, All Rights Reserved.

WWW. FUSIONIO. COM. Fusion-io s Solid State Storage A New Standard for Enterprise-Class Reliability Fusion-io, All Rights Reserved. Fusion-io s Solid State Storage A New Standard for Enterprise-Class Reliability iodrive Fusion-io s Solid State Storage A New Standard for Enterprise-Class Reliability Fusion-io offers solid state storage

More information

Could We Make SSDs Self-Healing?

Could We Make SSDs Self-Healing? Could We Make SSDs Self-Healing? Tong Zhang Electrical, Computer and Systems Engineering Department Rensselaer Polytechnic Institute Google/Bing: tong rpi Santa Clara, CA 1 Introduction and Motivation

More information

NAND Flash Basics & Error Characteristics

NAND Flash Basics & Error Characteristics NAND Flash Basics & Error Characteristics Why Do We Need Smart Controllers? Thomas Parnell, Roman Pletka IBM Research - Zurich Santa Clara, CA 1 Agenda Part I. NAND Flash Basics Device Architecture (2D

More information

3ME3 Series. Customer Approver. InnoDisk Approver. Customer: Customer Part Number: InnoDisk Part Number: InnoDisk Model Name: Date:

3ME3 Series. Customer Approver. InnoDisk Approver. Customer: Customer Part Number: InnoDisk Part Number: InnoDisk Model Name: Date: 3ME3 Series Customer: Customer Part Number: InnoDisk Part Number: InnoDisk Model Name: Date: InnoDisk Approver Customer Approver Table of contents SATADOM-ML 3ME3 LIST OF FIGURES... 6 1. PRODUCT OVERVIEW...

More information

Preface. Fig. 1 Solid-State-Drive block diagram

Preface. Fig. 1 Solid-State-Drive block diagram Preface Solid-State-Drives (SSDs) gained a lot of popularity in the recent few years; compared to traditional HDDs, SSDs exhibit higher speed and reduced power, thus satisfying the tough needs of mobile

More information

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University NAND Flash-based Storage Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics NAND flash memory Flash Translation Layer (FTL) OS implications

More information

Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory

Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory Characterizing Application Memory Error Vulnerability to Optimize Datacenter Cost via Heterogeneous-Reliability Memory Yixin Luo, Sriram Govindan, Bikash Sharma, Mark Santaniello, Justin Meza, Aman Kansal,

More information

Purity: building fast, highly-available enterprise flash storage from commodity components

Purity: building fast, highly-available enterprise flash storage from commodity components Purity: building fast, highly-available enterprise flash storage from commodity components J. Colgrove, J. Davis, J. Hayes, E. Miller, C. Sandvig, R. Sears, A. Tamches, N. Vachharajani, and F. Wang 0 Gala

More information

Secure data storage -

Secure data storage - Secure data storage - NAND Flash technologies and controller mechanisms Ricky Gremmelmaier Head of Business Development Embedded Computing Rutronik at a Glance Founded in 1973 / 2016 Revenue: 872 Mio Headquartered

More information

SSD Innovations. Report No. FI-NFL-SSDI By: Luca De Ambroggi & Francesco Petruzziello

SSD Innovations. Report No. FI-NFL-SSDI By: Luca De Ambroggi & Francesco Petruzziello Report No. FI-NFL-SSDI-0410 By: Luca De Ambroggi & Francesco Petruzziello April 2010 2010 Forward Insights. All Rights Reserved. Reproduction and distribution of this publication in any form in whole or

More information

White Paper: Increase ROI by Measuring the SSD Lifespan in Your Workload

White Paper: Increase ROI by Measuring the SSD Lifespan in Your Workload White Paper: Using SMART Attributes to Estimate Drive Lifetime Increase ROI by Measuring the SSD Lifespan in Your Workload Using SMART Attributes to Estimate Drive Endurance The lifespan of storage has

More information

NAND Flash Architecture and Specification Trends

NAND Flash Architecture and Specification Trends NAND Flash Architecture and Specification Trends Michael Abraham (mabraham@micron.com) NAND Solutions Group Architect Micron Technology, Inc. August 2011 1 Topics NAND Flash trends SSD/Enterprise application

More information