Analysis of SSD Health & Prediction of SSD Life

Similar documents
Solid State Drive (SSD) Cache:

It Takes Guts to be Great

McKesson mixes SSDs with HDDs for Optimal Performance and ROI. Bob Fine, Dir., Product Marketing

Smart Hybrid Storage Using Intelligent Data Access Classification

White Paper: Understanding the Relationship Between SSD Endurance and Over-Provisioning. Solid State Drive

Method to Establish a High Availability and High Performance Storage Array in a Green Environment

NEC Express5800 Servers System Configuration Guide

White Paper: Increase ROI by Measuring the SSD Lifespan in Your Workload

Vendor: Hitachi. Exam Code: HH Exam Name: Hitachi Data Systems Storage Fondations. Version: Demo

Understanding SSD overprovisioning

Lenovo Enterprise Capacity Solid State Drives Product Guide

Using MLC Flash to Reduce System Cost in Industrial Applications

Managing Storage Adapters

Solid-State Solutions as a Catalyst for Evolving Data Center Requirements

Virtual Storage Tier and Beyond

EPTDM Features SATA III 6Gb/s msata SSD

NEXT GENERATION EMC: LEAD YOUR STORAGE TRANSFORMATION

ReDefine Enterprise Storage

MS800 Series. msata Solid State Drive Datasheet. Product Feature Capacity: 32GB,64GB,128GB,256GB,512GB Flash Type: MLC NAND FLASH

New HPE 3PAR StoreServ 8000 and series Optimized for Flash

Breakthrough Cloud Performance NeoSapphire All-Flash Arrays. AccelStor, Inc.

It s Not Your. Drive, or Is It? Transformational Storage Technology. John Scaramuzzo Senior VP/GM

SATA RAID For The Enterprise? Presented at the THIC Meeting at the Sony Auditorium, 3300 Zanker Rd, San Jose CA April 19-20,2005

SimpliVity Best of Both Worlds

Flash Memory. SATA SSD vs. PCIe NVMe SSD. White Paper F-WP003

Chapter 2 Using WebBIOS This chapter explains the WebBIOS setup procedures. WebBIOS is a basic utility to set up and manage the array controller.

Field Update Expanded Deduplication Sizing Guidelines. Oct 2015

SED2FIII-MP Series. Embedded Storage Solutions. SATA III 2.5 Solid State Drive. SED2FIII Series Datasheet. Enterprise Grade CSXXXXXXB-XXXXB3XX

SSD WRITE AMPLIFICATION

Could We Make SSDs Self-Healing?

Webscale, All Flash, Distributed File Systems. Avraham Meir Elastifile

Technical Notes. Considerations for Choosing SLC versus MLC Flash P/N REV A01. January 27, 2012

HP VMA-series Memory Arrays

Purity: building fast, highly-available enterprise flash storage from commodity components

Software Defined Storage for the Evolving Data Center

Intel Solid State Drive Toolbox

Case study: Building bi-directional DR. Joep Piscaer, VMware vexpert, VCDX #101

Deep Learning Performance and Cost Evaluation

Hitachi Virtual Storage Platform Family

Overprovisioning and the SanDisk X400 SSD

Datasheet. Embedded Storage Solutions. Industrial. SATA III 2.5 Solid State Drive. SED2FV Series. V Aug

Get More Out of Storage with Data Domain Deduplication Storage Systems

Adrian Proctor Vice President, Marketing Viking Technology

Guardian Technology Platform Webcast

Optimizing Performance and Reliability of Mobile Storage Memory

Datasheet. Embedded Storage Solutions. Industrial. SATA III 2.5 Solid State Drive. SED2FV Series. V Aug

Kingston s Data Reduction Technology for longer SSD life and greater performance

SATA III 6Gb/S 2.5 SSD Industrial Temp AQS-I25S I Series. Advantech. Industrial Temperature. Datasheet. Rev

SSD Admission Control for Content Delivery Networks

Solid State Performance Comparisons: SSD Cache Performance

3MG2-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Clouds, Convergence & Consolidation

When Hadoop-like Distributed Storage Meets NAND Flash: Challenge and Opportunity

Best of VMworld Juni Wolfgang Richter Regional Sales Manager, Central Europe

Datasheet. Embedded Storage Solutions. Industrial. SATA III 2.5 Solid State Drive. SED2FIV Series CSXXXXXRC-XX09XXXX. 04 Jul 2016 V1.

User Guide. Storage Executive. Introduction. Storage Executive User Guide. Introduction

Backups to the Cloud. Agenda. Introductions. Our Journey to Move Beyond Tape. pollev.com/ryanbass401. Ryan Bass

ZD-XL SQL Accelerator 1.6

PowerVault MD3 Storage Array Enterprise % Availability

The World s Fastest Backup Systems

HPE Storage news. Mauro Colombo Hybrid IT sales & presales manager 23 rd May 2018

System Impacts of HDD and Flash Reliability Steven Hetzler IBM Fellow Manager Storage Architecture Research

Nimble Storage vs HPE 3PAR: A Comparison Snapshot

Dell PowerEdge Express Flash NVMe PCIe SSD Adapter P4500/P4600. User s Guide

Optimizing Flash Storage With Linearization Software. Doug Dumitru CTO EasyCO LLC. EasyCo. Santa Clara, CA USA August

GLS87BP032G3/064G3/128G3/256G3/512G3/001T3 Industrial Temp SATA M.2 ArmourDrive PX Series

An Introduction. (for Military Storage Application) Redefining Flash Storage

VMware Virtual SAN Technology

HGST: Market Creator to Market Leader

Micron Quad-Level Cell Technology Brings Affordable Solid State Storage to More Applications

Next Generation Data Center : Future Trends and Technologies

Optimizing Software-Defined Storage for Flash Memory

Datasheet. Embedded Storage Solutions. Industrial. SATA III 2.5 Solid State Drive. SED2FIV Series CSXXXXXRC-XX10XXXX. 22 Aug 2017 V1.

Advantech. AQS-I42N I Series. Semi-Industrial Temperature. Datasheet. SATA III 6Gb/s M.2 SSD Semi-Industrial Temp AQS-I42N I series

2.5-Inch SATA SSD -7.0mm PSSDS27xxx3

Wear Leveling Implementation on SQFlash

Open CloudServer OCS Solid State Drive Version 2.1

HPE Smart Array SR Gen10 User Guide

DMP SATA DOM SDM-4G-V SDM-8G-V SDM-16G-V

Nearline Storage. Enterprise Product List Internal Drives May Enterprise Capacity 3.5 HDD. Page 1

Controller Concepts for 1y/1z nm and 3D NAND Flash

朱义普. Resolving High Performance Computing and Big Data Application Bottlenecks with Application-Defined Flash Acceleration. Director, North Asia, HPC

SMORE: A Cold Data Object Store for SMR Drives

Innodisk ismart Windows 4.0.X

IBM FlashSystem. IBM FLiP Tool Wie viel schneller kann Ihr IBM i Power Server mit IBM FlashSystem 900 / V9000 Storage sein?

Drive Scope Registration

QNAP OpenStack Ready NAS For a Robust and Reliable Cloud Platform

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump

Designing SSDs for large scale cloud workloads FLASH MEMORY SUMMIT, AUG 2014

The SHARED hosting plan is designed to meet the advanced hosting needs of businesses who are not yet ready to move on to a server solution.

2TB DATA SHEET Preliminary

PowerVault MD3 SSD Cache Overview

Convergence and Flash: Reinventing Storage (Again)

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager

Balakrishnan Nair. Senior Technology Consultant Back Up & Recovery Systems South Gulf. Copyright 2011 EMC Corporation. All rights reserved.

Reconstruyendo una Nube Privada con la Innovadora Hiper-Convergencia Infraestructura Huawei FusionCube Hiper-Convergente

G i v e Y o u r H a r d D r i v e w h a t i t s B e e n M i s s i n g P e r f o r m a n c e The new Synapse

XCubeFAS Series White Paper

Take control of storage performance

How does a Client SSD Controller Fit the Bill in Hyperscale Applications?

Transcription:

Analysis of SSD Health & Prediction of SSD Life Dr. M. K. Jibbe Technical Director NetApp, Inc Bernard Chan Senior Engineer NetApp, Inc

Agenda Problem Statement SMART Endurance Reporting Wear Life Prediction Recovery from Common SSD Errors Drive Data Migration Summary 2

Problem Statement As a storage admin, how can I minimize the impact of SSDs failing? pro-active in handling data in the SSD before it fails? warning when SSD about to wear out copy data off to another SSD fail non-performing SSD maintain data access efficiency & availability 3

What is S.M.A.R.T.? Self-Monitoring Analysis and Reporting Technology Disk Drive s feature to provide various monitoring indicators of disk reliability Intent is to enable the anticipation of hardware failures so data can be copied off to another device prior to drive failure 4

What is S.M.A.R.T.? SSD Program Fail Count (171) SSD Erase Fail Count (172) SSD Wear Leveling Count (173) Erase Fail Count (176) Wear Range Delta (177) SSD Life Left (231) Available Reserved Space (232) Media Wearout Indicator (233) 5

Solid State Media log page (11h) 6

Solid State Media log page (11h) 7

Endurance Reporting NetApp Unique The AVERAGE BLOCKS ERASED field indicates the number of blocks erased per die averaged over all of the dies of the drive. The SPARE BLOCKS REMAINING PERCENTAGE field indicates the percentage of total spare blocks that remain for the device and is tracked over the life of the device. The ENDURANCE REMAINING PERCENTAGE field indicates an estimate of the percentage of 8 device life that has been used.

Effect of Power-on Years to % Endurance Used 9

High Worn SSD Commodity Trader Powered on for ~240 days E2760 with 37TB storage Vendor A 400G for SSD Cache 10 DWPD emlc NAND 1 8% (1.7PB written / 114TB read) 1 9% (1.7PB written / 114TB read) Prediction: 5 years = 69% (9% in 240 days) 10

High Worn SSD Commodity Trader Data ingest per day (Vendor A 400G 10DWPD) 100% = (400G x 10 DWPD x 365 days x 5 years) 1% = ~70TB 9% = 630TB in 240 days = 2.6TB / day Vendor B 3.8TB (1 DWPD) vs 3.2TB (3 DWPD) 2.6 TB/day / 3.8TB: 68% worn at end of 5 years 2.6 TB/day / 9.6TB: 27% worn at end of 5 years Vendor 15.3TB (1 DWPD) 2.6 TB/day / 15.3TB: 17% worn at end of 5 years 11

How Does NetApp Predict SSD Health? Recovery from Common SSD Errors Power cycle the SSD Check the thresholds; Migrate data if necessary and there is a spare drive Mark Impending drive failure and continue to use the drive if there is no spare 12

Drive Data Migration Feature Purpose: have a mechanism to copy data from an impending failure drive in lieu of fail/reconstruct Faster to copy than reconstruct Less impact on foreground I/Os 13

Drive Data Migration Feature Triggers on SSD s Predicted Failure Analysis (PFA) Synthesized PFA (SPFA) error exceeding threshold in a time window Media Error (5 per day) Recovered Error(150 per day) Hardware Error(2 per day) Fast I/O Timeout (8 per day) Stagnant I/O ( 30 count) 14

Drive Data Migration cont d Drive Data Migration thresholds was based upon analysis of failures in populations of 250k Drives from three different vendors HDD Nearline drives, HDD Enterprise drive, SSD drive from two different vendors with capacity ranging from 200GB 8TB The Normal distribution probabilities of the failed drive data shows that the Recovered Error SPFA threshold is increased from 60 to 150. 60 50 40 30 20 10 0 500 400 300 200 100 SSD Recorded Recovered Error Counts For Failed and Non-Failed Drives 0 1 2 3 4 5 6 7 8 9 10 SSD Projected Recovered Error Count Average 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Failed Drive Recovered Error Average Projected Recovered Error Count Average 15

Summary and Conclusions Two methods presented to protect array system from SSD failures SSD parameters are used to predict drive life time SSD error types were used to predict the usability of an SSD drive in a SAN Possible SSD PFA and SPFA triggers to migrate the data off a drive which is expected to fail Data backup can be on site or to a cloud 16

17