Estimating Hardware Storage Costs

Similar documents
HPSS RAIT. A high performance, resilient, fault-tolerant tape data storage class. 1

Automated Storage Tiering on Infortrend s ESVA Storage Systems

SATA RAID For The Enterprise? Presented at the THIC Meeting at the Sony Auditorium, 3300 Zanker Rd, San Jose CA April 19-20,2005

Lenovo RAID Introduction Reference Information

The term "physical drive" refers to a single hard disk module. Figure 1. Physical Drive

Chapter 6 - External Memory

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID

COMP283-Lecture 3 Applied Database Management

Hard Disk Storage Deflation Is There a Floor?

NetApp SolidFire and Pure Storage Architectural Comparison A SOLIDFIRE COMPETITIVE COMPARISON

I/O CANNOT BE IGNORED

IBM N Series. Store the maximum amount of data for the lowest possible cost. Matthias Rettl Systems Engineer NetApp Austria GmbH IBM Corporation

Optimizing the Data Center with an End to End Solutions Approach

Mladen Stefanov F48235 R.A.I.D

SolidFire and Pure Storage Architectural Comparison

Storage Systems. Storage Systems

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

QuickSpecs. Models. Overview

5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 485.e1

Storage Devices for Database Systems

Certified Information Systems Auditor (CISA)

3.3 Understanding Disk Fault Tolerance Windows May 15th, 2007

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek

Database Management Systems, 2nd edition, Raghu Ramakrishnan, Johannes Gehrke, McGraw-Hill

IST346. Data Storage

How to recover a failed Storage Spaces

Take control of storage performance

PeerStorage Arrays Unequalled Storage Solutions

vsan 6.6 Performance Improvements First Published On: Last Updated On:

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 6 External Memory

Storage Area Network (SAN)

Dell PowerVault MD Family. Modular storage. The Dell PowerVault MD storage family

MD4 esata. 4-Bay Rack Mount Chassis. User Manual February 6, v1.0

Guide to SATA Hard Disks Installation and RAID Configuration

Today: Secondary Storage! Typical Disk Parameters!

Future of the Data Center

I/O CANNOT BE IGNORED

DEDUPLICATION BASICS

BC31: A Case Study in the Battle of Storage Management. Scott Kindred, CBCP esentio Technologies

PC-based data acquisition II

High Performance Computing Course Notes High Performance Storage

HPC File Systems and Storage. Irena Johnson University of Notre Dame Center for Research Computing

V. Mass Storage Systems

After the Attack. Business Continuity. Planning and Testing Steps. Disaster Recovery. Business Impact Analysis (BIA) Succession Planning

The World s Fastest Backup Systems

RAID: The Innovative Data Storage Manager

Storage Optimization with Oracle Database 11g

RAID Technology White Paper

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland

How Hard Drives Work

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

Taurus Super-S LCM. Dual-Bay RAID Storage Enclosure for two 3.5 Serial ATA Hard Drives. User Manual July 27, v1.2

Taurus Super-S Combo

Storage systems. Computer Systems Architecture CMSC 411 Unit 6 Storage Systems. (Hard) Disks. Disk and Tape Technologies. Disks (cont.

Specifying Storage Servers for IP security applications

Models Smart Array 6402/128 Controller B21 Smart Array 6404/256 Controller B21

Deduplication Storage System

Analyzing the Economic Benefits of Datrium Cloud DVX

Hardware RAID, RAID 6, and Windows Storage Server

NT2 U3. 2-Bay RAID Storage Enclosure. User Manual May 18, 2010 v1.1

The Microsoft Large Mailbox Vision

PESIT Bangalore South Campus

COMPARING COST MODELS - DETAILS

Business Benefits of Policy Based Data De-Duplication Data Footprint Reduction with Quality of Service (QoS) for Data Protection

A track on a magnetic disk is a concentric rings where data is stored.

EMC Disk Tiering Technology Review

QuickSpecs. Models. Overview

Guide to SATA Hard Disks Installation and RAID Configuration

FOUR WAYS TO LOWER THE COST OF REPLICATION

Executive Summary. The Need for Shared Storage. The Shared Storage Dilemma for the SMB. The SMB Answer - DroboElite. Enhancing your VMware Environment

Lecture 29. Friday, March 23 CS 470 Operating Systems - Lecture 29 1

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary

EMC DATA DOMAIN PRODUCT OvERvIEW

QuickSpecs. Models. HP Smart Array 642 Controller. Overview. Retired

Chapter 12: Mass-Storage

EMC VMAX 400K SPC-2 Proven Performance. Silverton Consulting, Inc. StorInt Briefing

Definition of RAID Levels

LATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN

Technical Brief. NVIDIA Storage Technology Confidently Store Your Digital Assets

Is Tape Really Cheaper Than Disk?

Hydra Super-S LCM. 4-Bay RAID Storage Enclosure (3.5 SATA HDD) User Manual July 30, v1.2

Infrastructure Innovation Opportunities Y Combinator 2013

Introduction to I/O and Disk Management

From SCSI to SATA A Viable, Affordable Alternative for Server and Network Attached Storage Environments

Virtual Security Server

Innodisk FlexiArray Storage Appliance. Accelerates Your I/O-Intensive Applications with High IOPS.

Introduction to I/O and Disk Management

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

INFINIDAT Storage Architecture. White Paper

Get More Out of Storage with Data Domain Deduplication Storage Systems

William Stallings Computer Organization and Architecture 6 th Edition. Chapter 6 External Memory

Architecture Flexibility with HP BladeSystem & VMware Infrastructure

QuickSpecs. Models. HP Smart Array P400i Controller. Overview

DELL POWERVAULT MD FAMILY MODULAR STORAGE THE DELL POWERVAULT MD STORAGE FAMILY

FLASHARRAY//M Business and IT Transformation in 3U

Retail Banking Critical Infrastructure. Shopping Malls

IBM Storwize V7000 TCO White Paper:

Reducing Costs in the Data Center Comparing Costs and Benefits of Leading Data Protection Technologies

VSTOR Vault Mass Storage at its Best Reliable Mass Storage Solutions Easy to Use, Modular, Scalable, and Affordable

Transcription:

Estimating Hardware Storage Costs Jenny Woolley William Black ICEAA 2014 Denver, CO * The views expressed in this presentation are those of the presenters and do not imply endorsement by the Office of the Director of National Intelligence or any other US Government agency

Overview RAID storage deflation research became a priority when data was provided for estimates in terms of storage volume, rather than a bill of materials (BOM) Needed way to translate volume into cost to support future estimates Past research across multiple agencies has provided conflicting results on cost and how it changes over time Agenda Storage Background RAID Details Storage Deflation Data Analysis Conclusions Next Steps 2

Types of Storage Types of storage typically seen in government programs to be estimated: Tape storage Utilizes tape drives and tape libraries Still used for long-term storage, but has become less prevalent as costs for disk storage have decreased Disk storage Current standard for short to mid-term storage and often used for long-term archival purposes Most commonly seen type of storage in recent cost estimates Solid-state storage High-performance plug-and-play storage device that contains no moving mechanical components Likely to see expanded use in the next several years, but pricing is currently prohibitive for many government programs RAID = Redundant Array of Independent Disks Combines two or more physical drives into a logical unit presented as a single hard drive to the operating system Different configurations (called levels ) of RAID utilize multiple techniques to provide varying degrees of reliability (ability to withstand drive failure) and availability (speed of Input/Output) 3

RAID Terms Mirroring: Duplicating data to more than one disk Mirroring Can speed read times because the system can read data from more than one disk A1 A2 A1 A2 Can slow write times if the system must confirm that data is correctly written to each disk Striping Striping: Writing data across a number of disks in parallel A1 A3 A2 A4 Speeds read/write performance Parity: Redundancy information is calculated for each piece of data stored Distributed Parity If a drive fails, the missing data can be reconstructed from the remaining data and parity data B1 A1 B2 A2 BP A3 B3 AP 4

RAID Levels Level Striping Mirroring Parity Notes Storage for ~1 TB RAID 0 X Provides no fault tolerance 1 TB RAID 1 RAID 2 X RAID 3 X X RAID 4 X X RAID 5 X X RAID 6 X X RAID 10 X X RAID 0+1 X X RAID 50 X X X Provides fault tolerance, but can cause a slight drag on performance Striping at bit (rather than block) level; not currently used Byte level striping with a dedicated parity disk; rare in practice Block level striping with a dedicated parity disk; rare in practice Block level striping with parity data distributed across all member disks; fault tolerance against one drive failure Block level striping with two parity blocks distributed across all member disks; fault tolerance against two drive failures Stripe set composed of two or more mirrored sets; can operate as long as drives on both mirror sets do not fail Mirror set composed of two or more stripe sets; low level of scalability Striping data across multiple RAID 5 sets; can sustain up to 4 drive failures 2 TB Not Used Not Used Not Used 1.5 TB 2 TB 2 TB 2 TB 1.5 TB 5

Calculating Storage Volume Typically estimate storage costs in one of two ways: Include cost of BOM Other individuals determine hardware needs and obtain vendor quotes Estimate storage based on the amount of data to be received, which requires consideration of Downlink limitations Amount of data compression that will occur Removal of data that is not usable Products, reports, and metadata that must be stored (in addition to the original data) Storage policies (i.e., requirements for duration of storage) Chosen RAID level Standard 15-20% additional open storage recommended to ensure the system does not slow down 1 TB/day received from source.5 TB/day after removing unusable data.25 TB/day after compression.75 TB/day with products and metadata 1.5 TB/day For RAID level 6 1.8 TB/day with 20% open storage 6

Storage Deflation 7

Storage Deflation As technology has advanced over time, the cost of storage has decreased Change largely driven by decreasing cost of disk drives While other types of HW have also evolved, they have not demonstrated the same consistent decrease in cost Ex: Capability of servers increase while price stays about the same Groups estimate the changing cost of storage differently Some estimate a consistent annual decrease in storage costs (X% each year), leading to a lower total cost of storage over time Others have indicated that while storage deflation does occur, a group may just purchase additional storage to compensate for the cost of deflation (results in no cost change over time, but expanded storage capability) 8

Deflation Estimating Challenges Estimating using projected volume and a $/TB that includes the cost of all peripheral HW/SW may be misleading While storage costs deflate, other associated COTS HW/SW costs may not Need a full breakout of COTS purchases to apply deflation to only storage Must ensure that the $/TB used is applicable Avoid double counting or underestimating other COTS HW/SW products Using a $/TB that includes multiple types of HW makes capturing unique recap costs difficult (e.g., recap the physical rack every 15 years, but replace COTS SW every 3 years) Deflation has occurred historically, but previous research does not indicate when/if deflation might slow or cease entirely (i.e., does a floor exist?)* Storage may have already deflated so much that it is only a small portion of the total $/TB Cost of materials and production may limit how low the cost of storage can become Storage deflation does not take into account the possibility of additional technology advances Deflation estimates may not cover a program s transition to a new, more expensive type of storage If a type of storage becomes obsolete, the cost of that storage may actually begin to increase Existing burdens (e.g., SEITPM, maintenance) do not take deflation into account Unlikely that it will become less expensive to manage and maintain more complex HW Burdens may be correlated more with volume of storage rather than cost * Earlier Research: Cost Deflation vs. Technology Inflation of RAID Storage Systems Converse, Watkins, SCEA, 2006 9

Data Collection General research Confirms trend of decreasing cost per TB, but unclear on future impacts How long will cost continue to decrease? How will external factors impact cost? State of economy Transition to Infrastructure as a Service approach Recovery from 2011 Asian tsunami Data collected from available BOMs Searched for commonality within a single BOM and between different BOMs Looked for procurement of the same piece of equipment in multiple years Avoided HW with vague descriptions because a piece of HW with the same general name can have multiple configurations Ensured apples-to-apples comparison Did not compare prices that include maintenance with ones that did not Evaluated individual pieces of HW, rather than aggregate $/TB (based on level of information available) Unable to determine whether costs were influenced by purchasing agreements or enterprise licenses 10

Combined Data View $/GB Over Time $7.00 $6.00 $5.00 BY12$ $4.00 $3.00 $2.00 $1.00 $- 0 0.5 1 1.5 2 2.5 3 # Years After 2008 All available data points shown Data available from 2008 to 2011 Color indicates different disk drive Revolutions Per Minute (RPMs: green = 7200 RPM, purple = 15K RPM) Graph indicates that disk drive RPM is a determining factor for cost and rate of deflation 11

7200 RPM Disk Drive Data RPM: Revolutions per minute The faster the disk spins, the faster the drive operates Data used Equipment: Disk Drive, 1 TB or 500 GB, 7200 RPM, SATA 16 data points available Strong R 2 for data set Line of best fit indicates that annual change in cost is not as consistent as other research has indicated BY12$ $2.50 $2.00 $1.50 $1.00 $0.50 $- $/GB for 7200 RPM Disk Drives Over Time y = 1.0744x -0.463 R² = 0.9309 0 0.5 1 1.5 2 2.5 3 # Years After 2008 Trendline Predictions - 7200 RPM FY BY12$/GB % Change 2009 $1.07 2010 $0.78 27% 2011 $0.65 17% 2012 $0.57 12% 2013 $0.51 10% 2014 $0.47 8% 2015 $0.44 7% 2016 $0.41 6% 2017 $0.39 5% 2018 $0.37 5% 2019 $0.35 4% 2020 $0.34 4% 2021 $0.33 4% 2022 $0.32 3% 2023 $0.31 3% 2024 $0.30 3% 2025 $0.29 3% 2026 $0.28 3% 2027 $0.27 2% 2028 $0.27 2% 2029 $0.26 2% 2030 $0.26 2% 12

15K RPM Disk Drive Data Higher RPM = higher performance disks Data used Equipment: Disk Drive, 300 or 144 GB, 15K RPM, 4 GB, FC 10 data points available R 2 not as strong as for 7200 RPM disk drives Limited number of available data points Cost per TB higher than 7200 RPM, but deflation occurs more slowly BY12$ $7.00 $6.00 $5.00 $4.00 $3.00 $2.00 $1.00 $- $/GB for 15K RPM Disk Drives Over Time y = 3.7615x -0.357 R² = 0.7076 0 0.5 1 1.5 2 2.5 3 # Years After 2008 Trendline Predictions - 15K RPM FY BY12$/GB % Change 2009 $3.76 2010 $2.94 22% 2011 $2.54 13% 2012 $2.29 10% 2013 $2.12 8% 2014 $1.98 6% 2015 $1.88 5% 2016 $1.79 5% 2017 $1.72 4% 2018 $1.65 4% 2019 $1.60 3% 2020 $1.55 3% 2021 $1.51 3% 2022 $1.47 3% 2023 $1.43 2% 2024 $1.40 2% 2025 $1.37 2% 2026 $1.34 2% 2027 $1.31 2% 2028 $1.29 2% 2029 $1.27 2% 2030 $1.25 2% 13

Disk Drive Summary Trendline Predictions - 7200 RPM FY BY12$/GB % Change 2009 $1.07 2010 $0.78 27% 2011 $0.65 17% 2012 $0.57 12% 2013 $0.51 10% 2014 $0.47 8% 2015 $0.44 7% 2016 $0.41 6% 2017 $0.39 5% 2018 $0.37 5% 2019 $0.35 4% 2020 $0.34 4% 2021 $0.33 4% 2022 $0.32 3% 2023 $0.31 3% 2024 $0.30 3% 2025 $0.29 3% 2026 $0.28 3% 2027 $0.27 2% 2028 $0.27 2% 2029 $0.26 2% 2030 $0.26 2% Cost per TB significantly different Deflation for 15K RPM consistently lower 7200 RPM disk drives deflate more quickly Trendline Predictions - 15K RPM FY BY12$/GB % Change 2009 $3.76 2010 $2.94 22% 2011 $2.54 13% 2012 $2.29 10% 2013 $2.12 8% 2014 $1.98 6% 2015 $1.88 5% 2016 $1.79 5% 2017 $1.72 4% 2018 $1.65 4% 2019 $1.60 3% 2020 $1.55 3% 2021 $1.51 3% 2022 $1.47 3% 2023 $1.43 2% 2024 $1.40 2% 2025 $1.37 2% 2026 $1.34 2% 2027 $1.31 2% 2028 $1.29 2% 2029 $1.27 2% 2030 $1.25 2% 14

Estimating Applications Method described on previous slides is most accurate when storage is provided as a service or shared between multiple programs If storage is a service, a program can be charged for a portion of a rack If storage is purchased by programs individually, a program might have to buy an entire rack when only half of a rack is needed To fully estimate storage costs, must capture additional HW needed to support disk drives Includes chassis, servers, networking equipment, etc. May need to include cost of COTS SW licenses if not captured separately Equipment needs appear to change with relative frequency Ex: Rarely buy the same server to support storage when it is time for a recap 15

Conclusions General storage conclusions Must consider the chosen storage strategy when estimating storage costs RAID level contributes to the volume of storage required RPMs and other attributes of storage impact total cost and rate of cost deflation Receipt of detailed BOMs improves costing insight, especially in conjunction with volume data Provides details on storage costs vs. other HW/SW costs Allows for accurate application of RAID deflation and estimation of recaps Disk drive deflation conclusions Cost of disk drives decreases over time, approaching a floor Suggests use of a single factor to estimate deflation may not be sufficient Could be used as a cross check for programs with more detailed storage costs available Results of research conflicts with some previous storage deflation studies Research shows deflation rate decreasing, rather than remaining constant Need more detailed data to reconcile varying results 16

Next Steps Evaluate impacts of pricing agreements typically offered to the government Investigate life expectancy of disks at different RPMs Also consider different usage (e.g., long or short-term storage) Research impacts to factors and CERs Assess changes over time due to variation in storage costs Consider impacts of potential shifts in storage needs (e.g., solid state drives) 17

Back-ups 18

Contact Information William Black: wblack@scitor.com Jenny Woolley: jwoolley@scitor.com 19

Recognition Acknowledgments Carrie Gamble Brian Wells 20