Technologies for Green Storage. Alan G. Yoder, Ph.D. NetApp

Similar documents
Green Data Center and Storage Technologies. Alan G. Yoder, Ph.D., NetApp

Green Storage Technologies, CAPEX and OPEX. Alan G. Yoder, Ph.D. NetApp

Green Storage Technologies and Your Bottom Line. Alan G. Yoder, Ph.D. NetApp

ADVANCED DEDUPLICATION CONCEPTS. Thomas Rivera, BlueArc Gene Nagle, Exar

ADVANCED DATA REDUCTION CONCEPTS

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation

Tom Sas HP. Author: SNIA - Data Protection & Capacity Optimization (DPCO) Committee

Notes & Lessons Learned from a Field Engineer. Robert M. Smith, Microsoft

Trends in Data Protection and Restoration Technologies. Jason Iehl, NetApp

Data Deduplication Methods for Achieving Data Efficiency

COM Verification. PRESENTATION TITLE GOES HERE Alan G. Yoder, Ph.D. SNIA Technical Council Huawei Technologies, LLC

GREEN STORAGE II: Metrics and Measurement. Patrick B. Chu, Ph.D., Seagate Technology Erik Riedel, Ph.D., Seagate Technology

The Benefits of Solid State in Enterprise Storage Systems. David Dale, NetApp

Storage Performance Management Overview. Brett Allison, IntelliMagic, Inc.

Restoration Technologies

Tiered File System without Tiers. Laura Shepard, Isilon

Intro to Capacity Optimization Methods (COMs)

Effective Storage Tiering for Databases

Everything You Wanted To Know About Storage (But Were Too Proud To Ask) The Basics

Application Recovery. Andreas Schwegmann / HP

ASPECTS OF DEDUPLICATION. Dominic Kay, Oracle Mark Maybee, Oracle

Scaling Data Center Application Infrastructure. Gary Orenstein, Gear6

Trends in Worldwide Media and Entertainment Storage

Create a Smarter and More Economic Cloud Storage Architecture. November 7, 2018

Benefits of Storage Capacity Optimization Methods (COMs) And. Performance Optimization Methods (POMs)

SNIA Emerald SNIA Emerald Power Efficiency Measurement Specification. SNIA Emerald Program

Virtualization Practices: Providing a Complete Virtual Solution in a Box

Virtualization Practices:

Storage Virtualization II Effective Use of Virtualization - focusing on block virtualization -

Storage in combined service/product data infrastructures. Craig Dunwoody CTO, GraphStream Incorporated

A Promise Kept: Understanding the Monetary and Technical Benefits of STaaS Implementation. Mark Kaufman, Iron Mountain

SRM: Can You Get What You Want? John Webster, Evaluator Group.

Storage Virtualization II. - focusing on block virtualization -

Facing an SSS Decision? SNIA Efforts to Evaluate SSS Performance. Ray Lucchesi Silverton Consulting, Inc.

Overview and Current Topics in Solid State Storage

in Transition to the Cloud David A. Chapa, CTE EVault, a Seagate Company

Overview and Current Topics in Solid State Storage

IBM N Series. Store the maximum amount of data for the lowest possible cost. Matthias Rettl Systems Engineer NetApp Austria GmbH IBM Corporation

High Availability Using Fault Tolerance in the SAN. Wendy Betts, IBM Mark Fleming, IBM

SNIA Tutorial 1 A CASE FOR FLASH STORAGE HOW TO CHOOSE FLASH STORAGE FOR YOUR APPLICATIONS

Multi-Cloud Storage: Addressing the Need for Portability and Interoperability

How NAS Systems Participate in Data Protection. Paul Massiglia, Symantec Corporation

The Role of WAN Optimization in Cloud Infrastructures. Josh Tseng, Riverbed

Smart Data Centres. Robert M Pe, Data Centre Consultant HP Services SEA

Business Benefits of Policy Based Data De-Duplication Data Footprint Reduction with Quality of Service (QoS) for Data Protection

Thomas Rivera / Hitachi Data Systems. Author: SNIA - Data Protection & Capacity Optimization (DPCO) Committee

Planning For Persistent Memory In The Data Center. Sarah Jelinek/Intel Corporation

Identifying and Eliminating Backup System Bottlenecks: Taking Your Existing Backup System to the Next Level

Automated Storage Tiering on Infortrend s ESVA Storage Systems

50 TB. Traditional Storage + Data Protection Architecture. StorSimple Cloud-integrated Storage. Traditional CapEx: $375K Support: $75K per Year

Felix Xavier CloudByte Inc.

Felix Xavier CloudByte Inc.

SRM: Can You Get What You Want? John Webster Principal IT Advisor, Illuminata

The Best Storage for Virtualized Environments

Architectural Principles for Networked Solid State Storage Access

7 Best Practices for Increasing Efficiency, Availability and Capacity. XXXX XXXXXXXX Liebert North America

How to create a synthetic workload test. Eden Kim, CEO Calypso Systems, Inc.

Deploying Public, Private, and Hybrid. Storage Cloud Environments

Mark Rogov, Dell EMC Chris Conniff, Dell EMC. Feb 14, 2018

Get More Out of Storage with Data Domain Deduplication Storage Systems

Power Smart IEEE Conference Lunch Presentation

Don t Run out of Power: Use Smart Grid and Cloud Technology

NetApp SolidFire and Pure Storage Architectural Comparison A SOLIDFIRE COMPETITIVE COMPARISON

Testing. Michael Ault, IBM Oracle FlashSystem Consulting Manager

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

Flash Decisions: Which Solution is Right for You?

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

Expert Insights: Expanding the Data Center with FCoE

Why would you want to use outdoor air for data center cooling?

Data Center Power and Cooling (and More)

Introducing Tegile. Company Overview. Product Overview. Solutions & Use Cases. Partnering with Tegile

PRESENTATION TITLE GOES HERE. Understanding Architectural Trade-offs in Object Storage Technologies

Trends in Data Protection CDP and VTL

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

Boost your data protection with NetApp + Veeam. Schahin Golshani Technical Partner Enablement Manager, MENA

Chapter 10: Mass-Storage Systems

Hyperconvergence and Medical Imaging

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Capacity and Power Management: The Forgotten Factors in Disaster Recovery Planning Presented by: Clemens Pfeiffer CTO Power Assure, Inc.

LTO and Magnetic Tapes Lively Roadmap With a roadmap like this, how can tape be dead?

Partner Introduction

Cooling on Demand - scalable and smart cooling solutions. Marcus Edwards B.Sc.

vsan All Flash Features First Published On: Last Updated On:

The End of Redundancy. Alan Wood Sun Microsystems May 8, 2009

Careful What You Wish For: Poor Colocation Design Could Cost You. Thorough assessment of upfront costs to prevent future system failures

Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism

Nimble Storage vs HPE 3PAR: A Comparison Snapshot

Lightning Fast Rock Solid

Combining Cold Aisle Containment with Intelligent Control to Optimize Data Center Cooling Efficiency

The Data-Protection Playbook for All-flash Storage KEY CONSIDERATIONS FOR FLASH-OPTIMIZED DATA PROTECTION

The File Systems Evolution. Christian Bandulet, Sun Microsystems

Considerations to Accurately Measure Solid State Storage Systems

SAP Applications on IBM XIV System Storage

Green IT Project: The Business Case

PRESENTATION TITLE GOES HERE

A More Sustainable Approach to Enterprise Data Storage: Reducing Power Consumption with InfiniBox. Executive Brief

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication

Management Update: Storage Management TCO Considerations

Server and Storage Consolidation with iscsi Arrays. David Dale, NetApp Suzanne Morgan, Microsoft

Transcription:

Technologies for Green Storage Alan G. Yoder, Ph.D. NetApp

SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature under the following conditions: Any slide or slides used must be reproduced without modification The SNIA must be acknowledged as source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA Education Committee. Neither the Author nor the Presenter is an attorney and nothing in this presentation is intended to be nor should be construed as legal advice or opinion. If you need legal advice or legal opinion please contact an attorney. The information presented herein represents the Author's personal opinion and current understanding of the issues involved. The Author, the Presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK. 2

Abstract Technologies for Green Storage Hardware efficiencies are essential to reducing the amount of power used by storage. Equally real savings are obtained by reducing the number of copies of your data that must be made, kept and managed. This talk presents a number of technologies, ranging from thin provisioning and virtualization to flywheel UPSs, that each address part of the problem, and illustrates the impact that each technology can have on your data center footprint. 3

Outline Objectives Need for redundancy (extra copies of data) Software technologies for reducing excessive redundancy Other technologies for energy saving center technologies Typical savings

Objective Get more work done for less money Translation: reduce data center footprint in space less storage equipment to buy in energy more energy-efficient equipment less equipment to cool better cooling methodologies better power management in administrative costs less storage equipment to manage

Need for redundancy RAID 10 protect against multiple disk failures DR Mirror protect against whole-site disasters Backups protect against failures and unintentional deletions/changes Compliance archive protect against heavy fines /dev copies protect live data from mutilation by unbaked code Overprovisioning protect against volume out of space application crashes quicker and more efficient backups

Result of redundancy - Power consumption is roughly linear in the number of naïve (full) copies 10 TB ~10x + Backup Backup Backup 5 TB RAID10 RAID10 RAID10 RAID10 1 TB RAID10 RAID10 RAID10 RAID10 RAID10 RAID10 RAID10 App RAID 10 Overhead Overprovision DR Mirror Disk Backup Compliance /Dev copies

Effect of green technologies - Green storage technologies use less raw capacity to store and use the same data set - Power consumption falls accordingly 10 TB 5 TB 1 TB Backup RAID10 RAID10 Backup Backup Backup Backup Backup RAID 5/6 Thin Provisioning Multi- Use Backups Virtual Clones Dedupe & Compression

Green Storage Technologies Enabling technologies Storage virtualization Storage capacity planning Green software Compression Virtual (writeable) clones Thin provisioning Non-mirrored RAID Deduplication and SIS Resizeable volumes

Green storage technologies (cont.) Other storage technologies and power saving techniques Capacity vs. high performance drives ILM / HSM MAID SSDs Power supply and fan efficiencies Facilities-side technologies Hot aisle/cold aisle Water & natural cooling Flywheel UPSs

Enabling technologies Storage virtualization Storage capacity planning

Storage Virtualization Mapping from physical location to virtual location May exist at multiple layers In and of itself, not green wrt storage No reduction in dataset size Very green wrt servers But foundational for some green technologies Thin provisioning Resizeable volumes and virtual clones (depending on implementation) Also contributes in other areas Flexibility, manageability, etc.

Storage capacity planning Obtain and analyse baseline data 70 60 50 40 Identify inefficiencies 30 20 10 address each inefficiency found 0 Many toolkits available from storage and storage management vendors Toolkits usually slanted toward more purchase of said vendors products Vendors usually eager to help find issues with other vendors solutions Identify which green software technologies will Ask vendors for proposals 1 2 3 4 5 6 Series1

Green software technologies Compression Virtual (writeable) clones Thin provisioning Non-mirrored RAID Deduplication and SIS Resizeable volumes

Compression Compression Old and venerable Origins in signaling, number and coding theory Many formats already compressed Backup Lossless compression (LZW) necessary for unknown data types Configuration Backup matters RAID10 Backup RAID DP Therefore usually done at the stream or object level RAID10 Motivated by limited bandwidth and lossiness of satellite communications Scattered throughout the data stack JPEG, MPEG, MP3, etc. Backup Backup Backup Compress before encrypting, decrypt before decompressing Difficult in block-based environments

NOT just wholesale copies of the data We call those PIT copies sharing Form of deduplication in snapshot shared with live data until live data is written Backup Two fundamental techniques Backup Copy Out on Backup Write Backup Backup Backup RAID10 Write to new RAID live DP location RAID10

Virtual clones Writeable snapshots, essentially shared with master copy until written Master copy is typically a recent snapshot of some live data Original live data left intact Typical use What-if Backup scenarios ing Backup of application changes against up-to-date datasets RAID10 Backup Backup Backup Backup ing of new applications with near-online data Booting/running of VM images from a golden master RAID10

Thin provisioning Similar in concept to filesystem quotas Quotas Quota limit Increase quota Hit quota limit: writes fail Backup Backup RAID10 Storage admin just Backup manages Backup Backup the underlying Backup storage RAID10 Need more physical storage Thin provisioned volume Nominal volume size Fill up volume: writes fail Increase volume size Need more physical storage

RAID 5 Allows any (one) drive in a RAID set to fail without data loss Requires only one extra drive in a RAID set Much less raw capacity required than for mirroring Note: RAID 3 and RAID 4 have the same overhead Backup as RAID 5 Backup Note: these numbers RAID10 Backup Backup Backup would be Backup 14.3% and 100% respectively RAID10 Typical: 8-disk RAID 5 set: 12.5% overhead vs. 50% for mirroring if figured as overhead on top of, as opposed to as a percentage of

RAID 6 More dependable than mirroring Mirroring: can survive two failures in a disk group if they re not in the same mirrored pair RAID 6: can survive failure of any two drives in the group However, typically somewhat larger RAID sets Requires two extra drives per RAID set Necessary as drive sizes increase Probability of a disk failure during RAID 5 parity reconstruct is getting Backup too high Backup More RAID10 green than mirroring Backup Backup Backup 50% overhead in RAID 1 mirroring RAID10 14.3% overhead in a 14-disk RAID 6 raidset Backup 16.7% and 100% if figured on top of

Deduplication and SIS Find duplicates at some level, substitute pointers to a single shared copy Block or sub-file based (dedup) Content or name based (SIS, file folding ) Streaming (in-band) and offline techniques Backup Savings increase with number of copies found Backup Up to 95% savings, depending on dataset RAID10 Backup Backup Backup Backup RAID10

Resizeable volumes Does away with need for overprovisioning Run at 70% capacity instead of 20-30% Manage underlying storage pool instead of the volumes May require application-level support LUNs usually, filesystems Backup less so Backup RAID10 Backup Backup Backup RAID10 Backup

Other technologies Capacity vs. high performance drives ILM / HSM MAID SSDs Power supply and fan efficiencies

Capacity vs. High Performance Capacity focused on GB/watt at rest 1 TB SATA: 15W 4 x 250 GB FC: 64W also tend to have better $/GB NOTE: power use is quadratic with respect to rotational speed Use the slowest drives that will fit your needs Performance focused on seek time 1 TB SATA: 12 15 ms 300 GB FC: 3 4 ms also designed for higher RAS environments

ILM / HSM Exploit cost differences between storage tiers Idea: automatically move data to an appropriate storage platform at each period in its lifetime Tier change must have substantial value to make the overhead worth it Cost of ILM/HSM system Cost of administration Cost of data movement Practice Storage declines in value as it ages Manual movement of data sets New wrinkle Footprint considerations preclude keeping older gear Check out SNIA Tutorial: The Secret Sauce of ILM Assessment

MAID (Massive Array of Idle Disks) Idea: spin down disks when not in use Pros Cons Disks use no power when spun down > 50% power savings at idle Most data near-online (access times of several seconds) Background disk housekeeping difficult Often the same data center requirements (power supplies, CRAC units, PDUs etc.), but these are used at lower efficiencies Impending competition from SSDs Competition from...

Tape Our old buddy Pros Cons Tapes use no power when inactive > 90% power savings at idle is at best near-online (access times of several seconds) Not a random access format Lack of true resilience to format failure RAIT? Check out SNIA Tutorial: Introduction to Protection: Backup to Tape, Disk and Beyond

SSDs (Solid State Disks) Usually refers to FLASH-based disks Pros Great READ performance At rest power consumption = 0 No access time penalty when idle (cf. MAID) No need to keep some disks spinning (cf. MAID) Cons WRITE performance < mechanical disks Cost >> mechanical disks except at very high perf points Wear leveling requires a high space overhead Note: these dynamics changing rapidly with time SSSI SNIA Solid State Storage Initiative

Power supply and fan efficiencies Efficiency of power supply an up front waste Formerly 60-70% Nowadays 80-95% Climate Savers 80+ group Note: Efficient PSs are more expensive Variable speed fans Common nowadays Software (OS) control Energy consumption is quadratic in the rotational speed

Facilities-side technologies Monitoring Hot aisle cold aisle technologies Water and natural cooling Flywheel UPSs

Monitoring Critical to increased efficiencies Lights out operation Tightening up of temperature tolerances Better staff utilization Anomaly detection

Hot aisle / cold aisle technologies Segregate airflows into hot and cold aisles (backs and fronts of servers) More precise control Allows higher input temperatures Several emerging approaches Hot air plenum Complete containment

Water and natural cooling Water cooling of increasing interest Issues include corrosion, warm water disposal Natural cooling Economizers Exhaust hot aisle air when temp > outdoor air Use outdoor air when temp < cold aisle air May need humidity control

Flywheel UPSs Operational flow: normal condition Grid power keeps flywheel turning Flywheel Motor/ Alternator Clutch Diesel

Flywheel UPSs Operational flow: outage Flywheel powers alternator while diesel starts up Flywheel Motor/ Alternator Clutch Diesel

Flywheel UPSs Operational flow: diesel power special clutch transfers mechanical linkage from flywheel to diesel Flywheel Motor/ Alternator Clutch Diesel

Flywheel UPSs Pros No lead-acid batteries Environmental impact Ongoing maintenance pain More reliable No AC to DC to AC conversion More efficient Motor/ Clutch ~98% vs. ~94% best Alternator in class ~400KW difference in 10MW data center Cons More expensive Emerging competition CAES (compressed air flywheels )

Savings calculations Space savings Equipment power savings Facilities power savings

Savings calculations (storage) Calculations herein are for space savings Relationship of space to $$ is loose But every TB of disks you don t buy saves you CapEx (Capital Expenditure) for the equipment CapEx for the footprint CapEx for power conditioning and cooling OpEx (Operational Expenditure) for equipment power OpEx for power conditioning and cooling OpEx for storage management OpEx for service contract fees

Typical savings Compression 20 40% Remember, no savings from already compressed file formats 90 95% per snapshot, compared to raw PIT copies Only data written since snapshot needs to be copied Writeable clones 80 95%, compared to raw PIT copies Only data that is written needs to be duplicated

Typical savings Thin provisioning 40-60% Average 30% utilization over 80% utilization RAID 6 35% For 14-disk RAID 6 set, compared to RAID 1/10 Deduplication 40 95%, depending on dataset and time interval ~ 40 50% average over time Resizeable volumes 20 50%

Caveat Savings estimates are real, but best taken as anecdotal YMMV your mileage may vary Make your vendors prove their claims

Are these savings multiplicative? Sometimes yes RAID 6 + writeable clone Assume 1000GB writeable clone = 2000GB needed for a raw writeable copy on RAID 1 storage 90% writeable clone savings takes us to 200GB 35% RAID 6 savings takes us from there to 130GB 130GB needed for a writeable clone on RAID 6 Sometimes no Thin provisioning + resizeable volumes Similar effects, but you only get the savings once Sometimes maybe Snapshot + deduplication Can t dedup readonly snapshots are a form of deduplication, so there s less to dedup OTOH, already deduped data can be snapshotted efficiently

Savings matrix Savings multiply in combinations with checkboxes Compression (C) (SS) Virtual Clones (VC) Thin Provisioning (TP) RAID (R) Deduplication (DD) Resizeable Vols (RV) C SS VC TP R DD RV

Savings matrix (cont.) E.g. Thin provisioning with snapshots, RAID 6, and Dedup big win! C SS VC TP R DD RV Compression (C) (SS) Virtual Clones (VC) Thin Provisioning (TP) RAID (R) Deduplication (DD) Resizeable Vols (RV)

Equipment power savings Server virtualization up to 80% savings much depends on load Power supply efficiencies 10 20% difficult to do anything about this in a vacuum probably okay to just ride industry trends Variable speed fans up to 80% savings power consumption quadratic in rotational speed increased data center cold aisle temperatures are key

Facilities savings State of the art data centers PUE* drops from 2.25 to 1.25 = 45% savings 10MW 5.5MW $6.0M $3.3M annually Rebates in the $M from utilities on top of savings * Power Utilization Efficiency see www.thegreengrid.org/gg_content

Q&A / Feedback Please send any questions or comments on this presentation to trackgreenstorage@snia.org Many thanks to the following individuals for their contributions to this tutorial. - SNIA Education Committee SNIA GSI Marketing Committee SNIA reviewers Larry Freeman NetApp 48