PCAP: Performance-Aware Power Capping for the Disk Drive in the Cloud

Similar documents
IO Priority: To The Device and Beyond

Extremely Fast Distributed Storage for Cloud Service Providers

SSD Technology and Application Trend. Samsung Semiconductor Inc. Tony Kim Oct. 2010

Infrastructure Innovation Opportunities Y Combinator 2013

Performance Testing December 16, 2017

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

Lecture 20: WSC, Datacenters. Topics: warehouse-scale computing and datacenters (Sections )

Pelican: A building block for exascale cold data storage

The Design and Implementation of AQuA: An Adaptive Quality of Service Aware Object-Based Storage Device

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510

Assessing and Comparing HP Parallel SCSI and HP Small Form Factor Enterprise Hard Disk Drives in Server Environments

Addressing the Stranded Power Problem in Datacenters using Storage Workload Characterization. January 30 th, 2010 Sriram Sankar and Kushagra Vaid

AMD Opteron Processors In the Cloud

CS Project Report

Optimizing the IT/Server Ecosystem for Cloud Computing

SanDisk Enterprise Storage Solutions

HIGH PERFORMANCE SANLESS CLUSTERING THE POWER OF FUSION-IO THE PROTECTION OF SIOS

The PowerEdge M830 blade server

Public Cloud Leverage For IT/Business Alignment Business Goals Agility to speed time to market, adapt to market demands Elasticity to meet demand whil

Performance Analysis in the Real World of Online Services

Paperspace. Architecture Overview. 20 Jay St. Suite 312 Brooklyn, NY Technical Whitepaper

Hard Disk Drives. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau)

SSD Architecture for Consistent Enterprise Performance

Introducing and Validating SNIA SSS Performance Test Suite Esther Spanjer SMART Modular

A PRINCIPLED TECHNOLOGIES TEST REPORT DELL ACTIVE SYSTEM 800 WITH DELL OPENMANAGE POWER CENTER

Accelerate Applications Using EqualLogic Arrays with directcache

Roadmap for Enterprise System SSD Adoption

Four Steps to Unleashing The Full Potential of Your Database

Red Hat Ceph Storage and Samsung NVMe SSDs for intensive workloads

SoftNAS Cloud Performance Evaluation on AWS

QoS support for Intelligent Storage Devices

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE

Defining requirements for a successful allflash

A Flash Scheduling Strategy for Current Capping in Multi-Power-Mode SSDs

ReFlex: Remote Flash Local Flash

A Spot Capacity Market to Increase Power Infrastructure Utilization in Multi-Tenant Data Centers

CS252 S05. CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2. I/O performance measures. I/O performance measures

Deterministic Storage Performance

Deterministic Storage Performance

HPE ProLiant DL580 Gen9 and HPE PCIe LE Workload Accelerator 105TB Data Warehouse Fast Track Reference Architecture

The Oracle Database Appliance I/O and Performance Architecture

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc.

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Innovations in Non-Volatile Memory 3D NAND and its Implications May 2016 Rob Peglar, VP Advanced Storage,

Ultra-Low Latency Down to Microseconds SSDs Make It. Possible

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

SSD Architecture Considerations for a Spectrum of Enterprise Applications. Alan Fitzgerald, VP and CTO SMART Modular Technologies

Real-time Monitoring, Inventory and Change Tracking for. Track. Report. RESOLVE!

CSE 124: THE DATACENTER AS A COMPUTER. George Porter November 20 and 22, 2017

Supermicro All-Flash NVMe Solution for Ceph Storage Cluster

Software Defined Storage at the Speed of Flash. PRESENTATION TITLE GOES HERE Carlos Carrero Rajagopal Vaideeswaran Symantec

Preface. Fig. 1 Solid-State-Drive block diagram

Benchmarking Enterprise SSDs

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics

UCS Invicta: A New Generation of Storage Performance. Mazen Abou Najm DC Consulting Systems Engineer

Cutting the Cord: A Robust Wireless Facilities Network for Data Centers

Nimble Storage Adaptive Flash

DriveScale-DellEMC Reference Architecture

Kaminario Powering Cloud-Scale Applications with All-Flash. Sundip Arora Director, Product Marketing

Performance Sentry VM Provider Objects April 11, 2012

HP ProLiant DL380 Gen8 and HP PCle LE Workload Accelerator 28TB/45TB Data Warehouse Fast Track Reference Architecture

8. CONCLUSION AND FUTURE WORK. To address the formulated research issues, this thesis has achieved each of the objectives delineated in Chapter 1.

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

QuickSpecs. HPE Hard Disk Drives

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

On the Accuracy and Scalability of Intensive I/O Workload Replay

VoltDB vs. Redis Benchmark

Internet-Scale Datacenter Economics: Costs & Opportunities High Performance Transaction Systems 2011

Improving throughput for small disk requests with proximal I/O

Enhancing cloud energy models for optimizing datacenters efficiency.

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Cutting the Cord: A Robust Wireless Facilities Network for Data Centers

Block Storage Service: Status and Performance

Integrated and Hyper-converged Data Protection

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

SSD Applications in the Enterprise Area

Nimble Storage vs HPE 3PAR: A Comparison Snapshot

GLUSTER CAN DO THAT! Architecting and Performance Tuning Efficient Gluster Storage Pools

Deep Learning Performance and Cost Evaluation

Emulex LPe16000B Gen 5 Fibre Channel HBA Feature Comparison

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

How To Get The Most Out Of Flash Deployments

Improving Ceph Performance while Reducing Costs

Mission-Critical Databases in the Cloud. Oracle RAC in Microsoft Azure Enabled by FlashGrid Software.

Distributed caching for cloud computing

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication

Internet Scale Storage

The Datacentered Future Greg Huff CTO, LSI Corporation

All-NVMe Performance Deep Dive Into Ceph + Sneak Preview of QLC + NVMe Ceph

Fast and Easy Persistent Storage for Docker* Containers with Storidge and Intel

QuickSpecs. HPE SAS Hard Drives

Samsung Z-SSD SZ985. Ultra-low Latency SSD for Enterprise and Data Centers. Brochure

Data Centers and Cloud Computing. Data Centers

[537] I/O Devices/Disks. Tyler Harter

PRESENTATION TITLE GOES HERE

Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017

Increasing Cloud Power Efficiency through Consolidation Techniques

SAS Technical Update Connectivity Roadmap and MultiLink SAS Initiative Jay Neer Molex Corporation Marty Czekalski Seagate Technology LLC

Performance Benefits of Running RocksDB on Samsung NVMe SSDs

Transcription:

PCAP: Performance-Aware Power Capping for the Disk Drive in the Cloud Mohammed G. Khatib & Zvonimir Bandic WDC Research 2/24/16 1

HDD s power impact on its cost 3-yr server & 10-yr infrastructure amortization HDDs Power Distribution & Cooling Acquisition cost Cost overhead (Power related) Power Other Infrastructure $180 $160 13% 10% 3% 74% TOTAL COST ($) $140 $120 $100 $80 $60 $40 $20 $35.27 $44.13 $42.69 $51.55 Peak-power & PUE dependent $0 (1.07,$9) (1.80,$9) (1.07,$12) (1.80,$12) DISK MODEL Based on Hamilton s DC cost model [http://perspectives.mvdirona.com/2010/09/overall-data-centercosts/] 2

Two types of datacenters The average activity distribution of a sample of 2 Google clusters, each containing over 20,000 servers, over a period of 3 months 2013 Focus of this talk Luiz A. Barroso et al. [The Datacenter as a Computer, 2013] 3

Power over-subscription For maximum cost effectiveness, use provisioned power fully If a facility operates at 50% of its peak power capacity, the effective provisioning cost per Watt used is doubled! Luiz A. Barroso [The Datacenter as a Computer, 2013] How many servers fit within a given budget? Hard question! Specs are very conservative à Dell & HP offer online power calculators Actual power consumption varies significantly with load Hard to predict the peak power consumption of a group of servers while any particular server might temporarily run at 100% utilization, the maximum utilization of a group of servers probably isn t 100%. Problem: using any power numbers but the specs runs the risk of à facility power over-subscription à Power capping becomes necessary as a saftey mechanism 4

Power capping What is power capping? Preventing datacenter s total power usage from violating (crossing) a predefined limit, the power cap (i.e., prevent power overshooting) Techniques: Software techniques such as workload re-scheduling Duty cycle adaptation This work: Focuses on the 3.5 enterprise HDD Explores techniques inherently related to the underlying hardware Investigates using the queue size to cap the HDD s power consumption 5

Key contributions Investigate throttling HDD s throughput to cap power No strict positive correlation HDD is underutilized Investigate resizing HDD s queues and its impact on power Higher HDD utilization Performance differentiation: throughput & tail-latency Limitations under low concurrency and workload PCAP system based on queue resizing Make it stable, agile and performance-aware Compare it to throttling Study it for different workloads & settings 6

PCAP in the works 7

How we do it? 8

Setup A JBOD with 16x 4TB HDDs Exercising a single HDD only Workload generators: FIO YCSB & MongoDB Design space exploration: Reads/writes/mixed 4kB 2MB Threads: 10-256 Varying queue depth (QD) HDD: 1-32 IO stack: 4-128 Deadline scheduler (default) is used Power [W] 200 180 160 140 120 100 80 60 40 20 0 PS 1 PS2 Fans Disks Rest 20 110 20 20 20 9

HDD s dynamic power Static Dynamic Power [W] 10 5 0 HDD-A HDD-B HDD-C HDD-D 10

Existing techniques 1. Power off disks 2. Throttling throughput adapting duty cycle à negatively impact throughput & latency (later) à no strict positive correlation between power & throughput à 11

# outstanding requests matters queue size Small queue Spinning direction Large queue à 12

Proposal: Resizing queues We propose to resize HDD queues Queues: OS I/O scheduler (IOQ) HDD internal queue (NCQ) Dynamically resizing as the power cap changes Allows to control power Minding that queue size influences Throughput Tail-latency 13

Queue-size & power - causality Queue size influences the seek distance between requests A small queue results in long seek distance due to limited scheduling Long distances require acceleration & deceleration Acceleration and deceleration takes relatively large power And vice versa 14

Power vs. Queue size Power vs. IOQ Power vs. NCQ 15

Power vs. NCQ Long seeks à More power Throughput saturates à Stable power Low throughput à Less power Throughput increases à Power increases (diminishing returns) 16

PCAP design Reduce HDD s power quickly to bring it below the power cap Max out HDD s performance when more power is available But cautiously so that power cap is not violated à Different scaling factors of the queue sizes α UP and α DN Reduces oscillations around the target power à Hysteresis with margins [-ε,+ε] Periodically adapts queues to ensure cap power à Tunable period (T) 17

PCAP: basic Outside the controllable range 18

PCAP: Agile (bounded queues) 19

PCAP: Agile w/ improved throughput 20

PCAP: Dual-mode Throughput Tail-latency 21

Summary of performance Tail-latency Throughput Throttling % REQUESTS 100 80 60 40 Avg. throughp ut [IOPS] % requests < 100ms Max. latency [ms] Throttling 117 10% 2.5 PCAP - Throughput 154 (32%) 20% 2.7 20 PCAP tail latency 102 (-15%) 60% 1.3 0 10 100 1000 RESPONSE TIME [MS] 22

PCAP limitations Effective queue size matters Maximized load (100%) Varied load Cap=8W Cap=9W PCAP=8W PCAP=9W Averge power [W] 11 10 9 8 7 1 4 16 64 256 Averge power [W] 11 10 9 8 7 0% 50% 100% # threads (log scale) HDD utilization [%] 23

Conclusions Throttling underutilizes HDD s performance Useful under low concurrency and sequential throughput PCAP: resizing queues Improves HDD s utilization 32% more throughput 50% more requests < 100ms WC latency reduce by 2x PCAP performance is limited under Low concurrency Light workloads PCAP works for multiple HDDs Please see the paper for more observations and results 24

Thanks for your attention! Q & A Interested in internship in WDC research? Apply at: http://bit.do/fast16 Email: mohammed.khatib@hgst.com 25