Balancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation

Similar documents
Efficient QoS for Multi-Tiered Storage Systems

Balancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation

Nested QoS: Providing Flexible Performance in Shared IO Environment

Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica. University of California, Berkeley nsdi 11

OASIS: Self-tuning Storage for Applications

Lecture 21. Reminders: Homework 6 due today, Programming Project 4 due on Thursday Questions? Current event: BGP router glitch on Nov.

Multidimensional Scheduling (Polytope Scheduling Problem) Competitive Algorithms from Competitive Equilibria

Scheduling Computations on a Software-Based Router

The Design and Implementation of AQuA: An Adaptive Quality of Service Aware Object-Based Storage Device

DC-DRF : Adaptive Multi- Resource Sharing at Public Cloud Scale. ACM Symposium on Cloud Computing 2018 Ian A Kash, Greg O Shea, Stavros Volos

Cut Me Some Slack : Latency-Aware Live Migration for Databases. Sean Barker, Yun Chi, Hyun Jin Moon, Hakan Hacigumus, and Prashant Shenoy

Comparing Performance of Solid State Devices and Mechanical Disks

EECS750: Advanced Operating Systems. 2/24/2014 Heechul Yun

CPU Scheduling. Operating Systems (Fall/Winter 2018) Yajin Zhou ( Zhejiang University

W4118: advanced scheduling

Jinho Hwang and Timothy Wood George Washington University

Efficient and Adaptive Proportional Share I/O Scheduling

Hard Disk Drives. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau)

RobinHood: Tail Latency-Aware Caching Dynamically Reallocating from Cache-Rich to Cache-Poor

Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016

No Tradeoff Low Latency + High Efficiency

Scheduling II. Today. Next Time. ! Proportional-share scheduling! Multilevel-feedback queue! Multiprocessor scheduling. !

MUD: Send me your top 1 3 questions on this lecture

Algorithms, Games, and Networks March 28, Lecture 18

FairRide: Near-Optimal Fair Cache Sharing

Performance Evaluation of Scheduling Mechanisms for Broadband Networks

PAC485 Managing Datacenter Resources Using the VirtualCenter Distributed Resource Scheduler

Lies, Damn Lies and Performance Metrics. PRESENTATION TITLE GOES HERE Barry Cooks Virtual Instruments

STORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us)

Frequently asked questions from the previous class survey

Queuing. Congestion Control and Resource Allocation. Resource Allocation Evaluation Criteria. Resource allocation Drop disciplines Queuing disciplines

BROMS: Best Ratio of MLC to SLC

Enhancements to Linux I/O Scheduling

Improving Disk I/O Performance on Linux. Carl Henrik Lunde, Håvard Espeland, Håkon Kvale Stensland, Andreas Petlund, Pål Halvorsen

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Typical scenario in shared infrastructures

Nested QoS: Providing Flexible Performance in Shared IO Environment

A New Metric for Analyzing Storage System Performance Under Varied Workloads

OASIS: Self-tuning Storage for Applications

Example: CPU-bound process that would run for 100 quanta continuously 1, 2, 4, 8, 16, 32, 64 (only 37 required for last run) Needs only 7 swaps

Toward SLO Complying SSDs Through OPS Isolation

Congestion Control and Resource Allocation

MediaTek CorePilot 2.0. Delivering extreme compute performance with maximum power efficiency

Interrupt Coalescing in Xen

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

PARDA: Proportional Allocation of Resources for Distributed Storage Access

Modification and Evaluation of Linux I/O Schedulers

Building a High IOPS Flash Array: A Software-Defined Approach

YouChoose: A Performance Interface Enabling Convenient and Efficient QoS Support for Consolidated Storage Systems

Router Design: Table Lookups and Packet Scheduling EECS 122: Lecture 13

Uniprocessor Scheduling. Basic Concepts Scheduling Criteria Scheduling Algorithms. Three level scheduling

On the Efficient Implementation of Pipelined Heaps for Network Processing. Hao Wang, Bill Lin University of California, San Diego

Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems

mclock: Handling Throughput Variability for Hypervisor IO Scheduling

Why Study Multimedia? Operating Systems. Multimedia Resource Requirements. Continuous Media. Influences on Quality. An End-To-End Problem

Storage Optimization with Oracle Database 11g

Computer Architecture Lecture 24: Memory Scheduling

LECTURE 3:CPU SCHEDULING

Deadline Guaranteed Service for Multi- Tenant Cloud Storage Guoxin Liu and Haiying Shen

Using Synology SSD Technology to Enhance System Performance Synology Inc.

AN ALTERNATIVE TO ALL- FLASH ARRAYS: PREDICTIVE STORAGE CACHING

Scale-out Data Deduplication Architecture

Scheduling. Scheduling algorithms. Scheduling. Output buffered architecture. QoS scheduling algorithms. QoS-capable router

Towards Green Cloud Computing: Demand Allocation and Pricing Policies for Cloud Service Brokerage Chenxi Qiu

ECE519 Advanced Operating Systems

White Paper ETERNUS AF series Best Suited for Databases. White Paper

Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems

[537] I/O Devices/Disks. Tyler Harter

MARACAS: A Real-Time Multicore VCPU Scheduling Framework

Roadmap for Enterprise System SSD Adoption

Staged Memory Scheduling

An Adaptive Partitioning Scheme for DRAM-based Cache in Solid State Drives

PROXIMITY AWARE LOAD BALANCING FOR HETEROGENEOUS NODES Mrs. Yogita A. Dalvi Dr. R. Shankar Mr. Atesh Kumar

Lecture 24: Scheduling and QoS

Exam Guide COMPSCI 386

Empirical Evaluation of Latency-Sensitive Application Performance in the Cloud

Charles Lefurgy IBM Research, Austin

Operating System Support for Multimedia. Slides courtesy of Tay Vaughan Making Multimedia Work

CS557: Queue Management

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

The Oracle Database Appliance I/O and Performance Architecture

Datacenter Simulation Methodologies Case Studies

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System

Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li

VMware vsphere 4: The CPU Scheduler in VMware ESX 4 W H I T E P A P E R

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

QoS support for Intelligent Storage Devices

Key metrics for effective storage performance and capacity reporting

The Pennsylvania State University. The Graduate School. Department of Computer Science and Engineering

PROPORTIONAL fairness in CPU scheduling mandates

Network Support for Multimedia

SOFTWARE DEFINED NETWORKS. Jonathan Chu Muhammad Salman Malik

CS 326: Operating Systems. CPU Scheduling. Lecture 6

VioStor NVR + Turbo NAS. Surveillance Storage Expansion NVR NAS

EECS 750: Advanced Operating Systems. 01/29 /2014 Heechul Yun

Amdahl s Law in the Datacenter Era! A Market for Fair Processor Allocation!

SoftNAS Cloud Performance Evaluation on AWS

OPERATING SYSTEMS CS3502 Spring Processor Scheduling. Chapter 5

Transcription:

Balancing Fairness and Efficiency in Tiered Storage Systems with Bottleneck-Aware Allocation Hui Wang, Peter Varman Rice University FAST 14, Feb 2014

Tiered Storage Tiered storage: HDs and SSDs q Advantages: } Performance } Cost q Challenges: } Fair resource allocation } High system efficiency Variable system throughput 2

Tiered Storage Model } } } Clients: Make requests to SSD (hit) and HD (miss) in certain ratio Scheduler: Aware of the request target, dispatches requests to storage Storage: SSD and HD independent, without frequent data migrations 3

Fairness and Efficiency in Tiered Storage How do we define fairness? q How to define fairness for multiple resources? q Fair allocation may cause low efficiency How to improve efficiency of both devices? q Only focusing on efficiency may cause unfairness 4

Existing Solutions for QoS Scheduling Proportional sharing in storage / IO scheduling q Extended from networks and CPU scheduling q Additional Reservation and Limit controls q All of them are designed for a single resource! Dominant Resource Fairness Model (DRF) [NSDI 11] q Designed for allocating multiple resources q DRF does not explicitly address system utilization 5

Talk Outline Motivation Bottleneck-Aware Allocation (BAA) Evaluation Conclusions and future work 6

Example: Single Device Type Configuration: q Single HD with capacity 100 IOPS; q Two clients with equal weights } Fully backlogged, Work-conserving q Proportional sharing 100% 50 IOPS 50 IOPS Results: q Each gets 50 IOPS q Utilization 100% HD 100 IOPS Device can be fully utilized for any allocation ratio 7

8 What if there are multiple resources?

Example: Multiple Devices (Fairness) Natural policy: Weighted Fair Queuing Configuration: } HD capacity 100 IOPS, SSD 500 IOPS; } Two clients: h1 = 0.9, h2 = 0.5; } Conventional WFQ 1:1 Results: } Each gets 167 IOPS } Utilization of HD = 100%, but SSD only 47% 100% 47% 16.7 IOPS 150 IOPS 83.3 IOPS IDLE 83.3 IOPS Simply transferring WFQ to multiple resources will have efficiency problem! HD SSD 100 IOPS 500 IOPS (Capacity Normalized) 9

Example: Multiple Devices (Efficiency) Configuration: } HD capacity 100 IOPS, SSD 500 IOPS; } Two clients h1 = 0.9, h2 = 0.5; Results: } Utilization 100% } Client 1 gets 500 IOPS } Client 2 gets 100 IOPS 100% 100% 50 IOPS 450 IOPS 50 IOPS It is not possible to precisely assign both the relative allocations (fairness) and the system utilization (efficiency). 50 IOPS HD SSD 100 IOPS 500 IOPS (Normalized) 10

DRF (Dominant Resource Fairness) Configuration: } HD 100 IOPS } SSD 500 IOPS } Two clients h1 = 0.9 (dominant resource SSD) h2 = 0.5 (dominant resource HD) What will DRF do? q Equalize dominant shares 64% 100% 77% 36 IOPS 64 IOPS 324 IOPS IDLE 64% 64 IOPS HD SSD (Normalized) 11

DRF Not addressing efficiency q Add a third client h3 = 0.1 q Utilization further reduced to 48% q Worse if more clients bottlenecked on HD 39% 100% 22 IOPS 39 IOPS 48% 196 IOPS 39% 39 IOPS IDLE 39% 5 IOPS 39 IOPS HD SSD 100 IOPS 500 IOPS 12

One More HD-bound Client 100% 77% 36 IOPS 100% 22 IOPS 48% 64 IOPS 324 IOPS 64% 39% 39 IOPS 196 IOPS 39% 64% IDLE 39 IOPS IDLE 64 IOPS 39% 5 IOPS 39 IOPS HD SSD HD SSD 100 IOPS 500 IOPS 100 IOPS 500 IOPS (Normalized) (Normalized) 13

Talk Outline Motivation Bottleneck-Aware Allocation (BAA) Evaluation Conclusions and future work 14

Fair Shares Fair Share of a client q IOPS it would get if each resource was partitioned equally among the clients 1/3 150 IOPS 300 IOPS? IOPS? IOPS Two devices (150 IOPS and 300 IOPS) } Client 1: h1 = 4/9 } Client 2: h2 = 4/9 1/3? IOPS? IOPS } Client 3: h3 = 5/6? IOPS 1/3? IOPS HD SSD 15

Fair Shares } Client 1: h1 = 4/9 } Client 2: h2 = 4/9 } Client 3: h3 = 5/6 1/3 150 IOPS 300 IOPS 50 IOPS 40 IOPS f i Fair share ( ): 50 IOPS 40 IOPS } Client 1: 90 IOPS 1/3 } Client 2: 90 IOPS } Client 3: 120 IOPS 1/3 20 IOPS 100 IOPS } Depends only on client s hit ratio and capacities of the devices HD SSD 16

Fairness Policy Allocate in the ratio of fair shares? q Fair share reflects what a client would get if running alone Problem q Throttling across devices similar to DRF example Solution q Bottleneck-aware allocation 17

Bottleneck-Aware Allocation Bottleneck Sets q Define load-balancing point h bal = C s / (C s + C d ) q If h i h bal : in HD-bottleneck Set (D) q If h i > h bal : in SSD-bottleneck Set (S) 18

Fairness Requirements of BAA Sharing Incentive (SI) q No client gets less IOPS than it would from equally partitioning each resource Envy-Freedom (EF) q Clients prefer their own allocation over the allocation of any other client Local Fair Share Ratio q Clients belong to the same bottleneck set get IOPS in proportion to their fair shares 19

Bottleneck-Aware Allocation Maximize system throughput Satisfy fairness requirements 20

Solution Space Satisfying All Properties BAA will match SI and EF of DRF Get better or same utilization than DRF DRF Sharing Incentive Envy Free BAA search area Local Fair Share Ratio 21

Fairness Constraints of BAA Fairness between clients in D: Fairness between clients in S: Fairness between a client in D and a client in S: q constraints } 22

Optimization for Allocation (2-variable LP) (1) (2) (3) (4) 23

Talk Outline Motivation Bottleneck-Aware Allocation (BAA) Evaluation Conclusions and future work 24

Evaluation Simulation q Evaluate BAA s efficiency q Evaluate BAA s dynamic behavior when workload changes Linux q Prototype by interposing BAA scheduler in the IO path q Evaluate BAA s efficiency, fairness (SI and EF) 25

Simulation (Efficiency - 2 clients) Two clients: h1 = 0.5; h2 = 0.95 Two devices: q HD= 100 IOPS; SSD = 5000 IOPS } SSD Utilization: } FQ: 7% } DRF: 65% } BAA: 100% 26

Simulation (Efficiency - 3 clients) } A third client: h3 = 0.8 } SSD Utilization: } FQ: 6% } DRF: 45% } BAA: 71% (bounded by fairness) 27

Simulation (Dynamic Behavior) Two clients q h1 = 0.45, 0.2 (after 510s) q h2 = 0.95 Two devices: q HD= 200 IOPS q SSD = 3000 IOPS The utilization is pulled back high after a short period 28

Linux (Efficiency-Throughput) Two clients: q Financial workload (h1= 0.3) q Exchange workload (h2 = 0.95) } Total throughputs: } BAA: 1396 IOPS } DRF: 810 IOPS } CFQ: 1011 IOPS 29

Linux (Efficiency-Utilization) The average utilization: BAA (HD 94% and SSD 92%), DRF (HD 99% and SSD 78%), CFQ (HD 99.8% and SSD 83%) 30

Linux (Fairness Sharing Incentive) 10000 1000 IOPS 100 10 1 Fair Share Throughput Client 1 Client 2 Client 3 Client 4 Four financial clients } h1=0.2 (D Set) } h2=0.4 (D Set) } h3= 0.98 (S Set) } h4 =1.0 (S Set) Every client receives at least its fair share. q Proportional to fair share 31

Linux (Fairness Envy freedom) 10000 HD SSD No one envies others allocation } No one get higher allocation on all devices 1000 } D set: Higher HD allocation IOPS 100 } S set: Higher SSD allocation 10 1 Client 1 Client 2 Client 3 Client 4 32

Talk Outline Motivation Bottleneck-Aware Allocation (BAA) Evaluation Conclusions and future work 33

Conclusions and Future Work A new model (BAA) to balance fairness and efficiency q Fairness: } Sharing Incentive } Envy free } Local Fair Share q Efficiency: } Maximize utilization subject to fairness constraints 34

Ongoing Work Apply BAA for broader multi-resource allocation q CPU, Memory, Networks Other fairness policies q Cost, reservations Cache model q SSD as a cache of HD q Data migration 35

36