PARDA: Proportional Allocation of Resources for Distributed Storage Access

Similar documents
Performance Implications of Storage I/O Control Enabled NFS Datastores in VMware vsphere 5.0

Managing Performance Variance of Applications Using Storage I/O Control

Efficient QoS for Multi-Tiered Storage Systems

NAS for Server Virtualization Dennis Chapman Senior Technical Director NetApp

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

OASIS: Self-tuning Storage for Applications

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

Recommendations for Aligning VMFS Partitions

mclock: Handling Throughput Variability for Hypervisor IO Scheduling

WHITE PAPER. Optimizing Virtual Platform Disk Performance

PAC485 Managing Datacenter Resources Using the VirtualCenter Distributed Resource Scheduler

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems

PAC094 Performance Tips for New Features in Workstation 5. Anne Holler Irfan Ahmad Aravind Pavuluri

Accelerate Applications Using EqualLogic Arrays with directcache

Virtualization of the MS Exchange Server Environment

Reference Architecture

Quantifying Load Imbalance on Virtualized Enterprise Servers

Comparison of Storage Protocol Performance ESX Server 3.5

Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces

EMC Business Continuity for Microsoft Applications

EMC Backup and Recovery for Microsoft SQL Server

I/O Commercial Workloads. Scalable Disk Arrays. Scalable ICDA Performance. Outline of This Talk: Related Work on Disk Arrays.

Virtual SQL Servers. Actual Performance. 2016

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware

DELL TM AX4-5 Application Performance

Exchange 2003 Archiving for Operational Efficiency

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

PRESENTATION TITLE GOES HERE

Introduction. Application Performance in the QLinux Multimedia Operating System. Solution: QLinux. Introduction. Outline. QLinux Design Principles

IBM Emulex 16Gb Fibre Channel HBA Evaluation

VMware VMmark V1.1 Results

Demartek September Intel 10GbE Adapter Performance Evaluation for FCoE and iscsi. Introduction. Evaluation Environment. Evaluation Summary

SHARDS & Talus: Online MRC estimation and optimization for very large caches

Dell PowerEdge R910 SQL OLTP Virtualization Study Measuring Performance and Power Improvements of New Intel Xeon E7 Processors and Low-Voltage Memory

Azor: Using Two-level Block Selection to Improve SSD-based I/O caches

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange Enabled by MirrorView/S

EMC Virtual Infrastructure for Microsoft Exchange 2007 Enabled by EMC CLARiiON CX4-120 and VMware vsphere 4.0 using iscsi

Evaluation of Multi-protocol Storage Latency in a Multi-switch Environment

W H I T E P A P E R. What s New in VMware vsphere 4: Performance Enhancements

DELL Reference Configuration Microsoft SQL Server 2008 Fast Track Data Warehouse

vsan 6.6 Performance Improvements First Published On: Last Updated On:

RIGHTNOW A C E

Vblock Architecture. Andrew Smallridge DC Technology Solutions Architect

EMC CLARiiON Backup Storage Solutions

EMC CLARiiON CX3 Series FCP

Performance Scaling. When deciding how to implement a virtualized environment. with Dell PowerEdge 2950 Servers and VMware Virtual Infrastructure 3

Session 201-B: Accelerating Enterprise Applications with Flash Memory

A Performance Characterization of Microsoft SQL Server 2005 Virtual Machines on Dell PowerEdge Servers Running VMware ESX Server 3.

EMC Virtual Architecture for Microsoft SharePoint Server Reference Architecture

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION

EMC Virtual Infrastructure for Microsoft Exchange 2007

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays

Entuity Network Monitoring and Analytics 10.5 Server Sizing Guide

NetApp AFF A300 Review

Analysis and Optimization. Carl Waldspurger Irfan Ahmad CloudPhysics, Inc.

Efficient MRC Construc0on with SHARDS

Using EMC FAST with SAP on EMC Unified Storage

Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation

Architecture and Performance Implications

QLE10000 Series Adapter Provides Application Benefits Through I/O Caching

EMC Celerra Manager Makes Customizing Storage Pool Layouts Easy. Applied Technology

BlackBerry AtHoc Networked Crisis Communication Capacity Planning Guidelines. AtHoc SMS Codes

Maintaining End-to-End Service Levels for VMware Virtual Machines Using VMware DRS and EMC Navisphere QoS

What is QES 2.1? Agenda. Supported Model. Live demo

vsan Mixed Workloads First Published On: Last Updated On:

Maintaining End-to-End Service Levels for VMware Virtual Machines Using VMware DRS and EMC Navisphere QoS

IOmark- VM. HP MSA P2000 Test Report: VM a Test Report Date: 4, March

Toward SLO Complying SSDs Through OPS Isolation

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays

Free up rack space by replacing old servers and storage

Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure

Assessing performance in HP LeftHand SANs

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

NetApp AFF A700 Performance with Microsoft SQL Server 2014

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

Intelligent QoS Grid for Virtualized Workloads Gaurav Gupta Tata Consultancy Services

The Oracle Database Appliance I/O and Performance Architecture

Adobe Acrobat Connect Pro 7.5 and VMware ESX Server

PESIT Bangalore South Campus

Best Practices for deploying VMware ESX 3.x and 2.5.x server with EMC Storage products. Sheetal Kochavara Systems Engineer, EMC Corporation

Evaluation of Chelsio Terminator 6 (T6) Unified Wire Adapter iscsi Offload

Benefits of Multi-Node Scale-out Clusters running NetApp Clustered Data ONTAP. Silverton Consulting, Inc. StorInt Briefing

Architecting For Availability, Performance & Networking With ScaleIO

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

EMC Performance Optimization for VMware Enabled by EMC PowerPath/VE

VMmark V1.0.0 Results

Extremely Fast Distributed Storage for Cloud Service Providers

Empirical Evaluation of Latency-Sensitive Application Performance in the Cloud

EMC Exam E VMAX3 Solutions and Design Specialist Exam for Technology Architects Version: 6.0 [ Total Questions: 136 ]

IBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide

Performance Characterization of ONTAP Cloud in Azure with Application Workloads

Maximizing VMware ESX Performance Through Defragmentation of Guest Systems

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

Disaster Recovery-to-the- Cloud Best Practices

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary

Flash In the Data Center

Gavin Payne Senior Consultant.

Transcription:

PARDA: Proportional Allocation of Resources for Distributed Storage Access Ajay Gulati, Irfan Ahmad, Carl Waldspurger Resource Management Team VMware Inc. USENIX FAST 09 Conference February 26, 2009

The Problem online store Data Mining (low priority) Storage Array LUN 2

The Problem What you see online store Data Mining (low priority) Storage Array LUN 3

The Problem What you see What you want to see online store Data Mining (low priority) online store Data Mining (low priority) Storage Array LUN Storage Array LUN 4

Setup Distributed Storage Access VMs running across multiple hosts Hosts share LUNs using a cluster filesystem No central control on IO path Issues Hosts adversely affect each other Difficult to respect per-vm allocations Proportional shares (aka weights ) Specify relative priority Goal: Provide isolation while maximizing array utilization 10 30 10 10 30 10 10 10 100 50 30 20 5

Host-local Scheduling Inadequate VM Shares 20 10 10 10 20 10 OLTP OLTP OLTP OLTP Iomtr Iomtr 30 20 20 10 Host Shares Local SFQ schedulers respect VM shares BUT not across hosts Throughput (IOPS) Hosts 6

Host-local Scheduling Inadequate VM Shares 20 10 10 10 20 10 OLTP OLTP OLTP OLTP Iomtr Iomtr 30 20 20 10 Host Shares Local SFQ schedulers respect VM shares BUT not across hosts Throughput (IOPS) Hosts get equal IOPS IOPS dependent on VM placement! Hosts 7

Outline Problem Context Analogy to Network Flow Control Control Mechanism Experimental Evaluation Conclusions and Future Work 8

Analogy to Network Flow Control Networks Storage Network is a black box Array is a black box Network congestion detected using RTT, packet loss TCP adapts window size Estimate array congestion using IO latency Packet loss very unlikely Adapt number of pending IO requests (a.k.a. window size) TCP ensures fair sharing FAST TCP* proportional sharing Adapt window size based on shares/weights * Low et. al. FAST TCP: Motivation, Architecture, Algorithms, Performance. Proc. IEEE INFOCOM 04 9

Outline Problem Context Analogy to Network Flow Control Control Mechanism Experimental Evaluation Conclusions and Future Work 10

PARDA Architecture SFQ Host-Level Issue Queues Array Queue SFQ Storage Array SFQ Queue lengths varied dynamically 11

Per-Host Control Algorithm L w( t + 1) = (1 γ ) w( t) + γ w( t) + β L( t) Adjust window size w(t) using cluster-wide average latency L(t) L : latency threshold, operating point for IO latency β :proportional to aggregate VM shares for host γ : smoothing parameter between 0 and 1 Motivated by FAST TCP mechanism 12

Control Algorithm Features L w( t + 1) = (1 γ ) w( t) + γ w( t) + β L( t) Maintain high utilization at the array Overall array queue proportional to Throughput x L Ability to allocate queue size in proportion to hosts shares At equilibrium, host window sizes are proportional to β Ability to control overall latency of a cluster Cluster operates close to L or below 13

Unit of Allocation Two main units exist in literature Bandwidth (MB/s) Throughput (IOPS ) Both have problems Using bandwidth may hurt workloads with large IO sizes Using IOPS may hurt VMs with sequential IOs PARDA: allocate queue slots at array Carves out array queue among VMs Workloads can recycle queue slots faster or slower Maintains high efficiency 14

Storage-Specific Issues Issues Throughput varies by 10x depending on workload characteristics IO sizes may vary by 1000x (512B 512KB) Array complexity: caching, different paths for read vs. write Hosts may observe different latencies PARDA Solutions Latency normalization for large IO sizes Compute cluster-wide average latency using a shared file 15

Handling Bursts Two Time Scales PARDA Slots Assignment All VMs active VM 1 VM 2 VM 3 VM 4 16

Handling Bursts Two Time Scales PARDA Slots Assignment VM2 is idling Other VMs take advantage VM 1 VM 2 VM 3 VM 4 Workload variation over short time periods Handled by existing local SFQ scheduler No strict partitioning of host-level queue 17

Handling Bursts Two Time Scales PARDA Slots Assignment VM2 is idling Other VMs take advantage Long-term, scale shares VM 1 VM 2 VM 3 VM 4 Workload variation over short time periods Handled by existing local SFQ scheduler No strict partitioning of host-level queue VM idleness over longer term Re-compute β per host based on VM activity Effectively scale VM shares based on its utilization Utilization computed as (# outstanding IOs) / (VM window size) 18

Outline Problem Context Analogy to Network Flow Control Control Mechanism Experimental Evaluation Conclusions and Future Work 19

Experimental Setup VMware Cluster 1-8 hosts running ESX hypervisor Each host: 2 Intel Xeon dual cores, 8 GB RAM FC-SAN attached Storage EMC CLARiiON CX3-40 storage array Similar results on NetApp FAS6030 Two volumes used 200 GB, 10 disks, RAID-0 (striped) 400 GB, 5-disk, RAID-5 20

PARDA: Proportional Sharing Window sizes are in proportion to shares Latency close to 25 ms IOPS match shares but affected by other factors Aggregate IOPS with/without PARDA (3400 vs 3360) Setup: 4 Hosts, shares 1 : 1 : 2 : 2 Latency threshold L = 25ms Workload 16KB, 100% reads, 100% random IO 21

PARDA: Dynamic Load Adaptation PARDA adapts to load Latency close to 30 ms IOPS increase with increase in window size Aggregate IOPS with/without PARDA (3090 vs 3155) Setup: 6 Hosts, equal shares, uniform workload Latency threshold L = 30ms Three VMs are stopped at 145, 220 and 310 sec 22

PARDA: End-to-End Control VM Shares 20 10 10 10 20 10 OLTP OLTP OLTP OLTP Iomtr Iomtr Setup: Latency threshold L = 25ms 30 20 20 10 Host Shares With PARDA: Shares are respected independent of VM placement Throughput (IOPS) Hosts 23

PARDA: Handling Bursts Setup: One VM idles from 140 sec to 310 sec Result: PARDA is able to adjust β values at host No undue advantage given to VMs sharing the host with idle VM 24

PARDA: SQL Workloads Setup: 2 Hosts, 2 Windows VMs running SQL server (250 GB data disk, 50 GB log disk) Shares 1 : 4 Latency threshold L = 15ms Without PARDA: No Fairness Across hosts Both hosts get ~600 IOPS With PARDA: Shares are respected across hosts Host 1,2 with shares 1:4 get 6952 and 12356 OPM 25

PARDA: Burst Handling Beta values Result: Latency Window size PARDA adjusts β values at host with the change in pending IO count VM2 with high shares is unable to fill its window IOPS and OPM are differentiated based on β values 26

Evaluation Recap Effectiveness of PARDA mechanism Fairness Load or throughput variations Burst handling End-to-end control Enterprise workloads Evaluation of control parameters (without PARDA) Latency variation with workload Latency variation with overall load Queue length as control mechanism for fairness 27

Outline Problem Context Analogy to Network Flow Control Control Mechanism Experimental Evaluation Conclusions and Future Work 28

Conclusions and Future Work PARDA contributions Efficient distributed IO resource management without any support from storage array Fair end-to-end VM allocation, proportional to VM shares Control on overall cluster latency Future work Latency threshold estimation or adaptation? Detect and handle uncontrolled (non-parda) hosts NFS adaptation 29

Questions... {agulati,irfan,carl}@vmware.com Storage in Virtualization BoF tonight @7:30 pm 30