Empirical Evaluation of Latency-Sensitive Application Performance in the Cloud

Similar documents
Janus: A-Cross-Layer Soft Real- Time Architecture for Virtualization

Why Study Multimedia? Operating Systems. Multimedia Resource Requirements. Continuous Media. Influences on Quality. An End-To-End Problem

Operating System Support for Multimedia. Slides courtesy of Tay Vaughan Making Multimedia Work

Introduction. Application Performance in the QLinux Multimedia Operating System. Solution: QLinux. Introduction. Outline. QLinux Design Principles

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION

Chapter -5 QUALITY OF SERVICE (QOS) PLATFORM DESIGN FOR REAL TIME MULTIMEDIA APPLICATIONS

Real-Time Internet of Things

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling

ibench: Quantifying Interference in Datacenter Applications

CloudNet: Dynamic Pooling of Cloud Resources by Live WAN Migration of Virtual Machines

Adapting Enterprise Distributed Real-time and Embedded (DRE) Pub/Sub Middleware for Cloud Computing Environments

Toward SLO Complying SSDs Through OPS Isolation

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Next-Generation Cloud Platform

Is today s public cloud suited to deploy hardcore realtime services?

Lecture 09: VMs and VCS head in the clouds

Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior. Yoongu Kim Michael Papamichael Onur Mutlu Mor Harchol-Balter

Model-Driven Geo-Elasticity In Database Clouds

Department of Computer Engineering University of California at Santa Cruz. File Systems. Hai Tao

Data Centers and Cloud Computing

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER

Data Centers and Cloud Computing. Data Centers

Distributed Systems COMP 212. Lecture 18 Othon Michail

Installation Prerequisites

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

PARDA: Proportional Allocation of Resources for Distributed Storage Access

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

Docker Overlay Networks

Using MySQL in a Virtualized Environment. Scott Seighman Systems Engineer Sun Microsystems

Cut Me Some Slack : Latency-Aware Live Migration for Databases. Sean Barker, Yun Chi, Hyun Jin Moon, Hakan Hacigumus, and Prashant Shenoy

High Performance Computing Cloud - a PaaS Perspective

Pricing Intra-Datacenter Networks with

Distributed Systems. 31. The Cloud: Infrastructure as a Service Paul Krzyzanowski. Rutgers University. Fall 2013

Elastic Compute Service. Quick Start for Windows

Elastic Efficient Execution of Varied Containers. Sharma Podila Nov 7th 2016, QCon San Francisco

Virtualization Introduction

Network Design Considerations for Grid Computing

Cross-layer Optimization for Virtual Machine Resource Management

Memory - Paging. Copyright : University of Illinois CS 241 Staff 1

CS 457 Multimedia Applications. Fall 2014

Experimental Model for Load Balancing in Cloud Computing Using Throttled Algorithm

Modeling VM Performance Interference with Fuzzy MIMO Model

Block Device Scheduling. Don Porter CSE 506

PAC485 Managing Datacenter Resources Using the VirtualCenter Distributed Resource Scheduler

Live Migration of Virtualized Edge Networks: Analytical Modeling and Performance Evaluation

Chapter 5 C. Virtual machines

CSC 5930/9010 Cloud S & P: Virtualization

OPENSTACK: THE OPEN CLOUD

MASV Accelerator Technology Overview

Efficient QoS for Multi-Tiered Storage Systems

Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018

Multimedia Systems 2011/2012

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

[537] RAID. Tyler Harter

A Comparative Study of High Performance Computing on the Cloud. Lots of authors, including Xin Yuan Presentation by: Carlos Sanchez

CS 470 Spring Virtualization and Cloud Computing. Mike Lam, Professor. Content taken from the following:

CLOUD PERFORMANCE & VALUE COMPARISON. Comparing 9 Major IaaS Vendors With Data Centers in Europe May 2016

Scheduler Support for Video-oriented Multimedia on Client-side Virtualization

RT- Xen: Real- Time Virtualiza2on. Chenyang Lu Cyber- Physical Systems Laboratory Department of Computer Science and Engineering

Real-time scheduling for virtual machines in SK Telecom

Unit 5: Distributed, Real-Time, and Multimedia Systems

Block Device Scheduling. Don Porter CSE 506

Block Device Scheduling

Preserving I/O Prioritization in Virtualized OSes

ElasterStack 3.2 User Administration Guide - Advanced Zone

Paperspace. Architecture Overview. 20 Jay St. Suite 312 Brooklyn, NY Technical Whitepaper

Managing Performance Variance of Applications Using Storage I/O Control

Advanced Cloud Infrastructures

Vess A2000 Series. NVR Storage Appliance. Sony RealShot Advanced VMS. Version PROMISE Technology, Inc. All Rights Reserved.

EECS750: Advanced Operating Systems. 2/24/2014 Heechul Yun

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services

Pocket: Elastic Ephemeral Storage for Serverless Analytics

COMPARING COST MODELS - DETAILS

Power Efficiency of Hypervisor and Container-based Virtualization

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System

Providing Near-Optimal Fair- Queueing Guarantees at Round-Robin Amortized Cost

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy

Scalable Cloud Management with Management Objectives

GPU Consolidation for Cloud Games: Are We There Yet?

SD-WAN Recommended Test Plan

Multimedia Networking

vrealize Business Standard User Guide

Amazon EC2 Deep Dive. Michael #awssummit

High-performance aspects in virtualized infrastructures

Application Performance Management in the Cloud using Learning, Optimization, and Control

Modeling and Optimization of Resource Allocation in Cloud

Automated Control for Elastic Storage Harold Lim, Shivnath Babu, Jeff Chase Duke University

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

RAIDIX Data Storage Solution. Data Storage for a VMware Virtualization Cluster

Ultra high-speed transmission technology for wide area data movement

Operating Systems CMPSCI 377 Spring Mark Corner University of Massachusetts Amherst

Experimental Model for Load Balancing in Cloud Computing Using Equally Spread Current Execution Load Algorithm

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

Priority Traffic CSCD 433/533. Advanced Networks Spring Lecture 21 Congestion Control and Queuing Strategies

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

Introduction to Operating Systems

Preparing Virtual Machines for Cisco APIC-EM

Acano solution. White Paper on Virtualized Deployments. Simon Evans, Acano Chief Scientist. March B

Certified Reference Design for VMware Cloud Providers

Transcription:

Empirical Evaluation of Latency-Sensitive Application Performance in the Cloud Sean Barker and Prashant Shenoy University of Massachusetts Amherst Department of Computer Science

Cloud Computing! Cloud platforms built with data centers: large-scale, concentrated servers clusters Machines rented out to companies or individuals Hosting for arbitrary applications May supplement local resources! Cheap enough to rent machines by the hour Type CPUs Memory Disk Cost/hr Small 1 1.7 GB 160 GB $0.085 Large 4 7.5 GB 850 GB $0.34 XL 8 15 GB 1690 GB $0.68 Current prices on Amazon Elastic Compute Cloud (EC2) 2

Multimedia Cloud Computing Scenarios! Clouds designed primarily for web & e-commerce apps, but may also be used for multimedia! Rent game server for an evening No firewall or bandwidth issues, only a few dollars! Rent high-cpu machines for HD video transcoding Home PC may take several hours to transcode one video, cloud can transcode many in a fraction of this time! Rent servers for webcast of live event Large, inexpensive temporary bandwidth allocation 3

Resource Sharing in the Cloud! Data center servers are typically well-equipped Providers share individual machines machines among multiple users Core 1 Core 2 Core 3 Core 4 4 GB RAM8 GB RAM4 GB RAM 1000 GB Disk 1000 GB Disk! Example: one user runs game server, another runs high-performance database on same machine! Multimedia has unique performance requirements Low latency games, low jitter & high bandwidth streaming! Are cloud platforms designed for conventional web applications suitable for multimedia? 4

Outline! Motivation! Virtualized clouds! Amazon EC2 study! Laboratory cloud study! Real world multimedia case studies! Related work & conclusions 5

Virtualized Clouds! Cloud platforms are virtualized data centers! Virtualization facilitates machine distribution among multiple users with virtual machines (VMs) Users Customer A Customer C Game Server Web Server Media Server VM VM VM Hardware Customer B 6

Virtual Machine Isolation! Each VM is assigned slice of physical resources! VM access to hardware managed by hypervisor Enforces limits and isolates VMs from each other Users Users App A App B App C resource starvation App A App B App C VM VM VM Hypervisor Hardware VM VM VM Hypervisor Hardware! Are these resource sharing mechanisms suitable for the timeliness constraints of multimedia? 8

Outline! Motivation! Virtualized clouds! Amazon EC2 study! Laboratory cloud study! Real world multimedia case studies! Related work & conclusions 9

EC2 Study Overview! Amazon Elastic Compute Cloud (EC2) Popular virtualized cloud platform! Unknown applications coexisting on machine No control over VM placement! Goal: evaluate performance with unknown background server load! Methodology: measured CPU, disk, and network consistency over period of days 10

EC2 CPU Performance 1400 1200 1000 2.5x average EC2 Local outliers: 1.5-2x avg CPU time (ms) 800 600 400 200 no competing VMs: no outliers 0 Time (5 minute intervals) Volatility on EC2 vs stability on dedicated server 11

EC2 Disk Performance 90000 80000 EC2 Local Long write time (ms) 70000 60000 50000 40000 30000 20000 10000 widely fluctuating disk performance 0 Time (5 minute intervals) Similarly: inconsistent EC2 disk performance 12

EC2 Network Latency (LAN) 250 First three hops latency (ms) 200 150 100 50 0 Time (5 minute intervals) Latency variations in EC2 LAN 13

EC2 Study Summary! Performance variations observed on EC2 Not observed on local server running a single VM! Can only speculate on causes without access to the hypervisor! Need to experiment on a controlled platform similar to Amazon s 14

Laboratory Cloud Study Overview! Local cloud running the Xen hypervisor Same virtualization technology used by EC2 Advantage: local cloud gives us control of interference! Built-in mechanisms for sharing hardware between VMs CPU credit scheduler Round-robin disk servicing Linux-level tool tc for network sharing! How well do these tools isolate background work?! Methodology: evaluated performance impact of competing VM 15

CPU Performance with Background Load 200 150 Max background work: VM gets 50% CPU CPU time (ms) 100 50 No background work: VM gets 100% CPU 0 Time (5 second intervals) Default 1 to 1 sharing with variable background load 16

Disk Performance with Background Load 100 Performance Impact (%) 80 60 40 20 0 unfair impact Fair Share Small Read Small Write Read Throughput Write Throughput 1 2 3 4 8 Disk Thread Pairs on Collocated VM Degraded by half over fair, but stable with increasing load 17

Laboratory Cloud Study Summary! Significant interference possible from background VMs! Xen configuration can guarantee share of CPU Default settings allow fluctuation in shared CPU! Disk sharing less fair and harder to control Consistent with observed EC2 behavior! Network sharing effects evaluated in case studies on laboratory cloud (next) 18

Case Study 1 Doom 3 Game Server! Multiplayer Doom 3 game server! Introduced controlled interference as before! Measured map load times and server latency! Network sharing configuration via tc: Idle: No bandwidth usage by resource-hog VM Off (default): No rate-limiting, network free-for-all Shared: 50% (min) to 100% (max) of bandwidth per VM Dedicated: 50% (max) of bandwidth per VM 19

Game Server Map Load 5000 Average Server Load Time (ms) 4000 3000 2000 1000 0 Idle Disk CPU Disk + CPU Collocated VM Activity Interference produces up to 50% degradation 20

Game Server Latency Configuration Avg. Latency (ms) Std. Deviation (jitter) Timeouts No interference 8.1 10.2 0% tc off (free-for-all) N/A N/A 100% tc, sharing b/w 33.9 16.9 2% tc, dedicated b/w 23.6 29.6 7%! Server crippled without bandwidth controls (tc off)! Dedicated vs shared bandwidth: Dedicated: lower latency, higher jitter Sharing: higher latency, lower jitter 21

Case Study 2 Darwin Streaming Server! Streaming video to multiple clients! Introduced controlled interference as before! Measured sustained streaming bandwidth and stream jitter (latency variation)! Varied tc settings and number of clients Max video stream rate of 1 Mbps per client 22

Streaming Server Bandwidth average bitrate per stream (kbps) 1000 800 600 400 200 0 decreased stream quality idle (fair) off shared dedicated tc sharing type 4 streams 8 streams both tc configurations recovered bandwidth 23

Streaming Server Jitter average stream jitter (ms) 16 14 12 10 8 6 4 2 4 streams 8 streams 0 idle (fair) off shared dedicated tc sharing type Jitter improved by shared, but worsened by dedicated 24

Real World Case Studies Summary! Real applications show substantial impacts from background interference! Network is particularly vulnerable without administrative controls! Proper configuration is important CPU and network isolation tools fairly well-developed Disk isolation needs better mechanisms 25

Related Work! Fair-share schedulers and quality-of-service Nieh and Lam (SOSP 97) for multimedia Sundaram et al. (ACM MM 00) for QoS-aware OS! Virtualization and hypervisors Xen, VMware ESX Server! Improving performance isolation Gupta et al. (Middleware 06) for Xen mechanisms! We focus on evaluation of existing mechanisms with specific attention to multimedia 26

Conclusions! Clouds exhibit performance variations Applications with timeliness requirements are particularly sensitive! Appropriate hypervisor configuration can help In some cases, prevents resource starvation Some resource sharing mechanisms need improvement! Future work: evaluation of non-xen platforms! Questions? 27