Scheduler Support for Video-oriented Multimedia on Client-side Virtualization

Similar documents
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated Desktops

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

Janus: A-Cross-Layer Soft Real- Time Architecture for Virtualization

Chapter 5 C. Virtual machines

RT- Xen: Real- Time Virtualiza2on. Chenyang Lu Cyber- Physical Systems Laboratory Department of Computer Science and Engineering

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C.

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

references Virtualization services Topics Virtualization

REAL PERFORMANCE RESULTS WITH VMWARE HORIZON AND VIEWPLANNER

Virtualization. Michael Tsai 2018/4/16

RT- Xen: Real- Time Virtualiza2on from embedded to cloud compu2ng

RT#Xen:(Real#Time( Virtualiza2on(for(the(Cloud( Chenyang(Lu( Cyber-Physical(Systems(Laboratory( Department(of(Computer(Science(and(Engineering(

I/O Systems. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018

ParalleX. A Cure for Scaling Impaired Parallel Applications. Hartmut Kaiser

Virtual Machines. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Xen scheduler status. George Dunlap Citrix Systems R&D Ltd, UK

GPU Consolidation for Cloud Games: Are We There Yet?

Vulkan: Scaling to Multiple Threads. Kevin sun Lead Developer Support Engineer, APAC PowerVR Graphics

CFS-v: I/O Demand-driven VM Scheduler in KVM

Real-time scheduling for virtual machines in SK Telecom

EVALUATING WINDOWS 10: LEARN WHY YOUR USERS NEED GPU ACCELERATION

Real-Time Cache Management for Multi-Core Virtualization

Real-Time Internet of Things

15: OS Scheduling and Buffering

CS370 Operating Systems

CS370 Operating Systems

CS 470 Spring Virtualization and Cloud Computing. Mike Lam, Professor. Content taken from the following:

Empirical Evaluation of Latency-Sensitive Application Performance in the Cloud

IO virtualization. Michael Kagan Mellanox Technologies

CS-580K/480K Advanced Topics in Cloud Computing. VM Virtualization II

Virtual Machines. To do. q VM over time q Implementation methods q Hardware features supporting VM q Next time: Midterm?

QoS support for Intelligent Storage Devices

Xen and the Art of Virtualization. Nikola Gvozdiev Georgian Mihaila

I/O Device Controllers. I/O Systems. I/O Ports & Memory-Mapped I/O. Direct Memory Access (DMA) Operating Systems 10/20/2010. CSC 256/456 Fall

I/O Systems. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)

Preserving I/O Prioritization in Virtualized OSes

Reserves time on a paper sign-up sheet. Programmer runs his own program. Relays or vacuum tube hardware. Plug board or punch card input.

Networks and Operating Systems Chapter 11: Introduction to Operating Systems

Request-Oriented Durable Write Caching for Application Performance

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System

Sistemi in Tempo Reale

CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives

CPU Scheduling. Operating Systems (Fall/Winter 2018) Yajin Zhou ( Zhejiang University

Process Scheduling Queues

Virtualization. Pradipta De

Re-architecting Virtualization in Heterogeneous Multicore Systems

Using MySQL in a Virtualized Environment. Scott Seighman Systems Engineer Sun Microsystems

Operating Systems (2INC0) 2018/19. Introduction (01) Dr. Tanir Ozcelebi. Courtesy of Prof. Dr. Johan Lukkien. System Architecture and Networking Group

Four Components of a Computer System

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization.

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

Enlightening the I/O Path: A Holistic Approach for Application Performance

Main Points of the Computer Organization and System Software Module

Supporting Time-sensitive Applications on a Commodity OS

Performance Evaluation of Virtualization Technologies

Transparent Fault Tolerance of Device Drivers for Virtual Machines CS/TR June 29, K A I S T Department of Computer Science

How it can help your organisation

Unit 5: Distributed, Real-Time, and Multimedia Systems

Managing Performance Variance of Applications Using Storage I/O Control

3.1 Introduction. Computers perform operations concurrently

Fault Tolerant Java Virtual Machine. Roy Friedman and Alon Kama Technion Haifa, Israel

Virtual Machine Monitors!

Virtualization. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels

Real-Time Component Software. slide credits: H. Kopetz, P. Puschner

Processes and Non-Preemptive Scheduling. Otto J. Anshus

Lecture 4: Memory Management & The Programming Interface

Guest-Aware Priority-Based Virtual Machine Scheduling for Highly Consolidated Server

QuartzV: Bringing Quality of Time to Virtual Machines

The Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014)

CS370 Operating Systems

Scotch: Combining Software Guard Extensions and System Management Mode to Monitor Cloud Resource Usage

Chapter 4: Threads. Chapter 4: Threads. Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues

Shadowfax: Scaling in Heterogeneous Cluster Systems via GPGPU Assemblies

An Efficient Virtual CPU Scheduling Algorithm for Xen Hypervisor in Virtualized Environment

Multi-Hypervisor Virtual Machines: Enabling An Ecosystem of Hypervisor-level Services

Server Virtualization Approaches

Full Scalable Media Cloud Solution with Kubernetes Orchestration. Zhenyu Wang, Xin(Owen)Zhang

EVALUATING WINDOWS 10 LEARN WHY YOUR USERS NEED GPU ACCELERATION

VM Migration, Containers (Lecture 12, cs262a)

Introduction. What is an Operating System? A Modern Computer System. Computer System Components. What is an Operating System?

Distributed Systems COMP 212. Lecture 18 Othon Michail

NFVnice: Dynamic Backpressure and Scheduling for NFV Service Chains

I/O and virtualization

Advanced Operating Systems (CS 202) Scheduling (2)

GViM: GPU-accelerated Virtual Machines

Virtualization with XEN. Trusted Computing CS599 Spring 2007 Arun Viswanathan University of Southern California

238P: Operating Systems. Lecture 14: Process scheduling

Supporting Soft Real-Time Tasks in the Xen Hypervisor

Microsoft RemoteFX for Remote Desktop Virtualization Host Capacity Planning Guide for Windows Server 2008 R2 Service Pack 1

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy

On the DMA Mapping Problem in Direct Device Assignment

Operating Systems. Scheduling

Operating System: Chap13 I/O Systems. National Tsing-Hua University 2016, Fall Semester

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization:

System Virtual Machines

Chapter 1: Introduction Operating Systems MSc. Ivan A. Escobar

Khronos and the Mobile Ecosystem

PAC094 Performance Tips for New Features in Workstation 5. Anne Holler Irfan Ahmad Aravind Pavuluri

Transcription:

Scheduler Support for Video-oriented Multimedia on Client-side Virtualization Hwanju Kim 1, Jinkyu Jeong 1, Jaeho Hwang 1, Joonwon Lee 2, and Seungryoul Maeng 1 Korea Advanced Institute of Science and Technology (KAIST) 1 Sungkyunkwan University 2 ACM Multimedia Systems (MMSys) February, 22-24, Chapel Hill, North Carolina, USA

Virtualization Everywhere A new layer between OS and HW =hypervisor OS Hypervisor OS Multiple OSes High resource utilization Strong Isolation Easy management Live maintenance Fast provisioning Server-side virtualization Client-side virtualization Cloud computing Virtual desktop infrastructure 2/22

Client-side Virtualization Multiple OS instances on a local device Primary use cases Different OSes for application compatibility Consolidating business and personal computing environments on a single device BYOD: Bring Your Own Device Managed domain Business Personal Hypervisor 3/22

Multimedia on Virtualized Clients Multimedia is ubiquitous on any Video Playback Compilation Data Processing 3D game Video conference Downloading Windows Linux Business Personal Business Personal Hypervisor Hypervisor Hypervisor 1. Multimedia workloads are dominant on virtualized clients 2. Interactive systems can have concurrently mixed workloads 4/22

Issues on Multi-layer Scheduling A multimedia-agnostic hypervisor invalidat es OS policies for multimedia Larger CPU proportion & Timely dispatching BVT [SOSP 99] SMART [TOCS 03] Rialto [SOSP 97] BEST [MMCN 02] HuC [TOMCCAP 06] Redline [OSDI 08] RSIO [SIGMETRICS 10] Windows MMCSS Task OS scheduler CPU Task Additional abstraction Task I m unaware of any multimedia -specific OS policies in a, since I see each as a black box. Task OS scheduler Virtual CPU Hypervisor Scheduler CPU Task Task OS Scheduler Virtual CPU Semantic gap! 5/22

Multimedia-agnostic Hypervisor Multimedia QoS degradation Average FPS 30 25 20 15 10 5 0 Two s with equal CPU shares Multimedia + Competing Video playback (720p) on VLC media player Average FPS 100 90 80 70 60 50 40 30 20 10 0 Video playback or 3D game Xen hypervisor Credit scheduler Quake III Arena (demo1) Competing workloads Competing workloads in another Competing workloads in another 6/22

Possible Solutions to Semantic Gap Explicit vs. Implicit Explicit OS cooperation OS scheduler Hypervisor Scheduler + Accurate - OS modification - Infeasible w/o multimedia-friendly OS schedulers Explicit User involvement OS scheduler Hypervisor Scheduler + Simple - Inconvenient - Unsuitable for dynamic workloads Implicit Hypervisor-only OS scheduler Workload monitor Hypervisor Scheduler + Transparency - Difficult to identify workload demands at the hypervisor 7/22

Proposed Approach Multimedia-aware hypervisor scheduler Transparent scheduler support for multimedia No modifications to upper layer SW (OS & apps) Feedback-driven scheduling Multimedia monitor Hypervisor Multimedia manager (feedback-driven) Scheduling command (e.g., CPU share or priority) CPU scheduler Estimated multimedia QoS Audio Video CPU Challenges 1. How to estimate multimedia QoS based on a small set of HW events? 2. How to control CPU scheduler based on the estimated information 8/22

Multimedia QoS Estimation What is estimated as multimedia QoS? Display rate (i.e., frame rate) Used by HuC scheduler [TOMCCAP 06] How is a display rate captured at the hypervisor? Two types of display 1. Memory-mapped display (e.g., video playback) Display interface Memory-mapped Graphics Library 2. GPU-accelerated display (e.g., 3D game) Framebuffer Acceleration unit Video device 9/22

Memory-mapped Display (1/2) How to estimate a display update rate on the memory-mapped framebuffer Write-protection for virtual address space mapped to framebuffer Display interface write The hypervisor can inspect any attempt to map memory Hypervisor page fault handler { Update display rate } Virtual address space Framebuffer memory Write-protection Sampling to reduce trap overheads (1/128 pages, by default) 10/22

Memory-mapped Display (2/2) Accurate estimation Maintaining display rate per task An aggregated display rate does not represent multimedia QoS Tracking guest OS task at the hypervisor Task 25 FPS Task 10 FPS Inspecting address space switches (Antfarm [USENIX 06]) Monitoring audio access (RSIO [SIGMETRIC 10]) Inspecting audio buffer access with write-protection A task with a high display rate and audio access à a multimedia task 11/22

GPU-accelerated Display (1/2) Naïve method Inspecting GPU command buffer with write-protection or polling Too heavy due to huge amount of GPU commands Lightweight method Little overhead, but less accuracy 3D games are less sensitive to frame rate degradation than video playback GPU interrupt-based estimation An interrupt is typically used for an application to manage buffer memory Hypothesis A GPU interrupt rate is in proportion to a display rate 12/22

# of GPU interrupt / sec GPU-accelerated Display (2/2) Linear relationship between display rates and GPU interrupt rates 12000 10000 8000 6000 4000 2000 0 Intel GMA 950 (Apple MacBook) Quake3 demo1 (640x480) Quake3 demo2 (640x480) Quake3 demo1 (1024x768) # of GPU interrupt / sec 160 140 120 100 80 Nvidia 6150 Go (HP Pavillion tablet) Quake3 demo1 (640x480) Quake3 demo2 (640x480) Quake3 demo1 (1024x768) 60 0 0 20 40 60 A 80 GPU 100 interrupt 0 20 rate 40 can 60 be 80 used 100 to estimate 0 20 a display 40 rate 60 FPS FPS FPS without additional overheads # of GPU interrupt / sec 400 350 300 250 200 150 100 50 PowerVR (Samsung GalaxyS) Quake3 demo4 (320x240) Quake3 demo4 (640x480) Exponential weighted moving average (EWMA) is used to reduce fluctuation EWMA t = (1-w) x EWMA t-1 + w x current value 13/22

Multimedia Manager A feedback-driven CPU allocator Base assumption Additional CPU share (or higher priority) improves a display rate Desired frame rate (DFR) A currently achievable display rate Multiplied by tolerable ratio (0.8) /* Exceptional cases: * 1) No relationship between CPU and FPS * 2) FPS is saturated below DFR * 3) Local CPU contention in a */ If no FPS improvement by CPU share increase (3 times) Then Decrease CPU share by half IF current FPS < previous FPS AND current FPS < DFR THEN Increase CPU share If in initial phase Then Exponential increase Else Linear increase 14/22

Priority Boosting Responsive dispatching Problem The hypervisor does not distinguish the types of events for priority boosting A that will handle a multimedia event cannot preempt a currently running handling a normal event. Higher priority for multimedia-related events e.g., video, audio, one-shot timer Priority MMBOOST IOBOOST Normal priority Multimedia events Other events Based on remaining CPU shares 15/22

Evaluation Experimental environment Intel MacBook with Intel GMA 950 Xen 3.4.0 with Ubuntu 8.04 Implementation based on Xen Credit scheduler Two- scenario One with direct I/O + one with indirect (hosted) I/O Presenting the case of direct I/O in this talk See the paper for the details of the indirect I/O case 16/22

Estimation Accuracy Estimation accuracy Error rates: 0.55%~3.05% Video playback (720p) (w/ CPU-bound ) Real FPS Estimated FPS multimedia manager disabled FPS 30 25 20 15 10 5 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 Time (sec) Quake 3 (w/ CPU-bound ) Real FPS Estimated FPS Estimated FPS (EWMA, w=0.2) FPS 100 80 60 40 20 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 Time (sec) 17/22

Estimation Overhead CPU overhead caused by page faults Video playback 0.3~1% with sampling Less than 5% with tracking all pages Overhead All pages Sampling 1/8 pages 1/32 pages 1/128 pages Low resolution (640x354) High resolution (1280x720) 4.95% 1.10% 0.54% 0.58% 3.91% 1.04% 0.69% 0.33% 18/22

Multimedia Manager Video playback (720p) + CPU-bound FPS DFR CPU share (%) FPS 25 20 15 25 20 15 10 5 0 25 100 90 0 10 20 30 40 50 60 70 80 80 20 70 60 50 Time (sec) 15 100 90 80 70 60 50 40 30 20 100 10 090 80 70 60 CPU share (%) 10 40 50 5 5 6 7 8 9 10 30 20 10 80 81 82 83 84 40 19/22

Performance Improvement Performance improvement Closed to maximum achievable frame rates Video playback or 3D game Hypervisor Competing workloads Credit scheduler Video playback (720p) on VLC media player Credit scheduler w/ multimedia support Quake III Arena (demo1) Credit scheduler Credit scheduler w/ multimedia support Average FPS 25 20 15 10 5 0 Average FPS 100 80 60 40 20 0 Competing workloads in another Competing workloads in another 20/22

Limitations & Discussion Network-streamed multimedia Additional preemption support required for multimedia-related network packets Multiple multimedia workloads in a Multimedia manager algorithm should be refined to satisfy QoS of mixed multimedia workloads in the same Adaptive management for SMP s Adaptive vcpu allocation based on hosted multimedia workloads 21/22

Conclusions Demands for multimedia-aware hypervisor Multimedia are increasingly dominant in virtualized systems Multimedia-friendly hypervisor scheduler Transparent and lightweight multimedia support on client-side virtualization Future directions Multimedia for server-side VDI Multicore extension for SMP s Considerations for network-streamed multimedia 22/22

Thank You! Questions and comments Contact hjukim@calab.kaist.ac.kr 23/22

EXTRA SLIDES

Workload Details Descriptions and CPU usages for used workloads 25/22

Task-based Display Rate Management Different types of frame rate accounting For Unix-like OSes, a display rate is accounted to a previously scheduled task when a framebuffer write is observed (this heuristic works well because a display interface is interactive and is highly likely to be scheduled promptly on requests) Video playback (640x354) + Directory listing on the screen (ls alr /) (w/ CPU-bound ) Real FPS Estimated FPS FPS 30 25 20 15 10 5 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 Time (sec) Error rate = 1.78% 26/22

Local CPU Contention in a Increasing the number of CPU-bound tasks with a video playback application Native Linux Virtualized Linux w/ our scheme Ø Our scheme provides sufficient CPU to a (IDD) that hosts a multimedia workload even though local CPU contention increases Ø No starvation of another CPU-bound (guest) 27/22