Experimental Study of Virtual Machine Migration in Support of Reservation of Cluster Resources

Similar documents
A User-level Secure Grid File System

Distributed File System Support for Virtual Machines in Grid Computing

Kemari: Virtual Machine Synchronization for Fault Tolerance using DomT

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION

Cross-layer Optimization for Virtual Machine Resource Management

Power Consumption of Virtual Machine Live Migration in Clouds. Anusha Karur Manar Alqarni Muhannad Alghamdi

Supporting Application- Tailored Grid File System Sessions with WSRF-Based Services

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work

Data Partitioning on Heterogeneous Multicore and Multi-GPU systems Using Functional Performance Models of Data-Parallel Applictions

Optimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications

Network Design Considerations for Grid Computing

An Empirical Model for Predicting Cross-Core Performance Interference on Multicore Processors

Quantifying Load Imbalance on Virtualized Enterprise Servers

GFS: The Google File System

CS370 Operating Systems

Xenrelay: An Efficient Data Transmitting Approach for Tracing Guest Domain

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Towards Energy-Efficient Reactive Thermal Management in Instrumented Datacenters

Accelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh

Kyoung Hwan Lim and Taewhan Kim Seoul National University

IBM B2B INTEGRATOR BENCHMARKING IN THE SOFTLAYER ENVIRONMENT

IBM InfoSphere Streams v4.0 Performance Best Practices

Advanced RDMA-based Admission Control for Modern Data-Centers

Recovering Disk Storage Metrics from low level Trace events

EMC Backup and Recovery for Microsoft SQL Server

Distributed File System Virtualization Techniques Supporting On-Demand Virtual Machine Environments for Grid Computing

vsan 6.6 Performance Improvements First Published On: Last Updated On:

Preserving I/O Prioritization in Virtualized OSes

I/O Systems. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware

A Scalable Event Dispatching Library for Linux Network Servers

The Google File System

Buffered Co-scheduling: A New Methodology for Multitasking Parallel Jobs on Distributed Systems

Paperspace. Architecture Overview. 20 Jay St. Suite 312 Brooklyn, NY Technical Whitepaper

StorMagic SvSAN: A virtual SAN made simple

CS370 Operating Systems

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

Assessing performance in HP LeftHand SANs

Automated Control for Elastic Storage Harold Lim, Shivnath Babu, Jeff Chase Duke University

EMC Business Continuity for Microsoft Applications

Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems

IBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide

by I.-C. Lin, Dept. CS, NCTU. Textbook: Operating System Concepts 8ed CHAPTER 13: I/O SYSTEMS

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems

Device-Functionality Progression

Chapter 12: I/O Systems. I/O Hardware

VoltDB vs. Redis Benchmark

Module 6: INPUT - OUTPUT (I/O)

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization

Consolidating Complementary VMs with Spatial/Temporalawareness

Understanding Reduced-Voltage Operation in Modern DRAM Devices

Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment

Chapter 4 Data Movement Process

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures

Modeling VM Performance Interference with Fuzzy MIMO Model

double split driver model

VCP410 VMware vsphere Cue Cards

Making the Box Transparent: System Call Performance as a First-class Result. Yaoping Ruan, Vivek Pai Princeton University

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience

Lecture Topics. Announcements. Today: Uniprocessor Scheduling (Stallings, chapter ) Next: Advanced Scheduling (Stallings, chapter

THE phenomenon that the state of running software

Running VMware vsan Witness Appliance in VMware vcloudair First Published On: April 26, 2017 Last Updated On: April 26, 2017

Atos TM Virtualization Solutions

Profiling Grid Data Transfer Protocols and Servers. George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

Live Virtual Machine Migration with Efficient Working Set Prediction

G Robert Grimm New York University

Introduction. Application Performance in the QLinux Multimedia Operating System. Solution: QLinux. Introduction. Outline. QLinux Design Principles

Single-pass restore after a media failure. Caetano Sauer, Goetz Graefe, Theo Härder

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

Digital Design Laboratory Lecture 6 I/O

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

Delivering unprecedented performance, efficiency and flexibility to modernize your IT

Storage access optimization with virtual machine migration during execution of parallel data processing on a virtual machine PC cluster

Milestone Solution Partner IT Infrastructure Components Certification Report

Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand

IM B09 Best Practices for Backup and Recovery of VMware - DRAFT v1

CA485 Ray Walshe Google File System

PARDA: Proportional Allocation of Resources for Distributed Storage Access

Chapter 13: I/O Systems

Contents Overview of the Gateway Performance and Sizing Guide... 5 Primavera Gateway System Architecture... 7 Performance Considerations...

Chapter 6. Storage and Other I/O Topics

Chapter 13: I/O Systems

DiskReduce: Making Room for More Data on DISCs. Wittawat Tantisiriroj

Cloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe

Shen, Tang, Yang, and Chu

TOWARD PREDICTABLE PERFORMANCE IN SOFTWARE PACKET-PROCESSING PLATFORMS. Mihai Dobrescu, EPFL Katerina Argyraki, EPFL Sylvia Ratnasamy, UC Berkeley

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays

Module 12: I/O Systems

MiAMI: Multi-Core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces

Lecture 1 Introduction (Chapter 1 of Textbook)

CloudAP: Improving the QoS of Mobile Applications with Efficient VM Migration

Microsoft Office SharePoint Server 2007

Isilon Performance. Name

Network Swapping. Outline Motivations HW and SW support for swapping under Linux OS

SMORE: A Cold Data Object Store for SMR Drives

Resource Management IB Computer Science. Content developed by Dartford Grammar School Computer Science Department

so Mechanism for Internet Services

Nested Virtualization and Server Consolidation

Transcription:

Experimental Study of Virtual Machine Migration in Support of Reservation of Cluster Resources Ming Zhao, Renato J. Figueiredo Advanced Computing and Information Systems (ACIS) Electrical and Computer Engineering University of Florida {ming, renato}@acis.ufl.edu Advanced Computing and Information Systems laboratory

Motivating Application Dynamic Data-driven Brain-machine Interface Resource demanding parallel application Time-critical during online experiments Must finish each closed loop within 1ms No stringent time requirement for offline study Online Experimenting Offline Study Brain institute ACIS In Vivo Data Acquisition 1 BMI Model BMI Model K BMI Model Closed-Loop Control Parallel Computing on Virtualized Resources Data Storage Dynamic Data-driven Brain-machine Interfaces (http://www.acis.ufl.edu/dddasbmi)

Motivating Application Motivation for -based resource reservation Online experiments Dedicate cluster resources for s running BMI models Without reservation, deadline is often violated Offline study Share cluster resources with s running other workloads Cluster reservation Migrating exiting s to other resources Online Experimenting Offline Study Data polling Data transfer Computing Robot control Latencies 1ms In Vivo Data Acquisition 1 BMI Model BMI Model K BMI Model Closed-Loop Control Parallel Computing on Virtualized Resources Dynamic Data-driven Brain-machine Interfaces (http://www.acis.ufl.edu/dddasbmi) 3 Data Storage

Overview Goal: Efficient -based cluster resource reservation Challenge: Vacate resources in time (to meet schedule) but not too early (to fully utilize resources) Solution: Build a migration overhead model through experimental analysis

Outline Architecture Experimental analysis Methodology Migrating a single Migrating a sequence of s Migrating s in parallel Summary 5

-based Resource Reservation Virtual resource manager Global management of virtualized resources Presents an abstracted interface for resource requests Hides the complexity of using virtualized resources Coordinates with schedulers to reserve, allocate resources with s 1 Scheduler P1 3 ) instantiation 3) Reservation: Create a with... by 1am Virtual machine scheduler Per-host management of s Presents a unified interface for managing s Hides the complexity of using different software Job Manager 1) Resource request: Need 1GHz CPU, 1GB RAM at 1am ) Admission control: OK ) Resource handlers: Machine IP: 1.5.15.3 Virtual Resource Manager

Migration based Cluster Vacating Example P1 1 3 5 7 9 P1 P1 P P 1 3 P 1 3 5 5 P3 P3 P3 7 7 9 9 (I) P1 is shared by s running various workloads (II) P1 is reserved by migrating the existing s to P and P3 7 (III) P1 is dedicated to the s created for a parallel application

Experimental Setup Physical cluster Each node has four CPUs (.33GHz), GB memory Gigabit Ethernet 1 Node 1 Node Virtual machines ware Server based disk Shared read-only image on NFS (.vmdk) Independent changes (redo logs) on local disks (.REDO) memory Mapped to file (.vmem) Stored on local disks vmx REDO vmem Local vmx REDO vmem NFS vmdk Local

Experimental Setup Migration strategy Suspend Synchronizes memory state to file Copy Uses FTP to transfer memory state file, disk redo files Maximum throughput is about 1MB/s Resume Restores memory state from file in foreground Not based on ware solutions (otion) 1 vmx REDO vmem Node 1 Local vmx REDO vmem NFS vmdk Node Local 9

Migration Overhead of a Single Different memory sizes Suspend and resume phases are fast Copy time grows as memory size increases Different methods to store the migrated states RAMFS removes the impact of disk I/Os to the migration process time (s) 1 1 1 1 1 1MB 5MB 3MB 51MB MB 7MB 9MB 1MB time (s) 1 1 1 1 1 1MB 5MB 3MB 51MB MB 7MB 9MB 1MB suspend copy resume suspend copy resume Using disks on the destination host to store migrated state files Using RAMFS on the destination host to store migrated state files 1

Modeling Copy Phase Using regression methods Polynomial for using disk, linear for using RAMFS RAMFS setup is used for all the following experiments 1 1 1 Using disks y = 1E-5x +.3x + 1.7 R =.997 1 time (s) 1 Using RAMFS y =.9x +.75 R = 1 1 1 memory size (MB) Using regression to model the copy phase 11

Modeling Suspend and Resume Phases Resume Typically short: memory state is buffered after the copy phase Suspend Typically short: memory state is frequently synchronized to file 1 1 1 baseline 5MB 51MB 7MB 1MB time (s) 1 1 suspend resume 1

Modeling the Suspend and Resume Phases Running memory-intensive workload in the migrated A rogue program that continuously modifies memory Suspend time increases as more synchronization is needed 1 1 1 baseline 5MB 51MB 7MB 1MB time (s) 1 1 suspend resume 13

Migrating a Sequence of s Per- migration time is consistent regardless of the number of sequentially migrated s 1 1 9 single two s four s eight s 9 single two s four s 7 7 time (s) 5 time (s) 5 3 3 1 1 suspend copy resume total suspend copy resume total Per- migration time when migrating a sequence of 5MB-RAM s Per- migration time when migrating a sequence of 51MB-RAM s 1

Migrating s with CPU-intensive Workload CPU-intensive workload Runs iteratively, consumes 1% of CPU Performance degradation = Impacted iteration time Regular iteration time Takes longer for applications in s to recover to full performance 1 1 Performance degradation migration time 1 1 1 3 1 1 Migration delay (s) 1 1 Iteration Time (s) 1 1 1 3 Time (s) Migrating four s in sequence; each has 51MB-RAM and runs a CPU-intensive benchmark 15

Migrating s with Memory-intensive Workload Memory-intensive workload Runs iteratively, touches 1% of memory Performance degradation = Time of impacted iterations Regular iteration time * Migration is a more memory intensive process Migration delay (s) 3 3 1 1 1 1 1 Performance degradation migration time 1 3 Iteration time (s) 3 3 1 1 1 1 1 1 3 c Time (s) Migrating four s in sequence; each has 51MB-RAM and runs a memory-intensive benchmark 1

Migrating s with Web Workloads Apache Web server Serves constant-rate requests from outside clients (1 request/s each) Takes longer for the throughputs to recover 5 1 5 Reply rate 15 1 Reply rate 15 1 5 5 5 3 5 Reply rate 15 1 Reply rate 15 1 5 5 Reply rate 55 5 5 35 3 5 Aggregate 1 3 5 7 Time (s) Migrating four s in sequence; each has 51MB-RAM and runs a Web server 17

Migrating s in Parallel Parallel migration has shorter total migration time Faster vacating of cluster resources Sequential migration has shorter per- migration time Less impact on the applications in the s 5 5M-P 5M-P 5M-S 5M-S 51M-P 51M-S 3 5 5M-P 5M-S 51M-P 51M-S Migration time (s) 3 Migration time (s) 15 1 1 5 1 Number of s 1 Number of s Total migration time Per- migration time 1

Related Work -based resource management In-VIGO Virtual workspaces Virtuoso Shirako migration based resource reallocation VIOLIN Sandpiper migration optimizations ware otion Xen live migration 19

Summary Conclusions: It is feasible to predict the migration time of s with different configurations and different numbers It takes longer for a s application to recover than the migration time Sequential migration has less impact on s applications Parallel migration is faster for migrating multiple s Future work: Generalizes and validates the model across different physical platforms, with different technologies Considers modeling of live migrations

Acknowledgement DDDASBMI project team http://www.acis.ufl.edu/dddasbmi Sponsorship from NSF Publication approval from ware Questions? ming@acis.ufl.edu http://www.acis.ufl.edu/~ming 1