Virtualization and High-Availability

Similar documents
An overview of virtual machine architecture

Virtualization. Pradipta De

Virtual Machine Monitors!

Module 1: Virtualization. Types of Interfaces

Nested Virtualization and Server Consolidation

The Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014)

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36

Distributed Systems COMP 212. Lecture 18 Othon Michail

Virtualization. ...or how adding another layer of abstraction is changing the world. CIS 399: Unix Skills University of Pennsylvania.

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

Virtualization. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels

Chapter 5 C. Virtual machines

Multiprocessor Scheduling. Multiprocessor Scheduling

Is There Any Alternative To Your Enterprise UNIX Platform? Andrej Gursky PosAm TechDays EAST, March 2015

GoAhead Software NDIA Systems Engineering 2010

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班

Introduction to Virtualization

Introduction to Virtual Machines. Michael Jantz

EE 660: Computer Architecture Cloud Architecture: Virtualization

for Kerrighed? February 1 st 2008 Kerrighed Summit, Paris Erich Focht NEC

Overview of System Virtualization: The most powerful platform for program analysis and system security. Zhiqiang Lin

I/O and virtualization

LIA. Large Installation Administration. Virtualization

Lecture 5: February 3

Virtualization. ! Physical Hardware Processors, memory, chipset, I/O devices, etc. Resources often grossly underutilized

Virtualization. Starting Point: A Physical Machine. What is a Virtual Machine? Virtualization Properties. Types of Virtualization

Virtualization Overview

SR-IOV support in Xen. Yaozu (Eddie) Dong Yunhong Jiang Kun (Kevin) Tian

Virtualization. Application Application Application. MCSN - N. Tonellotto - Distributed Enabling Platforms OPERATING SYSTEM OPERATING SYSTEM

CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives

Introduction to Virtual Machines. Carl Waldspurger (SB SM 89 PhD 95) VMware R&D


Intel Virtualization Technology Roadmap and VT-d Support in Xen

Micro VMMs and Nested Virtualization

A Survey on Virtualization Technologies

Virtual Leverage: Server Consolidation in Open Source Environments. Margaret Lewis Commercial Software Strategist AMD

Virtualization. Michael Tsai 2018/4/16

CS370 Operating Systems

HPVM & OpenVMS. Sandeep Ramavana OpenVMS Engineering Sep Germany Technical Update Days 2009

Virtualization. join, aggregation, concatenation, array, N 1 ühendamine, agregeerimine, konkateneerimine, massiiv

System Virtual Machines

System Virtual Machines

Introduction to the Service Availability Forum

Hypervisor security. Evgeny Yakovlev, DEFCON NN, 2017

Server Virtualization Approaches

The only open-source type-1 hypervisor

1 Virtualization Recap

CS370 Operating Systems

OpenSAF More than HA. Jonas Arndt. HP - Telecom Architect OpenSAF - TCC

VMware vsphere with ESX 4 and vcenter

Clustering and Storage Management In Virtualized Environments Rasmus Rask Eilersen

Junhong Jiang, Kevin Tian, Chris Wright, Don Dugger

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization:

Virtualization with VMware ESX and VirtualCenter SMB to Enterprise

Learning Outcomes. Extended OS. Observations Operating systems provide well defined interfaces. Virtual Machines. Interface Levels

Introduction to Virtual Machines

OPS-9: Fun With Virtualization. John Harlow. John Harlow. About John Harlow

Virtual Machines. Part 2: starting 19 years ago. Operating Systems In Depth IX 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Dr. Song Fu 3/22/2010

2014 Software Global Client Conference

CHAPTER 16 - VIRTUAL MACHINES

CSCI 8530 Advanced Operating Systems. Part 19 Virtualization

Availability & Resource

CSE 120 Principles of Operating Systems

Operating Systems 4/27/2015

SERVE. -Priyal Lokhandwala

LINUX Virtualization. Running other code under LINUX

Virtual Machines. Virtual Machines

Technical Information

Virtual Machine Security

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization.

Improving Blade Economics with Virtualization

Protecting Mission-Critical Workloads with VMware Fault Tolerance W H I T E P A P E R

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

Virtualization Introduction

Fault Tolerant Java Virtual Machine. Roy Friedman and Alon Kama Technion Haifa, Israel

Virtual Machines. To do. q VM over time q Implementation methods q Hardware features supporting VM q Next time: Midterm?

CprE Virtualization. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

Using MySQL in a Virtualized Environment. Scott Seighman Systems Engineer Sun Microsystems

Using OpenSAF for carrier grade High Availability

CHAPTER 16 - VIRTUAL MACHINES

Building a High Availability System on Fujitsu SPARC M12 and Fujitsu M10/SPARC M10 Servers (Overview)

COSC6376 Cloud Computing Lecture 14: CPU and I/O Virtualization

Cloud Networking (VITMMA02) Server Virtualization Data Center Gear

Building a High Availability System on Fujitsu SPARC M12 and Fujitsu M10/SPARC M10 Servers (Overview)

Introduction to Cloud Computing and Virtualization. Mayank Mishra Sujesha Sudevalayam PhD Students CSE, IIT Bombay

Cloud and Datacenter Networking

Roadmap for Challenging Times System Virtualiztion

What is Cloud Computing? Cloud computing is the dynamic delivery of IT resources and capabilities as a Service over the Internet.

Virtualization. Part 1 Concepts & XEN

Concepts. Virtualization

Part 1: Introduction to device drivers Part 2: Overview of research on device driver reliability Part 3: Device drivers research at ERTOS

Virtualization (II) SPD Course 17/03/2010 Massimo Coppola

Status Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service)

ATCA, HPI, AIS open specifications for HA applications. Artem Kazakov SOKENDAI/KEK TIPP09

Xen Summit Spring 2007

Linux and Xen. Andrea Sarro. andrea.sarro(at)quadrics.it. Linux Kernel Hacking Free Course IV Edition

e-pg Pathshala Subject: Computer Science Paper: Cloud Computing Module 23: Virtualization II Module No: CS/CC/23 Quadrant 1 e-text

Chapter 5 B. Large and Fast: Exploiting Memory Hierarchy

Transcription:

Virtualization and High-Availability LAAS, 30 Novembre 2009 François Armand OpenWide, Université Paris 7 francois.armand@openwide.fr

Agenda Reminder about virtualization, HA, SA Forum HA challenge introduced by virtualization Some failover scenario Reminder on virtualization and devices Local failover scenario and issues Some initial modeling work 2

Virtualization Enables Consolidation Taking advantage of more powerful hardware Applications Applications A B C D SMP SMP Core Core Applications Applications Applications Applications A B C D A B C D Virtualization VMM HW HW Core Core

Classic VMs Enable to run multiple independent- s simultaneously on the same processor(guest os), each in its own Virtual Machine Two main approaches: Native VM s: Introduce a software layer between the hardware and the : Virtual Machine Monitor (VMM) or Bare metal Hypervisor Apps Apps Guest s VMM Apps Hardware Apps Host VMM VMM Hosted VM s: require a Host to start first Hardware 4

Taxonomy (derived from E. Smith & Nair) System Virtualization (Same ISA) System Level (ISA) VMs System VMs (# ISA) (Same ISA) Process Level (ABI) Process VMs (# ISA) Hardware Virtualization Classic VM Hardware Emulation Whole System Bochs, QEMU Multiprogrammed Dynamic Systems Translators Native, Type I Paravirtualized Xen, VLX HW Assisted Xen, VLX Transparent Full/Native Virtualization Dyn. Bin. Translation Vmware ESX Hosted, Type II VMware WS, KVM, VirtualBox, (=) )(#) (=) (#) Multitask Virtualization Translator WINE Virtual Servers Virtuozzo, Solaris Zones ISA & ABI Translator FX!32 ISA & Translator Transitive High Level Language Java 5

Virtualization and Availability VM Live Migration for planned downtime Hypervisor based Fault-Tolerance VMM rejuvenation In RAM suspension (ACPI S3) + VMM kexec style Marathon HA and FT (lock step), VmWare FT Kemari (Xen, KVM) Synchronisation passive VM 6

HA is about surviving failures To survive a single failure you need a redundant component (spare, standby, ) The system can detect faults and reconfigure itself to use a redundant component System User Failure Error Fault Failure Component System Error Detected Fault Failure Component User No Failure Standby Component Copyright 2006 Service Availability Forum, Inc 7

4 Means to achieve HA: Fault Prevention MTBF Quality Insurance Avoid Operator s error Avoid Overload situations Supported by Virtualization Independent management of Fair share scheduling policy in VMM Fault Removal MTBF Remove faults after/before HW Maintenance Corrective, Preventive (FRU) Collection of evidence: log, dumps Supported by Virtualization Software upgrades Fault Tolerance MTTR + MTBF Survive in spite of failures Detection, Isolation, Recovery, Repair => HA Middleware Supported by Virtualization Fully isolated guest s Independent reboot of s Fast restart Fault Forecasting MTBF Evaluation of system behaviour Qualitative Identify, classify, rank failures Quantitative Evaluate the probability with which the attributes of dependability are satisfied.

What s (usually) needed Redundant hardware Costly but easy Redundant runtime software instances This creates additional needs: Need to determine which instance is active / passive Need to determine failure and instruct passive to go active, Need to help active to send state to passive (checkpoints) And much more 9

Service Availability Forum SAF provides specification to standardize API s for software providing availability services: Checkpoint, log, notification, alarms, events, messaging Availability Management Framework (AMF) Hardware Platform Interface (HPI) System Management Framework (SMF) Platform Management (PLM) Information Management Model (IMM) And more See http://www.saforum.org/ 10

Agenda Reminder about virtualization, HA, SA Forum HA challenge introduced by virtualization Some failover scenario Reminder on virtualization and devices Local failover scenario and issues Some initial modeling work 11

Virtualization modifies underlying HA assumption / paradigm Application HA Middleware VMM Core Core 12

Challenge arising from HW evolution Virtualization must meet HA requirements Traditional single core based platform New multicore Virtualized platform Appli a Appli b Appli c HA mngt Appli Appli HA Appli b a c Blade Blade Blade Blade Core 1Cores 2 & 3 Core 4 Multicore Blade 1 to1 dependency of Appli / / HW Virtualization enables consolidation 1 application per processing blade Many Core processors part of next designs [Monocore processors on blades ] Virtualization will be a key element of platforms Virtualization mngt 13

Challenge arising from HW evolution Virtualization adds new dimension to HA Traditional HA New HA Appli HA mngt Appli Appli HA mngt Appli HW HW HW Virtualization HW 1to1dependency of Appli / / HW 1 application per processing blade Redundancy & HA managed at blade level Virtualization introduces new entity: VMs HA dependency chain is modified VMs mngt by HA enhances platform availability

Agenda Reminder about virtualization, HA, SA Forum HA challenge introduced by virtualization Some failover scenario Reminder on virtualization and devices Local failover scenario and issues Some initial modeling work 15

Remote Failover Resilience to software and hardware failures HA management is redundant as well HA Appli Appli HA Appli Appli mngt Active x mngt Sby y VM VM VM Virtualization HW VM VM VM Virtualization HW

Failover on Hardware Failure Multiple failovers can be handled simultaneously Failovers could be directed to different physical machines 2N OK N+1??? N+M??? HA Appli a Appli x HA Appli a Appli x mngt Active Active mngt Sby Sby VM VM VM Virtualization HW VM VM VM Virtualization HW

Local Failover Low cost hardware solution Resilience to software failure only Restart failed VM (policy defined) Can reboot Hardware upon escalation HA Appli Appli HA Appli Appli mngt Active Sby mngt Active ALONE VM VM VM Virtualization HW VM VM VM Virtualization HW

Agenda Reminder about virtualization, HA, SA Forum HA challenge introduced by virtualization Some failover scenario Reminder on virtualization and devices Local failover scenario and issues Some initial modeling work 19

Single Hardware Issues HA mngt Appli Appli g Active Sby VM performing HA management VM providing device access Adds a SPOF VM VM VM Virtualization in addition to HW, and VMM HW HW (board, devices): it s OK: it s a design choice VMM: it s OK: limited amount of code HA management can run replicated by design Partly solves the problem Issue: Device access / management More than 80% of system failures stem from device drivers (cf Nooks)

Virtualization and devices Shared devices: Accessed by more than one VM Ex: disk is shared, partitions are not Ex: Ethernet actually bridging/routing between virtual and physical Non shared devices Devices used exclusively by a single VM Ex: Network interface Virtualized by VMM Virtualized within a dedicated VM Dom0, Dom I/O in Xen, Any VM in VLX Direct physical device access from VM VT-d, PCI support / extensions, VMDQ, / VLX, 21

Virtualization and Devices Different ways to provide access to devices: Transparent I/O s or para-virtualized I/O s Pro s and Con s in both cases Applications Driver Native Driver Back-End Driver Applications Front-End Driver I/O conversion Real Driver V M M VMM Device Controller Device Controller 22

Virtualization and Devices (Cont d) Better hardware support: PCI SRIOV, MRIOV, Intel VT-d, Specific controllers (e.g.: VMDQ) Or Specific VMM implementations ti Applications Native Driver VMM Device Controller VLX Unmodified drivers, better performance 23

Sharing Devices Shared devices are a concern for failure resilience Shared devices provided by VMM: Failure of driver implies failure of VMM Applications Applications And failures of all VM s Driver Driver VMM I/O conversion Real Driver Device Controller 24

Sharing Devices Sharing provided by a VM, through back-end driver Failure of driver => failure of VM Only client VM s are impacted Restart under condition Native Driver Back-End Driver Applications Front-End Driver VMM Device Controller 25

Not Sharing Devices Multiple I/O able VM s could solve the dependability issue At the cost of more devices Applications Native Driver Applications Native Driver VMM Device Controller Device Controller 26

Agenda Reminder about virtualization, HA, SA Forum HA challenge introduced by virtualization Some failover scenario Reminder on virtualization and devices Local failover scenario and issues Some initial modeling work 27

Fully Independent VM s (for I/O s) VM physically independent of each other Not a typical SBC device (disk) configuration But provides redundancy, HA with limited SPOF: Hardware and Virtualization layer HA Mgt active Native Eth. Native Disk Virtual Eth. HA Mgt Standby Virtual Eth. Native Disk Native Eth. Virtualization P2 P3 Ethernet P9 P10 Ethernet 28

Realistic Hardware Configuration Requires moving ownership of a device from one VM to the other Support from Virtualization layer (I/O permission, DMA, IRQ routing) Support from : device hot plug, or device activation in sync with application failover! HA Mgt replicated, co-located with application. VM I/O Native Disk HA Mgt active User App Virtual Disk HA Mgt Standby User App Virtual Disk Native Eth. Virtual Eth. Phys. Dev. Virtual Eth. Virtual Eth. Phys Dev. P2 P3 P9 P10 Ethernet Virtualization Device 29

Relaxing VM I/O SPOF issue Upon failure of VM/IO restart it, seen as a non replicated resource by HA Mgt Virtualized devices wait for VM I/O recovery Issue: reset of devices w/o reset of the board! VM I/O Native Disk HA Mgt active User App Virtual Disk HA Mgt Standby User App Virtual Disk Native Eth. Virtual Eth. Phys. Dev. Virtual Eth. Virtual Eth. Phys Dev. P2 P3 P9 P10 Ethernet Virtualization Device 30

Complex scenario Upon active VM failure, standby takes over (=> alone) VM alone grabs physical devices (dsk, eth) owned by failed VM Need multipath support in w/o page fault during switch! Alone VM exports virtual devices to other VM s which rebind! Front-end device drivers must be able to rebind P2 P3 P9 P10 HA Mgt active Native Eth. Native Disk Virtual Eth. HA Mgt alone Virtual Eth. Virtual Disk Native Eth. Virtualization Ethernet 31

Agenda Reminder about virtualization, HA, SA Forum HA challenge introduced by virtualization Some failover scenario Reminder on virtualization and devices Local failover scenario and issues Some initial modeling work 32

Simple configuration (SC): 1 SBC, 1 dual core processor OK µ hw λ hw Assumptions: Core Core Board Simplified failure model of a single board with 1 dual core processor without software HW failed Currently failure of a single core implies failure of the processor (e.g. of both cores) Failure of the processor is identical to failure of the board Failure of I/O peripherals considered equivalent to failure of the board. Repair requires changing the board.

SC + 1 SMP + 1 application OK µ hw App failed λ hw HW failed µ λ hw λ λ App failed λhw Application A SMP Core Board Core µ App Failure of any component leads to unavailability of service λ Application repair: Restart the application repair: Reboot the and restart the application Might be fast restart or hard reset Board repair: Change the board

SC + VLX+ 2 + 2 applications Appl. A Appl. A VLX Core Core OK 2 * λ App µ App Board 2*λ λ µ hw 1 App failed µ App λ App MIN(µ App, µ ) 2 App failed λ hw µ µ λ λ 2 * λ µ vlx λ App 1 + 1Ap failed 1 failed λ λ λ vlx 2 failed HW failed λ hw VLX failed λ vlx Green states: available Red states: unavailable

States of: SC + VLX+ 2 + 2 applications 1 App failed (system said available) The other application is still up and running, whether the system is said available or not tis up to the end user, Repair: restart the failed application 2 App failed After failure of 1 st app instance, the second one fails. System is unavailable Repair: restart the failed application (done in parallel) 1 failed (system said available) Application running on such an is failed too, The other application is still up and running, whether the system is said available or not is suptot the eend duse, user, Repair: reboot that and its application

States of: SC + VLX+ 2 + 2 applications 1 + 1App failed Only an remaining up and running System unavailable Repair: restart the failed app and reboot failed and its application 2 failed System unavailable Repair: restart the 2 failed and their application VLX failed System unavailable Repair: reboot VLX (board reset or not) HW failed System unavailable Repair: Change the board

Guessed Failure and Repair rates Application A SMP Core Board Core MTBF hw : once every 2 years 17 520 hours λ hw : 57 077 FIT hw MTTR hw : 6 hours µ hw : 166 666 666 FIT MTBF : twice / year 4 380 hours λ : 228 310 FIT MTTR : 2mns= 0.0333 hours µ : 30 10 9 FIT Appl. A Appl. A VLX Core Board Core MTBF hw : once every 2 years 17 520 hours λ hw : 57 077 FIT hw MTTR hw : 6 hours µ hw : 166 666 666 FIT MTBF : twice / year 4 380 hours λ : 228 310 FIT MTTR : // repair is faster 45 sec µ : 80 10 9 FIT

Guessed Failure and Repair rates Application A SMP Core Board Core MTBF App : twice / year 4380 hours λ App : 228310 FIT MTTR App : 30sec µ : 120 10 9 FIT Appl. A Appl. A VLX Core Board Core MTBF App : twice / year 4380 hours λ App : 228310 FIT pp MTTR App : 30sec µ : 120 10 9 FIT MTBF VLX : once every 2 years 17 520 hours λ VLX : 57 077 FIT MTTR VLX : 2mns= 0.0333 hours µ VLX : 30 10 9 FIT

Resulting Computed Availability Application A SMP Core Board Core Results obtained with MEADEP Downtime: 3,0405862 hours per year Appl. A Appl. A VLX Core Board Core Results obtained with MEADEP Downtime: 3,0155951 hours per year Uptime: increased by ~90 seconds / year Independent of HW

Bibliographie A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines, Kenichi Kourai, Kenichi Kourai (DSN 2007) Hypervisor Based Fault-Tolerance, Thomas Bressoud, Fred Schneider (ACM TOCS, 1996) 41