Virtualization. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels

Similar documents
The Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014)

Virtualization. Adam Belay

CS5460: Operating Systems. Lecture: Virtualization. Anton Burtsev March, 2013

Virtualization. Pradipta De

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

Chapter 5 C. Virtual machines

references Virtualization services Topics Virtualization

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant.

Virtualization. Starting Point: A Physical Machine. What is a Virtual Machine? Virtualization Properties. Types of Virtualization

Virtualization. ! Physical Hardware Processors, memory, chipset, I/O devices, etc. Resources often grossly underutilized

Virtual Machines. To do. q VM over time q Implementation methods q Hardware features supporting VM q Next time: Midterm?

CSE 120 Principles of Operating Systems

Introduction to Virtual Machines. Carl Waldspurger (SB SM 89 PhD 95) VMware R&D

CS370 Operating Systems

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization.

CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives

Module 1: Virtualization. Types of Interfaces

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36

Virtual Machines. Part 2: starting 19 years ago. Operating Systems In Depth IX 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy

CSCE 410/611: Virtualization!

Virtualization. Dr. Yingwu Zhu

Advanced Operating Systems (CS 202) Virtualization

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

DISCO and Virtualization

System Virtual Machines

System Virtual Machines

Cloud Computing Virtualization

Learning Outcomes. Extended OS. Observations Operating systems provide well defined interfaces. Virtual Machines. Interface Levels

CSCE 410/611: Virtualization

Overview of System Virtualization: The most powerful platform for program analysis and system security. Zhiqiang Lin

CS 550 Operating Systems Spring Introduction to Virtual Machines

A Survey on Virtualization Technologies

Virtual Virtual Memory

Virtual Machines. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Virtual Machine Monitors (VMMs) are a hot topic in

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks


Operating Systems 4/27/2015

Virtualization. ...or how adding another layer of abstraction is changing the world. CIS 399: Unix Skills University of Pennsylvania.

Virtualization and Virtual Machines. CS522 Principles of Computer Systems Dr. Edouard Bugnion

CS 5600 Computer Systems. Lecture 11: Virtual Machine Monitors

Cross-architecture Virtualisation

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

Virtualization and memory hierarchy

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

Distributed Systems COMP 212. Lecture 18 Othon Michail

CS370: Operating Systems [Spring 2016] Dept. Of Computer Science, Colorado State University

Lecture 5. KVM for ARM. Christoffer Dall and Jason Nieh. 5 November, Operating Systems Practical. OSP Lecture 5, KVM for ARM 1/42

Lecture 7. Xen and the Art of Virtualization. Paul Braham, Boris Dragovic, Keir Fraser et al. 16 November, Advanced Operating Systems

Testing System Virtual Machines

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

Intel s Virtualization Extensions (VT-x) So you want to build a hypervisor?

Background. IBM sold expensive mainframes to large organizations. Monitor sits between one or more OSes and HW

Portland State University ECE 587/687. Virtual Memory and Virtualization

CS-580K/480K Advanced Topics in Cloud Computing. VM Virtualization II

COS 318: Operating Systems. Virtual Machine Monitors

Introduction to Cloud Computing and Virtualization. Mayank Mishra Sujesha Sudevalayam PhD Students CSE, IIT Bombay

Virtualization. Virtualization

Knut Omang Ifi/Oracle 6 Nov, 2017

Linux and Xen. Andrea Sarro. andrea.sarro(at)quadrics.it. Linux Kernel Hacking Free Course IV Edition

Virtualization. Guillaume Urvoy-Keller UNS/I3S

Micro VMMs and Nested Virtualization

Lecture 5: February 3

Virtual machines are an interesting extension of the virtual-memory concept: not only do we give processes the illusion that they have all of memory

CHAPTER 16 - VIRTUAL MACHINES

Concepts. Virtualization

Virtualization History and Future Trends

CS370 Operating Systems

Virtualization with XEN. Trusted Computing CS599 Spring 2007 Arun Viswanathan University of Southern California

Performance Aspects of x86 Virtualization

Virtual Machine Monitors!

CS 152 Computer Architecture and Engineering

Introduction Construction State of the Art. Virtualization. Bernhard Kauer OS Group TU Dresden Dresden,

VIRTUALIZATION: IBM VM/370 AND XEN

Multiprocessor Scheduling. Multiprocessor Scheduling

CSC 5930/9010 Cloud S & P: Virtualization

CS252 Spring 2017 Graduate Computer Architecture. Lecture 18: Virtual Machines

Server Virtualization Approaches

Computer Architecture Lecture 13: Virtual Memory II

[537] Virtual Machines. Tyler Harter

Virtual machine architecture and KVM analysis D 陳彥霖 B 郭宗倫

Virtualisation: The KVM Way. Amit Shah


CSE 237B Fall 2009 Virtualization, Security and RTOS. Rajesh Gupta Computer Science and Engineering University of California, San Diego.

Nested Virtualization and Server Consolidation

Chapter 5 B. Large and Fast: Exploiting Memory Hierarchy

Lecture 4: Extensibility (and finishing virtual machines) CSC 469H1F Fall 2006 Angela Demke Brown

Dan Noé University of New Hampshire / VeloBit

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班

CS370 Operating Systems

CS 571 Operating Systems. Final Review. Angelos Stavrou, George Mason University

CprE Virtualization. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018

W4118: virtual machines

The Future of Virtualization

Administrivia. Lab 1 due Friday 12pm. We give will give short extensions to groups that run into trouble. But us:

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

BUD17-301: KVM/ARM Nested Virtualization. Christoffer Dall

Virtualization. join, aggregation, concatenation, array, N 1 ühendamine, agregeerimine, konkateneerimine, massiiv

Transcription:

Virtualization Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels 1

What is virtualization? Creating a virtual version of something o Hardware, operating system, application, network, memory, storage The construction of an isomorphism between a guest system and a host [Popek, Goldberg, 74] 2

Example: virtual disk Partition a single hard disk to multiple virtual disks o Virtual disk has virtual tracks & sectors Implement virtual disk by file Map between virtual disk and real disk contents Virtual disk write/read mapped to file write/read in host system 3

What is virtualization? (continued) A way to run multiple operating systems and applications on the same hardware (virtual machines) Only virtual machine manager (a.k.a. hypervisor) has full system control Virtual machines completely isolated from each other (or so we hope) 4

Basic concepts Virtual Machine (VM) Host Guest Hypervisor (type ) / Virtual Machine Monitor 5

Basic concepts Virtual Machine (VM) Host Guest Hypervisor (type ) / Virtual Machine Monitor 6

Basic concepts Virtual Machine (VM) Host Guest Hypervisor (type ) / Virtual Machine Monitor 7

Basic concepts Virtual Machine (VM) Host Guest Hypervisor (type ) / Virtual Machine Monitor 8

Basic concepts Virtual Machine (VM) Host Guest Hypervisor (type ) / Virtual Machine Monitor 9

Types of virtualization Full virtualization guest OS runs unmodified Para-virtualization guest OS must be aware of virtualization, source-code modifications required Hardware virtualization support may be used for both Our focus is on full virtualization 10

Virtualization advantages Cost-effectiveness less hardware o Multiple virtual machines / operating systems / services on single physical machine (server consolidation) o Various forms of computation as a service Isolation o Good for security o Great for reliability and recovery: If VM crashes it can be rebooted, does not affect other services (fault containment) o VM migration Development tool o Work on multiple OS in parallel o Develop and debug OS in user mode o Origins of VMware as a tool for developers 11

Virtualization vs. Multi-Processing Multiprocessing Process 1 Process 2 OS HW (disk, NIC, ) User space/ kernel separation HW interface Virtualization VM Pr 1 Pr 2 Pr 1 Pr 2 OS 1 OS 2 VMM/Hypervisor HW (disk, NIC, ) Virtual HW interface Real HW interface 12

Type 1 and type 2 hypervisors VMware ESX, Microsoft Hyper-V, Xen VMware Workstation, Microsoft Virtual PC, Sun VirtualBox, QEMU, KVM Figure 7-1. Location of type 1 and type 2 hypervisors. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels 13

Type 1 and type 2 hypervisors (continued) Figure 7-2. Examples of the various combinations of virtualization type and hypervisor. Type 1 hypervisors always run on the bare metal whereas type 2 hypervisors use the services of an existing host operating system. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels 14

What's required of a (classic) hypervisor Hypervisor should provide the following: Safety: have full control of virtualized resources Fidelity: program behavior on VM should be identical to its behavior on bare hardware Efficiency: As much as possible, run directly on hardware without hypervisor intervention Full interpretation isn't efficient Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels 15

Classic virtualization: trap and emulate VM 1 VM 2 Return to process (3) VMM HW emulation HW Trap (1) Interrupt handler (2) Emulation is the process of implementing the functionality/interface of one system on a system having different functionality/interface 16

Trap and emulate: difficulties Sensitive instructions: behave differently in kernel/supervisor and user mode I/O instructions, enable/disable interrupts, Privileged instructions: cause a trap if executed in user mode Theorem [Popek and Goldberg, 1974] A machine can be virtualized [using trap and emulate] if every sensitive instruction is privileged. Not supported by x86 processors prior to 2005 In 2005, Intel/AMD introduced virtualization HW support. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels 17

What is sensitive? CPU registers MMU o Page table o Segments Interrupts Timers IO devices 18

X86 virtualization problem I The x86 architecture (w/o virtualization extensions) can't be virtualized by trap and emulate. Some sensitive instructions are not privileged. Example: the popf instruction o Pops 16 bits from stack to flags register o One of the flags masks (i.e. disables) interrupts o The instruction is not privileged o What happens if the OS of a VM runs popf? 19

X86 virtualization problem II Some instructions: push, pop, mov can have code segment selectors (cs, ds, ss) as arguments even in user mode, so they can be read The selectors have two bits that are their current privilege level o In x86 (beginning with 386), four privilege levels (ring 0 to ring 3) o Each resource is assigned a level. o The two lower bits of the cs register are the Current Privilege Level (CPL) of the code. o Guest OS thinks that it is in ring 0. o Guest OS is actually in ring 1 Result - guest OS confusion. 20

Implementation options Emulation o Full emulation hypervisor executes code of VM step by step, testing each instruction prohibitive overhead. o Trap and emulate if sensitive instructions privileged instructions Change sensitive instructions o Interpretation equivalent to emulation (BOCHS, JSLinux). o Binary translation change (VMware, QEMU). Para-virtualization re-compile guest OS (XEN, Denali). Hardware assistance Intel VT-x and AMD-V (used by KVM, XEN, Vmware). 21

Outline Concepts, classical CPU virtualization o Basic interpretation Memory virtualization 22

Binary translation Binary translation is the process of translating one instruction set to another one. Approach I: translate statically all code base. o In our case the result is para-virtualization. o Problems Dynamically linked libraries are not known at compile time. Self-modifying code, e.g. program generating code and running it, is not covered. 23

Dynamic binary translation Approach II: translate code on the fly (Just In Time). Simplest approach o Keep table mapping old instructions to new instructions. o Fetch old instruction. o Use table to translate. o Execute new instruction(s) Problem: performance o Overhead for every instruction similarly to interpretation. 24

Dynamic BT with caching Cache translated code region: o After translation run from cache. o Translation occurs only once. Static translation cannot handle dynamic control transfer, when: o Jump depending on memory address. o Indirect function call (by function pointer). Translation of dynamic control transfer must be done at execution time. 25

Virtualization prior to HW support Figure 7-4. The binary translation rewrites the guest operating system running in ring 1, while the hypervisor runs in ring 0 26

VMWare binary translation: example C code 64-bit binary Binary (hex) representation 27

VMWare binary translation: example Translator reads guest memory at the address indicated by guest PC Decodes instructions, creates Intermediate Representation - IR objects Accumulates IR objects to translation units (TUs) o Basic blocks (BB), stops upon control flow First TU Compiled code fragment (CCF) 28

VMWare binary translation: example Translator reads guest memory at the address indicated by guest PC Decodes instructions, creates Intermediate Representation - IR objects Accumulates IR objects to translation units (TUs) o Basic blocks (BB), stops upon control flow Identical code First TU Compiled code fragment (CCF) 29

VMWare binary translation: example Translator reads guest memory at the address indicated by guest PC Parses instructions, creates Intermediate Representation - IR objects Accumulates IR objects to translation units (TUs) o Basic blocks (BB), stops upon control flow First TU Translation of jump BB Compiled code fragment (CCF) 30

VMWare binary translation: example Translator reads guest memory at the address indicated by guest PC Parses instructions, creates Intermediate Representation - IR objects Accumulates IR objects to translation units (TUs) o Basic blocks (BB), stops upon control flow Translation of fall through BB First TU Compiled code fragment (CCF) 31

VMWare binary translation: example C code 64-bit binary Which basic block will be translated next? 32

VMWare binary translation: example C code 64-bit binary Which basic block will be translated next? 33

VMWare binary translation example: output 34

VMWare binary translation operation Translation cache (TC) stores translations done so far A hash table tracks the input to output correspondence Chaining optimization allows one CCF to jump directly to another without calling out of the translation cache As TC gradually captures guest's working set, proportion of translation decreases User code does not have to be translated 35

Dealing with privileged instructions: example The cli (clear interrupts) instruction is privileged Translated to: vcpu.flags.ip=0 Much faster than source binary! 36

Outline Concepts, classical CPU virtualization o Basic interpretation Memory virtualization 37

Memory allocation Each VM usually receives a contiguous set of physical addresses. o 512 Mbyte 4 Gbyte are typical values. As far as VM is concerned, this is the physical memory of the machine. The guest OS allocates pages or segments to guest processes. 38

Memory management Assumptions of OS in VM: o Physical memory is a contiguous block of addresses from 0 to some n. o OS can map any virtual page to any page frame. Hypervisor must: o Partition memory among VMs. o Ensure virtual page mapping only to assigned page frames. TLB page fault in HW-managed TLB (e.g. x86) causes HW to select a page from page table. VM OS must not manage real page table. 39

Option 1: brute force Define these pages as not R/W Guest OS Hypervisor Page dir. Page table VMM SW VM memory layout Interrupt & VMM corrects address. CR3 TLB CPU HW 40

Brute force description Guest page tables are read and write protected in host system. If guest OS reads page table (e.g. for page eviction) writes page table (e.g. after page fault), or changes CR3, the system traps. The hypervisor then uses a VM memory layout to: Return answers to VM Update the layout Hypervisor switches VM memory layout when new VM is scheduled. 41

Option 2: shadow page tables Guest OS Hypervisor Page dir. G-CR3 CR3 Page table TLB VMM SW CPU Shadow page table Interrupt & VMM corrects page table. HW 42

Shadow page tables description Hypervisor maintains shadow page tables. Guest page tables map: Guest VA Guest PA Shadow tables: Guest VA Host PA. Hypervisor does not trap guest updates to its page table. o Result inconsistent guest page table and shadow page table. When guest process accesses virtual address o The physical address is not in the guest page table, but in the shadow page table. o HW translates correctly, because it is aware only of shadow tables. 43

Shadow page tables description (continued) If address in TLB TLB hit and no problem. When guest process causes a page fault o Hypervisor begins execution. o Hypervisor updates guest page table with new page. o Hypervisor updates shadow page table. Performance is as good as native execution as long as there are no page faults. Shadow page tables should be cached so that once a VM is re-scheduled the page table does not have to be rebuilt from scratch. 44

Option 3: nested page tables Guest OS Hypervisor Page dir. Page table VMM SW Host page table CR3 TLB CPU EPTP HW 45

Nested page tables - description The name implies having page tables within page tables. The essence of the idea is a hardware assist. o Hardware has an extra pointer and the ability to walk an extra set of page tables. o Idea is called Extended Page Tables (EPT) by Intel Guest page tables hold Guest VA Guest PA mapping, access by standard CR3 Extended page tables hold Host VA Host PA mapping, access by EPTP (EPT pointer). Host VA=Guest PA 46

Nested page tables description (cont'd) TLB as usual holds Guest VA Host PA On memory access o If found in TLB no problem. o If not in TLB, but no page fault, hardware walks both tables and updates TLB. o If page fault, then hardware hypervisor gets host physical page and provides host virtual page (guest physical) to VM. 47

Sources Modern operating systems, 4 th edition, A. Tanenbaum and H. Bos Virtual machines, J. E. Smith and R. Nair A presentation by Niv Gilboa from CSE@BGU Formal requirements for virtualizable third generation architectures, G. J. Popek and R. P. Goldberg, CACM, 1974 A comparison of software and hardware techniques for x86 virtualization, K. Adams and O. Ageson, ASPLOS 2006 48