Knut Omang Ifi/Oracle 6 Nov, 2017

Similar documents
Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization:

CSE 120 Principles of Operating Systems

Module 1: Virtualization. Types of Interfaces

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks


Virtualization. Pradipta De

Virtualization. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels

System Virtual Machines

Virtual Machine Monitors (VMMs) are a hot topic in

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

Intel Virtualization Technology Roadmap and VT-d Support in Xen

System Virtual Machines

Advanced Operating Systems (CS 202) Virtualization

Virtualization and memory hierarchy

Introduction to Virtual Machines. Carl Waldspurger (SB SM 89 PhD 95) VMware R&D

Intel s Virtualization Extensions (VT-x) So you want to build a hypervisor?

Virtualization. Starting Point: A Physical Machine. What is a Virtual Machine? Virtualization Properties. Types of Virtualization

Virtualization. ! Physical Hardware Processors, memory, chipset, I/O devices, etc. Resources often grossly underutilized

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization.

Cloud Computing Virtualization

COSC6376 Cloud Computing Lecture 14: CPU and I/O Virtualization

Background. IBM sold expensive mainframes to large organizations. Monitor sits between one or more OSes and HW

Chapter 5 C. Virtual machines

The Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014)

CS 550 Operating Systems Spring Introduction to Virtual Machines

references Virtualization services Topics Virtualization

Virtualization and Virtual Machines. CS522 Principles of Computer Systems Dr. Edouard Bugnion

Nested Virtualization and Server Consolidation

Micro VMMs and Nested Virtualization

Lecture 5: February 3

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant.

Virtual Machines. Part 2: starting 19 years ago. Operating Systems In Depth IX 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36

Virtualization History and Future Trends

Virtual Virtual Memory

I/O and virtualization

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班

Virtualization. Michael Tsai 2018/4/16

Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018


Multiprocessor Scheduling. Multiprocessor Scheduling

Virtualization. Virtualization

CS370 Operating Systems

Virtual Machines. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

CSC 5930/9010 Cloud S & P: Virtualization

DISCO and Virtualization

CHAPTER 16 - VIRTUAL MACHINES

A Survey on Virtualization Technologies

Lecture 7. Xen and the Art of Virtualization. Paul Braham, Boris Dragovic, Keir Fraser et al. 16 November, Advanced Operating Systems

Server Virtualization Approaches

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

VIRTUALIZATION: IBM VM/370 AND XEN

Virtualization. Adam Belay

Virtual Machines. To do. q VM over time q Implementation methods q Hardware features supporting VM q Next time: Midterm?

Operating Systems 4/27/2015

CSCE 410/611: Virtualization!

Virtualization. Dr. Yingwu Zhu

Virtual Leverage: Server Consolidation in Open Source Environments. Margaret Lewis Commercial Software Strategist AMD

Virtual Machine Security

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

VIRTUALIZATION. Dresden, 2011/12/6. Julian Stecklina

Virtual Machine Monitors!

CSE543 - Computer and Network Security Module: Virtualization

Multi-level Translation. CS 537 Lecture 9 Paging. Example two-level page table. Multi-level Translation Analysis

Nested Virtualization Update From Intel. Xiantao Zhang, Eddie Dong Intel Corporation

CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives

Virtualisation: The KVM Way. Amit Shah

CSCE 410/611: Virtualization

Distributed Systems COMP 212. Lecture 18 Othon Michail

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

CS5460: Operating Systems. Lecture: Virtualization. Anton Burtsev March, 2013

COS 318: Operating Systems. Virtual Machine Monitors

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

Advanced Systems Security: Virtual Machine Systems

CS 152 Computer Architecture and Engineering

Advanced Systems Security: Virtual Machine Systems

Xen is not just paravirtualization

Introduction to Cloud Computing and Virtualization. Mayank Mishra Sujesha Sudevalayam PhD Students CSE, IIT Bombay

The Price of Safety: Evaluating IOMMU Performance

1 Virtualization Recap

Cloud Networking (VITMMA02) Server Virtualization Data Center Gear

Virtualization Overview NSRC

Virtualization with XEN. Trusted Computing CS599 Spring 2007 Arun Viswanathan University of Southern California

Virtualization, Xen and Denali

Virtually Impossible

COLORADO, USA; 2 Usov Aleksey Yevgenyevich - Technical Architect, RUSSIAN GOVT INSURANCE, MOSCOW; 3 Kropachev Artemii Vasilyevich Manager,

Nested EPT to Make Nested VMX Faster. Red Hat Author Gleb Natapov October 21, 2013

CSE543 - Computer and Network Security Module: Virtualization

Disco: Running Commodity Operating Systems on Scalable Multiprocessors

Performance Aspects of x86 Virtualization

Hardware- assisted Virtualization

Master s Thesis! Improvement of the Virtualization Support in the Fiasco.OC Microkernel! Julius Werner!

Background. IBM sold expensive mainframes to large organiza<ons. Monitor sits between one or more OSes and HW

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. November 15, MIT Fall 2018 L20-1

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Virtual Memory. Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. April 12, 2018 L16-1

Linux and Xen. Andrea Sarro. andrea.sarro(at)quadrics.it. Linux Kernel Hacking Free Course IV Edition

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Overview of System Virtualization: The most powerful platform for program analysis and system security. Zhiqiang Lin

Transcription:

Software and hardware support for Network Virtualization part 1 Knut Omang Ifi/Oracle 6 Nov, 2017 1 Motivation Goal: Introduction to challenges in providing fast networking to virtual machines Prerequisites: What is virtualization? Understand interplay between software ideas/application abstractions hardware evolution compatibility requirements! Understand some of the recent hardware additions to CPUs and chipsets Understanding some of the underlying APIs we virtualize upon! 2

Overview Introduction to virtualization (Virtual machines) Aspects of network virtualization: Virtual network infrastructure, interfaces, adapters Network interface attach points (PCI, PCIe) Software emulation of a network interface Paravirtualized network interfaces Hardware support for sharing a network adapter (SR/IOV) Use cases, challenges, risks and tradeoffs 3 Virtualization Present an abstraction to the application (guest OS, user program..) About resource sharing, resource utilization Not new: ex. process, virtual memory - just taking it further.. Virtual memory, virtual disk head, virtual CPU, virtual computer.. Containers... As in virtual machines: Host operating system (often called hypervisor) sees whole computer Guest operating system only sees a partition of the real computer Protection and transparency Flexible use of machine resources 4

Virtualization of resources Motivated from the programming side (software) Implementation in software faces problems: performance security Hardware: How can we support it better? Think about basic OS abstractions.. Important driver for hardware development Applies to network side as well 5 Virtualization isolation Popek and Goldberg,1974: Sensitive instructions: Instructions that for protection reasons must be executed in kernel mode Privileged instructions: Instructions that causes a trap A machine is virtualizable iff the set of sensitive instructions is a subset of the set of privileged instructions. 6

Virtualization before ca.1995 IBM CP/CMS -> VM/370, 1979 Hardware support: Traps sensitive instructions Architecture still in use for IBM mainframes Largely ignored by others Taken up by Sun and HP in 1990's x86-world? Difficult because: Some sensitive instructions ignored in user mode! Some sensitive instructions allowed from user mode! 7 Virtualization in the (limited) x86 Problems: Performance: I/O Page faults Interrupts (when?) Host resource usage Avoidig 'leaking' instructions Pentium allows instruction that makes it possible to determine if it is executed in kernel mode Might confuse OS.. 8

Virtualization in the (limited) x86 Solutions: Interpretation (emulating the instruction set) Performance penalty of factor 5-10 Benefit: May emulate any type of CPU Full virtualization Privileged instructions in guest OS'es rewritten by virtualization software (binary translation) Stanford DISCO --> VmWare workstation Did not require source code of OS! Paravirtualization Replacing parts of the guest operating system with custom code for virtualization 9 Xen PV (Xen Paravirtualization) Uses x86 privilege levels differently: Rings: 0, 1, 2, 3 (highest to lowest privilege) Normally OS executes in ring 0 and applications execute in ring 3 With Xen 0 Hypervisor 1 Guest OS 2 unused 3 Applications Guest OS modified for privileged instructions Still used for dom0 (privileged guest mode) in Xen VMWare ESX: similar approach 10

Initial hw support for virtualization on x86_64 VT-x(Intel) and SVM(AMD): Inspired by VM/370 Set of operations that trap Present in most (all?) newer 64 bit versions of AMD/Intel processors controlled by bitmap managed by host OS/hypervisor must sometimes be enabled in BIOS Delivers isolation according to Popek & Goldberg Effectively privileged mode, guest privileged mode and user mode.. On linux: cat /proc/cpuinfo egrep 'svm vmx' 11 Intel x86_64 page tables 12

Simple Intel page table example ( bare metal ) 16K process rep. as 4 * 4K pages, virt.cont. VA starts at 0x403FF000 Process accesses 0x40400080 process pages pmd offset 0x80 = 128 bytes into page 0 pud cr3 pgd 0 1 511 0 0 511 511 511 entries are 8 byte pte = page phys.addr (PA) + ctrl bits 13 Shadow page tables Guest page table is GVA ->GPA Hypervisor maintains shadow page tables Trap all changes to guest page tables Sync shadow page table: GVA HPA Very expensive to keep these tables in sync: lots of traps! memory overhead of extra page tables! 14

Shadow page tables guest GVA starts at guest phys.addr (GPA) shadow copy of guest page tables: GVA HPA Traps all write accesses MMU uses shadow translation, transparent to guest Host phys.addr (HPA) 15 Extended/nested page tables Shadow pages problem: Very expensive virtual machine context switches! OSes expect to own address space Two virtual machines could have the same guest physical addresses Need extra level of page tables but these physical pages were pointing to different host physical pages All state about pages must be flushed upon machine switch (VM exit) Hardware solution: Intel: Extended page tables (ept + vpid) AMD: Nested page tables (npt) Introduces hardware support to distinguish guest physical addresses from different machines Extra page table: GPA HPA 16

Extended page tables actual pages used in guest guest GVA starts at pmd guest phys.addr (GPA) All guest pages can be looked up in a second level page table nested pt pointer guest page table pages pud pgd 2nd level table translation: GPA HPA Both tables cached in TLB with vpid Host phys.addr (HPA) 17 Network virtualization: 1) Virtual infrastructure Ethernet as example: Hardware: broadcast point to point HUBs to switches VLAN: Sharing physical links Wireless/mobile.. Speed technology Pure software: virtual switches and links 18

Ethernet... 19 Ethernet CSMA/CD (Carrier Sense Multiple Access w/collision Detection) Half-duplex, serial, single wire pair Designed for optimistic, unreliable broadcast w/repeaters Today: somewhat related usage on Wifi But: Mostly reliable point-to-point w/switches (parallel) full-duplex Speed: 10,100,1000Mb/s, 10,40,100Gb/s,? Software aspects of Ethernet as success factor: Extensible, flexible protocol high penetration... 20

Virtual Ethernet? VLAN (IEEE 802.1Q): Extra VLAN ID field in Ethernet packet Allows several logical networks to use same medium Smart switches and routers define who will see which logical streams of packets Benefits: Flexibility, saves expensive wire resources, network capacity Drawbacks? Complexity, security issues, requires switch support 21 Ethernet and machine virtualization? Virtual ethernet switches Virtual packet forwarding tap, bonding devices Virtual ethernet devices in virtual machines logical transport links between VMs within a host Host virtual ethernet to be able to route packets to a virtual machine emulated (OS sees a real device) paravirtualized (custom interface with hypervisor) In common: All operates on Ethernet packets benefits? drawbacks? 22

Network interface attach points (Linux) Ethernet not the only network interface, abstraction layer: serial ports parallel ports tun (IP packets) IPC sockets USB Firewire Bluetooth RDMA devices (later) 23 Virtual ethernet support within a Linux host Virtual switches (bridges) Virtual ethernet interfaces (tap devices) tunctl Packet filtering brctl ebtables Tunneling 24

Virtual IP network support on Linux Tun devices Packet filtering tunctl tunn -n dev iptables, ip route + NAT Queuing and manipulation of packet queues tc (traffic control) 25 Overview Introduction to virtualization (Virtual machines) Aspects of network virtualization: Virtual network infrastructure, interfaces, adapters Network interface attach points (PCI, PCIe) Software emulation of a network interface Paravirtualized network interfaces Hardware support for sharing a network adapter (SR/IOV) Use cases, challenges, risks and tradeoffs 26

The system I/O bus Before the PC: Proprietary, incompatible expensive! IBM PC: AT bus - tried MicroChannel Architecture ISA (Industry Standard Architecture) bus! clone manufacturer effort parallel broadcast medium, 8 or 16 bit at a time Hardware design: slot design, pinout standardized w/extensions 486: ISA bus too slow for video req VESA local bus: 32 bit isa Pentium: PCI + ISA for bw comp 27 PCI (Peripheral Component Interconnect) DMA (Direct Memory Access) support for devices New, more compact physical design Standardized, extensible software interface! 3 Address space types: Config space I/O ports (ISA compat++) Memory mapped I/O (MMIO) Config space has standardized layout, standardized semantics 28

ISA vs PCI 29 Why do we care about details of an obsolete I/O bus? Common software implementations of virtualization emulates a PCI based system architecture Most OSes automatically recognize and are able to tell something about PCI devices Some OSes would probably not even boot if no PCI host bridge was detected! It is basically a good API to base a virtual device on! PCI Express is an extension of PCI from a software perspective! (remember old and new Ethernet? ;-) ) 30

PCI config space 31 PCI config space, Pentium emulation 32

PCI (Peripheral Component Interconnect) DMA (Direct Memory Access) support for devices New, more compact physical design Standardized, extensible software interface! 3 Address space types: Config space I/O ports (ISA compat++) Memory mapped I/O (MMIO) Config space has standardized layout, standardized semantics 33 PCI BAR (Base Address Register) spaces OS writes 0xfffffff to determine size/validity - size is 2^n Only the bits needed to set an aligned address are writable OS can deduce size - writes back desired PCI address. Device listens to PCI requests to range from BAR address + BAR size. Up to 6 32 bits regions Up to 3 64 bits regions Registers used in pair BAR type information in read only lower bit 34