I/O and virtualization

Similar documents
Virtualization. Pradipta De

Module 1: Virtualization. Types of Interfaces

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

Lecture 5: February 3

Cloud Computing Virtualization

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36

Virtualization, Xen and Denali

Chapter 5 C. Virtual machines

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks

Virtualization. Starting Point: A Physical Machine. What is a Virtual Machine? Virtualization Properties. Types of Virtualization

Last class: Today: Course administration OS definition, some history. Background on Computer Architecture

Virtualization. ! Physical Hardware Processors, memory, chipset, I/O devices, etc. Resources often grossly underutilized

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization:

CHAPTER 16 - VIRTUAL MACHINES

The Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014)

Nested Virtualization and Server Consolidation

Multiprocessor Scheduling. Multiprocessor Scheduling

Distributed Systems COMP 212. Lecture 18 Othon Michail

CSCI 8530 Advanced Operating Systems. Part 19 Virtualization

CSCE 410/611: Virtualization

for Kerrighed? February 1 st 2008 Kerrighed Summit, Paris Erich Focht NEC

1 Virtualization Recap

Operating Systems 4/27/2015

Virtual Machine Monitors!

I/O Handling. ECE 650 Systems Programming & Engineering Duke University, Spring Based on Operating Systems Concepts, Silberschatz Chapter 13

Virtualization. Dr. Yingwu Zhu

NON SCHOLAE, SED VITAE

Virtualization. ...or how adding another layer of abstraction is changing the world. CIS 399: Unix Skills University of Pennsylvania.

CSE 120 Principles of Operating Systems

CHAPTER 16 - VIRTUAL MACHINES

CS370 Operating Systems

Scuola Superiore Sant Anna. I/O subsystem. Giuseppe Lipari

Four Components of a Computer System

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy

Advanced Operating Systems (CS 202) Virtualization

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization.

Virtual Machines. Part 2: starting 19 years ago. Operating Systems In Depth IX 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

An overview of virtual machine architecture

Lecture 7. Xen and the Art of Virtualization. Paul Braham, Boris Dragovic, Keir Fraser et al. 16 November, Advanced Operating Systems

Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018

Virtualization. Darren Alton

Introduction to Cloud Computing and Virtualization. Mayank Mishra Sujesha Sudevalayam PhD Students CSE, IIT Bombay

Virtualization and memory hierarchy

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant.

Unit 5: Distributed, Real-Time, and Multimedia Systems

Virtual Machines. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Virtual Machine Security

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

LINUX KVM FRANCISCO JAVIER VARGAS GARCIA-DONAS CLOUD COMPUTING 2017

CS370 Operating Systems

I/O Systems. Jo, Heeseung

COS 318: Operating Systems. Virtual Machine Monitors

LINUX Virtualization. Running other code under LINUX

System Virtual Machines

I/O Systems. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand

Operating Systems. Operating System Structure. Lecture 2 Michael O Boyle

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

LIA. Large Installation Administration. Virtualization

Faculty of Computer Science, Operating Systems Group. The L4Re Microkernel. Adam Lackorzynski. July 2017

Server Virtualization Approaches

Hardware OS & OS- Application interface

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Knut Omang Ifi/Oracle 6 Nov, 2017

[08] IO SUBSYSTEM 1. 1

Operating System: Chap2 OS Structure. National Tsing-Hua University 2016, Fall Semester

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT I

CSE 451: Operating Systems Winter I/O System. Gary Kimura

Xen and the Art of Virtualization

VIRTUALIZATION: IBM VM/370 AND XEN

references Virtualization services Topics Virtualization

Operating Systems. V. Input / Output

CS370 Operating Systems

OS structure. Process management. Major OS components. CSE 451: Operating Systems Spring Module 3 Operating System Components and Structure

Intel Virtualization Technology Roadmap and VT-d Support in Xen

Performance Considerations of Network Functions Virtualization using Containers

Virtualization (II) SPD Course 17/03/2010 Massimo Coppola

DISCO and Virtualization

Virtualization. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班

Input/Output Problems. External Devices. Input/Output Module. I/O Steps. I/O Module Function Computer Architecture

Operating Systems (2INC0) 2018/19. Introduction (01) Dr. Tanir Ozcelebi. Courtesy of Prof. Dr. Johan Lukkien. System Architecture and Networking Group

Today: I/O Systems. Architecture of I/O Systems

The Price of Safety: Evaluating IOMMU Performance

Björn Döbel. Microkernel-Based Operating Systems. Exercise 3: Virtualization

Lecture 5. KVM for ARM. Christoffer Dall and Jason Nieh. 5 November, Operating Systems Practical. OSP Lecture 5, KVM for ARM 1/42

Comp 204: Computer Systems and Their Implementation. Lecture 18: Devices

CIS 21 Final Study Guide. Final covers ch. 1-20, except for 17. Need to know:

CSC 5930/9010 Cloud S & P: Virtualization

VGA Assignment Using VFIO. Alex Williamson October 21 st, 2013

Xen and the Art of Virtualization. Nikola Gvozdiev Georgian Mihaila

EE 660: Computer Architecture Cloud Architecture: Virtualization

Virtualization and Performance

I/O Systems. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)

OS Virtualization. Linux Containers (LXC)

Virtual Machines. To do. q VM over time q Implementation methods q Hardware features supporting VM q Next time: Midterm?

CS-580K/480K Advanced Topics in Cloud Computing. VM Virtualization II

Last 2 Classes: Introduction to Operating Systems & C++ tutorial. Today: OS and Computer Architecture

CS330: Operating System and Lab. (Spring 2006) I/O Systems

Virtualization for Embedded Systems

Transcription:

I/O and virtualization CSE-C3200 Operating systems Autumn 2015 (I), Lecture 8 Vesa Hirvisalo

Today I/O management Control of I/O Data transfers, DMA (Direct Memory Access) Buffering Single buffering Double buffering Virtualization The process abstraction defines a virtualization Why do we need something else? Compared to emulation and simulation Different classes of virtualization Virtual machines, virtualization engines, etc., and their support 2

I/O management 3

Introduction Reminder: The main tasks of an OS Resource management Abstraction of hardware Peripheral devices Often many, different types, Must be Managed Abstracted The appear as the filesystem, the network, Usually Significant proportion of a large OS is driver code The driver code is typically also the I/O code 4

I/O and devices Often there is plenty of structure Busses, bridges, controllers, (below a classical PC) Accessing the structure and its parts is not trivial From old main frames to newer devices Computer hardware is evolving rapidly (no universal computer ) Stand-alone computers are (more or less) dead No such thing Huge variations The traditional computer architecture (the PC) The novel computer architecture 5

Devices Devices come in wide variety human readable: display/keyboard/mouse etc. machine readable: hard disks, USB keys, sensors communication: modems,... Key differences data rate: orders of magnitude differences application: e.g., a hard disk vs. a keyboard complexity of control: e.g., disk vs. printer unit of transfer: block vs. character (stream) oriented data representation: data encoding differs from device to device error conditions: how the errors are handled and reported back, T.Lilja 6

Control of I/O Programmed I/O (typically polling) OS (CPU) issues I/O commands on behalf of a process the process waits until the the operation is complete Interrupt-driven I/O if instruction is non-blocking, process continues if instructions is blocking, the process is moved to blocked state once the I/O is finished, an interrupt is issued Direct Memory Access (DMA) processor initiates data transfer DMA module transfers the data independently DMA module issues an interrupt once the transfer is completed 7

Data transfers Small amount can be transferred by the processor big amounts call for HW support otherwise, CPU will become a bottleneck DMA transfer operation type: read or write address of the I/O device involved (e.g. network card) memory address where to start the operation number of words to be read or written PCI architecture includes arbitration PCI devices can request control of the bus and issue memory read/write operations 8

Organization Sharing the system bus with memory and CPU DMA module performs programmed I/O to transfer data inefficient: DMA must issue a transfer request and a transfer which both go to the same system bus! Explicit paths from DMA to I/O modules saves bus cycles: avoids the transfer request I/O systems might have their own DMA modules or share them Dedicated I/O bus allows sharing the DMA module among I/O devices easily expandable communication among I/O devices possible without going to memory, T.Lilja 9

Layered I/O handling Layered architecture lower the level the closer the HW layers should communicate through well-defined APIs Logical I/O provides the interface for the processes open, close, read, write Device I/O data converted to I/O instructions, channel and control commands buffering techniques may be used Scheduling and control I/O requests are scheduled and executed handles interrupts, memory transfer, status updates, T.Lilja 10

Buffering 11

I/O Buffering Typical issues in I/O there are significant speed differences efficient transfers must be large bursts of data there can be latencies no one at other end is listening Therefore considering control requests are deferred (asynchronous handling) considering data buffers (i.e., reserved memory areas) are used There typically are several buffers user, kernel, device, read, write, 12

I/O Device Types Different classifications per OS We use the UNIX ones Stream or character oriented data is transferred one unit at a time random access is not usually supported e.g., keyboard, mouse Block oriented data is moved in blocks block size for HDD is, e.g., 4096 bytes random accessing data is possible (or even fast) usually linearly addressed 13

Single Buffer: Block-Oriented Devices Block-oriented device: reading I/O device input is first transferred to the kernel space buffer once completed, the buffer is moved to the user space and another request for the I/O is issued (read-ahead) at the end one unnecessary read is done Block-oriented device: writing process data is first copied to a kernel buffer user process is free to continue its run data is written to the I/O device at a later time Process need not hang waiting for I/O to be completed swapping the userland side of I/O buffer is possible, T.Lilja 14

Single Buffer: Stream-Oriented Devices Line-at-a-time buffering kernel buffer is filled until a line termination character is found from the input stream e.g., terminal window Byte-a-time buffering single byte is read to the kernel and moved to the user avoids writing the data directly to the user address space e.g., mouse in a GUI Stream-oriented device: reading user process is suspended until a line of input is read avoids context switches between user process and I/O handling Stream-oriented device: writing user process can write a line and continue must suspend only if another line of output is available before the previous one gets written to the I/O device (flushed), T.Lilja 15

Double Buffering We can assign two kernel buffers for the I/O operation one is used by the process for reading/writing another is used by the kernel/device driver for I/O when operation(s) is(are) completed, the buffers are swapped Allows simultaneously access for both OS and process single buffering: process must block if the device driver is currently updating the buffer Using more than two buffers (circular buffering) Needed if the process performs rapid bursts of complex I/O Note: buffer = an area with concurrency control The point here is the use of locks, or semaphores, or Smaller granularity gives more opportunities for parallelism 16

Virtualization 17

Introduction Process abstraction is a virtualization But not a very good one (roughly: a program in execution) Process descriptors are large They refer to a whole lot of data Complex memory structures (incl. sharing), complex open files, etc. The data binds processes tightly onto the platform Migration of processes is heavy (sometimes almost impossible) Programs need to co-operate Process context switches are heavy Therefore IPC (Inter-Process Communication) is slow Threads are no good, also Parallel processing with threads is a nightmare Difficult to fix (we get back to this issue on the next lecture) HW threads are hard to replace 18

Virtualization Virtualization using resources that do not match real resources virtualization is based on real resources a virtual printer may be implemented by using several physical printers to do its job a virtual machine is usually based on the computational resources of an other (physically existing) machine Emulation a form of virtualization where there exists a physical device, whose behavior we mimic (without having the physical device) Simulation we imitate the operation of a system based on a model E.g., we have an abstract model of a memory system and simulate the operation of the memory system by using the model 19

Virtual computers (1/2) CPU emulates CPU instruction set, registers and other internal state e.g., MIPS simulator run on x86 would allow executing MIPS binaries provided that binary does no system calls or external device access Peripherals individual device emulation like memory, hard disks, networks e.g. distributed role system provide an illusion of a single hard disk even though data is spread through set of networked disks Full system virtualization models all parts of real or fictive system e.g. CPU, Memory, PCI-Bus, Network Interface if such hardware is exists and is supported by a SDK we can run real binaries unmodified, T.Lilja 20

Virtual computers (2/2) Operating system virtualization share the same kernel several isolated user-space instances user space instances have their own independent file system hierarchy resources can be allocated on instance bases e.g., Solaris containers Application level virtualization emulate some run-time requirements of applications e.g., system calls of a kernel: FreeBSD's Linux system call emulation allows running unmodified Linux binaries on a FreeBSD host WINE Windows Emulator allows running Windows binaries on Linux Programming language virtual machines Java Virtual Machine providing runtime support for Java byte code for these, usually there is no real hardware counterpart, T.Lilja 21

Classification of virtualization (1/2) Separation of guest and host (and their OSes) Host operating system is run on real hardware Guest operating system is run in a virtualized environment Bare metal architecture Hypervisor on real hardware Guest on the hypervisor Hosted architecture Host operating system on real hardware Hypervisor run on the operating system In both cases, the guest is running on the hypervisor Basically, multiple different guest operating system(s) on top of a real hardware, T.Lilja 22

Classification of virtualization (2/2) Full virtualization the whole system is completely modelled (CPU, Disk, NIC, ) allows for running unmodified guest operating systems e.g., system-mode QEMU Partial virtualization parts of the hardware are simulated allowing some code run unmodified not full blown kernel but some user-land binaries r.g., virtual 8086 mode on x86 architecture Paravirtualization hardware is not necessary emulated at all Guest OS is modified to be able to run in paravirtualized environment, T.Lilja 23

Hardware virtualization (1/3) Virtual Machine Monitor/Hypervisor is capable of virtualizing full set of hardware resources when the following criteria are met equivalence: program running under virtual machine monitor behaves essentially identically to one running on equivalent (real) machine safety: hypervisor or virtual machine monitor must be in complete control of the virtualized resources performance: most of the instructions should be executed without virtual machine monitor intervention If the safety criterion is broken guest program can take control of the virtualized resources without ever giving control back to VMM If the performance criterion is broken It may be too slow to provide any useful service, T.Lilja 24

Hardware virtualization (2/3) To derive conditions of a virtualization of a hardware architecture, we have to classify ISA of a CPU into three categories Privileged instructions cause a trap or exception when run in user mode do not cause any exception when run in kernel mode Control sensitive instructions change the configuration or state of resource e.g., processor execution mode Behavior sensitive instructions result depends on the configuration of a resource e.g., content of the relocation register or processor mode, T.Lilja 25

Hardware virtualization (3/3) An effective VM can be constructed if the set of sensitive instructions is a subset of the set of privileged instructions Why? All instructions that can effect the functioning of the VMM (i.e. sensitive instructions) must pass control to the VMM guarantees safety criteria Non-privileged instructions are executed natively guarantees performance criteria Classic or trap-and-emulate virtualization VMM must trap and emulate every sensitive instruction run non-sensitive instructions natively and for the sensitive instructions install a trap handler that is run instead of the OS trap handler, T.Lilja 26

Virtualization with Intel x86 (1/2) Classical x86 architecture had some sensitive instructions that did not produce traps (critical instructions) e.g., critical instructions change processor or resource state without allowing the virtual machine monitor to intervene Causes problems when VMM runs multiple OS OS #1 issues SIDT (Store Interrupt Descriptor Table Register) instruction and install interrupt handler vector OS #2 issues SIDT OS #1 invokes an interrupt OS #1 ends up in the OS #2 interrupt handler trap-and-emulate would not work Classical x86 can perform virtualization by binary translation replace sensitive instructions not producing traps with instructions that transfer control to the virtual machine monitor but the performance for critical instructions is poor, T.Lilja 27

Virtualization with Intel x86 (2/2) AMD released virtualization extension called AMD-v in 2005 Intel followed in 2006 releasing extension called VT-x which modifies 86 behavior when running VMM two operation modes: VMM and Guest mode own address space for VMM and Guest OSes transfer control to VMM when OS uses sensitive instructions virtualized interrupt vectors for guest OS Virtual Machine Control Structure used for context switching between Guest OS and VMM This provided the basic HW virtualization of the CPU but peripheral device virtualization was still not very efficient, T.Lilja 28

QEMU An emulator using Dynamic Binary Translation (DBT) Full software virtualization (no specific HW support required) User-mode user code is natively executed after DBT processor ISA functional emulation OS and rest of the system is emulated by the QEMU System-mode all code is natively executed after DBT all HW is emulated by QEMU Supports various CPUs: x86, PowerPC, ARM, SPARC, a number of various peripherals, PCI and ISA bridges network cards, audio cards, USB controllers, hard disks, QEMU is lacking proper multicore support QEMU can emulate a multicore guest but using a unicore host (or one core of a multicore) memory sharing and coherency are issues here 29

Xen (1/2) Xen hypervisor is run when the system boots dom0: runs modified version of Linux kernel (host OS) guest is aware that it is a virtual machine makes hypercalls directly, rather than issuing privileged instructions provides device drivers for all guests uses Xend daemon to control execution of the Guest OS XenStore provides statistics collection domu: runs guest operating systems unmodified OS if hardware assisted virtualization is supported otherwise, guests must be paravirtualized (critical instructions translated and device access remapped), T.Lilja 30

Xen (2/2), T.Lilja 31

I/O virtualization 32

Virtual I/O devices Similarly as for computing, I/O can be virtualized I/O operations are done with virtual devices The underlying HW implementation may differ significantly A memory copy may realize as network operations A network transfer may realize as a memory copy Programmability Code portability, migrations, etc. are hard Do such things under the hood Performance I/O operations are typically very slow (Parallel) hardware acceleration is often the answer Forget about the hand coded assembler is faster This is obsolete: modern systems are far too complex 33

KVM KVM consists of loadable generic kernel module (kvm.ko) and specific modules for AMD/Intel guest OS are run under modified version of the QEMU emulator part of Linux kernel and uses its scheduler and memory management to do the resource division easy to setup (no boot needed) no paravirtualization for CPU but may support it for I/O wrt to QEMU QEMU purely software-based and somewhat slow wrt to Xen Xen is an external hypervisor host OS needs to be specifically compiled supports paravirtualization, T.Lilja 34

Docker Applications in software containers Abstract the platform structure away Operating system-level virtualization Not actually a virtual machine Basically a virtualization engine Use Linux container mechanisms Process isolation and co-operation By using the kernel mechanisms Toward distributed systems Abstracts the network connection Multiple processes, apps, tasks, etc. Run on single or multiple hosts Sllows for lightweight communication Docker uses directly the kernel Allows for using quotas Does not support live migration 35

I/O virtualization with Intel x86 Extended Page Tables Translate guest-physical host-physical address Guest OS can modify its own page tables without VMM IOMMU Allows a single Guest OS direct access for I/O devices techniques: DMA and interrupt remapping AMD-Vi and Intel VT-d Toward the CPU: Intel CMT and CAT Network Virtualization Network card hardware must support this Allows sharing a single network device with multiple guest OS Allows hardware accelerated I/O operations Intel VT-c, SR-IOV, MR-IOV PC-SIG I/O virtualization PCI-E standardized non x86-specific I/O virtualization methods 36

Resource management tools (1/2) Virtualization needs basic tools Virtualization is an abstraction by definition But: there must be resource management, too And security, dependability, etc. (remember the OS basic tasks) Linux provides a wide range of tools, e.g. cgroups (control groups) An evolution of various mechanisms (check the 2.6.x history) A unified interface to many different use cases E.g., memory usage limit for a subsystem Namespaces Isolating the namespace of a subsystem from the others pid, mount, NIC, hostnames, etc. E.g., virtual machine isolation Governors E.g., in CPUfreq subsystem e.g., Performance Governor, Powersave Governor, On-demand Governor, Conservative Governor (what is available depends on the system) 37

Resource management tools (2/2) 38