A Userspace Packet Switch for Virtual Machines

Size: px
Start display at page:

Download "A Userspace Packet Switch for Virtual Machines"

Transcription

1 SHRINKING THE HYPERVISOR ONE SUBSYSTEM AT A TIME A Userspace Packet Switch for Virtual Machines Julian Stecklina OS Group, TU Dresden jsteckli@os.inf.tu-dresden.de VEE 2014, Salt Lake City

2 1 Motivation 2 Userspace Switch 3 Evaluation 4 Summary TU Dresden Userspace Packet Switch slide 2 of 30

3 01 Microkernels and TCB Systems built on microkernels usually structured as multiserver systems with strong isolation between subsystems. Applications only depend on subsystems they use. kernel user µkernel Application Networking Filesystem TU Dresden Userspace Packet Switch slide 3 of 30

4 01 Towards Monolithic Hypervisors kernel user Qemu virtio-net TU Dresden Userspace Packet Switch slide 4 of 30

5 01 Towards Monolithic Hypervisors kernel user KVM Qemu virtio-net TU Dresden Userspace Packet Switch slide 5 of 30

6 01 Towards Monolithic Hypervisors kernel user KVM vapic Qemu virtio-net TU Dresden Userspace Packet Switch slide 6 of 30

7 01 Towards Monolithic Hypervisors kernel user KVM vapic v-net... TU Dresden Userspace Packet Switch slide 7 of 30

8 01 User vs. Kernel Attacks by malicious guest code are a serious concern. Successful attacks on Qemu achieve unprivileged code execution. Dangerous, but manageable. Successful attacks on KVM achieve code execution in kernel mode. Game over. L You are here. TU Dresden Userspace Packet Switch slide 8 of 30

9 01 KVM/vhost Trusted Code In a modern KVM installation the complete networking path is in the TCB of all applications on the host: (simple) instruction decoding, virtio-net device implementation, NIC driver. Does it have to be? TU Dresden Userspace Packet Switch slide 9 of 30

10 1 Motivation 2 Userspace Switch 3 Evaluation 4 Summary TU Dresden Userspace Packet Switch slide 10 of 30

11 02 KVM Networking notification kernel user VCPU A VHOST VHOST VCPU B sv3 guest VM Exit Inject IRQ VM A VM B TU Dresden Userspace Packet Switch slide 11 of 30

12 02 KVM Networking notification kernel user VCPU A VCPU B sv3 sv3 guest VM Exit Inject IRQ VM A VM B TU Dresden Userspace Packet Switch slide 12 of 30

13 02 KVM Networking notification kernel user VCPU A VCPU B sv3 sv3 guest VM Exit Inject IRQ VM A VM B CPU 1 CPU 2 CPU 3 TU Dresden Userspace Packet Switch slide 13 of 30

14 02 A Userspace Switch: sv3 Userspace packet switch running as ordinary process on top of the host Linux: implements virtio-net, (no) packet memory management, NIC drver. Every sv3 instance is a complete isolated networking subsystem. TU Dresden Userspace Packet Switch slide 14 of 30

15 02 Vhost and KVM KVM and vhost are loosely tied together by Qemu using eventfds. Qemu ties them together using ioctl. KVM can trigger eventfds on VM Exits. eventfds can be used to trigger IRQ injection in KVM. Can use eventfds from userspace as well without using vhost. TU Dresden Userspace Packet Switch slide 15 of 30

16 02 Tying sv3 into Qemu Enhanced Qemu to support out-of-process PCI devices. Qemu connects to sv3 via AF LOCAL socket. Qemu exchanges fds to establish shared memory. Qemu exchanges eventfds for VM Exit notification, IRQ injection. sv3 implements the complete virtio-net logic and it feels like L4! TU Dresden Userspace Packet Switch slide 16 of 30

17 02 Zero-Copy Packet Transmission sv3 creates linear mappings of guest memory with mmap. Packet data can be copied with a plain memcpy. No additional copies are necessary. If no buffer space in the receiving VM is available, packet is dropped. No dynamic memory management for packets needed. VM A sv3 VM B VM A VM B TU Dresden Userspace Packet Switch slide 17 of 30

18 02 External Communication Userspace driver for Intel X GBit NIC using VFIO (requires IOMMU) supporting static offloads: TCP Segmentation Offload Large Receive Offload Checksum Offload Virtio descriptors translated to HW descriptors allows for zero-copy send with all offloads. Better reuse existing drivers next time... TU Dresden Userspace Packet Switch slide 18 of 30

19 02 Switching Loop sv3 is mostly single-threaded and lockless. Userspace RCU is used to synchronize adding and removing switch ports. 1 disable events on all virtio queues 2 disable HW IRQs 3 poll for work until queues empty 4 enable events/irqs 5 poll a last time, if packet seen goto 1 6 block on eventfd In overload scenarios, sv3 naturally operates in polling mode. TU Dresden Userspace Packet Switch slide 19 of 30

20 1 Motivation 2 Userspace Switch 3 Evaluation 4 Summary TU Dresden Userspace Packet Switch slide 20 of 30

21 03 Resource Consumption Processes are lightweight alternatives to driver VMs. sv3 NIC driver sv3 total < 2 MiB + 14 MiB < 16 MiB smallest VM 32 MiB [1] netback VM 128 MiB [1] Breaking Up is Hard to Do: Security and Functionality in a Commodity Hypervisor, Colp et al., SOSP 11 TU Dresden Userspace Packet Switch slide 21 of 30

22 03 Evaluation System Intel Core i7 3770S (Ivy Bridge) C-states, HT and frequency scaling disabled 16 GiB 159 Gbit/s Host: Fedora 19 with Linux 3.10 (vanilla) Guest: Linux 3.10 (vanilla), 256 MiB RAM Qemu 1.5 (plus patches) TU Dresden Userspace Packet Switch slide 22 of 30

23 03 Cost of Userspace Execution 14 Roundtrip Latency in μs vhost sv3 Notification to IRQ injection times for vhost vs. sv3 without any packet processing. Cost of additional trip to syscall layer and mode switch. TU Dresden Userspace Packet Switch slide 23 of 30

24 03 Latency Roundtrip Latency in μs vhost-external vhost-vm sv3-external sv3-vm Latency measured using netperf UDP RR. TU Dresden Userspace Packet Switch slide 24 of 30

25 03 VM-to-VM Bandwidth vhost, tso sv3, tso sv3 only CPU Utilization GBit/s TU Dresden Userspace Packet Switch slide 25 of 30

26 03 VM-to-VM Bandwidth (cont d) vhost, no tso sv3, no tso sv3 only CPU Utilization GBit/s TU Dresden Userspace Packet Switch slide 26 of 30

27 03 external-to-vm Bandwidth vhost, tso sv3, tso sv3 only CPU Utilization GBit/s TU Dresden Userspace Packet Switch slide 27 of 30

28 1 Motivation 2 Userspace Switch 3 Evaluation 4 Summary TU Dresden Userspace Packet Switch slide 28 of 30

29 04 Summary sv3 is an efficient lockless userspace packet switch for VMs running on (unmodified) Linux/KVM. Code: Linux/KVM has all the mechanisms to make a microkernel-style design possible and efficient: rights transfer via AF LOCAL sockets (capabilities) efficient notifications via eventfds drivers in userspace via VFIO using eventfds for IRQs tying eventfds to VM exits / IRQ injection address space switch cost not a factor in performance Few reasons to write new systems functionality in kernel mode. Questions? TU Dresden Userspace Packet Switch slide 29 of 30

30 04 external-to-vm Bandwidth sv3, no tso sv3 only CPU Utilization GBit/s TU Dresden Userspace Packet Switch slide 30 of 30

Changpeng Liu. Senior Storage Software Engineer. Intel Data Center Group

Changpeng Liu. Senior Storage Software Engineer. Intel Data Center Group Changpeng Liu Senior Storage Software Engineer Intel Data Center Group Legal Notices and Disclaimers Intel technologies features and benefits depend on system configuration and may require enabled hardware,

More information

Applying Polling Techniques to QEMU

Applying Polling Techniques to QEMU Applying Polling Techniques to QEMU Reducing virtio-blk I/O Latency Stefan Hajnoczi KVM Forum 2017 Agenda Problem: Virtualization overhead is significant for high IOPS devices QEMU

More information

Fakultät Informatik Institut für Systemarchitektur, Betriebssysteme THE NOVA KERNEL API. Julian Stecklina

Fakultät Informatik Institut für Systemarchitektur, Betriebssysteme THE NOVA KERNEL API. Julian Stecklina Fakultät Informatik Institut für Systemarchitektur, Betriebssysteme THE NOVA KERNEL API Julian Stecklina (jsteckli@os.inf.tu-dresden.de) Dresden, 5.2.2012 00 Disclaimer This is not about OpenStack Compute.

More information

Accelerating NVMe I/Os in Virtual Machine via SPDK vhost* Solution Ziye Yang, Changpeng Liu Senior software Engineer Intel

Accelerating NVMe I/Os in Virtual Machine via SPDK vhost* Solution Ziye Yang, Changpeng Liu Senior software Engineer Intel Accelerating NVMe I/Os in Virtual Machine via SPDK vhost* Solution Ziye Yang, Changpeng Liu Senior software Engineer Intel @optimistyzy Notices & Disclaimers Intel technologies features and benefits depend

More information

Virtio 1 - why do it? And - are we there yet? Michael S. Tsirkin Red Hat

Virtio 1 - why do it? And - are we there yet? Michael S. Tsirkin Red Hat Virtio 1 - why do it? And - are we there yet? 2015 Michael S. Tsirkin Red Hat 1 Uses material from https://lwn.net/kernel/ldd3/ Gcompris, tuxpaint Distributed under the Creative commons license. Lots of

More information

Vhost dataplane in Qemu. Jason Wang Red Hat

Vhost dataplane in Qemu. Jason Wang Red Hat Vhost dataplane in Qemu Jason Wang Red Hat Agenda History & Evolution of vhost Issues Vhost dataplane TODO Userspace Qemu networking Qemu Guest mainloop in IOThread virtio VCPU VCPU tap fd vcpu fd vcpu

More information

Evolution of the netmap architecture

Evolution of the netmap architecture L < > T H local Evolution of the netmap architecture Evolution of the netmap architecture -- Page 1/21 Evolution of the netmap architecture Luigi Rizzo, Università di Pisa http://info.iet.unipi.it/~luigi/vale/

More information

Changpeng Liu, Cloud Software Engineer. Piotr Pelpliński, Cloud Software Engineer

Changpeng Liu, Cloud Software Engineer. Piotr Pelpliński, Cloud Software Engineer Changpeng Liu, Cloud Software Engineer Piotr Pelpliński, Cloud Software Engineer Introduction to VirtIO and Vhost SPDK Vhost Architecture Use cases for vhost Benchmarks Next steps QEMU VIRTIO Vhost (KERNEL)

More information

Improve VNF safety with Vhost-User/DPDK IOMMU support

Improve VNF safety with Vhost-User/DPDK IOMMU support Improve VNF safety with Vhost-User/DPDK IOMMU support No UIO anymore! Maxime Coquelin Software Engineer KVM Forum 2017 AGENDA Background Vhost-user device IOTLB implementation Benchmarks Future improvements

More information

KVM Weather Report. Red Hat Author Gleb Natapov May 29, 2013

KVM Weather Report. Red Hat Author Gleb Natapov May 29, 2013 KVM Weather Report Red Hat Author Gleb Natapov May 29, 2013 Part I What is KVM Section 1 KVM Features KVM Features 4 KVM features VT-x/AMD-V (hardware virtualization) EPT/NPT (two dimensional paging) CPU/memory

More information

KVM PERFORMANCE OPTIMIZATIONS INTERNALS. Rik van Riel Sr Software Engineer, Red Hat Inc. Thu May

KVM PERFORMANCE OPTIMIZATIONS INTERNALS. Rik van Riel Sr Software Engineer, Red Hat Inc. Thu May KVM PERFORMANCE OPTIMIZATIONS INTERNALS Rik van Riel Sr Software Engineer, Red Hat Inc. Thu May 5 2011 KVM performance optimizations What is virtualization performance? Optimizations in RHEL 6.0 Selected

More information

Virtio/vhost status update

Virtio/vhost status update Virtio/vhost status update Yuanhan Liu Aug 2016 outline Performance Multiple Queue Vhost TSO Functionality/Stability Live migration Reconnect Vhost PMD Todo Vhost-pci Vhost Tx

More information

Achieve Low Latency NFV with Openstack*

Achieve Low Latency NFV with Openstack* Achieve Low Latency NFV with Openstack* Yunhong Jiang Yunhong.Jiang@intel.com *Other names and brands may be claimed as the property of others. Agenda NFV and network latency Why network latency on NFV

More information

VIRTUALIZATION. Dresden, 2011/12/6. Julian Stecklina

VIRTUALIZATION. Dresden, 2011/12/6. Julian Stecklina Department of Computer Science Institute of Systems Architecture, Operating Systems Group VIRTUALIZATION Julian Stecklina (jsteckli@os.inf.tu-dresden.de) Dresden, 2011/12/6 00 Goals Give you an overview

More information

HKG net_mdev: Fast-path userspace I/O. Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog

HKG net_mdev: Fast-path userspace I/O. Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog HKG18-110 net_mdev: Fast-path userspace I/O Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog Why userland I/O Time sensitive networking Developed mostly for Industrial IOT, automotive and audio/video

More information

VIRTIO-NET: VHOST DATA PATH ACCELERATION TORWARDS NFV CLOUD. CUNMING LIANG, Intel

VIRTIO-NET: VHOST DATA PATH ACCELERATION TORWARDS NFV CLOUD. CUNMING LIANG, Intel VIRTIO-NET: VHOST DATA PATH ACCELERATION TORWARDS NFV CLOUD CUNMING LIANG, Intel Agenda Towards NFV Cloud Background & Motivation vhost Data Path Acceleration Intro Design Impl Summary & Future Work Towards

More information

Changpeng Liu. Cloud Storage Software Engineer. Intel Data Center Group

Changpeng Liu. Cloud Storage Software Engineer. Intel Data Center Group Changpeng Liu Cloud Storage Software Engineer Intel Data Center Group Notices & Disclaimers Intel technologies features and benefits depend on system configuration and may require enabled hardware, software

More information

VALE: a switched ethernet for virtual machines

VALE: a switched ethernet for virtual machines L < > T H local VALE VALE -- Page 1/23 VALE: a switched ethernet for virtual machines Luigi Rizzo, Giuseppe Lettieri Università di Pisa http://info.iet.unipi.it/~luigi/vale/ Motivation Make sw packet processing

More information

Vhost and VIOMMU. Jason Wang (Wei Xu Peter Xu

Vhost and VIOMMU. Jason Wang (Wei Xu Peter Xu Vhost and VIOMMU Jason Wang (Wei Xu ) Peter Xu Agenda IOMMU & Qemu viommu background Motivation of secure virtio DMAR (DMA Remapping) Design Overview

More information

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved Siemens Corporate Technology August 2015 Real-Time KVM for the Masses Unrestricted Siemens AG 2015. All rights reserved Real-Time KVM for the Masses Agenda Motivation & requirements Reference architecture

More information

VDPA: VHOST-MDEV AS NEW VHOST PROTOCOL TRANSPORT

VDPA: VHOST-MDEV AS NEW VHOST PROTOCOL TRANSPORT VDPA: VHOST-MDEV AS NEW VHOST PROTOCOL TRANSPORT CUNMING(Steve) LIANG, Intel cunming.liang AT intel.com KVM Forum 2018, Edinburgh October, 2018 Background KVM Forum 2018 / Edinburg / 2018 Intel Corporation

More information

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks LINUX-KVM The need for KVM x86 originally virtualization unfriendly No hardware provisions Instructions behave differently depending on privilege context(popf) Performance suffered on trap-and-emulate

More information

Red Hat Enterprise Virtualization Hypervisor Roadmap. Bhavna Sarathy Senior Technology Product Manager, Red Hat

Red Hat Enterprise Virtualization Hypervisor Roadmap. Bhavna Sarathy Senior Technology Product Manager, Red Hat Red Hat Enterprise Virtualization Hypervisor Roadmap Bhavna Sarathy Senior Technology Product Manager, Red Hat RHEV Hypervisor 1 RHEV Hypervisor Themes & 2 Architecture & Use cases 3 Q&A 4 Future 5 } HYPERVISOR

More information

VIRTIO: VHOST DATA PATH ACCELERATION TORWARDS NFV CLOUD. CUNMING LIANG, Intel

VIRTIO: VHOST DATA PATH ACCELERATION TORWARDS NFV CLOUD. CUNMING LIANG, Intel VIRTIO: VHOST DATA PATH ACCELERATION TORWARDS NFV CLOUD CUNMING LIANG, Intel Agenda Towards NFV Cloud vhost Data Path Acceleration vdpa Intro vdpa Design vdpa Implementation Summary & Future Work Towards

More information

Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO

Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO 2014 International Conference on Embedded and Ubiquitous Computing Platform Device Assignment to KVM-on-ARM Virtual Machines via VFIO Antonios Motakis, Alvise Rigo, Daniel Raho Virtual Open Systems Grenoble,

More information

Userspace NVMe Driver in QEMU

Userspace NVMe Driver in QEMU Userspace NVMe Driver in QEMU Fam Zheng Senior Software Engineer KVM Form 2017, Prague About NVMe Non-Volatile Memory Express A scalable host interface specification like SCSI and virtio Up to 64k I/O

More information

viommu/arm: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017

viommu/arm: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017 viommu/arm: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017 Overview Goals & Terminology ARM IOMMU Emulation QEMU Device VHOST Integration VFIO Integration Challenges VIRTIO-IOMMU

More information

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS CS6410 Moontae Lee (Nov 20, 2014) Part 1 Overview 00 Background User-level Networking (U-Net) Remote Direct Memory Access

More information

Kata Containers The way to run virtualized containers. Sebastien Boeuf, Linux Software Engineer Intel Corporation

Kata Containers The way to run virtualized containers. Sebastien Boeuf, Linux Software Engineer Intel Corporation Kata Containers The way to run virtualized containers Sebastien Boeuf, Linux Software Engineer Intel Corporation https://regmedia.co.uk/2017/09/11/shutterstock_containers_in_port.jpg Containers 101 Process

More information

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration

More information

VIRTUALIZATION. Dresden, 2011/6/23. Julian Stecklina

VIRTUALIZATION. Dresden, 2011/6/23. Julian Stecklina Department of Computer Science Institute of Systems Architecture, Operating Systems Group VIRTUALIZATION Julian Stecklina (jsteckli@os.inf.tu-dresden.de) Dresden, 2011/6/23 00 Goals Give you an overview

More information

Status Update About COLO FT

Status Update About COLO FT Status Update About COLO FT www.huawei.com Hailiang Zhang (Huawei) Randy Han (Huawei) Agenda Introduce COarse-grain LOck-stepping COLO Design and Technology Details Current Status Of COLO In KVM Further

More information

Quo Vadis Virtio? Michael S. Tsirkin Red Hat

Quo Vadis Virtio? Michael S. Tsirkin Red Hat Quo Vadis Virtio? 26 Michael S. Tsirkin Red Hat Uses material from https://lwn.net/kernel/ldd3/ Gcompris, tuxpaint, childplay Distributed under the Creative commons license, except logos which are C/TM

More information

LINUX KVM FRANCISCO JAVIER VARGAS GARCIA-DONAS CLOUD COMPUTING 2017

LINUX KVM FRANCISCO JAVIER VARGAS GARCIA-DONAS CLOUD COMPUTING 2017 LINUX KVM FRANCISCO JAVIER VARGAS GARCIA-DONAS CLOUD COMPUTING 2017 LINUX KERNEL-BASED VIRTUAL MACHINE KVM (for Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86 hardware

More information

Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen

Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen Aalto University School of Electrical Engineering Degree Programme in Communications Engineering Supervisor:

More information

On the DMA Mapping Problem in Direct Device Assignment

On the DMA Mapping Problem in Direct Device Assignment On the DMA Mapping Problem in Direct Device Assignment Ben-Ami Yassour Muli Ben-Yehuda Orit Wasserman benami@il.ibm.com muli@il.ibm.com oritw@il.ibm.com IBM Research Haifa On the DMA Mapping Problem in

More information

Faculty of Computer Science, Operating Systems Group. The L4Re Microkernel. Adam Lackorzynski. July 2017

Faculty of Computer Science, Operating Systems Group. The L4Re Microkernel. Adam Lackorzynski. July 2017 Faculty of Computer Science, Operating Systems Group The L4Re Microkernel Adam Lackorzynski July 2017 2 Agenda Plan What is L4Re? History The L4Re Microkernel / Hypervisor Fiasco Interfaces SMP Virtualization...

More information

KVM as The NFV Hypervisor

KVM as The NFV Hypervisor KVM as The NFV Hypervisor Jun Nakajima Contributors: Mesut Ergin, Yunhong Jiang, Krishna Murthy, James Tsai, Wei Wang, Huawei Xie, Yang Zhang 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED

More information

Accelerating NVMe-oF* for VMs with the Storage Performance Development Kit

Accelerating NVMe-oF* for VMs with the Storage Performance Development Kit Accelerating NVMe-oF* for VMs with the Storage Performance Development Kit Jim Harris Principal Software Engineer Intel Data Center Group Santa Clara, CA August 2017 1 Notices and Disclaimers Intel technologies

More information

DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture. Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr.

DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture. Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr. DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr. Product Manager CONTRAIL (MULTI-VENDOR) ARCHITECTURE ORCHESTRATOR Interoperates

More information

Next Gen Virtual Switch. CloudNetEngine Founder & CTO Jun Xiao

Next Gen Virtual Switch. CloudNetEngine Founder & CTO Jun Xiao Next Gen Virtual Switch CloudNetEngine Founder & CTO Jun Xiao Agenda Thoughts on next generation virtual switch Technical deep dive on CloudNetEngine virtual switch Q & A 2 Major vswitches categorized

More information

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization:

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization: Software and hardware support for Network Virtualization part 2 Knut Omang Ifi/Oracle 20 Oct, 2015 32 Overview Introduction to virtualization (Virtual machines) Aspects of network virtualization: Virtual

More information

Virtualisation: The KVM Way. Amit Shah

Virtualisation: The KVM Way. Amit Shah Virtualisation: The KVM Way Amit Shah amit.shah@qumranet.com foss.in/2007 Virtualisation Simulation of computer system in software Components Processor Management: register state, instructions, exceptions

More information

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand Introduction to Virtual Machines Nima Honarmand Virtual Machines & Hypervisors Virtual Machine: an abstraction of a complete compute environment through the combined virtualization of the processor, memory,

More information

Virtual switching technologies and Linux bridge

Virtual switching technologies and Linux bridge Virtual switching technologies and Linux bridge Toshiaki Makita NTT Open Source Software Center Today's topics Virtual switching technologies in Linux Software switches (bridges) in Linux Switching technologies

More information

viommu/arm: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017

viommu/arm: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017 viommu/arm: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017 Overview Goals & Terminology ARM IOMMU Emulation QEMU Device VHOST Integration VFIO Integration Challenges VIRTIO-IOMMU

More information

KVM PV DEVICES.

KVM PV DEVICES. K DEVICES dor.laor@qumranet.com 1 Agenda Introduction & brief history VirtIO Enhanced VirtIO with K support Further implementation 2 General & history Fully virtualized devices performs bad 55 Mbps for

More information

Network device virtualization: issues and solutions

Network device virtualization: issues and solutions Network device virtualization: issues and solutions Ph.D. Seminar Report Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy by Debadatta Mishra Roll No: 114050005

More information

Efficient Shared Memory Message Passing for Inter-VM Communications

Efficient Shared Memory Message Passing for Inter-VM Communications Efficient Shared Memory Message Passing for Inter-VM Communications François Diakhaté 1, Marc Perache 1,RaymondNamyst 2, and Herve Jourdren 1 1 CEA DAM Ile de France 2 University of Bordeaux Abstract.

More information

Xen is not just paravirtualization

Xen is not just paravirtualization Xen is not just paravirtualization Dongli Zhang Oracle Asia Research and Development Centers (Beijing) dongli.zhang@oracle.com December 16, 2016 Dongli Zhang (Oracle) Xen is not just paravirtualization

More information

Virtualization Device Emulator Testing Technology. Speaker: Qinghao Tang Title 360 Marvel Team Leader

Virtualization Device Emulator Testing Technology. Speaker: Qinghao Tang Title 360 Marvel Team Leader Virtualization Device Emulator Testing Technology Speaker: Qinghao Tang Title 360 Marvel Team Leader 1 360 Marvel Team Established in May 2015, the first professional could computing and virtualization

More information

KVM PV DEVICES.

KVM PV DEVICES. K DEVICES dor.laor@qumranet.com Agenda Kernel Virtual Machine overview Paravirtualized s intro & brief history VirtIO Enhanced VirtIO with K support 2 Kernel Virtual Machine overview is a regular Linux

More information

VGA Assignment Using VFIO. Alex Williamson October 21 st, 2013

VGA Assignment Using VFIO. Alex Williamson October 21 st, 2013 VGA Assignment Using VFIO alex.williamson@redhat.com October 21 st, 2013 Agenda Introduction to PCI & PCIe IOMMUs VFIO VGA VFIO VGA support Quirks, quirks, quirks Status and future Performance 2 A brief

More information

Software Routers: NetMap

Software Routers: NetMap Software Routers: NetMap Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking October 8, 2014 Slides from the NetMap: A Novel Framework for

More information

MDev-NVMe: A NVMe Storage Virtualization Solution with Mediated Pass-Through

MDev-NVMe: A NVMe Storage Virtualization Solution with Mediated Pass-Through MDev-NVMe: A NVMe Storage Virtualization Solution with Mediated Pass-Through Bo Peng 1,2, Haozhong Zhang 2, Jianguo Yao 1, Yaozu Dong 2, Yu Xu 1, Haibing Guan 1 1 Shanghai Key Laboratory of Scalable Computing

More information

Low-Overhead Ring-Buffer of Kernel Tracing in a Virtualization System

Low-Overhead Ring-Buffer of Kernel Tracing in a Virtualization System Low-Overhead Ring-Buffer of Kernel Tracing in a Virtualization System Yoshihiro Yunomae Linux Technology Center Yokohama Research Lab. Hitachi, Ltd. 1 Introducing 1. Purpose of a low-overhead ring-buffer

More information

Virtualization, Xen and Denali

Virtualization, Xen and Denali Virtualization, Xen and Denali Susmit Shannigrahi November 9, 2011 Susmit Shannigrahi () Virtualization, Xen and Denali November 9, 2011 1 / 70 Introduction Virtualization is the technology to allow two

More information

New Approach to OVS Datapath Performance. Founder of CloudNetEngine Jun Xiao

New Approach to OVS Datapath Performance. Founder of CloudNetEngine Jun Xiao New Approach to OVS Datapath Performance Founder of CloudNetEngine Jun Xiao Agenda VM virtual network datapath evolvement Technical deep dive on a new OVS datapath Performance comparisons Q & A 2 VM virtual

More information

6.033 Spring Lecture #6. Monolithic kernels vs. Microkernels Virtual Machines spring 2018 Katrina LaCurts

6.033 Spring Lecture #6. Monolithic kernels vs. Microkernels Virtual Machines spring 2018 Katrina LaCurts 6.033 Spring 2018 Lecture #6 Monolithic kernels vs. Microkernels Virtual Machines 1 operating systems enforce modularity on a single machine using virtualization in order to enforce modularity + build

More information

PARAVIRTUAL RDMA DEVICE

PARAVIRTUAL RDMA DEVICE 12th ANNUAL WORKSHOP 2016 PARAVIRTUAL RDMA DEVICE Aditya Sarwade, Adit Ranadive, Jorgen Hansen, Bhavesh Davda, George Zhang, Shelley Gong VMware, Inc. [ April 5th, 2016 ] MOTIVATION User Kernel Socket

More information

KVM Weather Report. Amit Shah SCALE 14x

KVM Weather Report. Amit Shah SCALE 14x KVM Weather Report amit.shah@redhat.com SCALE 14x Copyright 2016, Licensed under the Creative Commons Attribution-ShareAlike License, CC-BY-SA. Virtualization Stack Virtualization Stack 3 QEMU Creates

More information

Accelerating VM networking through XDP. Jason Wang Red Hat

Accelerating VM networking through XDP. Jason Wang Red Hat Accelerating VM networking through XDP Jason Wang Red Hat Agenda Kernel VS userspace Introduction to XDP XDP for VM Use cases Benchmark and TODO Q&A Kernel Networking datapath TAP A driver to transmit

More information

Cloud Computing Virtualization

Cloud Computing Virtualization Cloud Computing Virtualization Anil Madhavapeddy anil@recoil.org Contents Virtualization. Layering and virtualization. Virtual machine monitor. Virtual machine. x86 support for virtualization. Full and

More information

IsoStack Highly Efficient Network Processing on Dedicated Cores

IsoStack Highly Efficient Network Processing on Dedicated Cores IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single

More information

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC,

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC, Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC, QorIQ, StarCore and Symphony are trademarks of Freescale

More information

Advanced Computer Networks. End Host Optimization

Advanced Computer Networks. End Host Optimization Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct

More information

RESOURCE MANAGEMENT MICHAEL ROITZSCH

RESOURCE MANAGEMENT MICHAEL ROITZSCH Department of Computer Science Institute for System Architecture, Operating Systems Group RESOURCE MANAGEMENT MICHAEL ROITZSCH AGENDA done: time, drivers today: misc. resources architectures for resource

More information

Part 1: Introduction to device drivers Part 2: Overview of research on device driver reliability Part 3: Device drivers research at ERTOS

Part 1: Introduction to device drivers Part 2: Overview of research on device driver reliability Part 3: Device drivers research at ERTOS Some statistics 70% of OS code is in device s 3,448,000 out of 4,997,000 loc in Linux 2.6.27 A typical Linux laptop runs ~240,000 lines of kernel code, including ~72,000 loc in 36 different device s s

More information

Storage Performance Tuning for FAST! Virtual Machines

Storage Performance Tuning for FAST! Virtual Machines Storage Performance Tuning for FAST! Virtual Machines Fam Zheng Senior Software Engineer LC3-2018 Outline Virtual storage provisioning NUMA pinning VM configuration options Summary Appendix 2 Virtual storage

More information

SCONE: Secure Linux Container Environments with Intel SGX

SCONE: Secure Linux Container Environments with Intel SGX SCONE: Secure Linux Container Environments with Intel SGX S. Arnautov, B. Trach, F. Gregor, Thomas Knauth, and A. Martin, Technische Universität Dresden; C. Priebe, J. Lind, D. Muthukumaran, D. O'Keeffe,

More information

Introduction to Oracle VM (Xen) Networking

Introduction to Oracle VM (Xen) Networking Introduction to Oracle VM (Xen) Networking Dongli Zhang Oracle Asia Research and Development Centers (Beijing) dongli.zhang@oracle.com May 30, 2017 Dongli Zhang (Oracle) Introduction to Oracle VM (Xen)

More information

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC,

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC, Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC, QorIQ, StarCore and Symphony are trademarks of Freescale

More information

RDMA-like VirtIO Network Device for Palacios Virtual Machines

RDMA-like VirtIO Network Device for Palacios Virtual Machines RDMA-like VirtIO Network Device for Palacios Virtual Machines Kevin Pedretti UNM ID: 101511969 CS-591 Special Topics in Virtualization May 10, 2012 Abstract This project developed an RDMA-like VirtIO network

More information

I/O and virtualization

I/O and virtualization I/O and virtualization CSE-C3200 Operating systems Autumn 2015 (I), Lecture 8 Vesa Hirvisalo Today I/O management Control of I/O Data transfers, DMA (Direct Memory Access) Buffering Single buffering Double

More information

Faithful Virtualization on a Real-Time Operating System

Faithful Virtualization on a Real-Time Operating System Faithful Virtualization on a Real-Time Operating System Henning Schild Adam Lackorzynski Alexander Warg Technische Universität Dresden Department of Computer Science Operating Systems Group 01062 Dresden

More information

ntop Users Group Meeting

ntop Users Group Meeting ntop Users Group Meeting PF_RING Tutorial Alfredo Cardigliano Overview Introduction Installation Configuration Tuning Use cases PF_RING Open source packet processing framework for

More information

Tolerating Malicious Drivers in Linux. Silas Boyd-Wickizer and Nickolai Zeldovich

Tolerating Malicious Drivers in Linux. Silas Boyd-Wickizer and Nickolai Zeldovich XXX Tolerating Malicious Drivers in Linux Silas Boyd-Wickizer and Nickolai Zeldovich How could a device driver be malicious? Today's device drivers are highly privileged Write kernel memory, allocate memory,...

More information

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36 The Challenges of X86 Hardware Virtualization GCC- Virtualization: Rajeev Wankar 36 The Challenges of X86 Hardware Virtualization X86 operating systems are designed to run directly on the bare-metal hardware,

More information

Zero-copy Receive for Virtualized Network Devices

Zero-copy Receive for Virtualized Network Devices Zero-copy Receive for Virtualized Network Devices Kalman Meth, Joel Nider and Mike Rapoport IBM Research Labs - Haifa {meth,joeln,rapoport}@il.ibm.com Abstract. When receiving network traffic on guest

More information

KVM on POWER Status update & IO Architecture

KVM on POWER Status update & IO Architecture KVM on POWER Status update & IO Architecture Benjamin Herrenschmidt benh@au1.ibm.com IBM Linux Technology Center November 2012 Linux is a registered trademark of Linus Torvalds. Reminders 2 different virtualization

More information

Virtio-blk Performance Improvement

Virtio-blk Performance Improvement Virtio-blk Performance Improvement Asias He , Red Hat Nov 8, 2012, Barcelona, Spain KVM FORUM 2012 1 Storage transport choices in KVM Full virtualization : IDE, SATA, SCSI Good guest

More information

Generic Buffer Sharing Mechanism for Mediated Devices

Generic Buffer Sharing Mechanism for Mediated Devices Generic Buffer Sharing Mechanism for Mediated Devices Tina Zhang tina.zhang@intel.com 1 Agenda Background Generic Buffer Sharing in MDEV Framework Status Summary 2 Virtual Function I/O Virtual Function

More information

RESOURCE MANAGEMENT MICHAEL ROITZSCH

RESOURCE MANAGEMENT MICHAEL ROITZSCH Faculty of Computer Science Institute of Systems Architecture, Operating Systems Group RESOURCE MANAGEMENT MICHAEL ROITZSCH AGENDA done: time, drivers today: misc. resources architectures for resource

More information

Support for Smart NICs. Ian Pratt

Support for Smart NICs. Ian Pratt Support for Smart NICs Ian Pratt Outline Xen I/O Overview Why network I/O is harder than block Smart NIC taxonomy How Xen can exploit them Enhancing Network device channel NetChannel2 proposal I/O Architecture

More information

An O/S perspective on networks: Active Messages and U-Net

An O/S perspective on networks: Active Messages and U-Net An O/S perspective on networks: Active Messages and U-Net Theo Jepsen Cornell University 17 October 2013 Theo Jepsen (Cornell University) CS 6410: Advanced Systems 17 October 2013 1 / 30 Brief History

More information

Overview. This Lecture. Interrupts and exceptions Source: ULK ch 4, ELDD ch1, ch2 & ch4. COSC440 Lecture 3: Interrupts 1

Overview. This Lecture. Interrupts and exceptions Source: ULK ch 4, ELDD ch1, ch2 & ch4. COSC440 Lecture 3: Interrupts 1 This Lecture Overview Interrupts and exceptions Source: ULK ch 4, ELDD ch1, ch2 & ch4 COSC440 Lecture 3: Interrupts 1 Three reasons for interrupts System calls Program/hardware faults External device interrupts

More information

Revisiting virtualized network adapters

Revisiting virtualized network adapters Revisiting virtualized network adapters Luigi Rizzo, Giuseppe Lettieri, Vincenzo Maffione, Università di Pisa, Italy rizzo@iet.unipi.it, http://info.iet.unipi.it/ luigi/vale/ Draft 5 feb 2013. Please do

More information

RESOURCE MANAGEMENT MICHAEL ROITZSCH

RESOURCE MANAGEMENT MICHAEL ROITZSCH Faculty of Computer Science Institute of Systems Architecture, Operating Systems Group RESOURCE MANAGEMENT MICHAEL ROITZSCH AGENDA done: time, drivers today: misc. resources architectures for resource

More information

VIRTUALIZATION. Dresden, 2013/12/3. Julian Stecklina

VIRTUALIZATION. Dresden, 2013/12/3. Julian Stecklina Department of Computer Science Institute of Systems Architecture, Operating Systems Group VIRTUALIZATION Julian Stecklina (jsteckli@os.inf.tu-dresden.de) Dresden, 2013/12/3 00 Goals Give you an overview

More information

ARM-KVM: Weather Report Korea Linux Forum

ARM-KVM: Weather Report Korea Linux Forum ARM-KVM: Weather Report Korea Linux Forum Mario Smarduch Senior Virtualization Architect m.smarduch@samsung.com 1 ARM-KVM This Year Key contributors Linaro, ARM Access to documentation & specialized HW

More information

Virtual Machine Virtual Machine Types System Virtual Machine: virtualize a machine Container: virtualize an OS Program Virtual Machine: virtualize a process Language Virtual Machine: virtualize a language

More information

Multi-Hypervisor Virtual Machines: Enabling An Ecosystem of Hypervisor-level Services

Multi-Hypervisor Virtual Machines: Enabling An Ecosystem of Hypervisor-level Services Multi-Hypervisor Virtual Machines: Enabling An Ecosystem of Hypervisor-level s Kartik Gopalan, Rohith Kugve, Hardik Bagdi, Yaohui Hu Binghamton University Dan Williams, Nilton Bila IBM T.J. Watson Research

More information

Advanced Operating Systems (CS 202) Virtualization

Advanced Operating Systems (CS 202) Virtualization Advanced Operating Systems (CS 202) Virtualization Virtualization One of the natural consequences of the extensibility research we discussed What is virtualization and what are the benefits? 2 Virtualization

More information

Network stack virtualization for FreeBSD 7.0. Marko Zec

Network stack virtualization for FreeBSD 7.0. Marko Zec Network stack virtualization for FreeBSD 7.0 Marko Zec zec@fer.hr University of Zagreb Network stack virtualization for FreeBSD 7.0 slide 1 of 18 Talk outline Network stack virtualization what, why, and

More information

Introduction Construction State of the Art. Virtualization. Bernhard Kauer OS Group TU Dresden Dresden,

Introduction Construction State of the Art. Virtualization. Bernhard Kauer OS Group TU Dresden Dresden, Virtualization Bernhard Kauer OS Group TU Dresden bk@vmmon.org Dresden, 2010-07-15 Motivation The vision: general-purpose OS secure trustworthy small fast fancy First problem: Legacy Application Supporting

More information

Live Migration with Mdev Device

Live Migration with Mdev Device Live Migration with Mdev Device Yulei Zhang yulei.zhang@intel.com 1 Background and Motivation Live Migration Desgin of Mediated Device vgpu Live Migration Implementation Current Status and Demo Future

More information

Using (Suricata over) PF_RING for NIC-Independent Acceleration

Using (Suricata over) PF_RING for NIC-Independent Acceleration Using (Suricata over) PF_RING for NIC-Independent Acceleration Luca Deri Alfredo Cardigliano Outlook About ntop. Introduction to PF_RING. Integrating PF_RING with

More information

VT-d Posted Interrupts. Feng Wu, Jun Nakajima <Speaker> Intel Corporation

VT-d Posted Interrupts. Feng Wu, Jun Nakajima <Speaker> Intel Corporation VT-d Posted Interrupts Feng Wu, Jun Nakajima Intel Corporation Agenda Motivation Difference btw CPU-based and VT-d Posted Interrupts Architecture Implementation Details Performance Summary 2

More information

Virtual Open Systems (VOSyS)

Virtual Open Systems (VOSyS) Virtual Open Systems (VOSyS) 2018-06-14 Company Profile contact@virtualopensystems.com 2018-05-05www.virtualopensystems.com Virtual Open Systems: Profile Virtual Open Systems (VOSyS) is a French fully

More information

Re-architecting Virtualization in Heterogeneous Multicore Systems

Re-architecting Virtualization in Heterogeneous Multicore Systems Re-architecting Virtualization in Heterogeneous Multicore Systems Himanshu Raj, Sanjay Kumar, Vishakha Gupta, Gregory Diamos, Nawaf Alamoosa, Ada Gavrilovska, Karsten Schwan, Sudhakar Yalamanchili College

More information