Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

Similar documents
Achieve Low Latency NFV with Openstack*

Red Hat Enterprise Virtualization Hypervisor Roadmap. Bhavna Sarathy Senior Technology Product Manager, Red Hat

Nova Scheduler: Optimizing, Configuring and Deploying NFV VNF's on OpenStack

Who stole my CPU? Leonid Podolny Vineeth Remanan Pillai. Systems DigitalOcean

A Userspace Packet Switch for Virtual Machines

KVM Weather Report. Amit Shah SCALE 14x

KVM 在 OpenStack 中的应用. Dexin(Mark) Wu

TITANIUM CLOUD VIRTUALIZATION PLATFORM

Virtio-blk Performance Improvement

Configuring and Benchmarking Open vswitch, DPDK and vhost-user. Pei Zhang ( 张培 ) October 26, 2017

Vhost and VIOMMU. Jason Wang (Wei Xu Peter Xu

KVM PERFORMANCE OPTIMIZATIONS INTERNALS. Rik van Riel Sr Software Engineer, Red Hat Inc. Thu May

VDPA: VHOST-MDEV AS NEW VHOST PROTOCOL TRANSPORT

Status Update About COLO FT

Virtual Open Systems (VOSyS)

INSTALLATION RUNBOOK FOR Netronome Agilio OvS. MOS Version: 8.0 OpenStack Version:

Deflating the hype: Embedded Virtualization in 3 steps

DEPLOYING NFV: BEST PRACTICES

Red Hat OpenStack Platform 10

Real Safe Times in the Jailhouse Hypervisor Unrestricted Siemens AG All rights reserved

DPDK Vhost/Virtio Performance Report Release 17.08

Installing the Cisco IOS XRv 9000 Router in KVM Environments

HP Helion OpenStack Carrier Grade 1.1: Release Notes

Changpeng Liu, Cloud Software Engineer. Piotr Pelpliński, Cloud Software Engineer

Applying Polling Techniques to QEMU

QLOGIC SRIOV Fuel Plugin Documentation

Changpeng Liu. Senior Storage Software Engineer. Intel Data Center Group

HPE Helion OpenStack Carrier Grade 1.1 Release Notes HPE Helion

Xen on ARM. How fast is it, really? Stefano Stabellini. 18 August 2014

DPDK Vhost/Virtio Performance Report Release 18.05

KVM as The NFV Hypervisor

Deep Insights: High Availability VMs via a Simple Host-to-Guest Interface OpenStack Masakari Greg Waines (Wind River Systems)

KVM Virtualized I/O Performance

DPDK Vhost/Virtio Performance Report Release 18.11

1. What is Cloud Computing (CC)? What are the Pros and Cons of CC? Technologies of CC 27

VIRTIO: VHOST DATA PATH ACCELERATION TORWARDS NFV CLOUD. CUNMING LIANG, Intel

Helion OpenStack Carrier Grade 4.0 RELEASE NOTES

Zhang Chen Zhang Chen Copyright 2017 FUJITSU LIMITED

For Performance and Latency, not for Fun

CHALLENGES FOR NEXT GENERATION FOG AND IOT COMPUTING: PERFORMANCE OF VIRTUAL MACHINES ON EDGE DEVICES WEDNESDAY SEPTEMBER 14 TH, 2016 MARK DOUGLAS

Implementing a TCP Broadband Speed Test in the Cloud for Use in an NFV Infrastructure

Network stack virtualization for FreeBSD 7.0. Marko Zec

Data Path acceleration techniques in a NFV world

Live block device operations in QEMU

KVM Weather Report. Red Hat Author Gleb Natapov May 29, 2013

WIND RIVER TITANIUM CLOUD FOR TELECOMMUNICATIONS

VIRTIO-NET: VHOST DATA PATH ACCELERATION TORWARDS NFV CLOUD. CUNMING LIANG, Intel

Changpeng Liu. Cloud Storage Software Engineer. Intel Data Center Group

Developing cloud infrastructure from scratch: the tale of an ISP

Cloud and Datacenter Networking

Performance Optimization on Huawei Public and Private Cloud

Painless switch from proprietary hypervisor to QEMU/KVM. Denis V. Lunev

Next Gen Virtual Switch. CloudNetEngine Founder & CTO Jun Xiao

New Approach to OVS Datapath Performance. Founder of CloudNetEngine Jun Xiao

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks

Accelerating NVMe I/Os in Virtual Machine via SPDK vhost* Solution Ziye Yang, Changpeng Liu Senior software Engineer Intel

Performance and Scalability of Server Consolidation

Red Hat OpenStack Platform 13

Landslide Virtual Machine Installation Instructions

Red Hat Virtualization 4.1 Technical Presentation May Adapted for MSP RHUG Greg Scott

Introduction to OpenStack Trove

VGA Assignment Using VFIO. Alex Williamson October 21 st, 2013

Real Time BoF ELC 2012

Improve VNF safety with Vhost-User/DPDK IOMMU support

Vhost dataplane in Qemu. Jason Wang Red Hat

Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen

Low overhead virtual machines tracing in a cloud infrastructure

Contrail Release Release Notes

Libvirt: a virtualization API and beyond

Course Review. Hui Lu

Agenda. About us Why para-virtualize RDMA Project overview Open issues Future plans

Agilio CX 2x40GbE with OVS-TC

Fuel VMware DVS plugin testing documentation

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System

Agenda How DPDK can be used for your Application DPDK Ecosystem boosting your Development Meet the Community Challenges

Measuring a 25 Gb/s and 40 Gb/s data plane

Hypervisors on ARM Overview and Design choices

Accelerating NVMe-oF* for VMs with the Storage Performance Development Kit

Installing the Cisco CSR 1000v in KVM Environments

GRNET Cloud Services

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware

ARM-KVM: Weather Report Korea Linux Forum

Storage Performance Tuning for FAST! Virtual Machines

Finding your way through the QEMU parameter jungle

Automatic NUMA Balancing. Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master Technologist, HP

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC,

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

OpenNebula on VMware: Cloud Reference Architecture

Contrail Release Release Notes

KVM on Embedded Power Architecture Platforms

Chapter 5 C. Virtual machines

libvirt integration and testing for enterprise KVM/ARM Drew Jones, Eric Auger Linaro Connect Budapest 2017 (BUD17)

CLOUD ARCHITECTURE & PERFORMANCE WORKLOADS. Field Activities

Nested Virtualization and Server Consolidation

Real-Time Internet of Things

Build Cloud like Rackspace with OpenStack Ansible

Towards Massive Server Consolidation

Yabusame: Postcopy Live Migration for Qemu/KVM

XEN and KVM in INFN production systems and a comparison between them. Riccardo Veraldi Andrea Chierici INFN - CNAF HEPiX Spring 2009

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC,

Transcription:

Siemens Corporate Technology August 2015 Real-Time KVM for the Masses Unrestricted Siemens AG 2015. All rights reserved

Real-Time KVM for the Masses Agenda Motivation & requirements Reference architecture Compute node setup Open Stack adaptions Summary & outlook Page 2 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Real-Time Virtualization Drivers Communication systems (media streaming & switching, etc.) Trading systems (stocks, goods, etc.) Control systems (industry, healthcare, transportation, etc.) => Consolidation => Hardware standardization => Simpler maintenance => Fast fail-over Images: Ethernet switch by Ben Stanfield, licensed under CC BY-SA 2.0, stock market by Katrina.Tuliao, licensed under CC BY 2.0 Page 3 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Real-Time KVM is working! Can I have it in the cloud? Page 4 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Real-Time Clouds? No Problem! Oh, you wanna do I/O as well... Page 5 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Real-Time Connectivity Required Page 6 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Realistic Deployments Requirement: Fast enough links to close loops in time Data acquisition (physical world input) Transfer to VM Data processing ( in VM on RT-KVM) Transfer back Data application (physical world output) That means Private cloud / data center / server cluster close to physical process RT VMs will require access to special networks Isolated standard networks Real-time Ethernets Field buses Page 7 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Confining the Real-Time Scope No QEMU in the loop (feasible but much harder) No RT disks (no use case yet, non-deterministic backends) I/O via Ethernet (common denominator) No device pass-through (feasible but complex) No live migration while RT-operational (out of reach so far) The reduced RT bill of material RT CPUs RT network Page 8 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Management Layers Moving from the lab... Hand-crafted deployments & starter scripts Individual hosts Some dozen VMs per host...into the data center Hundreds of VMs, both RT and non-rt Many networks, also both RT and non-rt Flexible management and accounting models Cloud-grade, RT-capable managements stack required => OpenStack Broadly used for private clouds Good integration with KVM Page 9 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Reference Architecture RT Guest Nova EMU Controller Services PREEMPT-RT with KVM Real-time CPUs (isolated) Compute Node Hardware Best-effort CPUs Separate HW or in VM Page 10 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Real-Time Network Access Options Emulation Pass-through Para-virtual devices => virtio Need for RT data plane vhost-net: in host kernel vhost-user: in separate userspace process vhost-user enables more RT tuning DPDK-based switch/router Aggressive polling on interfaces, less event signaling Only irqfd (eventfd) from vhost process to vcpu thread Page 11 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Reference Architecture (with Networking) RT Switch/ Router vhostuser vnic RT Guest EMU Nova Controller Services PREEMPT-RT with KVM pnic Real-time CPUs (isolated) Compute Node Hardware Best-effort CPUs Separate HW or in VM Page 12 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Compute Node Setup PREEMPT-RT as host kernel Configuration and tuning according to https://rt.wiki.kernel.org Tune power management at kernel and also BIOS-level See also Rik van Riel's slides (KVM Forum 2015) Set up isolcpus for 2 sets vcpu threads RT switch data plane threads Sufficient non-isolated CPUs required Management processes & threads QEMU event threads We use rcu_nocbs == isolcpus so far (but not nohz_full found no relevant impact on worst-case latency) Page 13 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Compute Node Setup (2) Think about RT thread throttling /proc/sys/kernel/sched_rt_period_us /proc/sys/kernel/sched_rt_runtime_us May suspend busy RT guests But infinitely looping RT guests can starve the host! isolcpus does not affect IRQ affinities Needs fine tuning via script and/or irqbalanced Even more tuning feasible... But... do your guests need really this? Page 14 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Simplifying the Setup Bad news: Good news: Still lots of tuning... Can be replicated to similar hosts Better news: There is a tooling framework! https://github.com/openenealinux/rt-tools.git partrt - Create real time CPU partitions on SMP Linux Usage: partrt [options] <cmd> partrt [options] create [cmd-options] [cpumask] partrt [options] undo [cmd-options] partrt [options] run [cmd-options] <partition> <command> partrt [options] move [cmd-options] <pid> <partition> partrt [options] list [cmd-options] Uses cgroups + various tricks, avoids isolcpus (=> no reboot) Pending full evaluation, seems reusable so far Page 15 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

RT-KVM Control via libvirt libvirt only executing higher layers' commands, no own policies All required controls upstream since 1.2.13 For RT-vCPUs Pinning Scheduling parameters setting (policy, priority) Memory locking For RT networks QEMU settings to allow sharing of guest RAM with vhost-user process Connecting VM NICs to specific vhost-user ports (identification via socket path) Page 16 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

OpenStack Support for Real-Time Nova Compute Several pieces already available vcpu pinning pcpu dedication RT Blueprint under discussion (https://review.openstack.org/#/c/139688 ) Introduces flavor property hw:cpu_realtime Allows tagging of instances and images Requires hw:cpu_policy = dedicated Selects QEMU memory locking vcpu thread policy & priority tuning Deficits Hard-coded and inappropriate policy/priority (RR, prio 1) 2 nd CPU mask required to differentiate between RT and non-rt pcpus Page 17 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Real-Time Nova Compute Status Patches by Sahid Ferdjaoui, Red Hat https://review.openstack.org/#/q/status:open+project:openstack/ nova+branch:master+topic:bp/libvirt-real-time,n,z Implements current blueprint over git master Not accepted for Liberty Blueprint needs to be merged first but window already closed New target: Mitaka Currently integrating Sahid's patches into our deployment Plan to come up with extensions to blueprint and code Page 18 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

OpenStack Support for Real-Time Neutron Networking If Neutron shall manage IP assignment for RT networks all done But RT networks tend to be special Network addresses managed by guests or externally Possibly no TCP/IP at all => new network type required Neutron patches work in progress @Siemens Introduce unmanaged networks (IP-free, no DHCP,... ) Agents on compute nodes will report connectivity (availability of specific physical networks) Page 19 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Results? void get_measurements(void); Page 20 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Summary & Outlook Simplify real-time for data centers & similar setups Standardize setup of basic RT scenarios Make RT VMs manageable and accountable Full RT stack of KVM & OpenStack feasible Baseline: PREEMPT-RT Standard QEMU & libvirt Patches for Nova and Neutron required Compute node tuning remains improvable Future work RT PCI device assignment (challenge: IRQ management) Compute node setup using rt-tools/partrt RT device emulation (requires reworked QEMU patches) Page 21 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved

Any Questions? Thank you! Jan Kiszka <jan.kiszka@siemens.com> Page 22 August 2015 Jan Kiszka, Corporate Technology Unrestricted Siemens AG 2015. All rights reserved