KVM 在 OpenStack 中的应用. Dexin(Mark) Wu

Similar documents
Virtualization at Scale in SUSE Linux Enterprise Server

Red Hat Enterprise Virtualization Hypervisor Roadmap. Bhavna Sarathy Senior Technology Product Manager, Red Hat

Tuning Your SUSE Linux Enterprise Virtualization Stack. Jim Fehlig Software Engineer

KVM Weather Report. Amit Shah SCALE 14x

Storage Performance Tuning for FAST! Virtual Machines

libvirt integration and testing for enterprise KVM/ARM Drew Jones, Eric Auger Linaro Connect Budapest 2017 (BUD17)

A block layer overview. Red Hat Kevin Wolf 8 November 2012

Live Block Device Operations in QEMU. Kashyap Chamarthy FOSDEM 2018 Brussels

KVM PERFORMANCE OPTIMIZATIONS INTERNALS. Rik van Riel Sr Software Engineer, Red Hat Inc. Thu May

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

Cloud environment with CentOS, OpenNebula and KVM

Increase KVM Performance/Density

Live block device operations in QEMU

qcow2 Red Hat Kevin Wolf 15 August 2011

Developing cloud infrastructure from scratch: the tale of an ISP

Red Hat Enterprise Linux OpenStack Platform User Group.

Linux/QEMU/Libvirt. 4 Years in the Trenches. Chet Burgess Cisco Systems Scale 14x Sunday January 24th

Agenda. About us Why para-virtualize RDMA Project overview Open issues Future plans

Memory Externalization With userfaultfd Red Hat, Inc.

Low overhead virtual machines tracing in a cloud infrastructure

Libvirt presentation and perspectives. Daniel Veillard

Deterministic Storage Performance

Part2: Let s pick one cloud IaaS middleware: OpenStack. Sergio Maffioletti

Deterministic Storage Performance

Improve VNF safety with Vhost-User/DPDK IOMMU support

DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture. Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr.

virtio-mem: Paravirtualized Memory

Virtio 1 - why do it? And - are we there yet? Michael S. Tsirkin Red Hat

Red Hat Virtualization 4.1 Technical Presentation May Adapted for MSP RHUG Greg Scott

Ceph Block Devices: A Deep Dive. Josh Durgin RBD Lead June 24, 2015

Accelerating NVMe-oF* for VMs with the Storage Performance Development Kit

Changpeng Liu, Cloud Software Engineer. Piotr Pelpliński, Cloud Software Engineer

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks

Virtualization Management the ovirt way

ARM-KVM: Weather Report Korea Linux Forum

Effective Virtual CPU Configuration in Nova

Painless switch from proprietary hypervisor to QEMU/KVM. Denis V. Lunev

DEEP DIVE: OPENSTACK COMPUTE

Vhost dataplane in Qemu. Jason Wang Red Hat

Who stole my CPU? Leonid Podolny Vineeth Remanan Pillai. Systems DigitalOcean

Landslide Virtual Machine Installation Instructions

HP Helion OpenStack Carrier Grade 1.1: Release Notes

Accelerating NVMe I/Os in Virtual Machine via SPDK vhost* Solution Ziye Yang, Changpeng Liu Senior software Engineer Intel

Know your competition A review of qemu and KVM for System z

1. What is Cloud Computing (CC)? What are the Pros and Cons of CC? Technologies of CC 27

Installing the Cisco IOS XRv 9000 Router in KVM Environments

Nova Scheduler: Optimizing, Configuring and Deploying NFV VNF's on OpenStack

HPE Helion OpenStack Carrier Grade 1.1 Release Notes HPE Helion

Thanks for Live Snapshots, Where's Live Merge?

Integrating ovirt, Foreman And Katello To Empower Your Data-Center Utilization

DRAFT Pure Storage FlashArray OpenStack Cinder Volume Driver Setup Guide

Changpeng Liu. Senior Storage Software Engineer. Intel Data Center Group

Transparent Hugepage Support

Pure Storage OpenStack (Liberty) Cinder Driver Best Practices. Simon Dodsley, OpenStack Solutions Architect

KVM on s390: what's next?

Backing Chain Management in libvirt and qemu. Eric Blake KVM Forum, August 2015

A comparison of performance between KVM and Docker instances in OpenStack

Libvirt: a virtualization API and beyond

The speed of containers, the security of VMs

AutoNUMA Red Hat, Inc.

A product by CloudFounders. Wim Provoost Open vstorage

A Userspace Packet Switch for Virtual Machines

Pure Storage FlashArray OpenStack Cinder Volume Driver Setup Guide

Bacula Systems Virtual Machine Performance Backup Suite

INSTALLATION RUNBOOK FOR Triliodata + TrilioVault

Cost-Effective Virtual Petabytes Storage Pools using MARS. FrOSCon 2017 Presentation by Thomas Schöbel-Theuer

Kata Containers The way to run virtualized containers. Sebastien Boeuf, Linux Software Engineer Intel Corporation

Oracle Communications Diameter Signaling Router Cloud Benchmarking Guide

[Cinder] Support LVM on a shared LU

Installation runbook for Hedvig + Cinder Driver

Applying Polling Techniques to QEMU

AMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV)

Jim Harris Principal Software Engineer Intel Data Center Group

Linux Virtualization Update

Oracle Communications Diameter Signaling Router Cloud Benchmarking Guide

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

Oracle Communications Diameter Signaling Router Cloud Benchmarking Guide

Yabusame: Postcopy Live Migration for Qemu/KVM

Welcome to Manila: An OpenStack File Share Service. May 14 th, 2014

DEPLOYING NFV: BEST PRACTICES

KVM Weather Report. Red Hat Author Gleb Natapov May 29, 2013

Installing the Cisco CSR 1000v in KVM Environments

Agilio CX 2x40GbE with OVS-TC

Live Migration: Even faster, now with a dedicated thread!

Chapter 1: API Reference 1

Achieve Low Latency NFV with Openstack*

KVM Forum Vancouver, Daniel P. Berrangé

Red Hat OpenStack Platform 11

Deep Insights: High Availability VMs via a Simple Host-to-Guest Interface OpenStack Masakari Greg Waines (Wind River Systems)

Virtualization Management the ovirt way

Minimal OpenStack Starting Your OpenStack Journey

Red Hat Enterprise Linux 8.0 Beta

viommu/arm: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017

Next Gen Virtual Switch. CloudNetEngine Founder & CTO Jun Xiao

Accelerating VM networking through XDP. Jason Wang Red Hat

SurFS Product Description

Userspace NVMe Driver in QEMU

Architecture and terminology

RHEV in the weeds - special sauce! Marc Skinner

Introduction to OpenStack Trove

Transcription:

KVM 在 OpenStack 中的应用 Dexin(Mark) Wu

Agenda Overview CPU Memory Storage Network

Architecture Overview nova-api REST API nova-scheduler nova-conductor nova-compute DB RPC Call libvirt driver libvirt Cinder QMP Monitor qemu driver Neutron Qemu/KVM Storage disk driver tap Network switch router

Agenda Overview CPU Memory Storage Network

KVM/Qemu Model Each vm is a qemu process Each vcpu is a qemu thread Reuse Kernel facilities

Cgroup Weight quota:cpu_shares No hard limit Bandwidth Control quota:cpu_period quota:cpu_quota Can't exceed 'quota' ms in a period Gold Root 3072 2048 Silver A B C D 1024 1024 1024 1024 30% 30% 20% 20% 60% 40% 100%

CPU topology cpu0 cpu1 cpu2 cpu3 cpu8 cpu9 cpu10 cpu11 L1 I L1 D L2 Core 0 L1 I L1 D L2 Core 1 L1 I L1 D L2 Core 0 L1 I L1 D L2 Core 1 cpu4 cpu5 cpu6 cpu7 cpu12 cpu13 cpu14 cpu15 L1 I L1 D L2 Core 2 L1 I L1 D L2 Core 3 L1 I L1 D L2 Core 2 L1 I L1 D L2 Core 3 L3 L3 Local Socket 0 Remote Remote Socket 1 Local Memory Memory NUMA Node 0 NUMA Node 1

vcpu topology Benefit Remove licensing restrictions Improve performance by working with vcpu pinning Implemented in Juno * hw:cpu_sockets=nn - preferred number of sockets to expose to the guest * hw:cpu_cores=nn - preferred number of cores to expose to the guest * hw:cpu_threads=nn - preferred number of threads to expose to the guest * hw:cpu_max_sockets=nn - maximum number of sockets to expose to the guest * hw:cpu_max_cores=nn - maximum number of cores to expose to the guest * hw:cpu_max_threads=nn - maximum number of threads to expose to the guest

vnuma Benefit increase the effective utilization of compute resources Implemented in Juno virt-driver-numa-placement.rst * hw:numa_nodes=nn - num of NUMA nodes to expose to the guest. * hw:numa_mempolicy=preferred strict - memory allocation policy * hw:numa_cpus.0=<cpu-list> - mapping of vcpus N-M to NUMA node 0 * hw:numa_cpus.1=<cpu-list> - mapping of vcpus N-M to NUMA node 1 * hw:numa_mem.0=<ram-size> - mapping N GB of RAM to NUMA node 0 * hw:numa_mem.1=<ram-size> - mapping N GB of RAM to NUMA node 1 Qemu and libvirt dependencies -object memory-ram,size=1024m,policy=bind,host-nodes=0,id=ram-node0 \ -numa node,nodeid=0,cpus=0,memdev=ram-node0

Other Features vcpu Pinning Approved in Kilo: virt-driver-cpu-pinning.rst Dedicated CPU Forbid overcommit of CPU vcpu hotplug 'live-resize' proposed, but not approved yet. virsh command setvcpus domain count live Auto oneline new vcpu in guest udev rule Guest agent

Agenda Overview CPU Memory Storage Network

Physical memory virtualization Guest physical memory is mapped into qemu virtual address space Mapping is maintained in memory slots Qemu use malloc or mmap to allocate memory Reuse kernel memory feature Overcommit Hugepage KSM

Memory Hugepage Approved in Kilo: virt-driver-large-pages.rst Benefit increase TLB hit ratio less page table footprint Why not THP? No hard guarantees

Memory Balloon (1) Memory Overcommit Guest Guest Guest Qemu Qemu Qemu Inflate Deflate Balloon device is added by default Missing Overcommit Manager

Memory Balloon (2) Guest Memory Stats Query More detailed and accurate Re-enabled by polling instead of asynchronous Not real time Nova support available in Juno Guest CONF.libvirt.mem_stats_period_seconds Ceilometer support available in Kilo Balloon Thread fetch last update synchronously Polling Client Memory Stats Qemu

Memory Hotplug Added in qemu 2.1 Libvirt support is under development Qemu commands (qemu) object_add memory-backend-ram,id=ram1,size=1g (qemu) device_add pc-dimm,id=d1,memdev=ram1 Auto online via udev SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1" SUBSYSTEM=="memory", ACTION=="add", TEST=="state", ATTR{state}=="offline", ATTR{state}="online"

Agenda Overview CPU Memory Storage Network

Storage Architecture Frontend IDE, SCSI, Virtio Image format Raw, Qcow2, QED, VMDK Backend File, host, ceph, glusterfs, sheepdog, iscsi

Cache Mode Cache Mode Host Page Cache Guest Disk Cache Semantics none No Yes direct directsync No No direct+ flush writeback Yes Yes writethrough Yes No writeback + flush nosafe Yes Yes, but flush is ignored writeback - flush Configuration direct = O_DIRECT flush = fdatasync or fsync disk_cachemodes= file=directsync,block=none Is writeback safe? data lost on power failure data corruption Guest FS barrier Live migration

I/O throttling Why not cgroup? Exposed by Cinder qos spec Currently missing online update support New version qemu re-implements throttling based on leaky bucket Support burst Missing cluster-level I/O throttling

Discard Return freed blocks to the storage Two underlying specifications ATA TRIM Command SCSI UNMAP Nova configuration hw_disk_discard=unmap Image metadata hw_scsi_model=virtio-scsi Issued from guest fstrim, mount option '-o discard' Supported in file,qcow2,rbd,glusterfs,sheepdog,iscsi

Virtio SCSI vhba Improve scalability Enable advanced SCSI features Recognized as 'sda', not vda vhost-scsi Better performance No format driver support Disallow live migration

Other features Snapshot quiesced-image-snapshots-with-qemu-guestagent.rst driver-mirror storage live migration Multi-queue virtio-disk

Agenda Overview CPU Memory Storage Network

Network Vhost-net Less context switch Zero-copy transmit Vhost-net + macvtap +sriov Live migration Multi-queue virtio NIC Scale performance with vcpu increase Vhost-user Approved in Kilo Userspace equivalent of vhost-net Used with userspace switch

Reference http://www.slideshare.net/meituan/kvmopt-osforce-27669119 http://www.linux-kvm.org/wiki/images/7/7b/kvm-forum-2013-openstack.pdf http://www.linux-kvm.org/wiki/images/f/f6/01x07a-vhost.pdf http://www.virtualizemydc.ca/2014/01/26/understanding-vnuma-virtual-non-uniform-memory-access/ http://www.searchtb.com/2012/12/%e7%8e%a9%e8%bd%accpu-topology.html http://www.virtualopensystems.com/en/solutions/guides/snabbswitch-qemu/ http://log.amitshah.net/wp-content/uploads/2014/11/virt-6-7-centos-dojo.pdf

Thanks! Email: wudx@awcloud.com