Performance Optimization on Huawei Public and Private Cloud

Similar documents
IN the cloud, the number of virtual CPUs (VCPUs) in a virtual

For Performance and Latency, not for Fun

Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen

Towards Fair and Efficient SMP Virtual Machine Scheduling

ARM Interrupt Virtualization. Andre Przywara

Live Migration with Mdev Device

KVM Weather Report. Red Hat Author Gleb Natapov May 29, 2013

Analysis of Virtual Machine Scalability based on Queue Spinlock

To EL2, and Beyond! connect.linaro.org. Optimizing the Design and Implementation of KVM/ARM

Dirty Memory Tracking for Performant Checkpointing Solutions

KVM PERFORMANCE OPTIMIZATIONS INTERNALS. Rik van Riel Sr Software Engineer, Red Hat Inc. Thu May

CFS-v: I/O Demand-driven VM Scheduler in KVM

KVM Weather Report. Amit Shah SCALE 14x

VT-d Posted Interrupts. Feng Wu, Jun Nakajima <Speaker> Intel Corporation

Latency on preemptible Real-Time Linux

FOSDEM 2019

Vhost and VIOMMU. Jason Wang (Wei Xu Peter Xu

Monitoring and Analyzing Virtual Machines Resource Overcommitment Detection and Virtual Machine Classification

Xen scheduler status. George Dunlap Citrix Systems R&D Ltd, UK

CS-580K/480K Advanced Topics in Cloud Computing. VM Virtualization II


Changpeng Liu. Cloud Storage Software Engineer. Intel Data Center Group

Making Nested Virtualization Real by Using Hardware Virtualization Features

Intel Graphics Virtualization on KVM. Aug KVM Forum 2011 Rev. 3

Who stole my CPU? Leonid Podolny Vineeth Remanan Pillai. Systems DigitalOcean

Optimizing and Enhancing VM for the Cloud Computing Era. 20 November 2009 Jun Nakajima, Sheng Yang, and Eddie Dong

Chapter 5 C. Virtual machines

What is KVM? KVM patch. Modern hypervisors must do many things that are already done by OSs Scheduler, Memory management, I/O stacks

Painless switch from proprietary hypervisor to QEMU/KVM. Denis V. Lunev

VMware vsphere 4: The CPU Scheduler in VMware ESX 4 W H I T E P A P E R

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

KVM for IA64. Anthony Xu

Service Edge Virtualization - Hardware Considerations for Optimum Performance

Nested Virtualization Update From Intel. Xiantao Zhang, Eddie Dong Intel Corporation

Virtualization. Pradipta De

Scheduler Activations for Interference-Resilient SMP Virtual Machine Scheduling

Virtual Machines. Part 2: starting 19 years ago. Operating Systems In Depth IX 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

RT#Xen:(Real#Time( Virtualiza2on(for(the(Cloud( Chenyang(Lu( Cyber-Physical(Systems(Laboratory( Department(of(Computer(Science(and(Engineering(

The RCU-Reader Preemption Problem in VMs

Interrupt Coalescing in Xen

Lecture 5. KVM for ARM. Christoffer Dall and Jason Nieh. 5 November, Operating Systems Practical. OSP Lecture 5, KVM for ARM 1/42

Schedule Processes, not VCPUs

Nested Virtualization on ARM

Micro VMMs and Nested Virtualization

SFO17-403: Optimizing the Design and Implementation of KVM/ARM

Virtualization. ...or how adding another layer of abstraction is changing the world. CIS 399: Unix Skills University of Pennsylvania.

Multiprocessor System. Multiprocessor Systems. Bus Based UMA. Types of Multiprocessors (MPs) Cache Consistency. Bus Based UMA. Chapter 8, 8.

Status Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service)

Live Migration: Even faster, now with a dedicated thread!

PROTECTING VM REGISTER STATE WITH AMD SEV-ES DAVID KAPLAN LSS 2017

Multiprocessor Systems. COMP s1

Real-time scheduling for virtual machines in SK Telecom

KVM on s390: what's next?

Enhancing Linux Scheduler Scalability

BUD17-301: KVM/ARM Nested Virtualization. Christoffer Dall

RT- Xen: Real- Time Virtualiza2on. Chenyang Lu Cyber- Physical Systems Laboratory Department of Computer Science and Engineering

Multiprocessor Systems. Chapter 8, 8.1

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System

Demand-Based Coordinated Scheduling for SMP VMs

Fakultät Informatik Institut für Systemarchitektur, Betriebssysteme THE NOVA KERNEL API. Julian Stecklina

KVM as The NFV Hypervisor

AMD SEV Update Linux Security Summit David Kaplan, Security Architect

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization:

Nested Virtualizationon ARM

PROCESS SCHEDULING II. CS124 Operating Systems Fall , Lecture 13

Oracle 1Z Oracle VM 2 for x86 Essentials.

CS 431/531 Introduction to Performance Measurement, Modeling, and Analysis Winter 2019

Performance and Scalability of Server Consolidation

Towards Co-existing of Linux and Real-Time OSes

Virtualization History and Future Trends

Preemptable Ticket Spinlocks: Improving Consolidated Performance in the Cloud

CS370 Operating Systems

[537] Virtual Machines. Tyler Harter

Changpeng Liu, Cloud Software Engineer. Piotr Pelpliński, Cloud Software Engineer

Preserving I/O Prioritization in Virtualized OSes

QuartzV: Bringing Quality of Time to Virtual Machines

Secure Containers with EPT Isolation

CPSC/ECE 3220 Summer 2017 Exam 2

ò mm_struct represents an address space in kernel ò task represents a thread in the kernel ò A task points to 0 or 1 mm_structs

Achieve Low Latency NFV with Openstack*

Scheduling. Don Porter CSE 306

Red Hat Enterprise Virtualization Hypervisor Roadmap. Bhavna Sarathy Senior Technology Product Manager, Red Hat

Bare-Metal Performance for x86 Virtualization

Virtualization. Virtualization

RT- Xen: Real- Time Virtualiza2on from embedded to cloud compu2ng

Towards More Power Friendly Xen

Supporting Time-sensitive Applications on a Commodity OS

Increase KVM Performance/Density

Microkernel Construction

Red Hat Enterprise Virtualization and KVM Roadmap. Scott M. Herold Product Management - Red Hat Virtualization Technologies

KVM/ARM. Marc Zyngier LPC 12

238P: Operating Systems. Lecture 14: Process scheduling

Xen on ARM. How fast is it, really? Stefano Stabellini. 18 August 2014

VIRTIO: VHOST DATA PATH ACCELERATION TORWARDS NFV CLOUD. CUNMING LIANG, Intel

Synchronization-Aware Scheduling for Virtual Clusters in Cloud

Real-Time Cache Management for Multi-Core Virtualization

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization.

CS140 Operating Systems Midterm Review. Feb. 5 th, 2009 Derrick Isaacson

COREMU: a Portable and Scalable Parallel Full-system Emulator

The Shadowy Depths of the KVM MMU. KVM Forum 2007

Transcription:

Performance Optimization on Huawei Public and Private Cloud Jinsong Liu <liu.jinsong@huawei.com> Lei Gong <arei.gonglei@huawei.com>

Agenda Optimization for LHP Balance scheduling RTC optimization 2

Agenda Optimization for LHP Balance scheduling RTC optimization 3

LHP (Lock Holder Preemption) More obvious in virtualization vcpu scheduling Task preemption Then Potentially blocking the progress of other vcpus waiting to acquire the same lock Increasing synchronization latency Performance degradation 4

LHP (Lock Holder Preemption) How to solve or alleviate? PLE (Pause Loop Exiting) DLHS (Delay LH scheduling) Co-scheduling Balance scheduling 5

PLE Hardware support VMCS configuration Optimization for Lock Waiters VM Exit actively Avoid waste vcpu cycles for invalid spin Pros. Supported by upstream Cons. Setting appropriate values of ple_gap and ple_windows is difficult Workloads adjustment Find an appropriate vcpu to yield 6

DLHS (Delay Lock Holder Scheduling) Background & precondition Usually, lock holders are under interrupt disable contexts Normally, the period of holding lock is shortly Hardware support (e.g. intel VT-X) interrupt window exiting Software support Hrtimer, 7

DLHS (Delay Lock Holder Scheduling) Solution Set a grace period for LH before scheduling If one vcpu is LH Start one hrtimer, and Set interrupt window exiting for VMCS If the hrtimer expire Clear interrupt window Continue to schedule for vcpu Judge the vcpu release the lock PLE happened Interrupt window exiting happen then Cancel hrtimer Release grace period Schedule the vcpu immediately 8

Unit: sec Lower is better DLHS performance 300 Hackbench results(cpu overcommit 1:3) 250 200 150 100 VM1-patched VM2-patched VM3-patched VM1 VM2 VM3 50 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 9

Agenda Optimization for LHP Balance scheduling RTC optimization 10

Time X Co-scheduling & Balance scheduling Guest Guest vcpu vcpu vcpu vcpu vcpu vcpu Run all vcpus on Time x Disperse all vcpus AMAS pcpu pcpu pcpu pcpu pcpu pcpu Co-scheduling Balance-scheduling 11

Co-scheduling CPU fragmentation Reduces CPU utilization Delay vcpu execution Priority inversion Degrades I/O performance pcpu 0 xxx vcpu0 xxx vcpu0 pcpu 1 vcpu1 I/O vcpu1 T 0 T 1 T 2 T 3 T 4 12

Balance scheduling Balances vcpu siblings on pcpus without precisely scheduling the vcpus simultaneously How to? Uses a bitmap to record all used pcpus for VM Scheduler adjustment Enqueue & dequeue Migration/find_idle_cpu/select_task_rq etc. 13

Proportion of building chains Performance evaluation Workload: Pushserver in Huawei Private Cloud Continuous testing for 24 hours Results 120.00% Proportion of building chains (higher is better) 100.00% 93.50% 95.30% 80.00% 70% 60.00% 40.00% 20.00% 0.00% with balance sched without balance sched 1:1 vcpupin <10ms 93.50% 70% 95.30% 14

Agenda Optimization for LHP Balance scheduling RTC optimization 15

RTC on KVM Windows use RTC as clock event device RTC emulation in Qemu, three timers rtc_periodic_timer Generates periodic interrupts Programmable to occur according to interrupt rate rtc_update_timer Generates alarm interrupts Occur one per second to once per day rtc_coalesced_timer Generates compensation interrupts Slews the lost ticks since different reasons Getting worse and worse with the VM density increase Pain points Some operations need to hold BQL Context switching between user space and kernel space Interrupt injecting from user space Performance degradation Latency increase Windows guest density decrease 16

RTC optimizations on KVM Minimize influence of BQL Placing RTC memory region outside BQL Using irqfd inject interrupts Hyperv features hyperv clock, Decreases read/write ioports Decreases ioport access on 0x70/0x71 Move RTC emulation to hypervisor Inject interrupts in KVM Reduce context switching But Large attack surface Incompatible with new features, such as split irqchip Optimize RTC compensation solution 17

RTC compensation solution Slew RTC ticks in hypervisor directly Count the coalesced interrupts When an RTC interrupt injecting failed Adjust the count when RTC interrupt rate changes Inject coalesced interrupts after EOI handler if exist Don t need a separate timer More timely Throttle the speed if there is too many coalesced interrupts Live migration support Save the coalesced interrupts in src side Restore them in dest side Both KVM and Qemu need to be patched 18

Optimization evaluation Before optimization After optimization 19

Thank You!