DRS: Advanced Concepts, Best Practices and Future Directions

Similar documents
PAC485 Managing Datacenter Resources Using the VirtualCenter Distributed Resource Scheduler

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

PERFORMANCE STUDY OCTOBER UNDERSTANDING vsphere DRS PERFORMANCE. VMware vsphere 6

TA7750 Understanding Virtualization Memory Management Concepts. Kit Colbert, Principal Engineer, VMware, Inc. Fei Guo, Sr. MTS, VMware, Inc.

VMware vcenter Server Performance and Best Practices

VMware vsphere HA Recommendations to Maximize Virtual Machine Uptime

VMware vsphere. Using vsphere VMware Inc. All rights reserved

Avoiding the 16 Biggest DA & DRS Configuration Mistakes

VCP-510 Volume Creating and Configuring VMware Clusters I

Cloud Infrastructure Launch vsphere Licensing Overview Your Cloud. Intelligent Virtual Infrastructure. Delivered Your Way.

vsphere Resource Management Update 2 VMware vsphere 5.5 VMware ESXi 5.5 vcenter Server 5.5

Resource Management Guide. ESX Server 3 and VirtualCenter 2

Resource Management Guide ESX Server and VirtualCenter 2.0.1

When dynamic VM migration falls under the control of VM user

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

VMware vsphere Optimize and Scale [v5.5] Custom Lab 12 Diagnosing CPU performance issues Copyright 2014 Global Knowledge Network Training Ltd.

Managing VMware Distributed Resource Scheduler

Public Cloud Leverage For IT/Business Alignment Business Goals Agility to speed time to market, adapt to market demands Elasticity to meet demand whil

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

New Features in VMware vsphere (ESX 4)

Logical Operations Certified Virtualization Professional (CVP) VMware vsphere 6.0 Level 1 Exam CVP1-110

The Old School Cloud Is No More: Running Your Microsoft Applications on AWS

predefined elements (CI)

vrealize Operations Manager User Guide

vrealize Operations Manager User Guide Modified on 17 AUG 2017 vrealize Operations Manager 6.6

Logical Operations Certified Virtualization Professional (CVP) VMware vsphere 6.0 Level 2 Exam CVP2-110

EXAM - VCP5-DCV. VMware Certified Professional 5 Data Center Virtualization (VCP5-DCV) Exam. Buy Full Product.

Detail the learning environment, remote access labs and course timings

VMware vfabric Data Director Installation Guide

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

VMware vsphere 5.5: Install, Configure, Manage Lab Addendum. Lab 21: VMware vsphere Distributed Resource Scheduler

vsphere Availability Guide

BraindumpsIT. BraindumpsIT - IT Certification Company provides Braindumps pdf!

VMware Infrastructure 3 Primer Update 2 and later for ESX Server 3.5, ESX Server 3i version 3.5, VirtualCenter 2.5

Quantifying Load Imbalance on Virtualized Enterprise Servers

Best Practices for Virtualizing Active Directory

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware

vrealize Operations Manager User Guide

Virtualizing Oracle on VMware

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

VMware vsphere Administration Training. Course Content

Best Practices for designing Farms and Clusters

[VMICMV6.5]: VMware vsphere: Install, Configure, Manage [V6.5]

VMware vsphere. Administration VMware Inc. All rights reserved

Exam Name: VMware Certified Professional on vsphere 5 (Private Beta)

vsphere Availability Update 1 ESXi 5.0 vcenter Server 5.0 EN

Effective Testing for Live Applications. March, 29, 2018 Sveta Smirnova

VMware vsphere with ESX 6 and vcenter 6

VMware Join the Virtual Revolution! Brian McNeil VMware National Partner Business Manager

VMware - VMware vsphere: Install, Configure, Manage [V6.7]

Huawei FusionCloud Desktop Solution 5.1 Resource Reuse Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01.

WHITE PAPER SEPTEMBER VMWARE vsphere AND vsphere WITH OPERATIONS MANAGEMENT. Licensing, Pricing and Packaging

vrealize Operations Manager User Guide 11 OCT 2018 vrealize Operations Manager 7.0

VMware vfabric Data Director Installation Guide

PARDA: Proportional Allocation of Resources for Distributed Storage Access

VCAP5-DCD. Vmware. VMware Certified Advanced Professional 5 - Data Center Design

Back To The Future - VMware Product Directions. Andre Kemp Sr. Product Marketing Manager Asia - Pacific

Best practices to achieve optimal memory allocation and remote desktop user experience

VMware Distributed Power Management Concepts and Use W H I T E P A P E R

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Exam : VMWare VCP-310

VMware vsphere 6.5: Install, Configure, Manage (5 Days)

Getting Started with ESXi Embedded

vsphere Resource Management Update 1 11 JAN 2019 VMware vsphere 6.7 VMware ESXi 6.7 vcenter Server 6.7

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Vmware VCP550PSE. VMware Certified Professional on vsphere 5.

SRM 8.1 Technical Overview First Published On: Last Updated On:

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced?

Disclaimer CONFIDENTIAL 2

How the Service Works

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

VMware vsphere: Install, Configure, Manage plus Optimize and Scale- V 6.5. VMware vsphere 6.5 VMware vcenter 6.5 VMware ESXi 6.

OPERATING SYSTEM. The Process. Introduction Process creation & termination Process state diagram Process scheduling & its criteria

5 Performance-Boosting vsphere Features You re Missing out on

VMware vsphere: Install, Configure, Manage (vsphere ICM 6.7)

Availability & Resource

VMware View Upgrade Guide

VMware Distributed Power Management Concepts and Use VMware ESX 3.5 and VMware vcenter Server 2.5

VMware Overview VMware Infrastructure 3: Install and Configure Rev C Copyright 2007 VMware, Inc. All rights reserved.

vcloud Usage Meter v2.3 Technical Overview 2009 VMware Inc. All rights reserved

Maintaining End-to-End Service Levels for VMware Virtual Machines Using VMware DRS and EMC Navisphere QoS

Administering VMware vsphere and vcenter 5

vsphere Resource Management

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Table of Contents HOL-SDC-1317

vsphere Availability Update 1 VMware vsphere 6.5 VMware ESXi 6.5 vcenter Server 6.5 EN

Azure SQL Database for Gaming Industry Workloads Technical Whitepaper

role at the the host. The the Administrator r can perform a user with Note Directory. Caution If you do

VMware HA: Overview & Technical Best Practices

VMware vsphere with ESX 4 and vcenter


Maintaining End-to-End Service Levels for VMware Virtual Machines Using VMware DRS and EMC Navisphere QoS

Your Speakers. Iwan e1 Rahabok Linkedin.com/in/e1ang

VMware vsphere: Fast Track [V6.7] (VWVSFT)

SRM 6.5 Technical Overview February 26, 2018

VMWARE VSPHERE FEATURE COMPARISON

VMware Exam VCP550D VMware Certified Professional 5 - Data Center Virtualization Delta Exam Version: 6.1 [ Total Questions: 270 ]

Exam4Tests. Latest exam questions & answers help you to pass IT exam test easily

Transcription:

INF-VSP2825 DRS: Advanced Concepts, Best Practices and Future Directions Aashish Parikh, VMware, Inc. Ajay Gulati, VMware, Inc. #vmworldinf

Disclaimer This session may contain product features that are currently under development. This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product. Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery. Pricing and packaging for any new technologies or features discussed or presented have not been determined. 2

3

Talk Outline Resource settings and Pools Specifying behavior under contention VM Happiness The key to application performance Load-balancing and handling multiple metrics The other side of making VMs happy Future Directions A sneak peek into DRS labs! 4

Resource Settings and Pools Specifying behavior under contention 5

Key Concepts Each VM has Configured capacity: 10 GHz, 24 GB RAM Demand for resource: 2 GHz, 4 GB RAM Each host has CPU & Memory capacity: 40 GHz, 96 GB RAM Q: How many VMs can fit on the host? A: A lot, if you don t care about performance Too safe: 4 (based on configured size) Ideal: 20 (based on CPU demand) 32 (based on memory demand) 6

Why do we need Resource controls? You don t know the demand values You may want to strictly isolate some VMs from others Sometimes: Sum of VM Demand > Cluster capacity Resource controls: Determine who gets what under contention Enable high consolidation and allow safe over-commitment! Q: Should you worry if Sum of VM Demand > host capacity? Answer: No, DRS handles that case! You only need to make sure that cluster has enough capacity 7

Resource Controls Reservation: Guaranteed allocation Limit: Guaranteed upper bound Shares: Allocation in between Resource pools: allocation and isolation of groups of VMs Configured (8Ghz) Limit (6Ghz) Actual(5Ghz) Actual allocation depends on the reservation, limit, shares and demand Reserved (1Ghz) 0 8

Resource Pools A powerful abstraction for aggregation and isolation of resources Customer creates a resource pool and sets business requirements Sharing within a pool, isolation across pools Organization Departments VMs Resource Pool 9

Controls: Reservation (MHz or MB) Minimum MHz or MB guaranteed Parent reservation sum of children reservations DRS will distribute the R value hierarchically using current demand R Root = 1000 MHz R Sales = 500 MHz R Marketing = 100 MHz 100 50 50 0 0 Resource Pool with CPU Reservations 10

Controls: Limit (MHz or MB) Maximum MHz or MB allowed Parent limit sum of child limits Child limits set based on parent limit and current VM demand L Root = 5000 MHz L Sales = U L Marketing = 1000 MHz 1500 MHz U U U U U: Unlimited Resource Pool with CPU Limits 11

Controls: Shares (No units) Relative priority between siblings in the tree Divide parent share in proportion of children s shares S Root = Normal (1000) 1000 S Sales = High (2000) S Marketing = Low (500) 800 200 Normal Low Low Custom Custom (1000) (500) (500) (600) (200) 400 200 200 150 50 Resource Pool with Share settings 12

Virtual View of the World DRS resource pool as specified by user In practice, life isn t so simple! VMs get mapped to different hosts and move between them <R,L,S> Sales, <R,L,S> Marketing, <R,L,S> VM level <r,l,s> settings Host A 5000 MHz Host B 4000 MHz 13

How to set these controls: 4 simple rules 1. Think VM, act RPs 2. Use shares for prioritization 3. Use a tiered reservations model (e.g. more for critical apps) 4. Use limits to control usage (e.g. pay per use model) 14

15

16

17

Use Case 2: Multiple Application Tiers Goal: Top tiers should suffer less than bottom tiers Think VM first: Tier 1 apps: VM reservation = 25% (rule 3), shares = High (rule 2) Tier 2 apps: VM reservation = 10% (rule 3), shares = Low (rule 2) R (GHz): 2 2 2 0.8 0.8 L (GHz): U U U U U Shares: 2000 2000 2000 500 500 Configured: 8 8 8 8 8 18

Use Case 2: Multiple Application Tiers Apply rule 1: Tier 1 RP: R = VM reservations, Shares = VM shares Tier 2 RP: R = VM reservations, Shares = VM shares Set VM level reservations = 0, shares = Normal for all VMs R = 6 GHz Shares = 2000*3 = 6000 R = 1.6 GHz Shares = 500*2 = 1000 R (GHz): 0 0 0 0 0 L (GHz): U U U U U Shares: 1000 1000 1000 1000 1000 Configured: 8 8 8 8 8 19

Use Case 3: Strict priority between Tiers Goal: Top tiers should suffer not suffer at all due to bottom tiers Think VM first: VM reservation = 50, 10 and 5% for three tiers (rule 3) VM shares = High, High/10, High/100 (rule 2) Apply rule 1: Resource pool settings = VM settings R = 12 GHz Shares = 2000*3 = 6000 R = 4 GHz Shares = 200*2 = 400 R = 0.8 GHz Shares = 20*2 = 40 R (GHz): 0 0 0 0 0 0 0 L (GHz): U U U U U U U Shares: 1000 1000 1000 1000 1000 1000 1000 Configured: 8 8 8 8 8 8 8 20

Key Takeaway Resource controls allow high consolidation and safe over-commit Follow four simple rules to determine resource controls 1. Think VM, act RPs 2. Use shares for prioritization 3. Use a tiered reservations model (e.g. more for critical apps) 4. Use limits to control usage (e.g. pay per use model) If you need some other behavior, let us know References for more technical details: VMware Distributed Resource Management: Design, Implementation, and Lessons Learned (VMware Technical Journal) 21

VM Happiness The key to application performance 22

The Giant Host Abstraction Treat the cluster as one giant host Capacity of this giant host = capacity of the cluster 6 hosts CPU = 10 GHz Memory = 64 GB 1 giant host CPU = 60 GHz Memory = 384 GB 23

Why is that Hard? Main issue: fragmentation of resource across hosts Primary goal: Keep VMs happy by meeting their resource demands! 1 giant host CPU = 60 GHz Memory = 384 GB VM demand < 60 GHz All VMs running happily 24

25

How are Demands computed? Demand indicates what a VM could consume given more resources Demand can be higher than utilization Think ready time CPU demand is a function of many CPU stats Memory demand is computed by tracking pages in guest address space and the percentage touched in a given interval 26

Why not Load-balance the DRS Cluster as the Primary Goal? Load-balancing is not free In case of low VM demand, load-balancing migrations have cost but no benefit Load-balancing is a criteria used to meet VM demands Consider these 4 hosts with almost-idle VMs Q: Should we balance VMs across hosts? Note: all VMs are getting what they need 27

Important Metrics and UI Controls CPU and memory entitlement VM deserved value based on demand, resource settings and capacity Load is not balanced but all VMs are happy! 28

Key Takeaway Goals of DRS are: To provide the giant host abstraction To meet VM demand and keep applications at high performance To use load balancing as a secondary metric while making VMs happy If you care about load balancing as a key metric It can be achieved Stay awake! 29

Load-balancing and handling multiple metrics The other side of making VMs happy 30

The balls and bins problem Problem: assign n balls to m bins Balls could have different sizes Bins could have different sizes Key Challenges Dynamic numbers and sizes of balls/bins Constraints on co-location, placement and others Now, what is a fair distribution of balls among bins? Welcome to DRS load-balancing VM resource entitlements are the balls Host resource capacities are the bins Dynamic load, dynamic capacity Various constraints: DRS-Disabled-On-VM, VM-to-Host affinity, 31

32

Goals of DRS Load-balancing Fairly distribute VM demand among hosts in a DRS cluster Enforce constraints Recommend mandatory moves Recommend moves that significantly improve imbalance Recommend moves with long-term benefit 33

How is Imbalance Measured? Imbalance metric is a cluster-level metric 0.65 0.1 0.25 For each host, we have Imbalance metric is the standard deviation of these normalized entitlements Imbalance metric = 0.2843 Imbalance metric = 0 perfect balance How much imbalance can you tolerate? AWESOME! OUCH! YIKES! OOPS! 34

The Myth of Target Balance UI slider tells us what star threshold is acceptable Implicit target number for cluster imbalance metric n-star threshold tolerance of imbalance up to n * 10% in a 2-node cluster Constraints can make it hard to meet target balance Meeting aggressive target balance may require many migrations DRS will recommend them only if they also make VMs happier If all VMs are happy, a little imbalance is not really a bad thing! 35

36

37

CostBenefit Filtering: the search for enduring quality Benefit: Higher resource availability Cost: Migration cost: vmotion CPU & memory cost, VM slowdown Risk cost: Gain (MHz or MB) Benefit Benefit may not be sustained due to load variation 0 Risk cost Migration cost Time (sec) Loss Migration Stable Time Time Invocation Interval Candidates that cannot meet this criteria are CostBenefit filtered 38

Handling Severe Imbalance Causes for severe imbalance Target too impractical, too many constraints Filters too aggressive for certain inventories Newly powered-on hosts or updated hosts interplay with VUM DRS automatically detects and addresses severe imbalance Filters automatically relaxed or dropped; reinstated when the situation is fixed *NEW* Entirely revamped in vsphere 5.1 New advanced config options Also available in vsphere 4.1 U3 and vsphere 5.0 U2 39

Handling Severe Imbalance: Advanced Config Options DRS relaxes filters to fix severe imbalance by default SevereImbalanceRelaxMinGoodness = 1 SevereImbalanceRelaxCostBenefit = 1 FixSevereImbalanceOnly = 1 You can modify default behavior, handle with care! FixSevereImbalanceOnly = 0 SevereImbalanceDropCostBenefit = 1 40

Handling Severe Imbalance: Older Releases and Updates Clusters severely imbalanced but unable to upgrade? Setting these options temporarily can provide immediate relief MinGoodness = 0; allows any migration with goodness > 0 CostBenefit = 0; considers expensive/ephemeral moves too Setting these options can cause a large number of migrations! Don t forget to set them back to 1 when balance is restored! DRS Load-balancing runs better with filters 41

Handling Multiple Metrics Optimizing for multiple metrics is hard DRS uses smart heuristics, tries not to hurt any metric Weights are computed dynamically based on resource utilization Customers have requested even more metrics! Extra metric: CPU ready time Handling severe imbalance reduces ready time in many cases Considering doing even more here, talk to us if this is important to you 42

Extra metric: Eggs-in-a-basket Restrict #VMs per Host Recall this example This cluster is perfectly balanced w.r.t. CPU, Memory However, spreading VMs around is important to some of you! *NEW* Advanced config option in vsphere 5.1 LimitVMsPerESXHost = 6 DRS will not admit or migrate more than 6 VMs to any Host This may impact VM happiness, load-balancing 43

Key Takeaway DRS recommends moves that reduce cluster imbalance Constraints can impact achievable imbalance metric Filters help improve the quality of moves and reduce noise Moves must satisfy MinGoodness, CostBenefit analyses to be recommended DRS automatically detects and addresses severe imbalance (NEW) DRS handles many metrics effectively 44

Future directions A sneak-peek into DRS Labs! 45

Turbo mode Load-balancing Most aggressive load-balancing option One-time, manual-mode load-balancing call no holds barred Pros Reach lowest possible imbalance metric value possible Maximum exploration of solution space No MinGoodness or CostBenefit filtering Cons No MinGoodness or CostBenefit filtering! May recommend a large number of migrations 46

Eggs-in-a-basket: AutoTune Recall the eggs-in-a-basket metric LimitVMsPerESXHost = 6 Host 1 Host 2 Host 3 Host 4 If 12 new VMs are powered on, Host 4 cannot accept any! Need to manually change the LimitVMsPerESXHost value Instead, automatically adjust value = mean + buffer% * mean Set new option for buffer% just once: LimitVMsPerESXHostPercent = 50 Number of VMs = 20 + 12 (new); mean = 32 4 = 8; buffer = 50% New value is automatically set to 8 + 50% * 8 = 12 Host 4 is now able to accept 6 more VMs! 47

What-if DRS: Building castles in a DRS sandbox Safely operate DRS on hypothetical inventories and configurations Real-world queries and DRS metrics If DPM was enabled on the cluster, what would VM happiness look like? Which migrations will be triggered by breaking the vm 6 vm 8 affinity rule? If host 2 is upgraded, how will any newly unlocked features work with DRS? Complex queries "If I added a clone of host 4, remove host 3 and add 10 clones on vm 24 to my cluster, will any of my constraints be violated? What will my new inventory look like after load-balancing?" Use for trouble-shooting and capacity planning! 48

In Summary Use resource pool settings instead of VM settings when possible Ease of management Safe over-commitment Resource control recipes can be used to handle various use cases VM Happiness is the key goal of DRS Load balancing keeps VMs happy and resource utilization fair As long as all VMs are happy, a little imbalance is not always a bad thing! Tons of cool DRS stuff in the pipeline, feedback welcome! 49

FILL OUT A SURVEY EVERY COMPLETE SURVEY IS ENTERED INTO DRAWING FOR A $25 VMWARE COMPANY STORE GIFT CERTIFICATE

INF-VSP2825 DRS: Advanced Concepts, Best Practices and Future Directions Ajay Gulati, VMware, Inc. Aashish Parikh, VMware, Inc. #vmworldinf