How to Find the Optimal Configuration for Your Virtualized Environment. Iordan K. Iotzov Blog:

Similar documents
The Oracle Database Appliance I/O and Performance Architecture

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

DRS: Advanced Concepts, Best Practices and Future Directions

Capacity & Performance Management in Solaris Virtualized Environment

The Virtualized Server Environment

Oracle Multitenant What s new in Oracle Database 12c Release ?

Enhancing Cloud Resource Utilisation using Statistical Analysis

Consolidation Assessment Final Report

Storage Designed to Support an Oracle Database. White Paper

<Insert Picture Here> Enterprise Data Management using Grid Technology

Storage Optimization with Oracle Database 11g

Achieving high availability for Hyper-V

Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Cloud Consolidation with Oracle (RAC) How much is too much?

<Insert Picture Here> Introducing Oracle WebLogic Server on Oracle Database Appliance

VxRAIL for the ClearPath Software Series

Random Search Report An objective look at random search performance for 4 problem sets

Fit for Purpose Platform Positioning and Performance Architecture

Technical Note P/N REV A01 March 29, 2007

ArcGIS Enterprise Performance and Scalability Best Practices. Andrew Sakowicz

Virtualizing Oracle on VMware

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

监控您的 SmartCloud 刘鹤 IBM 软件部

PYTHIA: Improving Datacenter Utilization via Precise Contention Prediction for Multiple Co-located Workloads

Hitachi Converged Platform for Oracle

vsphere 4 The Best Platform for Business-Critical Applications Gaetan Castelein Sr Product Marketing Manager VMware, Inc.

Adapting Mixed Workloads to Meet SLOs in Autonomic DBMSs

HP solutions for mission critical SQL Server Data Management environments

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION

Approximation Algorithms

Focus On: Oracle Database 11g Release 2

Best Practices for Setting BIOS Parameters for Performance

IaaS Vendor Comparison

A Randomized Algorithm for Minimizing User Disturbance Due to Changes in Cellular Technology

Roadmap to Availability

Evaluating Hyperconverged Full Stack Solutions by, David Floyer

Optimizing Testing Performance With Data Validation Option

Scaling ColdFusion. Presenter Mike Collins, Sr. ColdFusion Consultant - SupportObjective

StorageTek ACSLS Manager Software

UCP what? Unified Management UCP Director. Fully Integrated E2E QA, Support, Release Management. Compute. Storage. Connectivity FC And IP.

Accelerating Microsoft SQL Server 2016 Performance With Dell EMC PowerEdge R740

Session 1079: Using Real Application Testing to Successfully Migrate to Exadata - Best Practices and Customer Case Studies

Session 201-B: Accelerating Enterprise Applications with Flash Memory

NEC Express5800 A2040b 22TB Data Warehouse Fast Track. Reference Architecture with SW mirrored HGST FlashMAX III

Automatically and Continuously Optimizing Workload Configurations In Your Virtual Environment

Consolidating OLTP Workloads on Dell PowerEdge R th generation Servers

RAC Performance Monitoring and Diagnosis using Oracle Enterprise Manager. Kai Yu Senior System Engineer Dell Oracle Solutions Engineering

Understanding Data Locality in VMware vsan First Published On: Last Updated On:

Cloud Acceleration. Performance comparison of Cloud vendors. Tobias Deml DOAG2017

Milestone Solution Partner IT Infrastructure Components Certification Report

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND FIBRE CHANNEL INFRASTRUCTURE

Consolidate and Prepare for Cloud Efficiencies Oracle Database 12c Oracle Multitenant Option

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari)

ORDER DELIVERY TRACKING ON PREMISE ENVIRONMENT RECOMMENDATIONS

Virtualization of the MS Exchange Server Environment

Experience the GRID Today with Oracle9i RAC

Data Sheet: Storage Management Veritas Storage Foundation for Oracle RAC from Symantec Manageability and availability for Oracle RAC databases

Reduce Costs & Increase Oracle Database OLTP Workload Service Levels:

Oracle Database Appliance Combo

The Late Night Blog Post SEO Checklist

Design and Architecture of Dell Acceleration Appliances for Database (DAAD): A Practical Approach with High Availability Guaranteed

DELL EMC VMAX UNISPHERE 360

Samsung s Green SSD (Solid State Drive) PM830. Boost data center performance while reducing power consumption. More speed. Less energy.

Discrete Optimization. Lecture Notes 2

Lenovo Database Configuration

Virtual SQL Servers. Actual Performance. 2016

Documentation Accessibility. Access to Oracle Support

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

High Availability Infrastructure for Cloud Computing

vsan 6.6 Performance Improvements First Published On: Last Updated On:

Manageability comparison: Oracle database 10g and Oracle 9i database

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510

Oracle Rdb Hot Standby Performance Test Results

This presentation is for informational purposes only and may not be incorporated into a contract or agreement.

Weight Annealing Heuristics for Solving the Two-Dimensional Bin Packing Problem

VERITAS Storage Foundation 4.0 TM for Databases

2017 Resource Allocations Competition Results

A Performance Characterization of Microsoft SQL Server 2005 Virtual Machines on Dell PowerEdge Servers Running VMware ESX Server 3.

Using EonStor DS Series iscsi-host storage systems with VMware vsphere 5.x

Product: Order Delivery Tracking

A Cool Scheduler for Multi-Core Systems Exploiting Program Phases

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

February 19, Integer programming. Outline. Problem formulation. Branch-andbound

Hardware-Software Codesign

Parallel Algorithm Design. Parallel Algorithm Design p. 1

Oracle Database 12c: Performance Management and Tuning

Oracle Advanced Compression: Reduce Storage, Reduce Costs, Increase Performance Bill Hodak Principal Product Manager

Unsupervised Learning

EMC Virtual Infrastructure for Microsoft SharePoint Server 2010 Enabled by EMC CLARiiON and VMware vsphere 4

Oracle Database 10G. Lindsey M. Pickle, Jr. Senior Solution Specialist Database Technologies Oracle Corporation

Cluster Analysis. Prof. Thomas B. Fomby Department of Economics Southern Methodist University Dallas, TX April 2008 April 2010

Veritas Storage Foundation for Oracle RAC from Symantec

Integer Programming Theory

Oracle Real Application Clusters One Node

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND ISCSI INFRASTRUCTURE

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

NEXGEN N5 PERFORMANCE IN A VIRTUALIZED ENVIRONMENT

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public

Virtualizing the SAP Infrastructure through Grid Technology. WHITE PAPER March 2007

Quality-Assured Cloud Bandwidth Auto-Scaling for Video-on-Demand Applications

Transcription:

How to Find the Optimal Configuration for Your Virtualized Environment Iordan K. Iotzov Blog: http://iiotzov.wordpress.com/

About me 15+ years of database administration and development experience MS in Computer Science, BS in Electrical Engineering Presented at RMOUG,Hotsos, NYOUG and Virta-Thon Active blogger and OTN participant

Agenda Overview of virtual server consolidation Modeling the problem Summary and Definitions Constraints Implementation Solving the model Discrete optimization Conclusion

Overview Competing Goals Minimize the computational footprint of your enterprise through virtualization pay less for hardware and licenses While making sure performance and business requirements are met ensure that end user experiences and business processes do not suffer due to the consolidation

Overview Target Audience Medium to large size enterprises For small systems, the efforts to get a great virtual to physical mapping probably do not justify the benefits Reasonable load volatility The mapping decisions are based on past performance, so the future load should not significantly deviate from the past. Do not attempt for systems that can go viral. Ability to measure, store and process various performance metrics Performance data should be made available in a single repository, ideally in a relational database

Overview Optimization Brief Introduction to Optimization: Minimize f x Subject to constraints Where and Constraints S can by represented as x S f: D R, D is the domain of f S D is the set of feasible solution x C i x = 0 C j x 0

Overview Continuous Optimization Continuous optimization: variable x has real values Example: f(x) Optimum Minimize: f x = x 3 2x 2 31x + 28 Subject to : x > 0 40x + f(x) > 0 x

Overview Discrete Optimization Discrete optimization: variables x i are discrete Example: Knapsack problem There are n items, each with value v i and weight w i. The goal is to maximize the sum of the values of the items in a bag with capacity W Maximize: n i=1 v i x i Subject to : n i=1 w i x i <W http://commons.wikimedia.org/wiki/file:knapsack.svg

Summary and Definitions Mapping of virtual to physical servers as an optimization problem Minimize: Total Cost of Ownership (TCO) Need to account for sunk costs It is OK to simplify Subject to: Technology and Business Requirements Minimal CPU oversubscription during certain hours (based on historic patterns) Guaranteed level of performance even when a virtual server goes berserk Certain virtual servers have to run on CPUs with at least x GHz and many more

Physical servers Modeling the Problem Summary and Definitions p i, where i in [1,n] each server has p cpu i number of CPUs, each with p cpu speed i speed in GHz each server has p mem i memory (GB) each server has p io_thru i IO throughput(gb/sec) p io_thru i p mem i p cpu i

Summary and Definitions Virtual servers v i, where i in [1,m] each server has been allocated v cpu i number of CPUs each server has been allocated v mem i memory (GB) each server has used no more than v io_thru i IO throughput(gb/sec) v io_thru i v mem i v cpu i

Graphical Representation: Modeling the Problem Summary and Definitions Number of CPUs Amount of RAM HBA throughput 4 CPUs 8G RAM 8GB HBA 2 CPUs 16G RAM 8GB HBA

Summary and Definitions Mapping of virtual servers to physical ones Virtual Servers: Physical Servers: Virtual Servers:

Summary and Definitions Possible mapping shortfalls : Too spread out Low chance of performance issues due to interference from other VMs, but possibly using more hardware/licenses than needed

Summary and Definitions Possible mapping shortfalls : Too tight Minimal hardware/licenses costs, but with significant chance of performance issues due to interference from other VMs

Summary and Definitions Just right! Minimal hardware/licenses costs, with low chance of performance issues due to interference from other VMs Less Hardware/ Licensing!

n j=1 x j p cpu j Subject to : Modeling the Problem Summary and Definitions Find a mapping between virtual and physical servers x i,j x i,j = 1 if virtual server i will reside on physical server j, 0 otherwise Minimize the number of CPUs (Since license cost are typically tied to the number of CPUs): x j = 0 iff x i,j = 0 for all I (physical server p i not used) x j = 1 iff x i,j = 1 for al least one I (physical server p i used) Note: The function to minimize can be modified if using Oracle approved hard partitioning For each virtual server is in one and only one physical server For each virtual server j, n i=1 x i,j = 1

Constraints Constraints allow us to specify performance and business requirements that the system must adhere to. A common requirement is to allow no overallocation of computing resources No CPU overallocation: No memory overallocation: No IO throughput overallocation: For each physical server i, x i,j v cpu i = p cpu i m i=1 For each physical server I, x i,j v mem i = p mem i m i=1 For each physical server I, x i,j v io_thu i = p io_thru i m i=1

Implementation Custom Solutions (PL/SQL package) ability to incorporate virtually any information ability to customize to any specific environment/licensing need Off the Shelf Solutions (Oracle OEM/ Consolidation Planner) minimal setup easy to use

Implementation/Custom A repository, ideally in a database, is a great place to host the optimization logic. OEM Repository Database Custom PL/SQL OEM Repository Managed Tables Custom Tables CUST_TAB. MGMT$METRIC_HOURLY MGMT$TARGET MGMT_ECM_HW.

Implementation/ Consolidation Planner Off the shelf solutions: Oracle Enterprise Manager Consolidation Planner Setup->Extendibility->Plugins

Implementation/ Consolidation Planner Consolidation Project o defines the scope of the consolidation effort Consolidation Scenario o specific requirements and constraints

Implementation/Custom Allocating enough resources, such as virtual CPU ( v cpu i ), to be able to sustain maximum load (as per history) would minimize the likelihood of a performance problems related to resource utilization. OEM Repository query for getting the max number of CPUs used: SELECT MAX(ceil(m)) FROM ( select max((a.maximum*c.cpu_count)/100) m, a.rollup_timestamp from mgmt$metric_hourly a, mgmt$target b, sysman.mgmt_ecm_hw c where a.metric_name = 'Load' and a.column_label = 'CPU Utilization (%)' and a.target_guid = b.target_guid and b.target_name = <hostname> and c.hostname = <hostname> and c.vendor_name = 'Intel Based Hardware' group by a.rollup_timestamp)

Constraints Sizing for max load per day Lower risk of contention Larger footprint Sizing for average load per hour Higher risk of contention Smaller footprint Sizing for max load can be quite conservative i.e. we are likely to get excellent performance, but we are going to allocate substantial resources. We can switch the balance a little bit we can slightly increase the chance of performance issues, but reduce the computational footprint. We can archive that by taking into account the timing of the load. We can come with a configuration that would not have resulted in an overalloaction during any time of the past. Overallocation in future is possible if the timing of the workloads changes.

Implementation/ Consolidation Planner Consolidation planner comes with pre-configured scenarios for three different points on the contention/footprint scale

Constraints Sizing for max. load : v cpu 1 + v cpu 2 Sizing for max. combined load : v cpu 1+2 < v cpu 1 + v cpu 2 CPU v cpu 1 + v cpu 2 v cpu 1+2 v cpu 2 v cpu 1 Combined Server 1 and Server 2 load Server 1 load Server 2 load time

Implementation/Custom The following query checks if a list of virtual servers would fit in a physical server SELECT COUNT(*) FROM (SELECT SUM((a.average*c.cpu_count)/100) m FROM mgmt$metric_hourly a, mgmt$target b, sysman.mgmt_ecm_hw c WHERE a.metric_name = ''Load'' AND a.column_label = ''CPU Utilization (%)'' AND a.target_guid = b.target_guid AND b.target_name IN (' <list of virt servers> ' ) AND c.hostname ''.'' c.domain = b.target_name AND c.vendor_name = ''Intel Based Hardware'' GROUP BY a.rollup_timestamp HAVING SUM((a.average*c.cpu_count)/100) > 0.9*' <CPUs of physical server> )

Implementation/Consolidation Planner OEM Consolidation Planner can consider either max, 80% or average load.

Constraints Major drawback of over-allocation if one of the VMs consumes unplanned amount of resources, the other VMs would suffer. Some virtualization providers allow us to guarantee each of the VM certain level of resources (CPU/memory) in case of over allocation. Physical CPU VM1 guaranteed CPU VM2 guaranteed CPU VM3 guaranteed CPU Physical CPU > VM1 guaranteed CPU + VM2 guaranteed CPU + VM3 guaranteed CPU

Constraints A reasonable compromise is to guarantee that under distress each VM will get resources that would be enough to accommodate the load in 95% of the time CPU v cpu g - CPU a VM is guaranteed to receive regardless of activities in other VMs v cpu g time

Implementation/Custom Find CPU level that is enough for the system 95% of the time SELECT MAX(ceil(m)) FROM (SELECT MIN(m) m FROM (SELECT m, percent_rank () over ( ORDER BY m) perc FROM (SELECT MAX((a.maximum*c.cpu_count)/100) m, a.rollup_timestamp FROM mgmt$metric_hourly a, mgmt$target b, sysman.mgmt_ecm_hw c WHERE a.metric_name = 'Load' AND a.column_label = 'CPU Utilization (%)' AND a.target_guid = b.target_guid AND b.target_name = i.hostname '.<domain_name>.com' ) AND c.hostname = i.hostname AND c.vendor_name = 'Intel Based Hardware' GROUP BY a.rollup_timestamp ) ) WHERE perc > 0.95

Constraints Reducing computational footprint by recognizing that some servers are needed only during certain hours. Most business requirement checks should be done for every hour of the day, taking into account which servers are active then. 12 am 2 am 4 am 6 am 8 am 10 am 12 pm 2 pm 4 pm 6 pm 8 pm 10 pm Offshore Dev Tool Repository Onshore UAT TroubleShooting

Constraints No two nodes of a RAC cluster should be on the same physical server RAC cluster: DB1 DB3 DB1 DB1 DB2 DB3 DB2 DB2 DB3

Implementation/ Consolidation Planner Specifying the RAC nodes constraint in the Consolidation Planner

Constraints Guarantee that a virtual machine runs on a physical server that has sufficient CPU speed 2933 MHz Requires at least 2500 MHz 1995 MHz 3066 MHz

Solving the Model Discrete Optimization Computational Complexity of Optimization Problems P can be solved in polynomial time NP the solution can be verified in polynomial time NP hard at least as difficult as any problem in NP NP complete NP hard and in NP

Solving the Model Discrete Optimization Computational Complexity of Optimization Problems From: http://en.wikipedia.org/wiki/np-hard#mediaviewer/file:p_np_np-complete_np-hard.svg

Solving the Model Discrete Optimization Finding the optimal solution for many real world problems may require enormous, frequently impractical, amount of computing resources. We usually need to settle for good, but not necessary optimal solutions. Here are some major techniques in Discrete Optimization: Constraint Programming Local Search Linear and Integer Programming

Solving the Model Discrete Optimization The problem we are trying to solve here can be considered a variation of the offline variable size Bin Packing Problem (BPP). Given: N items, each with weight w j M bin, each with capacity c i Minimize: c i, for all bins which have at least one item Subject to: Each item must be in exactly one bin w j < c i, for all items that are in bin i

Solving the Model Discrete Optimization Heuristics for solving classic BPPs Next-Fit: Put in as many items as possible in a bin, then move to the next one. First Fit: Put an item in the first bin that fits it. Start using a new bin only after trying all partially filled bins Best-Fit: Assign items in a way that minimizes the residual capacity of a bin Next-Fit Decreasing: Same as Next-Fit, but have the items ordered in decreasing order

Solving the Model Discrete Optimization Randomization a simple way to minimize the risk of a bad solution. It has intuitive local search interpretation. Starting from (somehow) random starting position Random hill-climbing moves Simulated Annealing randomly allowing moves that do not improve the solution Local minimums The global minimum

Conclusion Getting optimal virtual server consolidation is more of a science than an art Doing optimal virtual server consolidation right requires time and efforts, but it can have significant ROI There is no need to look for absolute optimality - getting a great, though not optimal solution, can make a huge difference.

Thank you