Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Similar documents
Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

EMC XTREMCACHE ACCELERATES ORACLE

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Oracle Database on Cisco UCS C-Series Servers with Fusion-io iomemory

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

2 to 4 Intel Xeon Processor E v3 Family CPUs. Up to 12 SFF Disk Drives for Appliance Model. Up to 6 TB of Main Memory (with GB LRDIMMs)

Oracle Database Consolidation on FlashStack

EMC XTREMCACHE ACCELERATES MICROSOFT SQL SERVER

Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

Cisco HyperFlex All-Flash Systems for Oracle Real Application Clusters Reference Architecture

Virtualization of the MS Exchange Server Environment

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510

The Oracle Database Appliance I/O and Performance Architecture

HPE ProLiant DL580 Gen10 and Ultrastar SS300 SSD 195TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture

Cisco HyperFlex HX220c M4 Node

Best Practices for Setting BIOS Parameters for Performance

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Cisco UCS B200 M3 Blade Server

Lenovo Database Configuration

Cisco UCS B460 M4 Blade Server

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Accelerate Applications Using EqualLogic Arrays with directcache

HP SmartCache technology

HP ProLiant DL380 Gen8 and HP PCle LE Workload Accelerator 28TB/45TB Data Warehouse Fast Track Reference Architecture

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public

UCS Invicta: A New Generation of Storage Performance. Mazen Abou Najm DC Consulting Systems Engineer

HP SAS benchmark performance tests

Cisco UCS C250 M2 Extended-Memory Rack-Mount Server

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads

Cisco UCS B230 M2 Blade Server

Validating the NetApp Virtual Storage Tier in the Oracle Database Environment to Achieve Next-Generation Converged Infrastructures

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE

Cisco HyperFlex HX220c M4 and HX220c M4 All Flash Nodes

Cisco UCS C250 M2 Extended-Memory Rack-Mount Server

Cisco UCS C210 M2 General-Purpose Rack-Mount Server

Extremely Fast Distributed Storage for Cloud Service Providers

Storage Optimization with Oracle Database 11g

Cisco UCS C-Series I/O Characterization

Veritas NetBackup on Cisco UCS S3260 Storage Server

Reference Architecture Microsoft Exchange 2013 on Dell PowerEdge R730xd 2500 Mailboxes

iocontrol Reference Architecture for VMware Horizon View 1 W W W. F U S I O N I O. C O M

Using Synology SSD Technology to Enhance System Performance Synology Inc.

PRESERVE DATABASE PERFORMANCE WHEN RUNNING MIXED WORKLOADS

Sun Fire X4170 M2 Server Frequently Asked Questions

HPE ProLiant DL380 Gen9 and HPE PCIe LE Workload Accelerator 24TB Data Warehouse Fast Track Reference Architecture

Lenovo Database Configuration for Microsoft SQL Server TB

Deep Learning Performance and Cost Evaluation

Oracle Exadata: Strategy and Roadmap

Nimble Storage Adaptive Flash

Create a Flexible, Scalable High-Performance Storage Cluster with WekaIO Matrix

Technical Paper. Performance and Tuning Considerations for SAS on Fusion-io ION Accelerator

UCS M-Series + Citrix XenApp Optimizing high density XenApp deployment at Scale

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

Cisco Unified Computing System Delivering on Cisco's Unified Computing Vision

Cisco UCS Mini Software-Defined Storage with StorMagic SvSAN for Remote Offices

SAS workload performance improvements with IBM XIV Storage System Gen3

Using HPE ProLiant Servers and Fusion iomemorybased HPE PCIe Workload Accelerators from SanDisk

Optimizing the Data Center with an End to End Solutions Approach

Accelerating Enterprise Search with Fusion iomemory PCIe Application Accelerators

Deep Learning Performance and Cost Evaluation

Oracle Platform Performance Baseline Oracle 12c on Hitachi VSP G1000. Benchmark Report December 2014

Cisco UCS B440 M1High-Performance Blade Server

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

vsan 6.6 Performance Improvements First Published On: Last Updated On:

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

IBM Emulex 16Gb Fibre Channel HBA Evaluation

HUAWEI Tecal X6000 High-Density Server

Sizing and Best Practices for Online Transaction Processing Applications with Oracle 11g R2 using Dell PS Series

LSI and HGST accelerate database applications with Enterprise RAID and Solid State Drives

Parallels Remote Application Server. Scalability Testing with Login VSI

Accelerating Workload Performance with Cisco 16Gb Fibre Channel Deployments

The PowerEdge M830 blade server

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

LEVERAGING EMC FAST CACHE WITH SYBASE OLTP APPLICATIONS

Scalability Testing with Login VSI v16.2. White Paper Parallels Remote Application Server 2018

Consolidating OLTP Workloads on Dell PowerEdge R th generation Servers

Reduce Costs & Increase Oracle Database OLTP Workload Service Levels:

Open storage architecture for private Oracle database clouds

Exadata X3 in action: Measuring Smart Scan efficiency with AWR. Franck Pachot Senior Consultant

Cisco UCS C240 M4 I/O Characterization

Table 1 The Elastic Stack use cases Use case Industry or vertical market Operational log analytics: Gain real-time operational insight, reduce Mean Ti

stec Host Cache Solution

Cisco UCS C210 M1 General-Purpose Rack-Mount Server

BENEFITS AND BEST PRACTICES FOR DEPLOYING SSDS IN AN OLTP ENVIRONMENT USING DELL EQUALLOGIC PS SERIES

Database Solutions Engineering. Best Practices for Deploying SSDs in an Oracle OLTP Environment using Dell TM EqualLogic TM PS Series

Is There Any Alternative To Your Enterprise UNIX Platform? Andrej Gursky PosAm TechDays EAST, March 2015

Accelerating storage performance in the PowerEdge FX2 converged architecture modular chassis

QLogic 16Gb Gen 5 Fibre Channel for Database and Business Analytics

EMC VFCache. Performance. Intelligence. Protection. #VFCache. Copyright 2012 EMC Corporation. All rights reserved.

Veeam Availability Solution for Cisco UCS: Designed for Virtualized Environments. Solution Overview Cisco Public

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Samsung s Green SSD (Solid State Drive) PM830. Boost data center performance while reducing power consumption. More speed. Less energy.

SAP High-Performance Analytic Appliance on the Cisco Unified Computing System

W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X

Webinar Series: Triangulate your Storage Architecture with SvSAN Caching. Luke Pruen Technical Services Director

Cisco UCS C240 M3 Server

Transcription:

White Paper Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c What You Will Learn This document demonstrates the benefits of using the Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 for a high-performance transactional workload on Oracle database12c. Swingbench was used to populate the database and generate the I/O workload for this solution. The solution performed nearly 1 million transactions per minute, resulting in 100,000 I/O operations per second (IOPS) with average latency of just 0.4 millisecond. Solution Benefits Optimize your return on investment (ROI). Easily deploy and scale the solution. Get faster responses at a lower cost. Achieve a high number of I/O operations per second (IOPS) and low latency. Highlights Industry-Leading Performance and User Scalability The Cisco UCS B420 M4 Blade Server platform enables customers to improve performance of all critical applications, reduce IT costs through consolidation, manage more data on less hardware, and make better business decisions in real time. Linear scalability in database performance was achieved as the number of users increased from 50 to 400 concurrent users. Improved Database Productivity with Less Maintenance and Tuning Overall productivity is increased due to the elimination of inherent issues that arise with spinning-disk solutions. The solution decouples database and storage administration from database storage troubleshooting to optimize Oracle applications, leading to an improved customer experience. Significant Reduction in Costs Reduced storage infrastructure requirements result in a significant cost reduction. Overview Designed for demanding virtualization and database workloads, the Cisco UCS B420 M4 Blade Server (Figure 1) combines a large memory footprint with 4-socket scalability using the Intel Xeon processor E5-4600 v3 product family. The blade server supports 2133 MHz of DDR4 memory and uses Cisco UCS virtual interface card (VIC) technology to achieve up to 160 Gbps of aggregate I/O bandwidth, all in a dense, full-width blade form factor. The Cisco UCS B420 M4 maintains memory performance even as capacity grows, and the large power envelope of the Cisco UCS 5108 Blade Server Chassis means the Cisco UCS B420 M4 can handle up to 3 terabytes (TB) of memory without compromising CPU speed or core count. Up to four Cisco UCS B420 M4 servers can be installed in the Cisco UCS 5108 Blade Server Chassis. 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 11

Figure 1. Cisco UCS B420 M4 Blade Server Front View Figure 2. Cisco UCS B420 M4 Blade Server Inside view The Cisco UCS B420 M4 provides: Four Intel Xeon processor E5-4600 v3 series CPUs for up to 72 cores per server 48 DIMM slots providing 3 TB of 2133-MHz DDR4 memory Three mezzanine connectors enabling bandwidth of 160 Gbps Four SAS, SATA, or SSD hot-pluggable drive bays RAID 0, 1, 5, and 10, with optional 2-GB flash memory-backed write cache Up to four Cisco UCS B420 M4 Blade Servers per Cisco UCS 5108 Blade Server Chassis The Cisco UCS B420 M4 server is well suited for demanding IT workloads, including: Large virtual server and virtual desktop workloads Memory-intensive database installations Cloud infrastructure 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 2 of 11

Enterprise resource planning (ERP) and customer relationship management (CRM) applications Development and in-house applications Fusion iomemory PX600 Fusion-io (a SanDisk company) builds server-resident PCI Express (PCIe) flash storage for applications requiring a high IOPS rate with low latency. For enterprise applications such as Oracle, Fusion iomemory PX600 delivers a significant improvement in performance by reducing latency with a persistent, reliable, high-performance and highcapacity storage tier. By significantly improving performance, iomemory helps enable customers to reduce infrastructure and power and cooling costs compared to a traditional hard disk storage system, for lower total cost of ownership (TCO). The iomemory solution offers a persistent storage option, enabling the card in a server to load an entire database of less than 1300 GB into the card s flash memory, or just the performance-demanding structures of a larger database. Offloading storage requirements to the card from the SAN, closer to the server, can significantly increase performance. The iomemory PX600 provides: 1300 GB of multilevel cell (MLC) flash capacity 2.7 GBps of bandwidth (1-MB read operation) 1.7 GBps of bandwidth (1-MB write operations) 235,000 IOPS (512-byte random read operations) 370,000 IOPS (512-byte random write operations) 15 microseconds of write latency, and 92 microseconds of read latency Hardware supported: All Cisco UCS B-Series M4 blade servers Software supported: Cisco UCS Manager 2.1 Oracle 12c Solution Configuration Installation and configuration details for the solution are beyond the scope of this document. Here are the high-level steps for the solution: 1. Install Oracle Linux 6.5. 2. Download matching iomemory firmware and driver RPM packages from Cisco.com. Download additional utilities and RPM packages from the SanDisk support website (fusion-o support site). Here is the list of installed packages: [root@oracle1 ~]# rpm -qa grep fio fio-util-4.1.1.297-1.0.el6.x86_64 fio-common-4.1.1.297-1.0.el6.x86_64 fio-sysvinit-4.1.1.297-1.0.el6.x86_64 fio-2.1.10-1.el6.rf.x86_64 fio-preinstall-4.1.1.297-1.0.el6.x86_64 [root@oracle1 ~]# rpm -qa iomemory* iomemory-vsl4-3.8.13-16.2.1.el6uek.x86_64-4.1.1.297-1.0.el6.x86_64 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 3 of 11

3. After the packages are installed, verify that iomemory cards are detected in /dev (for example, /dev/fct0). Use the fio-format utility to format and attach iomemory cards. 4. Create OS partitions on both iomemory cards. The database solution tested here used four data partitions (270 GB) and two log partitions (30 GB) on each card. 5. Configure Oracleasm RPM and packages. Use asmtool to create and configure asm volumes. Here is a sample command: /usr/sbin/asmtool -C -l /dev/oracleasm -n data1 -s /dev/fioa1 -a force=yes Please refer to this document for additional configuration details: http://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/unified-computing/whitepaper_c11-732623.html Fusion iomemory PX600 Tests: Speeds and Feeds System Tests It s a common practice to evaluate the system performance before deploying any database application. System baseline performance testing is performed using common I/O calibration tools such Oracle Orion and Linux FIO. These tools can generate I/O patterns that mimic the type of I/O operations performed by Oracle databases. The testing here used both Orion and Linux FIO to measure I/O performance before actually installing Oracle. Figure 3 shows I/O tests at different read and write percentages and the corresponding IOPS. Figure 4 shows the random-read IOPS and throughput tests for various block sizes exercised using the iomemory PX600. Figure 3. Random IOPS for Various Read-Write Ratios and Latency Tests 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 4 of 11

Figure 4. IOPS and Bandwidth Tests These were the main performance results: 300,000 IOPS pure random read operations (4-KB block size) 275,000 IOPS pure random read operations (8-KB block size) 213,000 IOPS for mixed read and write operations (8-KB block size) 174,000 IOPS for pure random write operations (8-KB block size) 5.1 GBps throughput (1-MB read operations) 0.5 ms average latency (mixed read and write operations) User Scalability Tests For the database user scalability tests, an Oracle 12c single-instance database is loaded with 1.7 TB of order-entry transactions using the Swingbench testing tool. The database is configured with 96 GB of Oracle System Global Area (SGA; 5 percent of the data set in the memory), and the Swingbench transactional load test was run with users scaling from 50 to 400. Each database test run consists of capturing and analyzing test reports from the Swingbench testing tool, Oracle Automatic Workload Repository (AWR) reports, and Oracle Enterprise Manager reports, plus system performance metrics from the server and OS perspective. As the database scaled from 50 to 400 concurrent users, the testing showed nearly linear scalability in transactions per minute and IO throughput with little or no change in latency (Figure 5). As the number of users scaled beyond 400, the testing revealed typical data concurrency related events, which slightly reduced throughput per minute (TPM) and nominally increased latency. These changes were attributed to the greater number of users working on a relatively smaller data set at a very high speed. From captured statistics, the testing also verified that neither hardware saturation nor bottlenecks occurred. For small and medium-size businesses, departmental databases, and quality assurance groups running either smaller databases or medium-size databases (less than 1700 GB), an entire Oracle database is shown to perform excellently on the flash memory in a iomemory PX600 in a Cisco UCS B420 M4 (Figure 5). 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 5 of 11

Figure 5. Database User Scalability Test Results Oracle AWR and Enterprise Manager Reports Database performance was assessed using the Top-10 events AWR report (Figure 6) and the Enterprise Manager performance report (Figure 7). Note these important results in the performance reports: Figure 6. Top-10 Wait Events from Oracle AWR Report with 400 Users 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 6 of 11

Figure 7. Oracle Enterprise Manager IOPS and Latency Charts Database CPU utilization was 61.8 percent: effective utilization of CPU cycles for performing database transactions. 234 million single-block database file sequential read requests (8-KB random read operations) are serviced with less than 0.4-millisecond latency. These operations are mostly transaction-related index fetches, showing the random nature of the workload I/O profile. 58 million log file sync events (sequential write operations) are managed with 0.36-millisecond latency. Nearly 100,000 consistent and sustained IOPS occurred in the entire test duration. 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 7 of 11

Oracle Work Versus the Wait Distribution Cycle An important characteristic of an optimized solution is delivery of high-performance results with effective utilization of system resources (CPU cycles) and elimination of inefficiencies in the form of wait events, latency, etc. Figure 8 shows the distribution of database time during database testing. Figure 8. Distribution of Database Time Spent in Various Activities The figure shows that most database cycles were spent in database CPU activities (61.8 percent). This amount is the actual work that the database performed in processing transactions. The next set of events that consumed database cycles consists of activities that need to be serviced quickly to help ensure increased database transaction throughput. User I/O consumed 34.6 percent of the total data requests that needed to be fetched from storage to the Oracle engine to process transactions. Commit operations consumed 7.6 percent of the database cycles. This activity is triggered when a user session commits a transaction and the contents of the log buffer have to be written to the redo log file to confirm to the user that the transaction is committed and is fully secured. The rest of the activities system I/O, configuration, and other processes contribute less than 1 percent of the total activity. In a traditional SAN environment, a significant chunk of database time is spent in I/O processing waiting for data from the storage system. This wait time results in overall reduced database CPU utilization, and hence a lower number of transactions. The use of iomemory in this configuration testing helps ensure increased database CPU utilization, with significant reduction in time needed to service data requests, resulting in a highly optimized database solution. 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 8 of 11

System Performance Metrics Computing, storage, memory, and network performance metrics constitute the system performance metrics. Figure 9 shows the storage IOPS compared to the latency for various user scalability tests. The chart shows that as database test users increased from 50 to 400, IOPS scaled linearly while still delivering very low latency. This performance behavior is critical to delivering a high-performance database solution, to help ensure quick response time even with database user spikes up to designed maximum. To maintain quick response time, storage systems should deliver the data requests in the form IOPS by maintaining consistently low latency. These two performance features define high-performance storage systems. The figure shows that the peak of 400 users delivering 68,000 read IOPS and 30,000 write IOPS yields close to 100,000 IOPS at 0.4-millisecond latency. Figure 9. IOPS Versus Latency for User Scalability Tests To deliver high performance for any enterprise workloads, CPU cycles must be efficiently utilized in the user space by reducing system-level overhead and I/O wait cycles. Figure 10 shows the CPU performance graph for the peak of 400 users. Note the following points based on the CPU performance graph: Even with the peak of 400 users, less than 10 percent of the CPU cycle time was spent on I/O wait and system space cycles. The user CPU utilization is about 31 percent. The rest of the time 55 percent is idle, leaving CPU cycles available for any additional (second database) workload. 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 9 of 11

Figure 10. CPU Utilization Conclusion The solution stack must be designed with balanced server, storage, and networking subsystems to deliver reliable, scalable high performance for enterprise database applications. The test results and performance metrics illustrated in this document show that a solution stack using Cisco UCS B420 M4 with Fusion iomemory PX600 achieves these performance and scalability benefits. The performance metrics also show that the solution leaves a significant amount of CPU cycles available to accommodate additional databases, signifying a high-performance consolidation solution for small to medium-size databases. This high-performance, consolidation solution is simple and easy to deploy at a lower cost than other popular SAN storage systems. These features also help reduce maintenance (frequent mechanical disk failures), power, cooling, and rack space costs significantly. You can also augment existing investments in traditional SANs by offloading specific hot database objects with high response time requirements to the flash memory in Fusion iomemory for performance improvements, leaving the remaining cold data to be served from traditional SAN storage. 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 10 of 11

Printed in USA C22-734918-00 06/15 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 11 of 11