Technical Paper. Performance and Tuning Considerations for SAS on the Hitachi Virtual Storage Platform G1500 All-Flash Array

Similar documents
Technical Paper. Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array

Technical Paper. Performance and Tuning Considerations for SAS using Intel Solid State DC P3700 Series Flash Storage

Technical Paper. Performance and Tuning Considerations for SAS on Fusion-io ION Accelerator

Technical Paper. Performance and Tuning Considerations on Amazon Web Services with SAS 9.4 Using IBM Spectrum Scale

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

IBM Emulex 16Gb Fibre Channel HBA Evaluation

Oracle Platform Performance Baseline Oracle 12c on Hitachi VSP G1000. Benchmark Report December 2014

Microsoft SQL Server 2012 Fast Track Reference Architecture Using PowerEdge R720 and Compellent SC8000

Hitachi Unified Storage VM Dynamically Provisioned 21,600 Mailbox Exchange 2013 Mailbox Resiliency Storage Solution

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

Tuning WebHound 4.0 and SAS 8.2 for Enterprise Windows Systems James R. Lebak, Unisys Corporation, Malvern, PA

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays

SAS workload performance improvements with IBM XIV Storage System Gen3

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation

A GPFS Primer October 2005

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

RHEL 7 Low Latency Update

Hitachi Virtual Storage Platform Family

Comparison of Storage Protocol Performance ESX Server 3.5

Performance of Virtual Desktops in a VMware Infrastructure 3 Environment VMware ESX 3.5 Update 2

Frequently Asked Questions Regarding Storage Configurations Margaret Crevar and Tony Brown, SAS Institute Inc.

Test Report: Digital Rapids Transcode Manager Application with NetApp Media Content Management Solution

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Milestone Solution Partner IT Infrastructure Components Certification Report

Performance Evaluation Using Network File System (NFS) v3 Protocol. Hitachi Data Systems

Citrix XenDesktop 5.5 on VMware 5 with Hitachi Virtual Storage Platform

LINUX IO performance tuning for IBM System Storage

Four-Socket Server Consolidation Using SQL Server 2008

The Oracle Database Appliance I/O and Performance Architecture

PERFORMANCE STUDY OCTOBER 2017 ORACLE MONSTER VIRTUAL MACHINE PERFORMANCE. VMware vsphere 6.5

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

Consolidating Microsoft SQL Server databases on PowerEdge R930 server

Implementing SQL Server 2016 with Microsoft Storage Spaces Direct on Dell EMC PowerEdge R730xd

Extremely Fast Distributed Storage for Cloud Service Providers

Surveillance Dell EMC Storage with Bosch Video Recording Manager

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

Exchange Server 2007 Performance Comparison of the Dell PowerEdge 2950 and HP Proliant DL385 G2 Servers

Accelerating storage performance in the PowerEdge FX2 converged architecture modular chassis

Assessing performance in HP LeftHand SANs

High-Performance Lustre with Maximum Data Assurance

DELL Reference Configuration Microsoft SQL Server 2008 Fast Track Data Warehouse

Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure

EMC Virtual Architecture for Microsoft SharePoint Server Reference Architecture

IBM InfoSphere Streams v4.0 Performance Best Practices

SoftNAS Cloud Performance Evaluation on Microsoft Azure

Shared File System Requirements for SAS Grid Manager. Table Talk #1546 Ben Smith / Brian Porter

SoftNAS Cloud Performance Evaluation on AWS

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

IBM and HP 6-Gbps SAS RAID Controller Performance

Cisco Unified Computing System Delivering on Cisco's Unified Computing Vision

Accelerating Microsoft SQL Server 2016 Performance With Dell EMC PowerEdge R740

VERITAS Storage Foundation 4.0 for Oracle

SAS 9.2 on RHEL Reference Architecture

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

PESIT Bangalore South Campus

NAS for Server Virtualization Dennis Chapman Senior Technical Director NetApp

A Performance Characterization of Microsoft SQL Server 2005 Virtual Machines on Dell PowerEdge Servers Running VMware ESX Server 3.

New HPE 3PAR StoreServ 8000 and series Optimized for Flash

Accelerate Applications Using EqualLogic Arrays with directcache

OneCore Storage SDK 5.0

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Management Abstraction With Hitachi Storage Advisor

Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration. Sanjay Rao, Principal Software Engineer

Surveillance Dell EMC Storage with Verint Nextiva

High-density Grid storage system optimization at ASGC. Shu-Ting Liao ASGC Operation team ISGC 2011

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads

IBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide

Demartek December 2007

vsan 6.6 Performance Improvements First Published On: Last Updated On:

HP SAS benchmark performance tests

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Best Practices for Setting BIOS Parameters for Performance

IBM Tivoli Storage Manager for Windows Version Installation Guide IBM

FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC

MongoDB on Kaminario K2

EMC Virtual Infrastructure for Microsoft Exchange 2010 Enabled by EMC Symmetrix VMAX, VMware vsphere 4, and Replication Manager

HITACHI VIRTUAL STORAGE PLATFORM F SERIES FAMILY MATRIX

SUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS

Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation

Removing the I/O Bottleneck in Enterprise Storage

HITACHI VIRTUAL STORAGE PLATFORM G SERIES FAMILY MATRIX

LATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN

IBM Virtual Fabric Architecture

SCALING UP VS. SCALING OUT IN A QLIKVIEW ENVIRONMENT

IBM Power Systems solution for SugarCRM

Lenovo Database Configuration

Eliminate the Complexity of Multiple Infrastructure Silos

vsan Mixed Workloads First Published On: Last Updated On:

Milestone Solution Partner IT Infrastructure Components Certification Report

17TB Data Warehouse Fast Track Reference Architecture for Microsoft SQL Server 2014 using PowerEdge R730 and Dell Storage PS6210S

RAIDIX Data Storage System. Entry-Level Data Storage System for Video Surveillance (up to 200 cameras)

EMC VMAX 400K SPC-2 Proven Performance. Silverton Consulting, Inc. StorInt Briefing

Virtuozzo Hyperconverged Platform Uses Intel Optane SSDs to Accelerate Performance for Containers and VMs

Video Surveillance EMC Storage with LENSEC Perspective VMS

Create a Flexible, Scalable High-Performance Storage Cluster with WekaIO Matrix

Transcription:

Technical Paper Performance and Tuning Considerations for SAS on the Hitachi Virtual Storage Platform G1500 All-Flash Array

Release Information Content Version: 1.0 April 2018. Trademarks and Patents SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies. Statement of Usage This document is provided for informational purposes. This document may contain approaches, techniques and other information proprietary to SAS.

Contents Introduction...2 VSP G1500 Performance Testing...2 Test Bed Description... 2 Data and IO Throughput... 2 SAS File Systems... 3 Hardware Description... 3 VSP G1500 Test Host Configuration... 4 VSP G1500 Test Storage Configuration... 4 Test Results...5 Single Host Node Test Results... 5 Scaling from One Node to Four Nodes... 6 Scaling from Four Nodes to Eight Nodes... 7 Hitachi and SAS Tuning Recommendations...8 Host Tuning... 8 Hitachi Storage Tuning... 8 Conclusion...9 Resources...9 Contact Information...9 i

Introduction This paper is a test of the SAS mixed analytics workload on the Hitachi Virtual Storage Platform G1500 (VSP G1500) All-Flash array. This effort involves a flood test of one, four, and eight simultaneous X-86 nodes running a SAS mixed analytics workload to determine scalability against the storage and uniformity of performance per node. This technical paper outlines performance test results performed by SAS and provides general considerations for setting up and tuning the X-86 Linux host and the VSP G1500 All-Flash array for SAS application performance. An overview of the testing is discussed first, including the purpose of the testing, detailed descriptions of the actual test bed and workload, and a description of the test hardware. Test results are included, accompanied by a list of tuning recommendations. General considerations, recommendations for implementation with SAS Foundation, and conclusions are discussed. VSP G1500 Performance Testing Performance testing was conducted with host nodes attached via 16 Gigabit fiber channel HBA to a VSP G1500 array to deliver a relative measure of how well it performed with IO heavy workloads. Of particular interest was whether the VSP G1500 could easily manage SAS large-block, sequential IO patterns. In this section of the paper, we describe the performance tests, the hardware used for testing and comparison, and the test results. Test Bed Description The test bed chosen for flash testing was a SAS mixed analytics workload. This was a scaled workload of computation and IO-oriented tests to measure concurrent, mixed job performance. The actual workload was composed of 19 individual SAS tests: 10 computation, two memory, and seven IO-intensive tests. Each test was composed of multiple steps. Some tests relied on existing data stores, and other tests (primarily, computation tests) relied on generated data. The tests were chosen as a matrix of long-running and shorter-running tests (ranging in duration from approximately five minutes to one hour and 20 minutes. In some instances, the same test (running against replicated data streams) was run concurrently and/or back-to-back in a serial fashion to achieve an average of *20 simultaneous streams of heavy IO, computation (fed by significant IO in many cases), and memory stress. In all, to achieve the 20-concurrent test matrix, 77 tests were launched. Data and IO Throughput The IO tests input an aggregate of approximately 300GB of data, and the computation tests input more than 120GB of data for a single instance of each SAS mixed analytics workload with 20 simultaneous tests on each node. Much more data is generated as a result of test-step activity and threaded kernel procedures such as SORT (e.g., SORT makes three copies of the incoming file to be sorted). As stated, some of the same tests run concurrently using different data. Some of the same tests are run back-to-back to produce a total average of 20 tests running concurrently. This raises the total IO throughput of the workload significantly. 2

In Graph 1 below, the workload spikes between 10 and 15GB/sec on average peak during an 8-node test. The test suite is highly active for about 40 minutes, and then it finishes two low-impact, long-running trail-out jobs. This is a good, average, SAS-shop-throughput characteristic for four nodes that simulate the load of an individual SAS COMPUTE node. This throughput is cumulative from all three primary SAS file systems: SASDATA, SASWORK, and UTILLOC. Each node has its own dedicated SASDATA and SASWORK file system. UTILLOC is on SASWORK. Graph 1. VSP G1500 Throughput (GB/sec) for 1- to 8-Node SAS Mixed Analytics Workload with 20 Simultaneous Tests Run SAS File Systems There are two primary file systems all XFS involved in the flash testing: SAS permanent data file space SASDATA SAS working and utility data file space SASWORK/UTILLOC For this workload s code set, data, result space, working space, and utility space, the following space allocations were made: SASDATA 4 TB SASWORK (and UTILLOC) 4 TB This gives you a general size of the application s on-storage footprint. It is important to note that throughput, not capacity, is the key factor in configuring storage for SAS performance. Hardware Description This test bed was run against four host nodes using the SAS mixed analytics workload with 20 simultaneous tests. The host and storage configurations are specified in the following sections. 3

VSP G1500 Test Host Configuration Here are the specifications for the eight host server nodes: Host: Lenovo x3650 M5, Red Hat Enterprise Linux 7.2 Kernel: Linux 3.10.0-327.36.3.el7.x86_64 Memory: 256GB CPU: Intel Xeon CPU E5-2680 v3 @ 2.50GHz Host tuning: Host tuning was accomplished via a tuned profile script. Tuning aspects included CPU performance, huge page settings, virtual memory management settings, block device settings, etc. Here is the tuned profile used for testing: # create /usr/lib/tuned/sas-performance/tuned.conf containing: [cpu] force_latency=1 governor=performance energy_perf_bias=performance min_perf_pct=100 [vm] transparent_huge_pages=never [sysctl] kernel.sched_min_granularity_ns = 10000000 kernel.sched_wakeup_granularity_ns = 15000000 vm.dirty_ratio = 40 vm.dirty_background_ratio = 10 vm.swappiness=10 # select the sas-performance profile by running tuned-adm profile sas-performance VSP G1500 Test Storage Configuration Here are the specifications for the VSP G1500 All-Flash array: VSP G1500. 2TB of cache. Dual-controller chassis configuration. Four VSD boards. Each VSD board has two eight-core Intel Haswell 2.3Ghz processors (128 cores total). High-performance back end with four back-end features. 64x16 Gbps front-end ports. Software: SVOS (Storage Virtualization Operating System) only. Drive configuration: 9x3.2TB FMD RAID 6 14+2 RAID groups (460TB RAW capacity, 403TB usable base 2). FMD drives were arranged in a single Hitachi Dynamic Provisioning (HDP) pool. Storage connection to hosts consisted of 32x16 Gbit fiber channel over dual brocade switches (two FC connections per host). 4

Here are the specifications for the HDP pool configuration: Here are the DP-volume-to-host-mapping details for VSP G1500: The two file systems, /SASDATA and /SASWORK/UTILLOC, are built with Red Hat Enterprise Linux Volume Manager from 8-500GB logical volumes each. These volumes are mapped to two ports each, alternating over four port pairs for a total of 8 fiber channel ports per node. Each port pair is shared by two nodes. Therefore, a total of 32 fiber channel ports were used. Test Results Single Host Node Test Results The SAS mixed analytics workload was run in a quiet setting for the X-86 system using VSP G1500 on a single host node. Multiple runs were performed to standardize results. The previous tuning specifications were applied to a Linux operating system for Red Hat Enterprise Linux 7.2. Work with your Hitachi engineer for appropriate tuning specifications for other operating systems or for the particular processors used in your operating system. Table 1 shows the performance of the VSP G1500. This table shows a frequency mean value of the CPU/real-time ratio, summed from all of the 77 tests submitted. It shows summed User CPU Time, and summed System CPU Time in minutes. Storage System: X-86 with VSP G1500 Mean Value of CPU/Real Time Ratio Elapsed Real Time in User CPU Time in System CPU Time in Node1 1.11 583 615 59 Table 1. Performance Using VSP G1500 All-Flash Array 5

The second column shows the ratio of total CPU Time (User + System CPU) to total Real Time. If the ratio is less than 1, then the CPU is spending time waiting on resources, usually IO. The VSP G1500 system delivered an excellent 1.11 ratio of Real Time to CPU. The natural question is, How can I get above a ratio of 1.0? Because some SAS procedures are threaded, you can actually use more CPU cycles than wall-clock time (Real Time). The third column shows the total elapsed Real Time in minutes, summed from each of the jobs in the workload. The VSP G1500, coupled with the fast Intel processors on the Lenovo compute node, executes the aggregate run time of the workload in approximately 583 minutes of total execution time. The primary take-away from this test is that the VSP G1500 easily provided enough throughput (with extremely consistent low latency) to fully exploit this host improvement! Its performance with this accelerated IO demand still maintained a very healthy 1.11 CPU/Real Time ratio! Scaling from One Node to Four Nodes For a fuller flood test, the SAS mixed analytics workload was run concurrently in a quiet setting for the X-86 system using VSP G1500 on four physically separate but identical host nodes. Multiple runs were performed to standardize results. The previous tuning specifications were applied to a Linux operating system for Red Hat Enterprise Linux 7.x. Table 2 shows the performance of the four host node environments attached to the VSP G1500. This table shows a frequency mean value of the CPU/Real Time ratio, summed from all of the 77 tests submitted. It shows summed User CPU Time and summed System CPU Time in minutes. Storage System: X-86 with VSP G1500 Mean Value of CPU/Real Time Ratio Elapsed Real Time in User CPU Time in System CPU Time in Node1 1.08 630 654 65 Node2 1.08 617 635 62 Node3 1.08 610 639 63 Node4 1.07 614 632 62 Table 2. Performance Using Four Nodes on VSP G1500 All-Flash Array The second column shows the ratio of total CPU Time (User + System CPU) to total Real Time. If the ratio is less than 1, then the CPU is spending time waiting on resources, usually IO. The VSP G1500 system delivered a very good 1.08 ratio of Real Time to CPU. The third column shows the total elapsed Real Time in minutes, summed from each of the jobs in the workload. The VSP G1500, coupled with the fast Intel processors on the Lenovo compute node, executes the aggregate run time of 6

the workload in an average of 618 minutes per node and 2,471 minutes of total execution time for all four nodes. The storage performance peaked at approximately 5GB/sec for Reads and 12GB/sec for Writes using a SAS 64K BUFSIZE on the four-node test. The VSP G1500 was able to easily scale to meet this accelerated and scaled throughput demand while providing a very healthy CPU/Real Time ratio per node! The workload was a mixed representation of what an average SAS shop might be executing at any given time. Due to workload differences, your mileage might vary. Scaling from Four Nodes to Eight Nodes For a fuller flood test, the SAS mixed analytics workload was run concurrently in a quiet setting for the X-86 system using VSP G1500 on eight physically separate but identical host nodes. Multiple runs were performed to standardize results. The previous tuning specifications were applied to a Linux operating system for Red Hat Enterprise Linux 7.x. Table 3 shows the performance of the eight host node environments attached to the VSP G1500. This table shows a frequency mean value of the CPU/Real Time ratio, summed from all of the 77 tests submitted. It shows summed User CPU Time and summed System CPU Time in minutes. Storage System: X-86 with VSP G1500 Mean Value of CPU/Real Time Ratio Elapsed Real Time in User CPU Time in System CPU Time in Node1 1.10 591 618 60 Node2 1.10 630 677 71 Node3 1.10 621 661 68 Node4 1.02 653 641 62 Node5 1.09 618 662 65 Node6 1.09 583 610 60 Node7 1.09 588 618 58 7

Storage System: X-86 with VSP G1500 Mean Value of CPU/Real Time Ratio Elapsed Real Time in User CPU Time in System CPU Time in Node8 1.08 595 622 59 Table 3. Performance Using Eight Nodes on VSP G1500 All-Flash Array The second column shows the ratio of total CPU Time (User + System CPU) to total Real Time. If the ratio is less than 1, then the CPU is spending time waiting on resources, usually IO. The VSP G1500 system delivered a very good 1.08 ratio of Real Time to CPU. The third column shows the total elapsed Real Time in minutes, summed from each of the jobs in the workload. The VSP G1500, coupled with the fast Intel processors on the Lenovo compute node, executes the aggregate run time of the workload in an average of 609 minutes per node and 4,879 minutes of total execution time for all eight nodes. The storage performance peaked at approximately 6GB/sec for Reads and 8GB/sec for Writes using a SAS 64K BUFSIZE on the eight-node test. The VSP G1500 was able to easily scale to meet this accelerated and scaled throughput demand while providing a very healthy CPU/Real Time ratio per node! The workload was a mixed representation of what an average SAS shop might be executing at any given time. Due to workload differences, your mileage might vary. Hitachi and SAS Tuning Recommendations Host Tuning Using the Hitachi VSP G1500 All-Flash array can deliver significant performance for an intensive SAS IO workload. It is very helpful to use the SAS tuning guides for your operating system host to optimize server-side performance with Hitachi storage solutions. It is important to study and use as an example the host and multipath tuning listed in the Hardware Description section. In addition, pay close attention to LUN creation and arrangements and LVM tuning. For more information about configuring SAS workloads for Red Hat Enterprise Linux systems, see http://support.sas.com/resources/papers/proceedings11/342794_optimizingsasonrhel6and7.pdf. In addition, a SAS BUFSIZE option of 64K, along with a Red Hat Enterprise Linux logical volume stripe size of 64K, were used to achieve the best results from the VSP G1500 array. Testing with a lower BUFSIZE value might yield benefits on some IO tests, but might require host and application tuning changes. Hitachi Storage Tuning The default parameters were used for the storage configuration on the dynamic provisioning pool. The workload was evenly distributed on front-end ports to allow for the required bandwidth to handle the large sequential workloads. 8

Conclusion The Hitachi VSP G1500 All-Flash array has been proven to be extremely beneficial for scaled SAS workloads when using newer, faster processing systems. In summary, the faster processor enables the compute layer to perform more operations per second, thus increasing the potential performance for the solution, but it is the consistently low response times of the underlying storage layer that allow this potential to be realized. Operating the VSP G1500 array is designed to be as straightforward as possible, but to attain maximum performance, it is crucial to work with your Hitachi engineer to plan, install, and tune the hosts for the environment. The guidelines listed in this paper are beneficial and recommended. Your individual experience might require additional guidance by Hitachi and SAS engineers, depending on your host system and workload characteristics. Resources SAS papers on performance, best practices, and tuning are at http://support.sas.com/kb/42/197.html Contact Information Name: Steve Gerety Enterprise: Hitachi Vantara Inc. Email: steve.gerety@hitachivantara.com Name: Tony Brown Enterprise: SAS Institute Inc. Email: tony.brown@sas.com Name: Jim Kuell Enterprise: SAS Institute Inc. Email: jim.kuell@sas.com Name: Margaret Crevar Enterprise: SAS Institute Inc. Email: margaret.crevar@sas.com 9

To contact your local SAS office, please visit: sas.com/offices SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. Copyright 2014, SAS Institute Inc. All rights reserved.