Storage Performance 101. Ray Lucchesi, President Silverton Consulting, Inc.

Similar documents
STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

Storage Performance Management Overview. Brett Allison, IntelliMagic, Inc.

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek

COSC6376 Cloud Computing Lecture 17: Storage Systems

Facing an SSS Decision? SNIA Efforts to Evaluate SSS Performance. Ray Lucchesi Silverton Consulting, Inc.

Chapter 10: Mass-Storage Systems

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

COMP283-Lecture 3 Applied Database Management

Disk Storage Systems. Module 2.5. Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved. Disk Storage Systems - 1

IBM IBM Open Systems Storage Solutions Version 4. Download Full Version :

The Benefits of Solid State in Enterprise Storage Systems. David Dale, NetApp

CS3600 SYSTEMS AND NETWORKS

Storage Area Network (SAN)

White Paper. EonStor GS Family Best Practices Guide. Version: 1.1 Updated: Apr., 2018

A Crash Course In Wide Area Data Replication. Jacob Farmer, CTO, Cambridge Computer

Chapter 12: Mass-Storage

V. Mass Storage Systems

The advantages of architecting an open iscsi SAN

S SNIA Storage Networking Management & Administration

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

EMC VMAX 400K SPC-2 Proven Performance. Silverton Consulting, Inc. StorInt Briefing

CSE 120. Operating Systems. March 27, 2014 Lecture 17. Mass Storage. Instructor: Neil Rhodes. Wednesday, March 26, 14

Exam : S Title : Snia Storage Network Management/Administration. Version : Demo

Storage Area Networks: Performance and Security

The File Systems Evolution. Christian Bandulet, Sun Microsystems

Snia S Storage Networking Management/Administration.

VERITAS Storage Foundation 4.0 for Oracle

Identifying and Eliminating Backup System Bottlenecks: Taking Your Existing Backup System to the Next Level

Removing the I/O Bottleneck in Enterprise Storage

NetVault Backup Client and Server Sizing Guide 2.1

SPC-1 * results. Silverton Consulting, Inc. StorInt Dispatch

Computer Systems Laboratory Sungkyunkwan University

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

PRESENTATION TITLE GOES HERE

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems

The Oracle Database Appliance I/O and Performance Architecture

Introduction to I/O and Disk Management

Storage. Hwansoo Han

Server and Storage Consolidation with iscsi Arrays. David Dale, NetApp Suzanne Morgan, Microsoft

Oracle Performance on M5000 with F20 Flash Cache. Benchmark Report September 2011

Model Answer. Multiple Choice FORM. Questions Answer Key D D D D D C C C C C. Questions

Introduction to I/O and Disk Management

CS370 Operating Systems

Lecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H. Katz Computer Science 252 Spring 1996

abstract 2015 Progress Software Corporation.

Hitachi Virtual Storage Platform Family

Storage Devices for Database Systems

NetVault Backup Client and Server Sizing Guide 3.0

Exploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer

From server-side to host-side:

CS252 S05. CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2. I/O performance measures. I/O performance measures

VDBENCH Overview. Steven Johnson. SNIA Emerald TM Training. SNIA Emerald Power Efficiency Measurement Specification, for use in EPA ENERGY STAR

VMware Virtual SAN. Technical Walkthrough. Massimiliano Moschini Brand Specialist VCI - vexpert VMware Inc. All rights reserved.

DELL TM AX4-5 Application Performance

EMC CLARiiON Backup Storage Solutions

PESIT Bangalore South Campus

Storage Virtualization II Effective Use of Virtualization - focusing on block virtualization -

How to Speed up Database Applications with a Purpose-Built SSD Storage Solution

Latest ESRP V2.0 results

Storage Area Networks SAN. Shane Healy

Computer Architecture Computer Science & Engineering. Chapter 6. Storage and Other I/O Topics BK TP.HCM

IOPStor: Storage Made Easy. Key Business Features. Key Business Solutions. IOPStor IOP5BI50T Network Attached Storage (NAS) Page 1 of 5

Scaling Data Center Application Infrastructure. Gary Orenstein, Gear6

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

SurFS Product Description

We once again return to our quarterly SPC results and as such we report on the latest benchmark results below.

CSE325 Principles of Operating Systems. Mass-Storage Systems. David P. Duggan. April 19, 2011

Chapter 6. Storage and Other I/O Topics

Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200

Specifying Storage Servers for IP security applications

NAS for Server Virtualization Dennis Chapman Senior Technical Director NetApp

Introducing Tegile. Company Overview. Product Overview. Solutions & Use Cases. Partnering with Tegile

Disk Configuration with Ingres

Storage Systems Market Analysis Dec 04

Chapter 12: Mass-Storage

Overview and Current Topics in Solid State Storage

Chapter 12: Mass-Storage

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation

HP SAS benchmark performance tests

access addresses/addressing advantages agents allocation analysis

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Storage Virtualization II. - focusing on block virtualization -

Southern Maine Community College Information Technology Professor Howard Burpee. Installing Windows Server 2012

HP VMA-series Memory Arrays

3.1. Storage. Direct Attached Storage (DAS)

Exam Name: XIV Replication and Migration V1

White Paper. A System for Archiving, Recovery, and Storage Optimization. Mimosa NearPoint for Microsoft

Cloudian Sizing and Architecture Guidelines

Storage Systems. Storage Systems

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

Nexenta Technical Sales Professional (NTSP)

High Performance Computing Course Notes High Performance Storage

Building Your Own Robust and Powerful Software Defined Storage with VMware vsan. Tips on Choosing Hardware for vsan Deployment

Creating the Fastest Possible Backups Using VMware Consolidated Backup. A Design Blueprint

Introduction to NetApp E-Series E2700 with SANtricity 11.10

Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation

Transcription:

Storage Performance 101 Ray Lucchesi, President Silverton Consulting, Inc.

SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature under the following conditions: Any slide or slides used must be reproduced without modification The SNIA must be acknowledged as source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA Education Committee. 2

Abstract Storage Performance 101 This tutorial is an introduction to storage array performance tuning and should be a first step for anyone working to improve data storage performance. Most data centers are dealing with ever increasing amounts of data storage. Although the need for storage seems insatiable array performance typically plateaus or worse, degrades post installation. Storage performance tuning is a teachable, ongoing activity. In addition, the vocabulary and activities are something any administrator should be able to master within a short time. 3

I/O Journey App O/S Host Buffer HBA H O S T Switch1 Switch 2 S A N Front-end Cache Backend Disk Sub- System 4

Fast Disk I/O 64K byte block high end disk I/O - 6 msec. 3.3 msec. 2 msec. 0.7 msec. Seek time Rotation Buf. @4Gb/s FC 0.16 msec. @3Gb/s SAS 0.21 msec. @2Gb/s FC 0.32 msec. Interface data transfer Read seek times from 3.3 to 3.6msec Write seek times from 3.8 to 4.0msec Rotational speed 15KRPM Sustained data xfer from 93 to 125MB/s Capacity from 36 to 300GB 5

Slow Disk I/O 64K byte block high capacity/slow disk I/O - 13 msec. 8 msec. Seek time 4.2 msec. 0.8 msec. Rotation @3Gb/s SAS 0.21 msec. @1.5Gb/s SATA 0.42 msec. Buf. Read seek times ~8.5msec Write seek times ~9.5msec Rotational speed 7.2KRPM Interface data transfer Sustained data xfer from 65 to 72MB/s Capacity from 250 to 750GB 6

Cache I/O 64K byte block cache I/O 0.2 msec. 1.2 msec. 1.2 msec. SubSys Overhead @4Gb/s FC 0.16 msec. Need to add subsystem overhead ~2.4msec - shown 1/2 in front and 1/2 at end Overhead must be added to bare disk I/O times above. 7

Transfer speed Transfer rates Fibre channel 4Gb/s, 2Gb/s, 1Gb/s - front-end or backend SCSI Ultra 320 (3.2Gb/s) - front-end or backend Ethernet 10Gb/s, 1Gb/s, 100Mb/s - front-end only SAS/SATA 3Gb/s, 1.5Gb/s - backend only or direct attached storage 8

Enterprise Class Subsystem Larger and better cache, more front-end & backend interfaces, more drive options Local and remote replication options High availability Better throughput Cache size ~256GB Front-end from 64 to 224 FC interfaces Drive options from 73 to 500GB 9

Midrange Class Subsystem More drive options but cache size, front-end, and back-end limitations Typically less replication options Typically less availability options Typically better response time Backend SAS/SATA II and/or FC interfaces Front-end 8 FC interfaces per dual controllers Cache from 1 to 16GB Drives from 73 to 750GB, 7.2 to 15KRPM 10

JBODs Just a bunch of disks Direct attached storage - SATA/SAS, SCSI Ultra 320, or FC/AL RAID either S/W or HBA based Only disk buffer and host buffer cache for cachelike I/O 11

QuickTime and a decompressor are needed to see this picture. Cache 64K byte block cache I/O 0.2 msec. Cache hit, the fastest way to do I/O 2.6 vs. 8.4 msec. Larger cache helps, but Write hit de-staged to disk, impacts sequential throughput Writing disk directly faster for sequential Sophistication matters 1.2 msec. 1.2 msec. SubSys Overhead @4Gb/s FC 0.16 msec. 12

QuickTime and a decompressor are needed to see this picture. QuickTime and a decompressor are needed to see this picture. Drive speed 64K byte block high end disk I/O - 6 msec. 3.3 msec. 2 msec. 0.7 msec. Seek time Rotation Buf. @4Gb/s FC 0.16 msec. @3Gb/s SAS 0.21 msec. @2Gb/s FC 0.32 msec. Interface data transfer 64K byte block high capacity/slow disk I/O - 13 msec. 8 msec. Seek time 4.2 msec. 0.8 msec. Rotation Buf. @3Gb/s SAS 0.21 msec. Interface @1.5Gb/s SATA 0.42 msec. data transfer Fast drives helps miss and destage activity 8.4 vs. 15.4 msec. Assuming no other bottlenecks High capacity/slow drives degrades miss/destage performance seriously Response time concern, subsystem can mask throughput impacts with non-busy drives 13

RAID and Striping RAID-1 - mirrored data Reads use closest seek Writes both, 2nd destaged later RAID-4, 5, 6, DP - parity + data blocks Parity block write penalty RAID 5, 6, & DP parity block(s) distributed RAID 4 lone parity drive (hot drive) RAID 6 & DP two parity drives, RAID 5 has one LUN striping - LUN striped across multiple RAID groups (of same type) Eliminate hot RAID groups D D D D D P P 14

I/O Balance LUN I/O activity spread Across RAID groups Across Front-end interfaces/controllers Across Back-end interfaces Application/workload mix - toxic workloads reduce cache hits 15

Cache Revisited Cache read-ahead - insures follow-on sequential requests from cache. Some subsystems compute value in real-time Others specify (consider cache demand at time of I/O) Cache read to write boundary - Some have a hardware boundary Others specify boundary (sized on average or peak write workload) 16

Remote Replication Remote replication - mirrors data written on one subsystem to remote subsystem Synchronous - write performance degrades Semi-synchronous - remote site data behind primary site Asynchronous - data duplication scheduled only correct at end of activity Enterprise vs. midrange - cache use vs. backend use 17

Transfer size For sequential, the larger the better Most requests generate seek, rotation & transfer, bigger transfers, less I/Os per file Each transfer adds 2.4 msec overhead, less I/O, less overhead, better throughput For random I/O, larger transfers don t work as well Each random I/O processes only small amount of data Real workloads always mixed 18

Transfer Speed Transfer speed impacts performance for large transfer sizes at backend as well as front-end To transfer 1MB block 5 msec. @ 2Gb/s 3.3 msec. @ 3Gb/s 2.5 msec. @ 4Gb/s 19

Midrange Cache Mirroring Each write adds transfer (between controllers) Performance depends on transfer size and link speed between controllers 20

Point-in-time (P-I-T) Copy P-I-T copy - copies data locally for backup and test purposes Copy-on-write needs cache, disk, and/or other resources for write 21

Front-end Limits Front-end interfaces can limit performance FC achieves ~90% of rated speed, 2Gb/s=~180MB/s per FC link iscsi achieves ~50-85% of rated speed, 1Gb/s=50 to 85MB/s per iscsi link Network connectivity often dictates number of frontend links but performance requirements should be considered 22

Backend Limits Backend number of FC or SAS/SATA links also limits I/O Cache miss activity translates into backend I/O Write hits also FC Switched vs. FC/Arbitrated Loop (FC/AL) - switched has more throughput per drive vs sharing across FC/AL SAS backend - point-to-point 23

Drive Limits Number and speed of drives can limit I/O performance Upper limit to the number of I/Os one drive can do Faster drives do more Compute max drive I/O and know peak miss/destage workload to check limits 24

Pre-purchase decisions Drives (number and performance level) Performance drives cost 50% more ($/GB) Interfaces front-end and possibly backend (type, number, and transfer speed) Cache size and sophistication 2X cache size for 10% readhit improvement 25

Configuration Time RAID level LUN Striping Fixed cache parameters - look ahead, cache mirroring, read to write partition boundary I/O balance - across LUNs, RAID groups, frontend & back-end interfaces, controllers Subsystem partitioning - cache, interfaces, drives (RAID groups) 26

Host Side HBA configuration matches subsystem Host transfer size should be > or = subsystem Host buffer cache for file system I/O Write-back vs. write-thru Sync s for write-back May use all free host memory Database cache, host buffer cache, and subsystem cache interaction Multi-path I/O 27

SAN Performance Fan-in ratio 5:1 to 15:1 server to storage ports Hop counts ISL and FC link oversubscription Locality 28

Email Server Many email servers use multiple databases One database stores MAPI clients data One database stores attachments (ptrs from above) One database is a transaction Log Isolate each database to own set of LUNs Transaction log should be separate from other two Email I/O besides reading&writing mail Beware of BlackBerry and other push users 29

Database I/O Separate log files from table spaces Indices from the table spaces For heavy sequential DB access use larger transfer sizes For heavy random DB access use smaller transfer sizes 30

Ongoing Workload Monitoring What to look for OS, database, or subsystem specific tools Overall I/O activity to subsystem LUNs I/O balance over controllers, RAID groups, LUNs Read and write hit rates Sequentiality vs. random workload mix toxicity 31

Workload Monitoring Tools IOSTAT iostat -xtc 5 2 extended disk statistics tty cpu disk r/s w/s Kr/s Kw/s wait actv svc_t %w %b tin tout us sy wt id sd0 2.6 3.0 20.7 22.7 0.1 0.2 59.2 6 19 0 84 3 85 11 0 sd1 4.2 1.0 33.5 8.0 0.0 0.2 47.2 2 23 Sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 sd3 10.2 1.6 51.4 12.8 0.1 0.3 31.2 3 31 SAR /usr/bin/sar -d 15 4 AA-BB gummo A.08.06 E 9000/??? 02/04/92 17:20:36 device %busy avque r+w/s blks/s avwait avserv 17:20:51 disc2-1 33 1.1 16 103 1.4 20.7 disc2-2 56 1.1 42 85 2.0 13.2 17:21:06 disc2-0 2 1.0 1 4 0.0 24.5 disc2-1 33 2.2 16 83 24.4 20.5 disc2-2 54 1.2 42 84 2.1 12.8 Average disc2-0 2 1.0 1 4 0.0 29.3 disc2-1 44 1.8 21 130 16.9 21.3 disc2-2 45 1.2 34 68 2.0 13.2 32

Performance Automation Some enterprise subsystems can automate performance tuning for you LUN balancing Across RAID groups Across controllers/front-end interfaces Cache hit maximization Read ahead amount Read:write boundary partitioning Others 33

iscsi vs. FC Ethernet at typically at 50-85% vs. FC at 90% of sustained rated capacity Ethernet 1Gb/s vs. FC 2-4Gb/s Processor overhead for TCP/IP stack, TOE vs. HBA handling FC protocol overhead iscsi hints iscsi HBAs, Server class NICs or Desktop NICs Jumbo frames, q-depth level, separate storage LAN/VLAN More hints on iscsi storage deployment at http://www.demartek.com/ 34

NFS/CIFS vs. block I/O NFS/CIFS Performance ~same as block I/O # Directory entries/mount point Gateway vs. integrated system Parallel vs. cluster vs global file systems No central repository for NetBench CIFS benchmarks 35

What Price Performance? Drive cost differential 50% more ($/GB) for faster drives Enterprise - midrange cost differential Enterprise ~$30/GB, Midrange ~$20/GB Entry ~$10/GB Cache size differential 100GB s or more for Enterprise vs. 10GB or less for midrange. 36

Performance Benchmarks For NFS results raw data available at http://www.spec.org/osg/sfs97r1/results/ For block storage results raw data available at http://www.storageperformance.org/results/benc hmark_results_all For summary charts and analysis of both NFS and block performance see my Performance Results StorInt Dispatch available at http://www.silvertonconsulting.com/ 37

For More Information Storage Performance Council (SPC) block I/O benchmarks www.storageperformance.org Standard Performance Evaluation Corp. (SPEC) SFS NFS I/O benchmarks www.spec.org Computer Measurement Group (CMG) - more than just storage performance www.cmg.org Storage Networking Industry Association (SNIA) - standards with performance info www.snia.org Silverton Consulting - StorInt Briefings & Dispatches, articles, presentations and pod casts www.silvertonconsulting.com 38

Q&A / Feedback Please send any questions or comments on this presentation to SNIA: track-storage@snia.org Many thanks to the following individuals for their contributions to this tutorial. SNIA Education Committee Ray Lucchesi 39

Background Information 40

Disk Array Terminology SAN attached disk arrays Enterprise class - big subsystems with cache, multiple front-end interfaces and 10 to 100s of TB of disk Mid-range and entry level have smaller amounts of each of these Just a bunch of disks (JBODs) internally attached disks 41

Disk Terminology Disk seek in milliseconds (msec.) Disk rotational latency Disk data transfer Disk buffer Seek Rotational latency 42

Cache Terminology Cache read hit - read request finds data in cache Cache write hit - write request writes to cache instead of disk, later destaged to backend Cache miss - either a read or write that uses disk to perform I/O Cache read ahead - during sequential reads, reading ahead of where I/O requests data 43

IO Performance Terminology Throughput - bytes transferred over time (MB/s or GB/s), also measured in I/O operations per second Response time - average time to do I/O (msec.) includes seek, rotation, and transfer Sequential workload - multi-block accesses in block number sequence Random workload - no discernible pattern to block number accesses 44

Acronyms FC Fibre channel LUN Logical unit number FC/ALFibre channel arbitrated MB/s loop Mega-bytes per second Gb/s Giga-bits per second Msec 1/1000 of a second GB/sGiga-bytes per second P-I-T copy Point-in-time copy HBA Host bus adapter RAID Redundant array of inexpens I/O Input/output request SAN Storage area network iscsiip SCSI SAS Serial attached SCSI JBODJust a bunch of disks SATA Serial ATA KRPM1000 revolutions per minute Xfer Transfer 45