Exploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer

Similar documents
Three Generations of Linux Dr. Alexander Dunaevskiy

IBM POWER8 100 GigE Adapter Best Practices

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

Linux on System z - Disk I/O Alternatives

IBM IBM Open Systems Storage Solutions Version 4. Download Full Version :

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Q500-P20 performance report

Network Design Considerations for Grid Computing

A GPFS Primer October 2005

Introduction to Linux features for disk I/O

Aziz Gulbeden Dell HPC Engineering Team

Managing Caching Performance and Differentiated Services

CSCS HPC storage. Hussein N. Harake

Comparison of Storage Protocol Performance ESX Server 3.5

Creating the Fastest Possible Backups Using VMware Consolidated Backup. A Design Blueprint

The Oracle Database Appliance I/O and Performance Architecture

Optimizing the GigE transfer What follows comes from company Pleora.

Linux Network Tuning Guide for AMD EPYC Processor Based Servers

EMC CLARiiON Backup Storage Solutions

Analysis of high capacity storage systems for e-vlbi

High-density Grid storage system optimization at ASGC. Shu-Ting Liao ASGC Operation team ISGC 2011

Abstract /10/$26.00 c 2010 IEEE

Performance Benchmark of the SGI TP9400 RAID Storage Array

JMR ELECTRONICS INC. WHITE PAPER

SLES11-SP1 Driver Performance Evaluation

Experiences with HP SFS / Lustre in HPC Production

Oracle Performance on M5000 with F20 Flash Cache. Benchmark Report September 2011

Demystifying Storage Area Networks. Michael Wells Microsoft Application Solutions Specialist EMC Corporation

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Intel SR2612UR storage system

Distributed caching for cloud computing

VMware VMmark V1.1 Results

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

PERFORMANCE TUNING TECHNIQUES FOR VERITAS VOLUME REPLICATOR

Section 2 Storage Systems Architecture

Comparing Performance of Solid State Devices and Mechanical Disks

VMware VMmark V1.1.1 Results

designed. engineered. results. Parallel DMF

Vess A2000 Series. NVR Storage Appliance. Milestone Surveillance Solution. Version PROMISE Technology, Inc. All Rights Reserved.

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary

Disk I/O and the Network

Chapter 10: Mass-Storage Systems

Considerations for using TSM in a SAN

VMware VMmark V1.1 Results

Mass-Storage Structure

Oracle Database 11g Direct NFS Client Oracle Open World - November 2007

Connectivity. Module 2.2. Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved. Connectivity - 1

Linux Network Tuning Guide for AMD EPYC Processor Based Servers

IBM Emulex 16Gb Fibre Channel HBA Evaluation

A SCSI Transport Layer Extension with Separate Data and Control Paths for Scalable Storage-Area-Network Architectures

Four-Socket Server Consolidation Using SQL Server 2008

CS3600 SYSTEMS AND NETWORKS

Redbooks Paper. GPFS Sequential Input/Output Performance on IBM pseries 690. Abstract. Introduction. Gautam Shah James Wang

DTN End Host performance and tuning

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Storage Performance 101. Ray Lucchesi, President Silverton Consulting, Inc.

Introduction to TCP/IP Offload Engine (TOE)

Red Hat Enterprise Linux on IBM System z Performance Evaluation

Motivation CPUs can not keep pace with network

Storage. Hwansoo Han

Optimizing Quality of Service with SAP HANA on Power Rapid Cold Start

Assessing performance in HP LeftHand SANs

Storage Area Network (SAN)

iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard

Planning for Virtualization on System P

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897

CA485 Ray Walshe Google File System

Computer Systems Laboratory Sungkyunkwan University

Infrastructure architecture essentials, Part 3: System design methods for sc...

CS A490 Digital Media and Interactive Systems

D1.1 Server Scalibility

440GX Application Note

VERITAS Storage Foundation 4.0 TM for Databases

IOmark- VM. HP MSA P2000 Test Report: VM a Test Report Date: 4, March

PCIe 10G 5-Speed. Multi-Gigabit Network Card

VMware VMmark V1.1.1 Results

Parallels Virtuozzo Containers

Why Your Application only Uses 10Mbps Even the Link is 1Gbps?

CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

IBM InfoSphere Streams v4.0 Performance Best Practices

Example Networks on chip Freescale: MPC Telematics chip

Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Virtualization Practices:

COSC6376 Cloud Computing Lecture 17: Storage Systems

Veritas Storage Foundation and. Sun Solaris ZFS. A performance study based on commercial workloads. August 02, 2007

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems

Storage Area Network (SAN) Training Presentation. July 2007 IBM PC CLUB Jose Medeiros Storage Systems Engineer MCP+I, MCSE, NT4 MCT

NFS on the Fast track - fine tuning and futures

Mission-Critical Enterprise Linux. April 17, 2006

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

CSC Memory System. A. A Hierarchy and Driving Forces

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security

Performance Optimisations for HPC workloads. August 2008 Imed Chihi

I/O CANNOT BE IGNORED

Configuring iscsi in a VMware ESX Server 3 Environment B E S T P R A C T I C E S

IBM s Data Warehouse Appliance Offerings

Falcon: Scaling IO Performance in Multi-SSD Volumes. The George Washington University

Lenovo RD240 storage system

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Transcription:

TSM Performance Tuning Exploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer peinkofer@lrz.de

Agenda Network Performance Disk-Cache Performance Tape Performance Server Performance Lessons Learned Additional Resources

Network

The Problem with High-Speed Networks Current Ethernet technology can transfer up to 1.25 GB/s With default settings we cannot saturate a single Gigabit

Tuning Network Settings for Gigabit and Beyond Utilizing (Multi-)Gigabit links requires tuning of: TCP Window size How much can be sent/received before waiting for ACK Maximum Transfer Unit How much can be sent/received per Ethernet frame

TCP Window Size $> cat /etc/sysctl.conf net.ipv4.tcp_rmem = 4096 87389 4194304 net.ipv4.tcp_wmem tcp = 4096 87389 4194304 net.core.rmem_max = 4194304 net.core.wmem e. e _ max = 4194304 Sets a limit of 4MB for the receive and send window TSM option TCP Window size has to be set to 2MB on server and client

Maximum Transfer Unit $> ifconfig ethx mtu XXXX $> cat /etc/sysctl.conf net.ipv4.ip_no_pmtu_disc no pmtu = 0 Set MTU to max supported size Enable MTU path discovery for communication with non-jumbo- Framed hosts Only useful if every intermediate t system supports Jumbo Frames

Measuring the Success IPERF was used to benchmark the network performance http://dast.nlanr.net/projects/iperf

Measuring the Success Server $>iperf s w 1M f M Client $> iperf -c <server> -t 20 -w 1M -f M ------------------------------------- Client connecting to <server>, TCP port 5001 TCP window size: 2.00 MByte (WARNING: requested 1.00 MByte) ------------------------------------------------------------ [ 3] local <IP> port 36484 connected with <IP> port 5001 [ 3] 0.0-20.0 sec 10665 MBytes 533 MBytes/sec

Measuring the Success Influence of TCP Window size on a 10 Gbit Ethernet link

Some Thoughts on Bonding/Trunking Great for high availability Mostly not suitable for increasing performance Single client can utilize a single link only Multiple clients balance across available links only if: Clients and server are in the same subnet or Balancing algorithm uses IP addresses (unlikely) We have to keep in mind that: Switch is responsible for balancing incoming traffic Server is responsible for balancing outgoing traffic

Alternatives to Bonding Use next Ethernet generation Balance manually by using multiple IP s

Disk-Storage Photo from Helmut Payer, gsicom

Main Factors for good Disk-Cache Performance Stripe-Size Locality of disk accesses IO-Subsystem of OS Number of FC-Links utilized in parallel

Stripe-Size Size Rule of thumb: Random IO => Small Stripe-Size Sequential IO => Large Stripe-Size TSM Disk-Cache is rather a sequential IO workload Use Stripe-Size of 512 KB or larger TSM Database is rather a random IO workload IBM recommends Stripe-Size of 256 KB

Locality of Disk Accesses How TSM uses Disk-Cache volumes cannot be influenced How the OS lays out the volumes can be influenced

Locality of Disk Accesses TSM can allocate multiple disk volumes in parallel tsm> DEFINE VOLUME /stg/vol1.dsm FORMATSIZE=16G ANR0984I PROCESS XX for DEFINE VOLUME started...... tsm> DEFINE VOLUME /stg/vol4.dsm FORMATSIZE=16G ANR0984I PROCESS XY for DEFINE VOLUME started... How the volumes are placed on disk depends on the file systemstem

XFS Allocates disk-blocks when file system buffer is flushed Write(1) Write(2) Write(3) Write(4) Write(1) Write(2) Write(3) Write(4) Filesystem Cache Flush Buffers Disk

EXT3 Allocates disk-blocks when data hits the file system buffer Write(1) Write(2) Write(3) Write(4) Write(1) Write(2) Write(3) Write(4) Filesystem Cache Flush Buffers Disk

Comparing EXT3 and XFS XFS has no problems with parallel allocation of Disk-Volumes XFS has a slight weakness with re-write workloads On EXT3, volumes have to be defined one after another

Linux IO-Subsystem Linux s IO-Subsystem is rapidly evolving More and more screws to turn More and more complex to tune

Linux IO-Subsystem Current observation: Write performance OK with default settings Read performance must be tuned by setting read-ahead of block device $> blockdev setra <bytes> <device>

IO Multipathing Typically more than one FC-Link is used for connecting servers to storage for HA reasons Available FC-Links can be used in parallel to gain optimal performance IO-Balancing algorithm depends on IO-Failover driver Configuration for exploiting performance benefit depends on algorithm

IOMP with Qlogic Drivers Qlogic driver supports assignment of individual LUNs to a specific FC-Link Performance per LUN is not increased Resulting configuration: Use at least 2 LUNs per TSM-Instance and stripe them with Software-RAID 0 Use multiple TSM-Instances per server and use dedicated LUNs per instance

Measuring the Success IOZONE was used to benchmark disk performance http://www.iozone.org

Measuring the Success Write file sequential $>iozone -s 10g -r 512k -t 1 -i0 -w Read file sequential $>iozone -s 10g -r 512k -t 1 i1 -w -s 10g : Amount to Write/Read is 10 GB -r 512k: Size of Record to Write/Read is 512 KB -t 1 : Write 1 File in parallel -i 0 1 : Perform Write Perform Read -w : Don t delete Files after benchmark

Comparison of Stripe-Size Size IBM FastT900 with 6 SATA-Disks in a RAID5 volume Workload: Single file sequential read/write

EXT3 Block Allocation IBM FastT900 with 6 SATA-Disks in a RAID5 volume Workload: 12 parallel sequential reads

Comparison of Read-Ahead STK FlexLine 380 with 7 FC-Disks in a RAID5 volume

Tape-Storage

TSM Tape Performance No real influence on tape performance Barely seen 125 MB/s for more than a few seconds with Titanium drives TSM v5.3 on Linux seems not to be ready for current high-end tapes yet Assumption: Some buffers are too small Photo from Sun Microsystems

Server Photo from Helmut Payer, gsicom

Main Factors of Server Performance PCI Bus throughput Memory Bandwidth Number of CPU-Cores Performance of a CPU-Core

PCI Bus Throughput Data travels 4 times over PCI Bus => PCI Bus is main bottleneck PCI-X barely achieves half of the theoretical throughput in typical TSM workloads PCI-Express performs much better because of its switched topology General Rule: Don t try to save money on the peripheral interconnect

Memory Bandwidth As long as DIRECT-IO is not used, data travels 4 times through memory Database operations rely on memory performance too

Number of CPU-Cores Cores TSM is a multi-threaded application The more CPU-cores available the more work can be done in parallel l

Lessons Learned

Tuning Network TCP Window-size: always MTU: if applicable Disk Read-ahead Define Cache-/DB-/Log-volumes sequentially

Criteria for next Servers Have fastest peripheral interconnect available Have 10 Gbit-Ethernet Have at least 4-Gbit FC-HBAs Have at least 4 CPU-Cores Have upper class CPU-Core performance

Additional Resources IBM Tivoli Storage Manager Performance Tuning Guide v5.3 IBM DS4000 Best Practices and Performance Tuning Guide

Thank you for your Attention Any questions? Contact: peinkofer@lrz.de