Identifying and Eliminating Backup System Bottlenecks: Taking Your Existing Backup System to the Next Level

Similar documents
A Crash Course In Wide Area Data Replication. Jacob Farmer, CTO, Cambridge Computer

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation

Disk-to-Disk backup. customer experience. Harald Burose. Architect Hewlett-Packard

Introduction to iscsi

Data Deduplication Methods for Achieving Data Efficiency

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Trends in Data Protection and Restoration Technologies. Jason Iehl, NetApp

Creating the Fastest Possible Backups Using VMware Consolidated Backup. A Design Blueprint

Trends in Data Protection CDP and VTL

ADVANCED DEDUPLICATION CONCEPTS. Thomas Rivera, BlueArc Gene Nagle, Exar

Hyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process

Backup and Recovery Best Practices

ADVANCED DATA REDUCTION CONCEPTS

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.

Veritas NetBackup on Cisco UCS S3260 Storage Server

Understanding Virtual System Data Protection

Application Recovery. Andreas Schwegmann / HP

Backup 2.0: Simply Better Data Protection

StorageCraft OneXafe and Veeam 9.5

Notes & Lessons Learned from a Field Engineer. Robert M. Smith, Microsoft

Chapter 11. SnapProtect Technology

Disk to Disk Data File Backup and Restore.

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

Storage Virtualization II Effective Use of Virtualization - focusing on block virtualization -

Disk-Based Backup & Recovery: Making Sense of Your Options

Virtual Server Agent for VMware VMware VADP Virtualization Architecture

Cohesity Flash Protect for Pure FlashBlade: Simple, Scalable Data Protection

Tom Sas HP. Author: SNIA - Data Protection & Capacity Optimization (DPCO) Committee

Veritas NetBackup OpenStorage Solutions Guide for Disk

SNIA Discussion on iscsi, FCIP, and IFCP Page 1 of 7. IP storage: A review of iscsi, FCIP, ifcp

Netapp Exam NS0-510 NCIE-Backup & Recovery Implementation Engineer Exam Version: 7.0 [ Total Questions: 216 ]

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

The storage challenges of virtualized environments

Restoration Technologies

Protecting Miscrosoft Hyper-V Environments

Symantec NetBackup Backup Planning and Performance Tuning Guide

Veeam and HP: Meet your backup data protection goals

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

StorageCraft OneBlox and Veeam 9.5 Expert Deployment Guide

Exam Name: Midrange Storage Technical Support V2

Course Documentation: Printable course material This material is copyrighted and is for the exclusive use of the student only

Storage Virtualization II. - focusing on block virtualization -

Title: CompTIA Storage+ Powered by SNIA

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

How to Protect SAP HANA Applications with the Data Protection Suite

NetVault Backup Client and Server Sizing Guide 2.1

Chapter 10 Protecting Virtual Environments

NetVault Backup Client and Server Sizing Guide 3.0

Backup Exec 20.1 Tuning and Performance Guide

Executive Summary. The Need for Shared Storage. The Shared Storage Dilemma for the SMB. The SMB Answer - DroboElite. Enhancing your VMware Environment

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

Protect 500 VMs in 17 Minutes. A reference architecture for VMware Data Protection using CommVault Simpana IntelliSnap and IBM XIV Storage Arrays

Chapter 2 CommVault Data Management Concepts

Overview and Current Topics in Solid State Storage

Disk and Tape Backup Mechanisms

Symantec Backup Exec Blueprints

iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard

Backup and archiving need not to create headaches new pain relievers are around

Virtual Disaster Recovery

Cloud Optimized Performance: I/O-Intensive Workloads Using Flash-Based Storage

Zero Data Loss Recovery Appliance DOAG Konferenz 2014, Nürnberg

IBM Storage Software Strategy

Symantec Design of DP Solutions for UNIX using NBU 5.0. Download Full Version :

Code42 Defines its Critical Capabilities Methodology

Chapter 12 Backup and Recovery

Storage Protocol Analyzers: Not Just for R & D Anymore. Brandy Barton, Medusa Labs - Finisar

1 Quantum Corporation 1

Comparing File (NAS) and Block (SAN) Storage

EXAM - HP0-J67. Architecting Multi-site HP Storage Solutions. Buy Full Product.

Overview and Current Topics in Solid State Storage

iscsi Technology: A Convergence of Networking and Storage

A Promise Kept: Understanding the Monetary and Technical Benefits of STaaS Implementation. Mark Kaufman, Iron Mountain

Technology Insight Series

Protect enterprise data, achieve long-term data retention

Storage Area Network (SAN)

PRESENTATION TITLE GOES HERE

The Benefits of Solid State in Enterprise Storage Systems. David Dale, NetApp

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?

Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200

Nexsan Storage Optimizes Virtual Operating Environment for The Clark Enersen Partners

CompTIA Storage+ Powered by SNIA

WHITE PAPER: ENTERPRISE SOLUTIONS. Disk-Based Data Protection Achieving Faster Backups and Restores and Reducing Backup Windows

CONTENTS. 1. Introduction. 2. How To Store Data. 3. How To Access Data. 4. Manage Data Storage. 5. Benefits Of SAN. 6. Conclusion

PeerStorage Arrays Unequalled Storage Solutions

Chapter 11: File System Implementation. Objectives

Oracle Zero Data Loss Recovery Appliance (ZDLRA)

S SNIA Storage Networking Management & Administration

Chapter 10: Mass-Storage Systems

Deduplication File System & Course Review

Chapter 4 Data Movement Process

Protecting Hyper-V Environments

Boost your data protection with NetApp + Veeam. Schahin Golshani Technical Partner Enablement Manager, MENA

Next Generation Backup: Better ways to deal with rapid data growth and aging tape infrastructures

The Role of WAN Optimization in Cloud Infrastructures. Josh Tseng, Riverbed

S S SNIA Storage Networking Foundations

Cybernetics Virtual Tape Libraries Media Migration Manager Streamlines Flow of D2D2T Backup. April 2009

Data Protection for Virtualized Environments

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Transcription:

Identifying and Eliminating Backup System Bottlenecks: Taking Your Existing Backup System to the Next Level Jacob Farmer, CTO Cambridge Computer

SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature under the following conditions: Any slide or slides used must be reproduced without modification The SNIA must be acknowledged as source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA Education Committee. 2

Session Abstract Identifying and Eliminating Backup System Bottlenecks: Taking Your Existing Backup System to the Next Level This tutorial reveals the obvious and not-so-obvious bottlenecks found in enterprise backup systems and offers practical examples for applying the technologies described in the other Data Protection tutorials to achieve one's performance objectives. The goal of this session is to illustrate how one can take an existing backup system to the next level by integrating cutting edge technologies and low-cost disk. We start with the assumption that the end user has made a sizable investment in his/her enterprise backup system and is looking for a road map for affordable growth in both performance and capacity. We also assume that tape is here to stay (at least for now) and that the ultimate goal is to get data onto tape for off-site removal. Topics include accelerating LAN-based backups, achieving maximum performance from tape, disk staging with ordinary disk, deduplication, block-level differencing, and virtual tape. The take home message is that you cannot simply buy your way out of backup system headaches, you must design your way out. 3

Why are Backups Still Such a Pain? The infrastructure for backup does not grow at the rate of disk. Few backup systems are thoughtfully designed. The backup system is often a tack-on to a server or storage purchase, rather than a whole project unto itself. People typically do not grow their backup systems in tandem with their disk systems and applications. The backup system typically falls behind as new storage is added. Backup systems are typically not re-designed when other major changes take place All moves, add, and changes must be evaluated in terms of their impact on the backup system. 4

Where are the Backup System Bottlenecks? Network Bandwidth Storage Channel Bandwidth Front-End Bottlenecks Network I/O Processing Applications Exchange brick-level Files Systems Lots of small files CPU and RAM contention Storage Channels Disk Spindles LAN LAN Switches Backup Server I/O Processing Back-End Storage Devices 5

Pinpointing Bottlenecks It s not a perfect science The bottleneck varies from job to job Bottlenecks shift over the course of the night or even within a single backup job. Tools for spotting bottlenecks Backup system logs and reporting software Network and SAN monitoring tools Host monitoring tools Cross domain correlation Tools that correlate performance data from tape devices, backup software, SAN, LAN, and host. 6

Where to Begin? Break Down the Problem Design from Front to Back The most common mistake is to start with the back-end storage devices without regard to bottlenecks in front of the storage devices. Front-end (host side) Set RTO and RPO goals for each host, or at least for the important ones. Identify major challenges, e.g. shortcomings in hardware, high volume of small files, etc. Center Eliminate I/O processing bottlenecks. Back-end Can your backup devices (tape, disk, or optical) handle the various feeds or do they introduce a new bottleneck? 7

Front-End Challenges Large volumes How large is large? It s all relative to your environment and the equipment at your disposal. Large volumes with zillions of small files Causes a lot of disk thrashing when retrieving files Disproportionate amount of metadata compared to actual data. Secretly increases the capacity of the backup Taxes the backup server, which has to create log entries for each of these files. Slower hosts: legacy equipment, slow network connections, etc. 8

Eliminating Host-Side Bottlenecks Efficiency Incremental Forever Strategies Synthetic Fulls a/k/a Save Set Consolidation Granular Incrementals Backup only the bits and bytes that have changed Hierarchical Storage and ILM Solutions Brute Force Volume image backups Proxy backups Snapshot or split mirror and mount on the backup server or a dedicated backup client Replication to a dedicated backup client Multiplexing data streams with push agents 9

Efficiency: Why would you ever back something up more than once? If you want multiple backup copies, back it up once and then make copies of the backup. Incremental Forever After the initial full backup only changed files are backed up. Back end processes ensure that a full restore could be performed efficiently. Synthetic Full / Save Set Consolidation The most recent full and its subsequent incrementals are consolidated into a current full backup. Software must compensate for deliberate deletions. 10

Granular Incrementals Examples of sub-file-level incremental backups Block-level incremental (in the file system or volume) Transactional changes Object-level backups Commonality Factoring / De-duplication Instead of moving an entire changed file, only take action on the portions of the file that have changed. For instance, change the word dog to cat in your 100 page word document. Do you back up the full 100 pages, or just cat. Sub-file-level tracking of changes is the key to a variety of data protection technologies: Snaphots Data replication Continuous Backup 11

Efficiency: Hierarchical Storage and ILM Files, database records, and/or email that are not in active use are either deleted or migrated to a secondary storage system. Eliminates redundant backups. Allows for speedier recovery. Frees up backup system resources for other jobs. Saves on tape media and reduces capacity needs of back-end storage devices. What s the catch? Stub files and symbolic links can complicate things: Restoration processes Retention policies Indexing and e-discovery engines Backup performance might not improve as much as you might think. Watch out for meta-data processing bottlenecks. 12

Interleaved Tape from Multiplexing A B C D Results in: Roughly one in four blocks is relevant to the restore. With higher levels of parallelism, the problem is that much worse. Tape copy and restore operations suffer. 13

Eliminating Central Bottlenecks

Traditional Centralized Backup System The bottleneck is the I/O processing capacity of the backup server LAN SCSI bus Backup server 15

Dedicated High Speed Ethernet Backbone Problem: Still have I/O bottleneck on backup server. Backup server LAN High Speed LAN SCSI 16

SANs Enable the Backup Server to Seek Help from a Slave Server One or more computers can collect data from the network and send to tape. LAN 100% improvement in throughput!!! 17

Common Names for Slave Servers There is no industry standard term for slave server. Common names include: Storage Node Media Server Device Server Tape Server Smart Clients Media Agents Data Movers Others?? 18

SAN Backup without SAN Disk You don t need to have SAN disk to have SAN backup. For that matter, if often makes sense to do LAN backup, even if you have disk on a SAN. LAN-Free backup does not require disk to be on the SAN. SANs for backup only do not require fancy SAN infrastructure. FC-AL loop switches are okay. Non-redundant fabrics are okay. Older model switches are okay. Multi-port SAN routers can form entire SAN. 19

SAN Backup Without a SAN!!! Master Backup Server Slave Server You don t need a SAN to have SAN backup!!! Use dedicated drives and dedicated SCSI channels. Add Fibre Channel or iscsi functionality and dynamic drive sharing as a future upgrade. 20

SAN: Circumventing the Backup Servers Slave Server Continue to back up over LAN, and attach additional clients to the SAN over time. LAN Backup Server Application server sending data directly to storage device on SAN 21

Hybrid of LAN and SAN Backup LAN 22

SAN v. LAN Client: How to Choose? Just because is not a valid reason Don t use the SAN to back up just because you happen to have a SAN connection. Select SAN Backup When You have a host that can generate substantially more I/O than your network backup servers can handle. You have an aggressive RTO that cannot be met over the LAN. You have back end storage devices that can keep pace. Select LAN when LAN offers suitable performance. Focus your efforts on improving LAN backup performance. Consider staging backups to disk Consider multiplexing 23

Eliminating Back End Bottlenecks

Shoe Shining or Back-Hitching Performance problem caused by the mechanical operations of linear tape drives. The tape media is moving too fast to stop. When the data buffer empties. It keeps on going, then rewinds, then starts again. With erratic and slow data patterns it s like two steps forward, one step back. Larger drive buffers and throttling of the drive mechanism help compensate for media repositioning, but problem still persists in many situations. 25

Consequences of Shoe Shining Backup performance suffers. If you use multiplexing to stream the drives, then restore performance suffers. Tape copies, reclamation, and job consolidation take much longer to complete. Wear and tear on the drives and media. 26

You Can Have Too Many Drives Backup Server +/- 50MB/Sec Modern Tape Drives 80-160MB/Sec x 4 Drives 320-560MB/Sec LAN 27

Distribute The Tape Drives With fewer drives per host, you have a much better chance of getting the drives to stream. LAN 28

SAN Connections: Getting the Drives to Stream LAN 29

Tape Backup on the SAN: Some Things to Watch out For Not all hosts can push enough data to stream the drives directly. Slower hosts can tie up an expensive tape drive. These hosts should back up over the LAN. When you deploy new models of drives with higher shoe-shine thresholds, you might need to move some clients from the SAN back to the LAN. Some hosts might be able to go faster than the tape drives. It s a shame to let the tape drive slow them down. Slower hosts can tie up an expensive drive It s a shame to waste a drive on these hosts. 30

Staging Backups to Disk Disk arrays don t shoe-shine!!! As linear drives progress in throughput, it will be harder and harder to keep them streaming without first staging backups to disk. Similarly, fewer hosts will be able to deliver the throughput to justify direct communication with the tape device. Staging to disk will get the most out of your tape investment. 31

Disk is NOT the Panacea! Remember the I/O Bottleneck Backup Server +/-50MB/Sec Disk Array +/- 100MB/Sec. (perhaps much faster) LAN 32

Provisioning Secondary Disk: Try Not to Create a Headache Conventional disk staging either involves direct attached storage or SAN storage. Provisioning disk for backups can be a pain. Low cost disk devices might not have the same provisioning capabilities as high end disk. It is difficult to accurately predict needs. Utilization is a big challenge since disproportionately more disk is needed for full backups than incrementals. 33

Disk Staging and SAN Backup Together Great idea, but there are some catches Provisioning is a pain. You have just duplicated the provisioning problem you had for primary storage. Every time you provision primary storage, you have to provision secondary storage for disk staging. Sure SATA disk is cheap, but not if you waste it. Secondary disk needs are heavy during full backups and light for incremental backups. Okay for slave servers. Not great for SAN-enabled backup clients. Why? SAN attached clients using backup to disk will require some kind of sharing mechanism, so that a backup server or slave server can handle moving from disk to tape. 34

The Role of Shared Storage in Disk Staging App Server App Server Backup Server or Slave Server 35

SAN-Enabled Backup-To-Disk Requires a Sharing Mechanism Sophisticated volume manager integrated with backups. But few vendors offer this and it is complex to configure. Volume managers do not work across OS platforms. SAN file system that allows multiple hosts to read and write to the same physical disk. But SAN file systems are still somewhat esoteric. It would be overkill to deploy a SAN file system just for backups. CIFS/NFS Share But there could be some serious performance limitations. Virtual tape library This problem was solved long ago in the tape world. By emulating tape, VTLs take advantage of a proven mechanism. 36

Virtual Tape Library On a SAN Physical and Virtual Library LAN 37

Virtual Tape Library on a SAN Virtual Front-Ends the Physical LAN 38

Q&A / Feedback Please send any questions or comments on this presentation to SNIA: trackdatamgmt@snia.org Many thanks to the following individuals for their contributions to this tutorial. SNIA Education Committee Jacob Farmer 39