IM B09 Best Practices for Backup and Recovery of VMware - DRAFT v1 George Winter, Symantec Corporation Technical Product Manager David Hendrix, FedEx Corporation Technical Principal Abdul Rasheed, Symantec Corporation Product Marketing Engineer IM B09 - Best Practices for VMware Backup 1
Please Note This information is about pre-release software. Any unreleased update to the product or other planned modification is subject to ongoing evaluation by Symantec and therefore subject to change. This information is provided without warranty of any kind, express or implied. Customers who purchase Symantec products should make their purchase decision based upon features that are currently available. IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 2
Agenda What s a VADP? VADP Backup Process Performance Baseline Testing Backup Application & VADP Perf Considerations Virtual Machine Restore Considerations FedEx VADP Implementation NetBackup Deduplication Appliance Q & A IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 3
What s a VADP? IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 4
vstorage API for Data Protection (VADP) A family of storage related API s created by VMware vstorage API for Array Integration (VAAI) vstorage API for Data Protection (VADP) vstorage API for Site Recovery Manager (VASRM) vstorage API for Multi-pathing (VAMP) Not a backup application a true API Backup vendors use this API to access advanced VMware backup capabilities VADP replacement (not enhancement) of VCB VCB still supported with vsphere 4 VCB no longer supported with vsphere 5 IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 5
VADP - Provides Powerful Backup Features Changed Block Tracking (CBT) True incremental backups I consider this most important enhancement Image (vmdk) level incremental backup & restore Some vendors support single file restores from CBT incremental Incremental (CBT) backups are key component of efficient backups Less back end storage required Quicker backups Less snapshot impact (more on this in a minute) IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 6
VADP Backup Process IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 7
vstorage API Backup Process VMware Backup Host ESX Datastore VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX/ESXi Backup Target 1. 1 Snapshot is created 1 VM4 VM5 VM6 VMDK VMDK VMDK VMDK VMDK VMDK ESX/ESXi IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 8
NetBackup 7 for VMware Backup Process VMware Backup Host ESX Datastore VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX/ESXi Tape 2 1. Snapshot is created 2. 2 VM data copied directly to tape (VTL, etc.) VM4 VM5 VM6 VMDK VMDK VMDK VMDK VMDK VMDK ESX/ESXi IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 9
NetBackup 7 for VMware Backup Process VMware Backup Host ESX Datastore VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX/ESXi Tape 3 1. Snapshot is created 2. VM data copied directly to tape (VTL, etc.) 3. 3 VM Snapshot released VM4 VM5 VM6 VMDK VMDK VMDK VMDK VMDK VMDK ESX/ESXi IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 10
NetBackup 7 for VMware Backup Data Path Network Backups (NBD) NFS or DAS fully supported No loss in backup or restore functionality Direct communication with ESX server required (DNS, etc.) ESX server directly impacted LAN ESX Datastore = Backup Impact / Path IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 11
NetBackup 7 for VMware Backup Data Path Network Backups (NBD) NFS or DAS fully supported No loss in backup or restore functionality Direct communication with ESX server required (DNS, etc.) ESX server directly impacted LAN ESX Datastore Shared storage configuration Fibre or iscsi Near zero impact on ESX No loss in backup or restore functionality No communication with ESX required Hotadd LAN ESX Datastore = Backup Impact / Path IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 12
Std Client Backups Std Client Backups ESX Server Load Client Backups vs. VADP NBU 7 for VMware SAN Backups NBU 7 for VMware SAN Backups IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 13
Performance Baseline Testing IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 14
Know Your Hardware! Determine maximum performance of backup environment Tests are designed to simulate VADP backup paths (SAN, Network) Verify that hardware configured and working correctly Never assume hardware or software working optimally until performance verified Cisco UCS, VMware, NetBackup Benchmark: http://www.symantec.com/business/products/whitepapers.jsp?pcid=pcat_business_cont&pvid=2_1 IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 15
Backup Application & VADP Performance Settings & Considerations IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 16
What Happens During a Backup? A Detailed Look at the VADP Snapshot Process During VM backups, VMware creates temp snapshot of VM. Process is as follows: VSS provider flushes OS buffers in VM Snapshot of VM is taken (vmdk(s) are frozen) (SCSI reservation of LUN) Redo log created all writes redirected to redo log VM is backed up Redo log data applied to original vmdk(s) Snapshot released - backup completed VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX/ESXi Why does this matter? Every one of these steps involves I/O Reducing random, simultaneous backups limits I/O impact Improves backup performance & reliability IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 17
Reducing Backup I/O Impact Improve B/U Reliability Shorten time backup snapshot is open Backup during periods of low VM activity Use incremental backups (CBT) liberally Limit simultaneous backups per ESX / Datastore Configure backup application for optimal performance Design backup policies to evenly balance load across ESX / Datastores Faster backups = shorter open snapshots Tune backup application buffers for optimal performance Result: Snapshots more reliable Overall backup processing faster Higher level of backup success IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 18
Backup Throughput in MB/s Ensure Optimal Backup Performance Number of simultaneous backups? 120 100 80 60 40 20 0 1 2 4 8 12 15 Total # of concurrent VMguests Aggregated Backup Speed (MB/s) Single VADP backup stream typically won t saturate backup path Max backup performance achieved by creating simultaneous backup streams Design backups so that data is streamed across multiple VMware components (ESX/DS) Improve backup speeds, shorten backup window IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 19
VADP Performance Characteristics Single ESX server Creating Multiple Backup Streams 200 MB/sec 200 MB/Sec 175 MB/Sec 495 MB/sec Aggregate Creates Throughput Significant = Backup Load on ESXi 495 Datastore MB/sec 75 MB/Sec 45 MB/Sec Based on Cisco/VMware/NetBackup Benchmark Testing Fibre = 8 Gb Stream #: 1 2 3 4 IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 20
VADP Performance Characteristics Four Separate ESX servers Creating One Backup Stream / ESXi 200 MB/sec 200 MB/Sec 200 MB/Sec 200 MB/Sec 200 Aggregate MB/Sec Throughput = 800 MB/sec Creates Less Load 800 Per ESX MB/sec Server Based on Cisco/VMware/NetBackup Benchmark Testing Fibre = 8 Gb Stream #: 1 2 3 4 IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 21
VMware Intelligent Policy Improve Backup Performance & Reliability IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 22
VIP Designed for Two Major Tasks 1) Automatically add and backup new and moved VM s 2) Automatically balance backups across entire vsphere environment (Fibre or network) VMs protected based on physical location ESX server ESX Datastore VMware cluster VMs protected based on logical attributes vcenter folder Resource pool vapp IM B09 - Best Practices for VMware Backup 23
Without VIP Backup Activity Unbalanced VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 1 VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 2 VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 3 VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 4 1 JOBS 6 JOBS 0 JOBS 8 JOBS LAN Network Backup Server Slower backups More backup failures Longer backup window Backup Load IM B09 - Best Practices for VMware Backup 24
With VIP Automatic Load Balancing VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK 2 JOBS Network ESX 1 VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK 2 JOBS Backup Server ESX 2 LAN Backup Load VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 3 VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 4 2 JOBS 2 JOBS Faster backups Improved reliability Less load on ESX New VM s automatically backed up IM B09 - Best Practices for VMware Backup 25
VMware Intelligent Policy Use Cases Improve Backup Performance & Reliability IM B09 - Best Practices for VMware Backup 26
Problem: Equalize Backup Load in SAN Environment Manage SAN (iscsi - shared storage) backups at Datastore level Managing backups at ESX server does not balance load at storage level Solution: Will protect every existing and new VM on any Datastore with PROD in name Fully compliant with Storage vmotion Set Datastore resource limit to balance backup load across all Datastores IM B09 - Best Practices for VMware Backup 27
SAN Backup Data Path Shared storage configuration Fibre or iscsi Near zero impact on ESX No loss in backup or restore functionality No communication with ESX required LAN ESX Datastore = Backup Impact / Path IM B09 - Best Practices for VMware Backup 28
SAN Backups - Equalize Load at Datastore VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK 2 JOBS Fibre or iscsi ESX 1 VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 2 2 JOBS SAN Backup Server VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 3 VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 4 UCS 2 JOBS 2 JOBS Design resource limit setting for each Datastore to 2 (total=8) Jobs evenly stream data across every SAN connection IM B09 - Best Practices for VMware Backup 29
Problem: Network backups Equalize Load at ESX NIC All backup traffic over network no shared storage available Don t want to saturate ESX network interface Solution: All powered on VMs on any ESX server in Production will be protected. Unimportant VMs in the DEVELOPMENT folder (only used for development) will be skipped Set ESX resource limit to balance backup load across all ESX NICs IM B09 - Best Practices for VMware Backup 30
Network (NBD) Backup Data Path Network Backups (NBD) NFS or DAS fully supported No loss in backup or restore functionality Direct communication with ESX server required (DNS, etc.) ESX server directly impacted LAN ESX Datastore = Backup Impact / Path IM B09 - Best Practices for VMware Backup 31
Network backups Equalize Load at ESX NIC VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK 2 JOBS Network ESX 1 VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 2 2 JOBS LAN Backup Server VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 3 VM1 VM2 VM3 VMDK VMDK VMDK VMDK VMDK VMDK ESX 4 UCS 2 JOBS 2 JOBS Design resource limit setting for each ESX server to 2 (total = 8) No ESX NIC ever saturated with backup traffic IM B09 - Best Practices for VMware Backup 32
So, How Fast Can I Backup? Where s the beef? IM B09 - Best Practices for VMware Backup 33
Translate Performance Into Protection Test 1: 63 MB/sec Normal Backup 340 VMs 2430 VMs Incremental Backup Single 92 VMware Virtual Machines Backup Host Weekend 5% B/U Data Change = 60 hours Avg. VM size = 40GB Data Change = 5% 12 minutes! 460 VM s per hour Test 2: 450 MB/sec 3240 VMs Test 3: 600 MB/sec 0 500 100 1000 200 1500 300 2000 400 2500 500 3000600 Total Number of VMs Protected Full Backups IM B09 - Best Practices for VMware Backup 34
Virtual Machine Restore Considerations IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 35
General Restore Considerations Restore process involves more I/O than backup process Disks (vmdk s) must be first created as target of restore Type of vmdk can impact restore speed and I/O required Single restore typically won t saturate restore path DR create simultaneous restore jobs As with backup balance restores across ESX or Datastore Slow vcenter can also cause restore perf issues Optional: bypass vcenter by restoring directly to ESX(i) server Known to significantly improve restore perf in some cases IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 36
VMDK Type Can Impact Restore Speeds 0% 100% 100% 0% thin Space not allocated during creation Space is supplied then zeroed out on demand Creation slow if vmdk turns out to be full zeroedthick 100% of space allocated during creation Zeroed out on demand Can be faster than thin especially if vmdk is nearly full eagerzeroedthick 100% of space allocated during creation 100% of disk zeroed out during creation Could take long time (and create lots of I/O) to complete entire process IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 37
FedEx VADP Implementation David Hendrix, FedEx Corporation Technical Principal IM B09 - Best Practices for VMware Backup SYMANTEC VISION 2012 38
FedEx Corporate Overview Founded in 1971 Based in Memphis, TN 300,000+ employees Spread across 5 operating companies Shipping presence in over 200 countries 5 major data centers in US Additional data centers in EMEA, APAC IM B09 - Best Practices for VMware Backup 39
FedEx VMware Environment DEV / Test 8400+ VM s (growing ~30/mo) 90+% Linux 500+ ESX servers 3 vcenter domains vsphere 4.1 Backup methods: Guest, Array Snapshot, VADP Production 4000+ VM s (growing ~170/mo) 90+% Linux 400+ ESX servers 6 vcenter domains vsphere 4.1 Methods: Guest, VADP Storage EMC, NetApp Block / NFS IM B09 - Best Practices for VMware Backup 40
FedEx Backup Environment Backup Environment - NetBackup Need for both physical system and VM backups Physical environment Array based snapshots RMAN (Oracle) backups NDMP Backup targets Data Domain Tape Goal is tapeless NetBackup 7.1 servers associated with VM backups ~ 40 IM B09 - Best Practices for VMware Backup 41
VM Protection Criteria Simple and scalable, minimal impact to ESX servers Disaster Recovery: Legal offsite Requirements Backup and restore locally Offsite = replication and tape Occasional restore of entire VM Restore entire environment if necessary Single file restore Linux and Windows Most common restore request IM B09 - Best Practices for VMware Backup 42
Test Environment IM B09 - Best Practices for VMware Backup 43
Backup Performance Testing Critical for meeting SLA s 8 months testing performance/functionality 8 hour Backup Window RTO is based on tier definitions normally within 4 hours Performance testing methodology Multiple Protocol: FCoE, iscsi, NFS CBT, and mapped (single file restore) VMs mandatory Transport: NBD (network) and SAN Scale tests (# ESX servers, # of VMs, # of datastores) Results of performance testing 2-3TB/hr (limiting factor target storage) for a single backup host Overhead of vcenter Media server CPU capacity SAN transport method preferred IM B09 - Best Practices for VMware Backup 44
Restore Functionality Testing Ultimate goal is to focus on restore functionality What can be restored? How fast can restores be processed? (single files indexed, searchable) How simple is restore process? Single files Entire VM IM B09 - Best Practices for VMware Backup 45
Lessons Learned vstorage API Advantages Simplified Client Management Future Enhancements vstorage API Challenges High IO activity on snapshot rollback for applications that do not integrate well with VMware snapshots Weblogic, Oracle in VMs, et al VAAI array support important affects snapshot performance and # of simultaneous snapshots Critical VM s are sensitive to any VMware activity, DRS, VMotion, VADP NetBackup configuration suggestions Split backup hosts across vcenter servers Use query (VIP) method for VM client selection IM B09 - Best Practices for VMware Backup 46
Moving Forward vstorage API vs proprietary storage snapshots vs guest deduplication Measure success as roll out progresses 12 month VM count exceed 15,000 IM B09 - Best Practices for VMware Backup 47
NetBackup Appliance Abdul Rasheed IM B09 - Best Practices for VMware Backup 48
Symantec Critical System Protection Built-in WAN Optimization What is NetBackup Appliance? New! NetBackup with Intelligent Deduplication New! Veritas Storage Foundation Symantec Hardened Operating System Optimized Hardware Redundant Storage Symantec NetBackup Appliances 49
Why NetBackup Appliances? Building a backup solution NetBackup Appliances Hardware Acquisition - Updates Software Acquisition - Patches Integration Test Deploy Multiple Consoles Multiple Vendor Relationships Simplified Backup Lower OpEx Elimination of Integration Complexity Predictable Performance One Vendor Easier Acquisition Single Pane of Glass Management Symantec NetBackup Appliances 50
Symantec NetBackup Appliances NetBackup 5220 NetBackup 5020 Backup appliance with dedupe 2U to 8U form factor Master/Media server roles 4TB to 72TB usable capacity Deduplication appliance 4U form factor Storage pool role 32TB ; up to 192TB with 6 nodes Deduplication at source and target Multiple Connectivity Options: Ethernet and Fiber Channel Transferable software licensing Replication at no charge Symantec NetBackup Appliances 51
End-to-end protection for Virtualized Enterprise Protect up 3,000 virtual machines in one appliance Direct vsphere backup No proxy server required Equipped with Symantec V-Ray Multiple types of recovery from a single backup pass single file, applications, application objects and DR image Protect through SAN or Ethernet Symantec NetBackup Appliances 52
George Winter, Symantec Corporation David Hendrix, FedEx Corporation Abdul Rasheed, Symantec Corporation Copyright 2012 Symantec Corporation. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. This document is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or implied, are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice. IM B09 - Best Practices for VMware Backup 53