EMC Data Domain for Archiving Are You Kidding? Bill Roth / Bob Spurzem EMC EMC 1
Agenda EMC Introduction Data Domain Enterprise Vault Integration Data Domain NetBackup Integration Q & A EMC 2
EMC Introduction EMC 3
Data Domain: Leadership and Innovation A history of industry firsts 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 First deduplication NAS First deduplication volume replication First deduplication virtual tape library Largest deduplication array First deduplication directory replication Fastest backup controller Cascaded replication First long-term retention system for backup and archive First deduplication nearline storage First distributed processing First inline deduplication for compliant archiving
Data Stored Deduplication Dramatically Reduces Storage Capacity Requirements 30 Deduplication 10 30 times less data stored versus fulls + incrementals with typical retention policies 20 10 0 1 5 10 15 20 Weeks in Use Deduplication storage Traditional storage
Backup Data Reduction/Deduplication Time series of large enterprise implementation 2H '07 15% 15% 14% 31% 25% 2H '08 1H '09 2H '09 24% 27% 40% 12% 8% 16% 15% 4% 14% 25% 28% 21% Over three years, in-use rates for backup with 26% deduplication have risen from 22% 15% to 48% 20% 1H '10 46% 6% 14% 17% 18% 1H '11 48% 7% 16% 7% 10% 13% In Use Now In Pilot/Evaluation In Near-term Plan In Long-term Plan Past Long-term Plan Not in Plan Source: Wave 15 Storage Study Q2 2011, published 5/16/11, large-enterprise sample; H 07, n=151; 2H 08, n=127; 1H 09, n=147; 2H 09, n=182; 1H 10, n=146; 1H 11, n=31;theinfopro (www.theinfopro.com)
Data Domain Basics Easy integration with existing environment Control Tier Backup and Archive Applications EMC Symantec CIFS, NFS, NDMP Target Tier Disaster Recovery Tier CommVault IBM HP Veeam Ethernet Virtual Tape Library (VTL) over Fibre Channel DD890 appliance Replication DD890 appliance Quest
Data Deduplication: Technology Overview Store more backups in a smaller footprint Friday Full Backup A B C D A E F G Mon Incremental A B H Tues Incremental C B I Weds Incremental E G J Backup Estimated Data Logical Reduction Physical FRIDAY FULL 1 TB 2 4x 250 GB Monday Incremental 50 GB 7 10x 5 GB Tuesday Incremental 50 GB 7 10x 5 GB Wednesday Incremental 50 GB 7 10x 5 GB Thurs Incremental A C K Second Friday Full Backup B C D E F L G H A B C D E F G H I J K L Thursday Incremental 50 GB 7 10x 5 GB Second FRIDAY FULL 1 TB 50 60x 18 GB TOTAL 2.2 TB 7.6x 288 GB
Retain: Store More for Longer with Less Over one year of retention in 3U of Data Domain deduplication storage Backup Cumulative Estimated Physical Data Logical Reduction First Full 1 TB 4x 250 GB Week 1 Week 2 Week 3 Month 1 Month 2 Month 3 April 7 2.2 TB 8x 288 GB April 14 3.4 TB 10x 326 GB April 21 4.6 TB 13x 364 GB April 28 5.8 TB 14x 402 GB May 31 10.6 TB 19x 554 GB June 30 15.4 TB 21x 706 GB TOTAL 15.4 TB 21x 706 GB
Data Integrity: Data Invulnerability Architecture End-to-end data verification Checksum Deduplication, write to disk Verify Self-healing file system Cleaning Expired data Defrag Verify Other RAID 6 NVRAM Snapshots Generate Checksum File System Deduplication Local Compression RAID Verify Data End-to-end data verification Verify the file system metadata integrity Verify user data integrity Verify stripe integrity
Data Domain Infrastructure and Ecosystem Supports a variety of workloads and data types VMware Microsoft Exchange Microsoft SharePoint Oracle, SAP Other Application Servers NAS, SAN, DAS Application Storage Primary storage Midrange and Mainframe IBM i EMC DLm1000 Backup Applications Symantec NetBackup Symantec Backup Exec Archive Applications Symantec Enterprise Vault Network Replication over WAN Disaster Recovery
EMC Data Domain Enterprise Vault Integration Presentation Identifier Goes Here 12
Symantec Enterprise Vault Solution Highlights Support for E-Mail, SharePoint, and File-System Archiving (FSA) EV 8 Introduced new Data Domain storage type Makes EV8 aware that the DDR appliance provides compression, deduplication, & retention-lock features (as of SP3) Supports EV Collections Recommended w/dd to reduce total file count New collection-focused integration guide published DD qualified on Symantec EV Support matrix For both EV8 & EV9, EV10 pending Archive data from Symantec Enterprise Vault can be co-located next to backup data on the same Data Domain platforms
Symantec Enterprise Vault Recommended method of writing data to DD Direct write of EV savesets to DD share many small files written ( < 100K typical for email archiving, CIFS) one email may produce several small files DD may not be able to support total # files, or ingest rate Collections Savesets are written to DD share, then collected at a later date same performance issues as direct write + incur a re-read of all the data Collections w/migration (recommended) this is a multistep process with DD being the secondary storage collects these small savesets into.cab files (10MB) moves collections to DD according to schedule Cannot be used with Retention Lock
Enterprise Vault with Collections/Migration Exchange Server File Server SQL server Directory Vault Store Group Vault Store Audit Monitoring Reporting LAN EV server Primary storage: Collection process: collects EV savesets into 10MB.cab files on TEMP storage deletes original savesets Secondary Storage: DD CIFS Share Migration Process:.cab files moved to DD according to migration schedule
Symantec Enterprise Vault Integration Details Setup Temp Storage for Vault Store Partition
Symantec Enterprise Vault Integration Details Select device settings and choose collection software
Symantec Enterprise Vault Integration Details Set collection schedule and size, Select Migrate
Symantec Enterprise Vault Integration Details Select Migration software and Schedule
Symantec Enterprise Vault Integration Details Point to DD CIFS share, ENABLE Migrate all Files
EMC Data Domain NetBackup Integration Presentation Identifier Goes Here 21
EMC Data Domain Integration Distributed Segment Processing Executes On Backup Servers & Data Domain systems Segments Incoming Backup Streams Fingerprints Segments Identifies Unique Segments Unique Segments Are Compressed Compressed Segments Are Written To The Data Domain System Advanced Load Balancing & Failover Interface Group Feature Optimized Duplication Low Bandwidth Managed File Replication Feature Encrypted Managed File Replication Feature Presentation Identifier Goes Here 22
Distributed Segment Processing NetBackup Media Server Segment A B C D E B D Fingerprint Compress Data Domain System Unique? A B C D E Write Presentation Identifier Goes Here 23
Optimized Duplication Managed File Replication: Low Bandwidth Replication The low-bandwidth Replicator option reduces the WAN bandwidth utilization This option is useful if optimized duplication is being performed over a low-bandwidth network (WAN) link This option provides additional compression during data transfer and is recommended only for optimized duplication jobs that occur over WAN links that have fewer than 6 Mb/s of available bandwidth Presentation Identifier Goes Here 24
Advanced Load Balancing & Failover NetBackup Media Server Advanced Load Balancing & Failover NetBackup Media Server NetBackup Media Server Streams Load Balanced Across Network Links In Interface Group All Backup Servers Connect To The Same Registered Interface Transparent Link Failover For Network Failures Presentation Identifier Goes Here 25
Advanced Load Balancing & Failover NetBackup Media Servers NetBackup Media Servers Active Job Transparent Failover To An Operational Link NIC NIC NIC NIC Operational Link Used For An Active Job Failed Link Will Not Be Used For New Jobs Presentation Identifier Goes Here
Data Domain VTL Support for up to 64 concurrently active VTLs per Data Domain system Support for 1 to 256 tape drives per Data Domain system depending on the model number Up to 32000 slots per VTL Up to 64000 slots per Data Domain system Up to 100 CAPs (Cartridge Access Ports) per VTL Up to 2000 CAPs per Data Domain system Presentation Identifier Goes Here 27
Thank you! Copyright 2012 Symantec Corporation. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. This document is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or implied, are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice. Presentation Identifier Goes Here 28