Chapter 15 Technical Planning Methodology

Similar documents
Chapter 3 `How a Storage Policy Works

Chapter 7. GridStor Technology. Adding Data Paths. Data Paths for Global Deduplication. Data Path Properties

Chapter 10 Protecting Virtual Environments

Chapter 9 Protecting Client Data

Chapter 11. SnapProtect Technology

Chapter 16 Content Based Planning Methodology

Chapter 4 Data Movement Process

Chapter 2 CommVault Data Management Concepts

Chapter 1. Storage Concepts. CommVault Concepts & Design Strategies:

DASH COPY GUIDE. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31

Features - Microsoft Data Protection Manager

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

Administration GUIDE. Virtual Server idataagent (VMware) Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 225

Chapter 12 Compliance, Records Management & ediscovery

Archive 7.0 for File Systems and NAS

Symantec Backup Exec Blueprints

Executive Summary SOLE SOURCE JUSTIFICATION. Microsoft Integration

Implementation & Maintenance Student Guide

Dell PowerVault DL2100 Powered by CommVault

VCS-276.exam. Number: VCS-276 Passing Score: 800 Time Limit: 120 min File Version: VCS-276

Commvault with Cohesity Data Platform

Enhanced Protection and Manageability of Virtual Servers Scalable Options for VMware Server and ESX Server

VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014

6/4/2018 Request for Proposal. Upgrade and Consolidation Storage Backup Network Shares Virtual Infrastructure Disaster Recovery

QUICK START GUIDE Active Directory idataagent

Administration GUIDE. OnePass Agent for Exchange Mailbox. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 177

User Guide - Exchange Database idataagent

Administration Guide - Documentum idataagent (DB2)

Virtual Server Agent v9 with VMware. June 2011

Data Domain OpenStorage Primer

TIBX NEXT-GENERATION ARCHIVE FORMAT IN ACRONIS BACKUP CLOUD

Protect enterprise data, achieve long-term data retention

Dell Compellent Storage Center with CommVault Simpana 9.0. Best Practices

WHITE PAPER: ENTERPRISE SOLUTIONS. Disk-Based Data Protection Achieving Faster Backups and Restores and Reducing Backup Windows

Backup Solution. User Guide. Issue 01 Date

EMC Data Domain for Archiving Are You Kidding?

Administration GUIDE. IntelliSnap Virtual Server idataagent for VMware. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 277

User Guide - Exchange Mailbox Archiver Agent

EMC DATA DOMAIN PRODUCT OvERvIEW

Cloudian HyperStore Backup Deployment and Performance Tuning Guide for Commvault

VMware vsphere Data Protection Evaluation Guide REVISED APRIL 2015

Paragon Protect & Restore

Quick Start - BlueArc File Archiver

Administration Guide - NetApp File Archiver

Backup and Recovery Best Practices with Tintri VMstore and Commvault Simpana Software

Disaster Happens; Don t Be Held

CommVault Simpana 9 Virtual Server - Lab Validation

Tyler Orick SJCOE/CEDR Joan Wrabetz Eversync Solutions

Backup and archiving need not to create headaches new pain relievers are around

EMC DATA DOMAIN OPERATING SYSTEM

Protecting Oracle databases with HPE StoreOnce Catalyst and RMAN

Quick Start Guide TABLE OF CONTENTS COMMCELL ARCHITECTURE OVERVIEW COMMCELL SOFTWARE DEPLOYMENT INSTALL THE COMMSERVE SOFTWARE

Quick Start - OnePass Agent for Windows File System

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved.

Backups and archives: What s the scoop?

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.

StorageCraft OneXafe and Veeam 9.5

Technology Insight Series

Table of Contents. Introduction 3

Microsoft DPM Meets BridgeSTOR Advanced Data Reduction and Security

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation

Virtual Server Agent for VMware VMware VADP Virtualization Architecture

HP Designing and Implementing HP Enterprise Backup Solutions. Download Full Version :

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication

Setting Up the DR Series System on Veeam

Managing Data Growth and Storage with Backup Exec 2012

CA ARCserve Backup for Windows

Quick Start Guide - Exchange Database idataagent

CommVault Galaxy Data Protection 7.0 for Microsoft Exchange Systems

The Microsoft Large Mailbox Vision

Countering ransomware with HPE data protection solutions

Setting Up the Dell DR Series System on Veeam

Best Practices for Using Symantec Online Storage for Backup Exec

Data Management at Cloud Scale CommVault Simpana v10. VMware Partner Exchange Session SPO2308 February 2013

DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE

StorageCraft OneBlox and Veeam 9.5 Expert Deployment Guide

Dell PowerVault DL Backup-to-Disk Appliance Powered by CommVault

Oracle Zero Data Loss Recovery Appliance (ZDLRA)

Balakrishnan Nair. Senior Technology Consultant Back Up & Recovery Systems South Gulf. Copyright 2011 EMC Corporation. All rights reserved.

Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014

White paper ETERNUS CS800 Data Deduplication Background

White Paper Simplified Backup and Reliable Recovery

Simplify Backups. Dell PowerVault DL2000 Family

User Guide. Version 2.1

Copyright 2012 EMC Corporation. All rights reserved.

ZYNSTRA TECHNICAL BRIEFING NOTE

How to solve your backup problems with HP StoreOnce

Commvault and Vormetric SQL Server Protection

Cloud Compute. Backup Portal User Guide

Zero Data Loss Recovery Appliance DOAG Konferenz 2014, Nürnberg

CA ARCserve Backup. Benefits. Overview. The CA Advantage

Quick Start - NetApp File Archiver

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Understanding Virtual System Data Protection

ADMINISTRATION GUIDE EXTERNAL DATA CONNECTOR (NETBACKUP)

Using Computer Associates BrightStor ARCserve Backup with Microsoft Data Protection Manager

Configuration Guide for Veeam Backup & Replication with the HPE Hyper Converged 250 System

Installing Acronis Backup Advanced Edition

Symantec Backup Exec 10d for Windows Servers AGENTS & OPTIONS MEDIA SERVER OPTIONS KEY BENEFITS AGENT AND OPTION GROUPS DATASHEET

Transcription:

Chapter 15 Technical Planning Methodology

294 - Technical Planning Methodology This chapter will focus on a straight forward methodology for implementing storage policies. This approach will allow you to design a simple storage policy architecture and expand on the design to meet more complex requirements as needs arise. It will focus on technical protection methods using traditional libraries and Simpana deduplication. It is a good approach for Simpana administrators just starting out or for basic environments without complex protection needs. For more detailed planning and design, see the Content Based Planning & Design Strategies chapter. This section will use the following Procedures to design storage policies: All Agents will initially be configured with just a Default Subclient. Though this reasoning runs counter to the basic design methodology discussed in the previous chapter, it is with experience that I see many environments only using the default subclient except in special circumstances. Adding custom subclients will be addressed after initial storage policies are designed. Determine the minimum number of storage policies. This will establish the base starting point to get the CommCell environment up and running. For existing environments this will determine whether current policies are adequate or if too many policies are in use. Determine how many secondary copies will be required. Secondary copies will be required for additional copies for on/off site disaster recovery, data recovery and compliance requirements. Determine if custom subclient configurations will be required. Once the base environment is running determine if all protection and restore requirements are being met. Adjust or create additional subclients to modify the CommCell environment as needed to meet requirements. There will be three phases to designing storage policies: 1. Determine basic policy design. 2. Determine additional copies that will be needed. 3. Determine if additional subclients will be used. Determining the Basic Storage Policy Design The first phase is to determine the minimum number of storage policies that will be required. In this phase only the storage policy and primary copy will be defined. This approach will take into consideration the following key aspects of storage policy design: Primary target location. Primary retention requirements. Data types being protected. Deduplication configuration settings.

Technical Planning Methodology - 295 Primary target location Typically when different libraries are being used to protect data, each library will have different storage policies defined. Using this approach simplifies the management of storage policies by assisting the administrator in knowing where the data is going. Using advanced features such as GridStor technology or Data Path Override, you can use fewer policies to accomplish primary target requirements, but it does add a level of complexity into an environment. In the following illustration, two libraries are being used. Database data is writing to a tape library and file system data is writing to a disk library. Using two separate storage policies is the easiest method to implement this solution. Multiple Locations In a distributed CommCell architecture where different physical locations are using local storage, different storage policies should be used. This avoids the potential of improper data path configurations within the policy copy resulting in data being unintentionally moved over WAN connections. This also provides the ability to delegate control of local policies to administrators at that location without potentially providing them full control to all policies.

296 - Technical Planning Methodology Primary Retention Requirements If the primary retention requirements are different for backup data, a separate storage policy is recommended. It is possible to modify retention for a job in storage or set a specific retention setting for a job schedule, but both of these options add complexity to an environment. If archive data is going to be managed by the same storage policy as backup data, different retention settings can be configured for standard backup retention and archive/compliance retention. Archive and backup data should only be combined in a primary copy if data is being protected to disk. If the data is going to tape then all of the data will be retained on the tape until the longest retention setting is satisfied. In this case different storage policies should be used. Data Types Using different storage policies for different data types is an option that can add organization into the CommCell environment. This is not a required step for designing policies but is frequently used to divide data into policy groups such as: Exchange policy, SQL policy, Windows files policy, Linux file policy, etc The advantage to this method is from a management and reporting perspective. The disadvantage is that too many policies will result in higher fragmentation of data. Example 1: You are backing up 10 different data types and decide to use 10 different storage policies. You need to send data off site each day on tape. Since there are 10 different policies configured, at a minimum 10 tapes will have to be sent off site each day. Example 2: You are using Simpana deduplication. By default each policy will create its own dedupe database and store. This will result in duplicate blocks being redundantly stored in the same library. You can overcome this obstacle with the use of global deduplication policies which will be discussed in the next section. Dissimilar Data Types to Tape When backing up to tape media it is recommended to isolate different data types to separate media. This is especially true when considering database and file data. Since each storage policy logically divides protected data, if separate policies are used for the different data types, you can ensure that each data type will use dedicated media. Typically different retentions will also be used for the data types so different policies may be required anyway. Mixing dissimilar data types such as file and database data will also affect the chunk size. This can have a negative impact on performance as well as continuation of interrupted indexed operations. Indexed data such as file systems use a default chunk size of 4GB when writing to tape. Database jobs use a default size of 16GB. If these data types use the same tape then the chunk size will be determined by the first data block written to the chunk. Using separate policies will allow each job to use the appropriate chunk size and isolate the data type on separate media.

Technical Planning Methodology - 297 File data and database data are both writing to the same tape library. Although the same physical library is being used, using two separate storage policies will result in the dissimilar data types using their own dedicated media. Determine Deduplication Requirements Block Factor When using Simpana Deduplication, the dedupe block factor is a primary concern when developing storage policy strategies. The smaller the block size the more entries are made to the dedupe database. Currently the database can scale from 500-750 million records. The total volume of data being protected, which is relatively simple to estimate and the estimated number of unique blocks, which is certainly not easy to estimate, should be taken into consideration when determining block size. The following recommendations for block factor settings are based on the following: 128KB All object level protection, virtual machines and smaller databases. 128KB 512KB Current recommendation for database backups depending on size of all database data managed by the policy. For large databases it is recommended to engage CommVault Professional Services for proper deployment.

298 - Technical Planning Methodology In this case different storage policies should be configured for the different block factors. It is not recommended to use a single policy for all data when mixed data types are involved since different data may not deduplicate well in mixed dedupe stores. Another factor that should be considered is how long the data will be retained for. Longer retention will result in larger databases. Since different data types typically will have different retention settings, it would require separate storage policies to manage the data so separate dedupe databases will be used. It is NOT recommended to use global deduplication for long retention or large volume protection. Dissimilar Data Types Dissimilar data types that require different block settings should have separate policies. In some cases data types that do use the same block size should still have their own dedicated policies. This is the case when considering databases since different database applications typically organize and compress data using specific algorithms. Using the same dedupe policy will not necessarily result in better deduplication ratios. Also considering databases can contain large amounts of data, using a single dedupe policy may result in large dedupe databases which can lead to scalability issues. It is recommended to use different storage policies dedicated to different data types. For applications that perform their own compression, it is typically recommended to disable Simpana compression in the dedupe policy copy. This could lead to poor deduplication ratios if compressed blocks contain different data each time they are protected. Check with CommVault for current best practices when considering using Simpana compression or application compression and deduplication. Deduplication Database Scaling A single deduplication database can scale to 750 million records. Based on estimated dedupe efficiency a single deduplication database can manage 60 90 TB of data in protected storage and 500 or more TB total volume of production data. Total volume would be the size of data times the number of cycles the data will be retained. In environments managing large amounts of data, it is recommended to use multiple storage policies, each with their own dedupe database to provide for the highest level of scalability. When determining the number of policies that will be needed in large environments, data growth projections should be considered. Although a single dedupe database may be able to manage all current data, if the data growth rate is expected to change significantly, you may find yourself scrambling to redesign your policies at the last minute to accommodate changes in your environment. This will have a negative effect on deduplication efficiency especially when data is being retained for longer periods of time.

Technical Planning Methodology - 299 Although the current data sizes for the environment in the following illustration can be handled with a single dedupe database, the expected growth rate over the next three years makes it more efficient to use dedicated storage policies, Media Agents and libraries. This provides scalability and enhanced performance by distributing data through multiple Media Agents to storage dedicated to each Media Agent. Global Deduplication on Primary Copy If different retention settings are required for the primary copy but the disk location and block factor are the same, a global deduplication storage policy can be used to achieve a better deduplication ratio. Associating a global dedupe policy with a primary copy will result in a single dedupe database and dedupe store being used across multiple copies. Because of this, consider the volume of data that will be protected and ensure the deduplication database will be able to scale to meet current and future data growth. The use of global dedupe policies are mainly for consolidating small amounts of data with different primary retention needs or for consolidating remote location data to a central location. Global deduplication policies should NOT be used across the board for everything in your datacenter. You can quickly grow out of the database

300 - Technical Planning Methodology maximum size which will then require a complete redesign of your storage policy structure. Realize that policy copies attached to a global dedupe policy cannot be unattached. New policies will have to be created and the old policies cannot be deleted until ALL data has aged from the policies. Three storage policy primary copies with different retention are linked to a global deduplication storage policy. Blocks will be retained in a single store and one dedupe database will be used. Blocks will be aged from disk once no data from any of the policies are referencing the block. In this scenario the total volume of data is 10 TB. So what is your expected growth rate? General estimates for modern data centers usually are in the range of 15-30%. With the quickly changing business environment these numbers may be closer to 40+%. Why is that? Considerations for data growth are based on traditional back end storage growth which is relatively easy to trend using historical data. A largely overlooked aspect of data is the inclusion of more end user data in the form of workstation backup, mobile user data backup as well as companies moving to virtualize workstations. So the traditional thought of growth rates may be a thing of the past. Consider these sources of data growth in any storage design strategy.

Technical Planning Methodology - 301 Scenario Part I: Determine Minimum Storage Policies All of the above mentioned factors will assist in determining how many storage policies will be required. Although they are discussed individually, they all interrelate to one another. The easiest method to determine how many policies will be needed is to design a simple matrix chart in a spreadsheet and sort by various columns to determine how you can group like requirements together. The following diagrams and charts represent a small datacenter and three satellite offices.

302 - Technical Planning Methodology The basic protection matrix table shows servers, locations and agents used to protect data. In this scenario to simplify configuration only a Default Subclient will be used. Projected Data Size 1 year (GB) Primary Retention (Days) Dedupe Block Factor Primary Server Location Agent Subclient Target FileSrv_A MainDC FSiDA Default 800 Disk_1 30 128 Exch2010 MainDC FSiDA Default 15 Disk_1 30 128 Exch2010 MainDC ExchDB Default 600 Disk_1 30 256 SQL MainDC FSiDA Default 25 Disk_1 30 128 SQL MainDC SQLiDA Default 5000 Disk_1 14 256 Oracle MainDC FSiDA Default 25 Disk_1 30 128 Oracle MainDC ORiDA Default 15000 Disk_OR 14 256 RemoteFS_1 Pasadena FSiDA Default 250 Disk_2 30 128 RemoteFS_2 Boston FSiDA Default 125 Disk_3 30 128 RemoteFS_3 Dallas FSiDA Default 400 Disk_4 30 128 The spreadsheet is sorted by the following order: Primary target, Primary Retention, then Dedupe Factor. The result is a grouping of the file system idataagent for Exch2010, SQL, Oracle, and FileSrvA all having the same protection requirements. This result can be used to simplify storage policy design strategies. Grouping data with like protection requirements allows the data to be associated with a single storage policy. Projected Data Size 1 year (GB) Primary Retention (Days) Dedupe Block Factor Primary Server Location Agent Subclient Target SQL MainDC SQLiDA Default 5000 Disk_1 14 256 Exch2010 MainDC FSiDA Default 15 Disk_1 30 128 SQL MainDC FSiDA Default 25 Disk_1 30 128 Oracle MainDC FSiDA Default 25 Disk_1 30 128 FileSrv_A MainDC FSiDA Default 800 Disk_1 30 128 Exch2010 MainDC ExchDB Default 600 Disk_1 30 256 RemoteFS_1 Pasadena FSiDA Default 250 Disk_2 30 128 RemoteFS_2 Boston FSiDA Default 125 Disk_3 30 128 RemoteFS_3 Dallas FSiDA Default 400 Disk_4 30 128 Oracle MainDC ORiDA Default 15000 Disk_OR 14 256

Technical Planning Methodology - 303 Deduplication Calculations Using the Simpana deduplication calculator which is available from CommVault Professional Services, the following calculations are based on the total of volume of data protected within each storage policy dedupe store. Three storage policies are being used, each to manage different data types. The block size is based on how long the data will be retained as well as current and estimated data growth. Storage Policy Recovery Window Requirements (1) Data Set Type (SP Copy) Recovery SLA - # Weeks SP Segment Size (KB) (3) Full Copy Mode # Full jobs over retention period # Inc jobs over retention period Total Recovery Points Total Retention Time in the SP Copy set set set set set 23 Days 26 Days Oracle Databases 2 wks 256KB Weekly Full 2 12 14 Days 14 Days SQL Databases 2 wks 256KB Weekly Full 2 12 14 Days 14 Days File Data 4 wks 128KB Weekly Full 4 26 30 Days 28 Days Full backup and estimated block change rate is factored into the calculations. Depending on the data type various base and subsequent data size reductions will be achieved. Storage Policy Backup Size Assumptions Deduplication Assumptions Data Set Type (SP Copy) Full Backup size Incremental Backup Job Size %/Full Inc:Full Ratio Total Size of Retained Backup Jobs / Cycle Base Full Reduction (/job) Seq-Full Reduction (/job) INCR Reduction (/job) set 1,051.500 TB 87.200 TB 9.6% 5,831.1 TB -57.2% -94.1% -61.9% Oracle Databases 15.000 TB 1.000 TB 6.7% 45.4 TB -50.0% -90.0% -60.0% SQL Databases 5.000 TB 0.500 TB 10.0% 17.3 TB -50.0% -90.0% -60.0% File Data 1.500 TB 0.200 TB 13.3% 12.1 TB -65.0% -90.5% -60.0%

304 - Technical Planning Methodology Deduplication store estimations are based on all previous data entered into the calculator. Based on all entered factors an estimated deduplication database size is given. Storage Policy Dedupe Media Store Estimate (1) Data Set Type (SP Copy) set Baseline Full/ Size (TB) 489.267 TB Seq-Full Delta Add Size (TB) 57.934 TB DISK STORE Media Size / retention period 1,282.47 TB Dedupe Ratio / Store Dedupe Store Percentage Saved Store Mix ( BL / Delta) DDB Store DB Size (GB) 4.5 : 1 x -78% 38% - 62% 3,467 Gb Oracle Databases 8.100 TB 1.620 TB 14.40 TB 3.2 : 1 X -68% 56% - 44% 17 GB SQL Databases 2.700 TB 0.540 TB 5.60 TB 3.1 : 1 X -68% 48% - 52% 7 GB File Data 0.567 TB 0.154 TB 3.07 TB 3.9 : 1 X -75% 18% - 82% 10 GB The last part of the calculations show data store size and number of storage policies required to protect data. Storage Policy Dedupe Media Store Estimate DDB Partition Recommendations (1) Data Set Type (SP Copy) DDB Store DB Size (GB) DDB Suggested Max Store Size (Profile) Suggested Storage Policy Active Stores (#) set 3,467 Gb 22.0 Outcome Oracle Databases 17 GB 193.1 TB 1.0 Good SQL Databases 7 GB 193.1 TB 1.0 Good File Data 10 GB 96.6 TB 1.0 Good

Technical Planning Methodology - 305 Storage Policy Design Chart Storage Policy design requirements will be based on these key points: File data at the main datacenter will use the same policy. The SQL, Exchange and Oracle database agents will have dedicated policies. Each remote location will have a dedicated storage policy. Storage Policy Design The diagrams and charts on the following pages show the storage policy physical and logical design solution for this scenario.

306 - Technical Planning Methodology Data Protection Matrix Charts The following chart shows the storage policies that will be required, device streams, deduplication settings and subclients associated with each policy. A global deduplication policy will be used when associating secondary copies for remote office data consolidation to the main data center. Policy Name Dev Dedupe DC_GlobDDFS Client Side Subclients MainDC_FSiDA/128k 10 128 YES Exc2010-FSiDA-"Default", SQL-FSiDA-"Default",Oracle- FSiDA-"Default",FileSrv_A-FSiDA-"Default" SQL_MainDC/256k 10 256 YES SQL-SQLiDA-"Default" Oracle_MainDC/256k 10 256 YES Orcale-OracleiDA-"Default" Exchange_MainDC/256k 10 256 YES Exch2010-ExchiDA-"Default" PasdenaRemote/128k 10 128 YES RemoteFS_1-FSiDA-'Default" BostonRemote/128k 10 128 YES RemoteFS_2-FSiDA-'Default" DallasRemote/128k 10 128 YES RemoteFS_3-FSiDA-'Default"

Technical Planning Methodology - 307 Primary Copy design shows a descriptive name that includes library and retention settings. The MainDC_FSiDA policy is associated with the global deduplication policy. This will allow secondary copy data blocks from remote sites to be deduplicated against the primary copy at the main data center. Since the data type for the main primary and remote secondary sites is the same, deduplication ratios will be efficient. Policy Name Name Cyc Day Library Global Dedupe Media Agent DC_GlobDDFS DiskLib1 MainDC_FSiDA/128k PRI-Disk1/D=14 C=2 4 28 DiskLib1 DC_GlobDDFS MA1 SQL_MainDC/256k PRI-Disk1/D=7 C=1 2 14 DiskLib1 MA1 Oracle_MainDC/256k Pri-Tape/D=7 C=1 2 14 Disk_OR OR/MA Exchange_MainDC/256k Pri-Disk1/D=30 C=4 4 28 DiskLib1 MA1 PasdenaRemote/128k Pri-Disk2/D=30 C=4 4 28 DiskLib2 MA2 BostonRemote/128k Pri-Disk3/D=30 C=4 4 28 DiskLib3 MA3 DallasRemote/128k Pri-Disk4/D=30 C=4 4 28 DiskLib4 MA4 Storage Policy & Primary Copy Considerations In addition to the basic configurations discussed here, there are several other options and features that should be considered: Feature Encryption Description Inline encryption can be configured by enabling encryption in the Client properties and then applying encryption to the subclient data. If using inline encryption with deduplication separate storage policies must be used to separate encrypted data from non-encrypted data. Hardware encryption for LTO drives that support AES encryption can be enabled in the Data Path properties of the tape library.

308 - Technical Planning Methodology Security Group Security If user group security must be assigned to manage specific data, then different storage policies should be configured and the appropriate groups should be assigned rights to manage the policy. If policy level Media Passwords are required for specific data types, define a separate storage policy and assign it the appropriate Media Password in the Advanced tab of the storage policy Properties. GridStor If multiple data paths are going to be used for the storage policy, add the paths in the Data Path tab of the Primary Copy. Multiple data paths can be configured to Round-Robin load balance or to failover if resources are offline or busy. Round-Robin should be used when: Dynamic Drive Sharing tape library with multiple Media Agents zoned to see the library. Shared Disk library when multiple Media Agents are writing to NAS storage. This can be used for deduplicated or non-deduplicated libraries. Failover to alternate paths should be used when: A primary library is being used (such as disk) but failover is desired if the primary library is offline. When using tape libraries with multiple drive pools or scratch pools. Failover can be configured to grab resources from a different pool if no resources are available in the primary data path. If writing to deduplicated disk library in a SAN environment. Multiplexing SILO When writing multiple streams to a tape library multiplexing can be used to improve write performance by multiplexing multiple Job Streams into a Device Stream. Multiplexing can improve backup performance but can have a negative effect on restore performance. Consult with CommVault Online Documentation for more information on the proper settings for multiplexing. If SILO storage is going to be used for certain protected data, separate policies must be created. Create storage policy(s) for data that will be protected to SILO and other policies for data that does not require SILO storage.

Determine Requirements for Additional Copies Technical Planning Methodology - 309 Once the storage policies and primary copies have been configured, secondary copies for on-site and off-site protection will be configured. There are three primary reasons for protecting data: disaster recovery, data recovery, and data preservation for compliance. Disaster Recovery Copies Disaster recovery is the primary reason for protecting data. The requirement to locate data at an off-site facility in case of site disaster typically requires additional copies of data to be created. Although some organizations may send their primary copies off-site, it is much preferred to keep primary copies on site for the most likely disaster scenario such as a disk or server crash, and keep additional copies off site for larger disaster situations. The frequency in which data is sent off-site will determine Recovery Point Objectives (RPO) for major site disasters. If data is only sent off site once a week, seven days of data could potentially be lost. It is strongly recommended to send data off-site as frequently as possible. There are two secondary copy methods that can be used for off-site DR copies: Secondary Synchronous copy is the preferred method for DR backups. A synchronous copy will manage all full, incremental, differential and transaction log backups. This will provide complete point in time recovery to any point within the cycle for data recovery. Synchronous copies should be used when sending data off-site daily. It is also a preferred method even if sending data off-site weekly since it allows point in time recovery of data to any backup point within the cycle. For example, consider a file which was created on Monday and deleted on Wednesday. If full backups are run on Friday and only the full is sent off-site on Monday, the file will not be in the weekly off-site copy. If all jobs within the cycle are being sent off-site by using a Synchronous copy then any data protected within the cycle can be recovered. Secondary Selective copy creates additional copies of full backup jobs. It provides a point in time copy based on when the full was performed. If data is being sent off-site weekly or monthly and point in time restores to any point within the cycle are not required, then Selective copies can be used. DR Copies to Disk With Simpana deduplication and WAN bandwidth becoming considerably cheaper, secondary copies can be defined to manage off-site DR copies by writing deduplicated data to disk at a DR site. The feature used to maximize bandwidth usage and minimize disk consumption at the DR location is called DASH Copy. A DASH Copy is configured on the secondary copy deduplication settings which will result in only changed blocks being copied during an auxiliary copy operation.

310 - Technical Planning Methodology Data Recovery Copies Data recovery relates to providing point in time recovery for a specified period of time. Where DR copies focus on the most recent data for recovery, data recovery may require restores from weeks or even months back in time. These point in time recoveries typically relate to user data recovery for deleted files, folders, or email messages. Since a user may create and delete data at any point, Synchronous Copies should be used for data recovery copies. Data Preservation Copies Data preservation or compliance copies are typically point in time full backup copies that will be retained for long time periods. This allows for point in time views such as month end or quarter end that can be used to archive data for compliance reasons. There are two methods that can be used to create point in time full backup copies: Selective Copies Extended Retention Rules Selective Copy A selective copy will select specific full backup jobs based on a defined time or cycle interval such as month end, quarter end, or year end. A selective copy will selectively copy source data to new media to be independently managed based on the selective copy retention. This will allow specific data to be isolated on media to meet compliance requirements. Extended Retention Where a selective copy creates an additional copy of a full backup, Extended Retention Rules apply different retention settings to an existing full backup. This does not require an additional copy of the data to be created but instead extends the retention on the existing job that meets the extended retention criteria such as end of month or end of quarter. The advantage to this option is that since additional copies do not need to be made, less media may be required. The disadvantage to this option is if other jobs exist on the tape that do not meet extended retention criteria, the jobs will still be preserved on tape based on the longest retention setting. This can actually result in less efficient use of media. So where a selective copy can be used to isolate jobs on media, extended retention will not. For this reason it is recommended to use separate schedules for jobs that will have extended retention applied and select the options Start new Media and Mark Media Full. This will isolate the jobs on different media. Selective / Extended Hybrid Using both the Selective copy and extended retention rules can be used to implement Grandfather/Father/Son tape rotation while isolating required data on independent media. This will allow jobs not required for longer retention periods to be kept on separate media from jobs that do require longer retention times.

Global Deduplication on Secondary Copies Technical Planning Methodology - 311 The most common implementation method for global deduplication Storage policies is consolidating remote data to a central disk library. Secondary Copies for remote storage policies can be associated with a global deduplication Policy with a data path at the main data center. This will allow multiple remote locations to be consolidated into a single deduplication store at the main data center. Create any global deduplication policies prior to creating secondary copies for the remote location. Scenario Part II: Adding Secondary Copies The diagrams and charts on the following pages illustrate the solution for adding and configuring secondary copies.

312 - Technical Planning Methodology Remote Location DR Copies Using Global Deduplication Policy. The sorted matrix chart for secondary disaster recovery copies reveals that file system data from clients at the main datacenter can all be consolidated to the same secondary copy. Server Location Agent Subclient Secondary DR target Type Retention SQL MainDC SQLiDA Default Tape_1 Sel D=30 C=4 Exch2010 MainDC FSiDA Default Tape_1 Synch D=30 C=4 SQL MainDC FSiDA Default Tape_1 Synch D=30 C=4 Oracle MainDC FSiDA Default Tape_1 Synch D=30 C=4 FileSrv_A MainDC FSiDA Default Tape_1 Synch D=30 C=4 Exch2010 MainDC ExchDB Default Tape_1 Synch D=30 C=4 RemoteFS_1 Psadena FSiDA Default Disk_1 Synch D=60 C=8 RemoteFS_2 Boston FSiDA Default Disk_1 Synch D=60 C=8 RemoteFS_3 Dallas FSiDA Default Disk_1 Synch D=60 C=8 Oracle MainDC ORiDA Default Tape_1 Sel D=30 C=4

Technical Planning Methodology - 313 This section of the matrix chart shows the storage policy and secondary copy names. Note that the secondary copies for remote data are associated with the global deduplication policy. Policy Name Subclients Name Cyc Day Library Exc2010-FSiDA- "Default", SQL-FSiDA- "Default",Oracle-FSiDA- MainDC_FSiDA "Default",FileSrv_A- SyncTape TapeLib/ /128k FSiDA-"Default" _DR 4 30 LTO4/4drive SQL_MainDC/2 56k Oracle_MainDC /256k Exchange_Mai ndc/256k PasdenaRemot e/128k BostonRemote /128k DallasRemote/ 128k SQL-SQLiDA-"Default" Orcale-OracleiDA- "Default" Exch2010-ExchiDA- "Default" RemoteFS_1-FSiDA- 'Default" RemoteFS_2-FSiDA- 'Default" RemoteFS_3-FSiDA- 'Default" SelTapeE OW_DR 4 30 SelTapeE OW_DR 4 30 SelTapeE OW_DR 4 30 TapeLib/ LTO4/4drive TapeLib/ LTO4/4drive TapeLib/ LTO4/4drive Global Dedupe SyncDskG DD_DR 8 60 DiskLib1 YES ALL SyncDskG DD_DR 8 60 DiskLib1 YES ALL SyncDskG DD_DR 8 60 DiskLib1 YES ALL Subclients All All All All Since user data on the file server is required for a 180 day recovery period a secondary data recovery copy will be created using a tape target.

314 - Technical Planning Methodology A synchronous copy is created to ensure all user data is preserved for 180 days. It is important to note that although this step seems minor compared to the other configurations in this scenario, data recovery copies are a critical and often overlooked aspect of data protection. Secondary Data Recovery Target Retention (Days) Server Location Agent Subclient Type SQL MainDC SQLiDA Default Exch2010 MainDC FSiDA Default SQL MainDC FSiDA Default Oracle MainDC FSiDA Default FileSrv_A MainDC FSiDA Default Tape_1 Synch 180 Exch2010 MainDC ExchDB Default RemoteFS_1 Psadena FSiDA Default RemoteFS_2 Boston FSiDA Default RemoteFS_3 Dallas FSiDA Default Oracle MainDC ORiDA Default Data preservation & compliance copies will be used to preserve Email and database data using end of quarter copies to tape media.

Technical Planning Methodology - 315 Although all the secondary selective copies are being preserved with the same requirements, since the deduplication settings require different storage policies, the end of quarter copies will each require separate media. This is based on the rule that data cannot be combined when protected through different storage policies. Server Location Agent Subclient Secondary Compliance Target Type Retention SQL MainDC SQLiDA Default Tape_1 Sel EOQ 5 years Eexc2010 MainDC FSiDA Default SQL MainDC FSiDA Default Oracle MainDC FSiDA Default FileSrv_A MainDC FSiDA Default Exch2010 MainDC ExchDB Default Tape_1 Sel EOQ 5 years RemoteFS_1 Psadena FSiDA Default RemoteFS_2 Boston FSiDA Default RemoteFS_3 Dallas FSiDA Default Oracle MainDC ORiDA Default Tape_1 Sel EOQ 5 years Secondary Copy Considerations The following additional options can be configured for secondary copies. Feature / Option Combine to Streams Description This option will consolidate multiple source streams into fewer destination streams. Use this setting to consolidate streams on media for more efficient usage of tape media. This option is not required when writing to a secondary disk location. On the other hand if multiple streams from different agents or servers are combined to the same media, they will not be able to be recovered concurrently. Separate recovery operations would have to be performed. If multiple streams belong to the same data set in the same agent then they can be recovered concurrently when using the Browse and Recovery option. Multiplexing Source Streams If the source location is a disk library with multiple mount paths this option can be used to improve read performance from the disks when using the Combine to Streams option.

316 - Technical Planning Methodology Encryption Secondary copies can be encrypted using one of two methods: Hardware encryption using LTO4 or LTO5 drives can be enabled in the data path properties for the tape library. Software copy based encryption uses Simpana s FIPS certified encryption, which supports standard encryption methods to perform encryption during an auxiliary copy operations. This is configured in the Advanced tab of the secondary copy. Source Copy Backup Selection Source copy determines the source location of data during an auxiliary copy operation. The default source is the primary copy. Secondary copies can also be specified as the source in the Copy Policy tab of the secondary copy. This specifies the effective start point in which auxiliary copies will be based. The default value for this option is All Backups. For Synchronous copies, all available data in the source location will be copied. This option can be changed to a specific start date where all source jobs will be copied on or after the date specified. Since Selective copies are time based (Monthly, Quarterly, Yearly ), the start point must be specified. Inline Copy This option will perform a primary backup and an auxiliary copy at the same time. This requires that the primary and secondary data paths both be accessible from the same Media Agent. Parallel Copy Extended Retention Data Verification Erase Media This option will perform multiple auxiliary copies at the same time. Any secondary copies with this option selected will be run in parallel when an auxiliary copy starts. This requires both secondary data paths to be accessible from the same Media Agent. Extended Retention can be configured to extend retention on existing full backups. Similar to Selective Copies, this option can be configured to select full backups at time intervals such as Weekly, Monthly, Quarterly, or Yearly. Up to three tiers can be defined providing for a Grandfather/Father/Son tape rotation method. This option can be configured to verify chunk data on media after it has been written. Selecting this option will mark media to be erased after all data has aged. Erase media operations must be scheduled for each tape library to erase media. The erase media operation will overwrite the OML header on the tape making the data unrecoverable through the CommCell Console, Media Explorer, or catalog operations.

Technical Planning Methodology - 317 Determine Requirements for Additional Subclients Up to this point all storage policy design strategies have been based on each agent only having a Default Subclient configured. This approach is based on the idea that most CommVault administrators design their environments using just the default subclient as the source of all data for an agent. Defining custom subclients allow for much greater flexibility in crafting a comprehensive protection strategy. User defined subclients allow data to be explicitly defined within a subclient container. This subclient container can then be protected and managed independently within CommVault protected storage. There are many advantages to using custom subclients. The following list highlights the primary advantages of using custom subclients: Allows for better media management by meeting protection needs based on specific content. Better performance by using multiple streams within a subclient, protecting multiple subclients concurrently or by stagger scheduling subclients over a longer time period. Allow for custom configurations for specific data such as open file handling, filtering or Pre/Post Process scripts. Special Protection Needs Data being protected by an idataagent is by default protected by the Default Subclient. Custom subclients can be explicitly defined to manage specific data such as a folder or database. Each subclient container can be managed independently in protected storage. This can reduce the amount of data that needs to be protected by associating just subclient data to the storage policy copy to meet protection requirements. Example: A file server with 800 GB of data has a file share containing critical financial data that must be retained for 10 years. The folder containing the data can be defined in a separate subclient and the subclient can be associated with a storage policy copy with a 10 year retention. The result is instead of keeping all 800GB of data for 10 years; only the financial data will be kept for the required period. Performance Requirements Each defined subclient will be an independent job and use independent streams when being protected. There are several reasons why this will improve performance: Multiple Stream Backups A subclient can be configured to use multiple streams for supported agents. This is useful when data is being stored on a RAID array. To take advantage of RAID s fast read access; multiple streams can be used to improve the performance of data protection operations.

318 - Technical Planning Methodology Multiple Subclients running concurrently will result in multiple stream data protection operations. This is especially useful when the application does not inherently support multi-stream backups such as Exchange message level backups or archives. Stagger Schedule Backups By creating separate subclients, you can stagger schedule data protection operations. Instead of trying to get a full backup done in one night, different subclients can scheduled to run full backups throughout the week or month and incremental backups on other days. This can be especially useful for virtual machine backups or Network Attached Storage with large file counts. Special Attributes Certain data may require special handling to ensure the data is properly protected. By defining subclients specific for that data, custom configurations can be set. Filters Filters can be applied through the Global Filter applet in Control Panel or locally at the subclient level. If specific folder locations require special filters, a dedicated subclient should be used. Define the subclient content to the location where the filters will be applied and configure local filters for that subclient. The option to use Global Filters can still be used allowing the global and local filters to be combined. If global filters are being used but specific subclient data should not have certain filters applied define the content in a separate subclient. Global filters can still be enabled for the subclient but the exclusions list can be used to override the global filter settings for specific file/folder patterns. Open File Handling Open file Handling using Microsoft VSS or CommVault QSnap can be used to ensure open files are protected. VSS is an available option for Windows 2003 or higher agents. Non Windows agents can use CommVault QSnap to ensure open files are protected. Pre/Post Scripts Pre/Post Process scripts can be used to quiesce applications prior to protection. This is very useful when protecting proprietary database systems or for quiescing databases within virtual machines prior to using the Simpana Virtual Server Agent for snapping and backing up the virtual machine.

Technical Planning Methodology - 319 Determine Requirements for Additional Storage Policies Incremental Storage Policy An Incremental Storage Policy links two policies together. The main policy will manage all Full backup jobs. The incremental policy will manage all dependent jobs (incremental, differential or logs). This is useful when the primary target for full backups needs to be different than dependent jobs. Traditionally this has been used with database backups where the full backup would go to tape and log backups would go to disk. When performing log backups multiple times each day, replaying logs from disk during restore operations is considerably faster than replaying the logs from tape. Microsoft SQL Log Storage Policy MS-SQL subclients have a unique configuration where Full and Differential backups can be directed to one storage policy and Log backups can be directed to a second policy. This is the same concept as Incremental Storage Policies except that instead of linking the policies together, the two policies are defined in the Storage Device tab of the SQL subclient. Legal Hold Policy When using the Simpana Content Indexing and compliance search feature, auditors can perform content searches on end user data. The search results can be incorporated into a Legal Hold. By designating a storage policy as a Legal Hold policy, the auditor will have the ability to associate selected items required for Legal Hold with designated legal hold policies. It is recommended to use dedicated Legal Hold policies when using this feature. Legal Hold Storage Policies can also be used with Content Director for records management policies. This allows content searches to be scheduled and results of the searches can be automatically copied into a designated Legal Hold Policy. Erase Data Erase data is a powerful tool that allows end users or Simpana administrators to granularly mark objects as unrecoverable within the CommCell environment. For object level archiving such as files and Email messages, if an end user deleted a stub, the corresponding object in CommVault protected storage can be marked as unrecoverable. Administrators can also browse or search for data through the CommCell Console and mark the data as unrecoverable. It is technically not possible to erase specific data from within a job. The way Erase data works is by logically marking the data unrecoverable. If a browse or find operation is conducted the data will not appear. In order for this feature to be effective, any media managed by a storage policy with Erase Data enabled will not be able to be recovered through Media Explorer, Restore by Job, or Cataloged. It is important to note that enabling or disabling this feature cannot be applied retroactively to media already written to. If this option is enabled then all media managed by the policy cannot be recovered other than through the CommCell Console. If it is not enabled then all data managed by the policy can be recovered through Media Explorer, Restore by Job, or Cataloged.

320 - Technical Planning Methodology If this feature is going to be used it is recommended to use dedicated storage policies for all data that may require the Erase Data option to be applied. For data that is known to not require this option disable this feature. Note: This option is enabled by default on all new storage policies created as of Simpana v9 sp3. Group Security If specific groups need rights to a storage policy to manage it than it is recommended different policies should be created for each group. This is a very effective separation of power method in larger departmentalized organizations. Each department group can be granted management capabilities to their own storage policies. Media Password The Media Password is used when recovering data through Media Explorer or by Cataloging media. When using hardware encryption or the Simpana software copy based encryption with the Direct Media Access option set to Via Media Password, a media password is essential. By default the password is set for the entire CommCell environment in the System applet in Control Panel. Storage policy level media passwords can be set which will override the CommCell password settings. For higher level of security or if a department requires specific passwords, use the Policy level password setting which is configured in the Advanced tab of the Storage Policy Properties. Using Encryption with a Deduplication Policy If Client side encryption is going to be used with deduplicated data, separate storage policies must be used to separate encrypted and non-encrypted data. The ability to encrypt deduplicated data is a powerful tool which is unique to the Simpana software. This is because the encryption will take place after the block has been hashed and compared. Using encryption for deduplicated data is especially useful when backing up deduplicated data to Cloud storage.