Presented by: George Crump President and Founder, Storage Switzerland www.storagedecisions.com Today s Goals Achieve 100% Backup Success! o The Data Protection Problem Bad Foundation, New Problems o Fixing the Foundation A Strategy, Not Products o Establishing Service Level Objectives, NOT Agreements o Augmenting or Replacing Data Protection Solutions to Meet New Objectives Today s Goals Achieve 100% Backup Success! o The Data Protection Problem Bad Foundation, New Problems o Fixing the Foundation A Strategy, Not Products o Establishing Service Level Objectives, NOT Agreements o Augmenting or Replacing Data Protection Solutions to Meet New Objectives o Leave Today With A Clear Idea on Establishing A NEW Strategy!
About Storage Switzerland Analyst firm covering storage, cloud and virtualization markets Knowledge of these markets is gained through product testing, real world implementations and interactions with users and suppliers The results of this research are found in the articles, briefing reports, case studies and lab reports on www.storageswiss.com and www.searchstorage.com About George Crump Founder of Storage Switzerland Designing advanced backup architectures since 1988 Not a Backup Purist Get The Job Done Read my full bio on the first page of your proceedings booklets! George Crump is the founder of Storage Switzerland, the leading storage analyst focused on the subjects of big data, solid state storage, virtualization, cloud computing and data protection. He is widely recognized for his articles, whitepapers, and videos on such current approaches as all-flash arrays, deduplication, SSDs, software-defined storage, backup appliances, and storage networking. He has 25 years of experience designing storage solutions for data centers across the US. Session 1 The Data Protection Problem (Data protection is hard!)
The Data Protection Problem Old Problems, Still A Problem Moving Data Across A Network Data Capture Requirements increasing, leads to a combination of snapshots, replication and backup Off-Site is a requirement and has to happen more frequently Recovery is measured in minutes not hours or days Agile Data Center Makes it Worse BYOD has placed user data everywhere Virtualization has significantly increase the number of VMs to be protected Unstructured data growth is out of control IT Staffing is Incredibly Thin
Which Leads To Throw Hardware and Software at it problem solving Purpose built primary storage Environment specific backup applications Backup architectures older than primary storage architectures Backup and Recovery Expectations Are Assumed Which Leads To Throw Hardware and Software at it problem solving Purpose built primary storage Environment specific backup applications Backup architectures older than primary storage architectures Backup and Recovery Expectations Are Assumed Mutual Mystification Mutual Mystification is Hazardous To You Career User Expectations (assumptions) are higher than ever Instant Recovery, Zero Downtime All version of all data kept forever (for free) Service Level Agreements are seldom set or adhered to
Developing a Strategy Time for a Fresh Approach The Service Level Agreement Problem Ask any user or application owner how much data can they afford to loose and guess what the answer will be? The Service Level Agreement Problem Ask any user or application owner how much data can they afford to loose and guess what the answer will be?
Set Objectives Don t Broker Agreements Ask any user or application owner how much data can they afford to loose and guess what the answer will be? NONE! Service Level Objectives are set by IT based on their known capabilities and intrinsic understanding of the environment Components of a Service Level Objective Recovery Point Objective Recovery Time Objective Version Retention Objective (archive) Geographic Recovery Objective (Disaster Recovery) Components of a SLO
Time for a Fresh Approach Developing a Strategy RPO + RTO + VRO + GRO = Service Level Objective (SLO) Results of an Service Level Objective Strategy 60% of the objective is accepted with no changes needed! Results of an Service Level Objective Strategy 60% of the objective is accepted with no changes needed! 25% of SLOs require some modification, minor upgrades
Results of an Service Level Objective Strategy 60% of the objective is accepted with no changes needed! 25% of SLOs require some modification, minor upgrades 15% of SLOs need major work but at least there are no assumptions about abilities Results of an Service Level Objective Strategy 60% of the objective is accepted with no changes needed! 25% of SLOs require some modification, minor upgrades 15% of SLOs need major work but at least there are no assumptions about abilities BUT. Results of an Service Level Objective Strategy 60% of the objective is accepted with no changes needed! 25% of SLOs require some modification, minor upgrades 15% of SLOs need major work but at least there are no assumptions about abilities 100% of Mutual Mystification Eliminated!
Results of an Service Level Objective Strategy 60% of the objective is accepted with no changes needed! 25% of SLOs require some modification, minor upgrades 15% of SLOs need major work but at least there are no assumptions about abilities Augment with additional hardware and/or software Replace with new enterprise solutions Time for a Fresh Approach Developing a Strategy Data Protection Tools The Tools Taking Care of That Last 15% Data Protection Type Enterprise backup Application specific backup Archive applications Snapshots Replication Primary Role Foundational coverage Gap coverage VRO & AdHoc copies Rapid RPO/RTO Rapid RPO/RTO & GRO Time for a Fresh Approach Developing a Strategy Data Protection Tools The Tools Taking Care of That Last 15% Data Protection Hardware Tape Backup disk Secondary disk Tier 2 disk Primary Role Backup & archive target - Long term storage Backup target - Short term recovery Archive and replication target Cluster/replication failover
Where to Start? Pick Your Top 3 Apps Set objectives for each Select the right tool to meet each SLO Document which tools you will use and when Identify the tools needed to meet each SLO component Session 2 Creating Your Data Protection Toolkit The Components of a Service Level Objective What is recovery point objective? (RPO) - Re-key / re-load time - Reduced by increasing the data protection interval Ways to reduce - Increase data capture events - Online access to data
Meeting the Recovery Point Objective Tools to lower RPO - Standard backups - Block level incremental Change block tracking - Snapshots - Replication to a standby server - Active active clustering Meeting the Recovery Point Objective The Components of a Service Level Objective What is recovery time objective? (RTO) - Time it takes to bring application back on line - Part of the RTO is the RPO - Factors Time to transfer data Time to do WRITEs vs. READs - Reduce Eliminate data movement Speed data Movement
Meeting the Recovery Time Objective Tools To Lower RTO - Changed block recovery - Recovery in place - Recovery in cloud - Snapshots - Replication to standby server Meeting the Recovery Time Objective The Components of a Service Level Objective What is version retention objective? (VRO) - Number of versions of data that you will retain Where? How long? - How fast will you need it? Not as fast as RTO But recall speed can vary Focus on recall patterns not data age
Meeting the Version Retention Objective Tools To Improve The VRO - Backup as archive Yes, backup - TapeNAS - Disk archive - The cloud Meeting the Version Retention Objective Meeting The Geographic Retention Objective What is geographic recovery objective? (GRO) - GRO = Disaster recovery plan - All GROs need a cross-regional copy of data - Some SLOs may require a metro copy Synchronized / more real time - Recent copies only Although the archive may exist in the same DR site, not the same thing
Meeting The Geographic Retention Objective Let s Talk About Distance! Meeting the Geographical Retention Objective Tools to improve GRO - Replication software on host or in VM - Storage system replication - Backup device replication - Physical tape movement Testing Your Toolkit Should you test? How will you test? How often will you test? Will the testing process work? Should this be a part of the SLO? Should You REPORT on your Tests
Managing to an SLO In a perfect world, we want an application that will just report on SLO violations, not failed jobs. We are not there yet, but we are getting very close Manual Vendor software Third party software The SLO in Action Major healthcare facility - Top 3 applications Patient onboarding Patient billing Long-term retention - Top 3 concerns Speed of recovery Capacity growth in long-term retention Data loss of medical images The SLO in Action: Major Healthcare Facility Top 3 Applications Patient onboarding Patient billing Long-term retention Top 3 Concerns Speed of recovery Capacity growth in long-term retention Data loss of medical images
Session 3 Putting the Toolkit into Action (A Workshop) Disaster Preparedness Loss of Data Center We ll plan for loss of data center Disaster Preparedness Loss of Data Center Why plan for loss of data center? Why not? - Earthquake, flood, hurricane - Airplane through building - Terrorist attack - Cyber attack
Disaster Preparedness Description DR/site situation Scenario cards - Budget / SLO Primary site situation - Varies by scenario Design Actual disaster Loss of Data Center DR site situation - No equipment at remote site - T3 connection - WAN optimization solution tested, 50% improvement - $75K investment - Option to upgrade to OC3 or OC12 Loss of Data Center Primary site - 150K level - 200K level - 250K level - 300K level
Loss of Data Center Primary site - 150K level 10TBs of data to get off-site Data can be behind production data by no more than 24 hours There is one application based on MS-SQL that has an 8-hour RPO/RTO o Has 200GB of data that changes 25% per day You do not have a remote location available to use as the DR Site. You must leverage a Co-Lo, MSP, CSP. Loss of Data Center Primary site - 200K level 20TBs of data to get off-site Data can be behind production data by no more than 12 hours One application based on Oracle running on AIX that has a 4-hour RPO/RTO o Has 1TB of data that changes 10% per day There is a virtual environment that needs to be up and running in the DR site You have offices throughout the southern US to leverage as a DR site, but none of these offices have IT personnel Loss of Data Center Primary site - 250K level 40TBs of data to get off-site Data by no more than 8 hours There is one application based on four MS-SQL servers that has a 2-hour RPO/RTO o Has 1TB of data that changes 25% per day There is a virtual environment that needs to be up and running in the DR site and has to be able to host a virtual exchange instance, and a virtual SharePoint instance You have offices with limited IT staffing in Calgary (Canada)
Loss of Data Center Primary site - 300K level 60TBs of data to get off-site Data by no more than 4 hours There is one application based on one Linux based Oracle server that has a 30-minute RPO/RTO o Has 2TB of data that changes 25% per day There is a virtual environment that needs to be up and running in the DR site You have offices throughout the world, with data centers in Denver and London Loss of Data Center Design - Where? - Seeding? - On-going data copy-based on scenario - Mission-critical apps - People? - Testing? Loss of Data Center Disaster strikes - Failover strategy - People strategy - Return strategy
Loss of Data Center What Would I Do? Primary site - 150K level - 200K level - 250K level - 300K level Loss of Data Center What Would I Do? Primary site - 150K level - 200K level - 250K level - 300K level Thank You! Questions? Keep in touch! George Crump President and Founder, Storage Switzerland Email: gcrump@storage-switzerland.com Website: www.storageswiss.com Twitter: @storageswiss Member of:
SEMINAR