Module 4 STORAGE NETWORK BACKUP & RECOVERY

Similar documents
Information Storage and Management TM Volume 2 of 2 Student Guide. EMC Education Services

Introduction to Business continuity Planning

Business Continuity and Disaster Recovery. Ed Crowley Ch 12

After the Attack. Business Continuity. Planning and Testing Steps. Disaster Recovery. Business Impact Analysis (BIA) Succession Planning

Security+ Guide to Network Security Fundamentals, Third Edition. Chapter 13 Business Continuity

Disaster Recovery Solutions for Oracle Database Standard Edition RAC. A Dbvisit White Paper By Anton Els

TOP REASONS TO CHOOSE DELL EMC OVER VEEAM

ECE Engineering Robust Server Software. Spring 2018

IT CONTINUITY, BACKUP AND RECOVERY POLICY

BUSINESS CONTINUITY: THE PROFIT SCENARIO

DISASTER RECOVERY PRIMER

How to Protect Your Small or Midsized Business with Proven, Simple, and Affordable VMware Virtualization

Archiving, Backup, and Recovery for Complete the Promise of Virtualisation Unified information management for enterprise Windows environments

ZYNSTRA TECHNICAL BRIEFING NOTE

Audit & Advisory Services. IT Disaster Recovery Audit 2015 Report Date January 28, 2015

Disaster Recovery and Business Continuity

IBM Tivoli Storage Manager 6

INFORMATION SECURITY- DISASTER RECOVERY

Data Sheet: High Availability Veritas Cluster Server from Symantec Reduce Application Downtime

10 Reasons Why Your DR Plan Won t Work

Backup vs. Business Continuity

Business Continuity Plan Executive Overview

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange 2007

Contingency Planning and Disaster Recovery

BME CLEARING s Business Continuity Policy

Florida State University

Disaster Recovery Planning: Is Your Plan in Place? Presented by: Steve Shofner, CISA, CGEIT

Continuity of Business

White Paper. Disaster Recovery in the Cloud

Rejuvenating BCM - Infrastructure. Business Continuity Awareness Week March 2009

Maximum Availability Architecture: Overview. An Oracle White Paper July 2002

UPS system failure. Cyber crime (DDoS ) Accidential/human error. Water, heat or CRAC failure. W eather related. Generator failure

Disaster Recovery Is A Business Strategy

High Availability and Disaster Recovery Solutions for Perforce

Projectplace: A Secure Project Collaboration Solution

Disaster Recovery Planning: Weighing your customer s options

Leveraging ITIL to improve Business Continuity and Availability. itsmf Conference 2009

EMC GLOBAL DATA PROTECTION INDEX KEY FINDINGS & RESULTS FOR HONG KONG

Introduction 1.1 SERVER-CENTRIC IT ARCHITECTURE AND ITS LIMITATIONS

InterSystems High Availability Solutions

EMC GLOBAL DATA PROTECTION INDEX STUDY KEY RESULTS & FINDINGS FOR THE USA

Information Systems. Data Protection Disaster recovery Backups

EMS TM Continuity

Veritas Cluster Server from Symantec

EMC GLOBAL DATA PROTECTION INDEX KEY FINDINGS & RESULTS FOR INDIA

IPMA State of Washington. Disaster Recovery in. State and Local. Governments

HP Designing and Implementing HP Enterprise Backup Solutions. Download Full Version :

EMC GLOBAL DATA PROTECTION INDEX KEY FINDINGS & RESULTS FOR UAE

EMC GLOBAL DATA PROTECTION INDEX KEY FINDINGS & RESULTS FOR INDONESIA

Take Back Lost Revenue by Activating Virtuozzo Storage Today

EMC GLOBAL DATA PROTECTION INDEX KEY FINDINGS & RESULTS FOR BRAZIL

HYBRID CLOUD BACKUP & DISASTER RECOVERY

EMC GLOBAL DATA PROTECTION INDEX KEY FINDINGS & RESULTS FOR ITALY

ITIL overview Service Delivery. Jaroslav Procházka

EMC GLOBAL DATA PROTECTION INDEX KEY FINDINGS & RESULTS FOR AMERICAS

EMC GLOBAL DATA PROTECTION INDEX KEY FINDINGS & RESULTS FOR APJ

The Problem. Business Continuity/ Disaster Recovery. Course Outline and Structure. The Problem The Coverage. Sean Gunasekera

CLOUDALLY EBOOK. Best Practices for Business Continuity

Business Continuity and Disaster Recovery Disaster-Proof Your Business

New England Data Camp v2.0 It is all about the data! Caregroup Healthcare System. Ayad Shammout Lead Technical DBA

TUFTS HEALTH PLAN CORPORATE CONTINUITY STRATEGY

Rediffmail Enterprise High Availability Architecture

The Data Protection Rule and Hybrid Cloud Backup

How to Protect SAP HANA Applications with the Data Protection Suite

Level 3 Certificate in Cloud Services (for the Level 3 Infrastructure Technician Apprenticeship) Cloud Services

IBM Storage Software Strategy

EMC Celerra CNS with CLARiiON Storage

White Paper: Backup vs. Business Continuity. Backup vs. Business Continuity: Using RTO to Better Plan for Your Business

IBM TS7700 grid solutions for business continuity

Technology Insight Series

PowerVault MD3 Storage Array Enterprise % Availability

Backups and archives: What s the scoop?

Copyright 2012 EMC Corporation. All rights reserved.

NWPPA2016. Disaster Recovery NWPPA Reno, NV Copyright 2016, IVOXY Consulting, LLC

Exam : A : Assess: Fundamentals of Applying Tivoli Storage Management V3. Title. Version : Demo

University Information Technology Data Backup and Recovery Policy

High Availability- Disaster Recovery 101

Real-time Protection for Microsoft Hyper-V

Protecting Microsoft Hyper-V 3.0 Environments with Arcserve

ECE Enterprise Storage Architecture. Fall 2018

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc.

Network Performance, Security and Reliability Assessment

EMC GLOBAL DATA PROTECTION INDEX KEY FINDINGS & RESULTS FOR TURKEY

Chapter 1. Storage Concepts. CommVault Concepts & Design Strategies:

Exam Name: Midrange Storage Technical Support V2

Breakthrough Data Recovery for IBM AIX Environments

Designing Data Protection Strategies for Lotus Domino

Microsoft E xchange 2010 on VMware

EXAMGOOD QUESTION & ANSWER. Accurate study guides High passing rate! Exam Good provides update free of charge in one year!

Titan SiliconServer for Oracle 9i

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Business Continuity & Disaster Recovery

TSM Paper Replicating TSM

In this unit we are going to review a set of computer protection measures also known as countermeasures.

Designing Data Protection Strategies for Oracle Databases

How Does Failover Affect Your SLA? How Does Failover Affect Your SLA?

SOLUTION BRIEF RSA ARCHER BUSINESS RESILIENCY

Solution Pack. Managed Services Virtual Private Cloud Managed Database Service Selections and Prerequisites

Using Computer Associates BrightStor ARCserve Backup with Microsoft Data Protection Manager

Edge for All Business

Transcription:

Module 4 STORAGE NETWORK BACKUP & RECOVERY BC Terminology, BC Planning Lifecycle General Conditions for Backup, Recovery Considerations Network Backup, Services Performance Bottlenecks of Network Backup, Backup Clients, Back up file systems, Backup Databases, Next Generation Backup.

Introduction to Business Continuity (BC) Information Availability BC Terminology BC Planning Lifecycle

Business Continuity (BC) Business continuity encompasses planning and preparation to ensure that an organization can continue to operate in case of serious incidents or disasters and is able to recover to an operational state within a reasonably short period.

Business Continuity (BC) Business continuity (BC) is an integrated and enterprise wide process that includes all activities (internal and external to IT) that a business must perform to mitigate the impact of planned and unplanned downtime. It involves proactive measures, such as business impact analysis and risk assessments, data protection, and security, and reactive countermeasures, such as disaster recovery and restart, to be invoked in the event of a failure. The goal of a business continuity solution is to ensure the information availability required to conduct vital business operations.

Information Availability (IA) It refers to the ability of the infrastructure to function according to business expectations during its specified time of operation. Information availability ensures that people (employees, customers, suppliers, and partners) can access information whenever they need it. Information availability can be defined with the help of reliability, accessibility and timeliness.

Information Unavailability Various planned and unplanned incidents result in data unavailability: 1. Planned incidents include installation/ integration/ maintenance of new hardware, software upgrades, taking backups, application and data restores, facility operations (renovation and construction), and refresh/migration of the testing to the production environment. 2. Unplanned incidents include failure caused by database corruption, component failure, and human errors.

Another type of incident that may cause data unavailability is natural or man-made disasters such as flood, fire, earthquake, etc.

Fig: Cause of Information Unavailability

Measuring Information Availability 1. Mean Time Between Failure (MTBF): It is the average time available for a system or component to perform its normal operations between failures. 2. Mean Time To Repair (MTTR): It is the average time required to repair failed component. While calculating MTTR, it is assumed that the fault responsible for the failure is correctly identified

Measuring Information Availability Mean Time To Repair (MTTR): It includes the time required to do the following: detect the fault, mobilize the maintenance team, diagnose the fault, obtain the spare parts, repair, test, and resume normal operations.

Measuring Information Availability IA is the fraction of a time period that a system is in a condition to perform its intended function upon demand. It can be expressed in terms of system uptime and downtime and measured as the amount or percentage of system uptime: IA = system uptime / (system uptime + system downtime) In terms of MTBF and MTTR, IA could also be expressed as IA = MTBF / (MTBF + MTTR)

BC Terminology 1. Disaster recovery: This is the coordinated process of restoring systems, data, and the infrastructure required to support key ongoing business operations in the event of a disaster. It is the process of restoring a previous copy of the data and applying logs or other necessary processes to that copy to bring it to a known point of consistency. Once all recoveries are completed, the data is validated to ensure that it is correct.

BC Terminology 2. Disaster restart: This is the process of restarting business operations with mirrored consistent copies of data and applications. 3. Recovery-Point Objective (RPO): This is the point in time to which systems and data must be recovered after an outage. It defines the amount of data loss that a business can endure. A large RPO signifies high tolerance to information loss in a business.

BC Terminology 3. Recovery- Time Objective (RTO) : The time within which systems, applications, or functions must be recovered after an outage. It defines the amount of downtime that a business can endure and survive. Businesses can optimize disaster recovery plans after defining the RTO for a given data center or network.

BC Planning Lifecycle The BC planning lifecycle includes five stages : 1. Establishing objectives 2. Analyzing 3. Designing and developing 4. Implementing 5. Training, testing, assessing, and maintaining

Figure: BC planning lifecycle

BC Planning Lifecycle 1. Establishing objectives Determine BC requirements. Estimate the scope and budget to achieve requirements. Select a BC team by considering subject matter experts from all areas of the business, whether internal or external. Create BC policies.

BC Planning Lifecycle 2. Analyzing Collect information on data profiles, business processes, infrastructure support, dependencies, and frequency of using business infrastructure. Identify critical business needs and assign recovery priorities. Create a risk analysis for critical areas and mitigation strategies. Conduct a Business Impact Analysis (BIA). Create a cost and benefit analysis based on the consequences of data unavailability. Evaluate options

BC Planning Lifecycle 3. Designing and developing Define the team structure and assign individual roles and responsibilities. For example, different teams are formed for activities such as emergency response, damage assessment, and infrastructure and application recovery. Design data protection strategies and develop infrastructure. Develop contingency scenarios. Develop emergency response procedures. Detail recovery and restart procedures.

BC Planning Lifecycle 4. Implementing Implement risk management and mitigation procedures that include backup, replication, and management of resources. Prepare the Disaster Recovery (DR) sites that can be utilized if a disaster affects the primary data center. Implement redundancy for every resource in a data center to avoid single points of failure.

BC Planning Lifecycle 5. Training, testing, assessing, and maintaining Train the employees who are responsible for backup and replication of business-critical data on a regular basis or whenever there is a modification in the BC plan. Train employees on emergency response procedures when disasters are declared. Train the recovery team on recovery procedures based on contingency scenarios. Perform damage assessment processes and review recovery plans. Test the BC plan regularly to evaluate its performance and identify its limitations. Assess the performance reports and identify limitations. Update the BC plans and recovery/restart procedures to reflect regular changes within the data centre.

General Conditions for Backup Installed storage capacity doubles every 4-12 months depending upon the company requirement. The data set is thus often growing more quickly than the infrastructure in general (personnel, network capacity). Nowadays, business processes have to be adapted to changing requirements all the time. As business processes change, so the IT systems that support them also have to be adapted. As a result, the daily backup routine must be continuously adapted to the ever-changing IT infrastructure.

As a result of globalisation, the Internet and e- business, more and more data has to be available around the clock. Network backup can help us to get to grips with these problems.

Network Backup Services Network backup systems such as Arcserve (Computer Associates), NetBackup (Symantec/Veritas), Networker (EMC/Legato) and Tivoli Storage Manager (IBM) provide the following services: 1. Backup 2. Archive 3. Hierarchical Storage Management (HSM)

Network Backup Services Backup The main task of network backup systems is to back data up regularly. To this end, at least one up-to-date copy must be kept of all data, so that it can be restored after a hardware or application error ( file accidentally deleted or destroyed by editing, error in the database programming ).

Network Backup Services Archive The goal of archiving is to freeze a certain version of data so that precisely this version can be retrieved at a later date. For example, at the end of a project the data that was used can be archived on a backup server and then deleted from the local hard disk. This releases local disk space and accelerates the backup and restore processes, because only the data currently being worked on needs to be backed up or restored.

Network Backup Services Hierarchical Storage Management (HSM) HSM moves files that have not been accessed for a long time from the local disk to the backup server; only a directory entry remains in the local file server. The entry in the directory contains meta-information such as file name, owner, access rights, date of last modification and so on.

PERFORMANCE BOTTLENECKS OF NETWORK BACKUP 1. Application-specific performance bottlenecks 2. Performance bottlenecks due to server-centric IT architecture

1. Application-specific performance bottlenecks are all those bottlenecks that can be traced back to the network backup application. The main candidate is the metadata database. Almost every action in the network backup system is associated with one or more operations in the metadata database. If, for example, several versions of a file are backed up, an entry is made in the metadata database for each version.

1. Application-specific performance bottlenecks The backup of a file system with several hundreds of thousands of files can thus be associated with a whole range of database operations. A further candidate is the storage hierarchy: when copying the data from hard disk to tape the media manager has to load the data from the hard disk into the main memory via the I/O bus and the internal buses, only to forward it from there to the tape drive via the internal buses and I/O bus.

1. Application-specific performance bottlenecks This means that the buses can get clogged up during the copying of the data from hard disk to tape. The same applies to tape reclamation.

2. Performance bottlenecks due to servercentric IT architecture In a server-centric IT architecture storage devices only exist in relation to servers; access to storage devices always takes place via the computer to which the storage devices are connected. The performance bottlenecks described in the following apply for all applications that are operated in a server-centric IT architecture.

2. Performance bottlenecks due to servercentric IT architecture Let us assume that a backup client wants to back data up to the backup server Figure : In network backup, all data to be backed up must be passed through both computers. Possible performance bottlenecks are internal buses, CPU and the LAN.

2. Performance bottlenecks due to servercentric IT architecture The backup client loads the data to be backed up from the hard disk into the main memory of the application server via the SCSI bus, the PCI bus and the system bus, only to forward it from there to the network card via the system bus and the PCI bus. On the backup server the data must once again be passed through the buses twice.

2. Performance bottlenecks due to servercentric IT architecture During backup, therefore, the buses of the participating computers can become a bottleneck, particularly if the application server also has to bear the I/O load of the application or the backup server is supposed to support several simultaneous backup operations. The network card transfers the data to the backup server via TCP/IP and Ethernet. Previously the data exchange via TCP/IP was associated with a high CPU load. However, the CPU load caused by TCP/IP data traffic can be reduced using TCP/IP offload engines

BACKUP CLIENTS A platform-specific client (backup agent) is necessary for each platform to be backed up. The base client can back up and archive files and restores them if required. The term platform is used here to mean the various operating systems and the file systems that they support. Some base clients offer HSM for selected file systems. The backup of file systems takes place at file level as standard.

BACKUP CLIENTS This means that each changed file is completely re-transferred to the server and entered there in the metadata database. By using backup at volume level and at block level it is possible to change the granularity of the objects to be backed up. When backup is performed at volume level, a whole volume is backed up as an individual object on the backup server.

BACKUP CLIENTS Although this has the disadvantage that free areas, on which no data at all has been saved, are also backed up, only very few metadata database operations are necessary on the backup server and on the client side it is not necessary to spend a long time comparing which files have changed since the last backup. As a result, backup and restore operations can sometimes be performed more quickly at volume level than they can at file level. This is particularly true when restoring large file systems with a large number of small files.

BACKUP CLIENTS Backup on block level optimises backup for members of the external sales force, who only connect up to the company network now and then by means of a laptop via a dial-up line or the Internet. In this situation the performance bottleneck is the low transmission capacity between the backup server and the backup client.

BACKUP CLIENTS When backing up on block level the backup client additionally keeps a local copy of every file backed up. If a file has changed, it can establish which parts of the file have changed. The backup client sends only the changed data fragments (blocks) to the backup server. This can then reconstruct the complete file. Thus, when backing up on block level the quantity of data to be transmitted is reduced at the cost of storage space on the local hard disk.

BACKUP CLIENTS When backing up on block level the backup client additionally keeps a local copy of every file backed up. If a file has changed, it can establish which parts of the file have changed. The backup client sends only the changed data fragments (blocks) to the backup server. This can then reconstruct the complete file. Thus, when backing up on block level the quantity of data to be transmitted is reduced at the cost of storage space on the local hard disk.

BACKUP OF FILE SYSTEMS 1. Backup of file servers 2. Backup of file systems 3. Backup of NAS servers 4. The Network Data Management Protocol (NDMP)** (Note: For explanation refer Text Book1: page no. 288 to 291 )

BACKUP OF DATABASES 1. Functioning of database systems 2. Classical backup of databases 3. Next generation backup of databases (Note: For explanation refer Text Book1: page no. 299 to 303)

NEXT GENERATION BACKUP 1. Server-free backup 2. LAN-free backup 3. LAN-free backup with shared disk file systems 4. Backup using instant copies 5. Data protection using remote mirroring 6. Tape library sharing (Note: For explanation refer Text Book1: page no. 279 to 286 )