ECE Enterprise Storage Architecture. Fall 2018

Similar documents
ECE Engineering Robust Server Software. Spring 2018

Chapter 12 Backup and Recovery

Information Storage and Management TM Volume 2 of 2 Student Guide. EMC Education Services

Business Continuity and Disaster Recovery. Ed Crowley Ch 12

Content Addressed Storage (CAS)

Chapter 10 Protecting Virtual Environments

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation

Understanding Virtual System Data Protection

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

Exploring Options for Virtualized Disaster Recovery

Chapter 11. SnapProtect Technology

arcserve r16.5 Hybrid data protection

Disaster Recovery-to-the- Cloud Best Practices


IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

High Availability- Disaster Recovery 101

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.

Exploring Options for Virtualized Disaster Recovery. Ranganath GK Solution Architect 6 th Nov 2008

HP Designing and Implementing HP Enterprise Backup Solutions. Download Full Version :

High Availability- Disaster Recovery 101

Simplify Backups. Dell PowerVault DL2000 Family

Application Recovery. Andreas Schwegmann / HP

Module 4 STORAGE NETWORK BACKUP & RECOVERY

EMC Data Domain for Archiving Are You Kidding?

How Symantec Backup solution helps you to recover from disasters?

1 Quantum Corporation 1

BC/DR Strategy with VMware

Backup Solution Testing on UCS B and C Series Servers for Small-Medium Range Customers (Disk to Tape) Acronis Backup Advanced Suite 11.

Virtualization with Arcserve Unified Data Protection

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc.

Protecting Microsoft Hyper-V 3.0 Environments with Arcserve

Chapter 1. Storage Concepts. CommVault Concepts & Design Strategies:

Microsoft E xchange 2010 on VMware

Using Computer Associates BrightStor ARCserve Backup with Microsoft Data Protection Manager

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Disaster Recovery Is A Business Strategy

Virtual Disaster Recovery

A Crash Course In Wide Area Data Replication. Jacob Farmer, CTO, Cambridge Computer

Replication is the process of creating an

Disaster Recovery Options

Copyright 2012 EMC Corporation. All rights reserved.

Disaster Happens; Don t Be Held

White Paper. A System for Archiving, Recovery, and Storage Optimization. Mimosa NearPoint for Microsoft

The Data Protection Rule and Hybrid Cloud Backup

A CommVault White Paper: Business Continuity: Architecture Design Guide

Microsoft SQL Server

SAP HANA. HA and DR Guide. Issue 03 Date HUAWEI TECHNOLOGIES CO., LTD.

Zero Data Loss Recovery Appliance DOAG Konferenz 2014, Nürnberg

RECOVERY SERIES BACKUP

The Data-Protection Playbook for All-flash Storage KEY CONSIDERATIONS FOR FLASH-OPTIMIZED DATA PROTECTION

Copyright 2012 EMC Corporation. All rights reserved.

VPLEX & RECOVERPOINT CONTINUOUS DATA PROTECTION AND AVAILABILITY FOR YOUR MOST CRITICAL DATA IDAN KENTOR

ZYNSTRA TECHNICAL BRIEFING NOTE

Archiving, Backup, and Recovery for Complete the Promise of Virtualisation Unified information management for enterprise Windows environments

The Microsoft Large Mailbox Vision

Backup Solution Testing on UCS B-Series Server for Small-Medium Range Customers (Disk to Tape) Acronis Backup Advanced Suite 11.5

EMC RecoverPoint. EMC RecoverPoint Support

Veritas Storage Foundation for Windows by Symantec

Trends in Data Protection and Restoration Technologies. Jason Iehl, NetApp

Vendor: EMC. Exam Code: E Exam Name: Cloud Infrastructure and Services Exam. Version: Demo

Boost your data protection with NetApp + Veeam. Schahin Golshani Technical Partner Enablement Manager, MENA

Protecting VMware vsphere/esx Environments with CA ARCserve

Real-time Protection for Microsoft Hyper-V

Configuration Guide for Veeam Backup & Replication with the HPE Hyper Converged 250 System

Server Fault Protection with NetApp Data ONTAP Edge-T

Protecting VMware vsphere/esx Environments with Arcserve

What s new. James De Clercq (RealDolmen) Timothy Dewin (Veeam Software)

VMware Mirage Getting Started Guide

vsan Disaster Recovery November 19, 2017

Enhanced Protection and Manageability of Virtual Servers Scalable Options for VMware Server and ESX Server

MOVING TOWARDS ZERO DOWNTIME FOR WINTEL Caddy Tan 21 September Leaders Have Vision visionsolutions.com 1

Accelerate the Journey to 100% Virtualization with EMC Backup and Recovery. Copyright 2010 EMC Corporation. All rights reserved.

Copyright 2012 EMC Corporation. All rights reserved.

Protecting Miscrosoft Hyper-V Environments

TSM Paper Replicating TSM

EMC RECOVERPOINT: ADDING APPLICATION RECOVERY TO VPLEX LOCAL AND METRO

Exam Name: Midrange Storage Technical Support V2

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange Enabled by MirrorView/S

Virtualizing disaster recovery helps ensure business resiliency while cutting operating costs.

INTRODUCTION TO XTREMIO METADATA-AWARE REPLICATION

Veeam and HP: Meet your backup data protection goals

Protecting Hyper-V Environments

Symantec Backup Exec Blueprints

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

VERITAS Volume Replicator. Successful Replication and Disaster Recovery

IBM SmartCloud Resilience offers cloud-based services to support a more rapid, reliable and cost-effective enterprise-wide resiliency.

50 TB. Traditional Storage + Data Protection Architecture. StorSimple Cloud-integrated Storage. Traditional CapEx: $375K Support: $75K per Year

Veritas Storage Foundation for Windows by Symantec

Offers easy management of all protected devices and data through a unified secure touchfriendly web-based management console.

Protecting Mission-Critical Application Environments The Top 5 Challenges and Solutions for Backup and Recovery

How CloudEndure Disaster Recovery Works

C H A P T E R Overview Figure 1-1 What is Disaster Recovery as a Service?

Executive Summary SOLE SOURCE JUSTIFICATION. Microsoft Integration

XtremIO Business Continuity & Disaster Recovery. Aharon Blitzer & Marco Abela XtremIO Product Management

VMware vsphere Clusters in Security Zones

Chapter 9 Protecting Client Data

10 Reasons Why Your DR Plan Won t Work

Chapter 3 `How a Storage Policy Works

Business Continuity & Disaster Recovery

Transcription:

ECE590-03 Enterprise Storage Architecture Fall 2018 Business Continuity: Disaster Recovery Tyler Bletsch Duke University Includes material adapted from the course Information Storage and Management v2 (modules 9-12), published by EMC corporation.

BC Terminologies 1 Disaster recovery Coordinated process of restoring systems, data, and infrastructure required to support business operations after a disaster occurs Restoring previous copy of data and applying logs to that copy to bring it to a known point of consistency Generally implies use of backup technology Disaster restart Process of restarting business operations with mirrored consistent copies of data and applications Generally implies use of replication technologies Module 9: Introduction to Business Continuity 2

BC Terminologies 2 Recovery-Point Objective (RPO) Point-in-time to which systems and data must be recovered after an outage Amount of data loss that a business can endure Recovery-Time Objective (RTO) Time within which systems and applications must be recovered after an outage Amount of downtime that a business can endure and survive Weeks Tape Backup Weeks Tape Restore Days Periodic Replication Days Disk Restore Hours Minutes Asynchronous Replication Hours Minutes Manual Migration Seconds Synchronous Replication Seconds Global Cluster Recovery-point objective Recovery-time objective Module 9: Introduction to Business Continuity 3

RPO vs RTO 4

BC Planning Lifecycle Establishing Objectives Training, Testing, Assessing, and Maintaining Analyzing Implementing Designing and Developing Module 9: Introduction to Business Continuity 5

Business Impact Analysis Identifies which business units and processes are essential to the survival of the business Estimates the cost of failure for each business process Calculates the maximum tolerable outage and defines RTO for each business process Businesses can prioritize and implement countermeasures to mitigate the likelihood of such disruptions Module 9: Introduction to Business Continuity 6

BC Technology Solutions Solutions that enable BC are: Resolving single points of failure Multipathing software Backup and replication Backup Local replication Remote replication Module 9: Introduction to Business Continuity 7

Single Points of Failure Single Points of Failure It refers to the failure of a component of a system that can terminate the availability of the entire system or IT service. Array port Client IP Switch Server FC Switch Storage Array Module 9: Introduction to Business Continuity 8

Resolving Single Points of Failure Client Redundant Paths Redundant Ports Redundant Arrays NIC NIC HBA HBA IP Redundant Network NIC Teaming NIC NIC HBA HBA Redundant FC Switches Production Storage Array Remote Storage Array Clustered Servers Module 9: Introduction to Business Continuity 9

Multipathing Software Recognizes and utilizes alternate I/O path to data Provides load balancing by distributing I/Os to all available, active paths: Improves I/O performance and data path utilization Intelligently manages the paths to a device by sending I/O down the optimal path: Based on the load balancing and failover policy setting for the device Module 9: Introduction to Business Continuity 10

Backup and Archive 11

Tyler s Immutable Rules Of Backup A BACKUP SOLUTION MUST: 1. Record changes to data over time If I just have the most recent copy, then I just have the most recently corrupted copy. RESULT: MIRRORING ISN T BACKUP!!!! 2. Have a copy at a separate physical location If all copies are in one place, then a simple fire or lightning event can destroy all copies 3. Must be automatic When you get busy, you ll forget, and busy people make the most important data 4. Require separate credentials to access If one compromised account can wipe primary and secondary, then that account is a single point of failure 5. Be unwritable by anyone except the backup software (which ideally should live in the restricted backup environment) If I can cd to a directory and change backups, then the same mistake/attack that killed the primary can kill the backup 6. Reliably report on progress and alert on failure I need to know if it stopped working or is about to stop working 7. Have periodic recovery tests to ensure the right data is being captured Prevent well it apparently hasn t been backing up properly all along, so we re screwed If you encounter backups that don t meet these rules, explain the potential dangers until they do! 12

What is Backup? Backup It is an additional copy of production data that is created and retained for the sole purpose of recovering lost or corrupted data. Organization also takes backup to comply with regulatory requirements Backups are performed to serve three purposes: Disaster recovery Operational recovery Archive Module 10: Backup and Archive 13

Backup Granularity Full Backup Su Su Su Su Su Incremental Backup Su M T W Th F S Su M T W Th F S Su M T W Th F S Su M T W Th F S Su Cumulative (Differential) Backup Su M T W Th F S Su M T W Th F S Su M T W Th F S Su M T W Th F S Su Amount of Data Backup Module 10: Backup and Archive 14

Restoring from Incremental Backup Monday Tuesday Wednesday Thursday Friday Files 1, 2, 3 File 4 Updated File 3 File 5 Files 1, 2, 3, 4, 5 Full Backup Incremental Incremental Incremental Production Less number of files to be backed up, therefore, it takes less time to backup and requires less storage space Longer restore because last full and all subsequent incremental backups must be applied Module 10: Backup and Archive 15

Restoring from Cumulative Backup Monday Tuesday Wednesday Thursday Friday Files 1, 2, 3 File 4 Files 4, 5 Files 4, 5, 6 Files 1, 2, 3, 4, 5, 6 Full Backup Cumulative Cumulative Cumulative Production More files to be backed up, therefore, it takes more time to backup and requires more storage space Faster restore because only the last full and the last cumulative backup must be applied Module 10: Backup and Archive 16

Backup Architecture Backup client Gathers the data that is to be backed up and send it to storage node Backup server Manages backup operations and maintains backup catalog Storage node Responsible for writing data to backup device Manages the backup device Backup Data Backup Client (Application Server) Backup Server Tracking Information Storage Node Backup Catalog Backup Data Backup Device Module 10: Backup and Archive 17

Understanding this traditional model A storage server could even be all three if index data is just kept with backup data Backup Server Backup Catalog Tracking Information This could be disk or tape Backup Data Backup Data Backup Client (Application Server) Storage Node Backup Device Your storage server could be both of these (assuming the onboard disks are your backup media) 18

Backup Operation Application Servers (Backup Clients) 1 Backup server initiates scheduled backup process. 2 3a 3b Backup server retrieves backup-related information from the backup catalog. Backup server instructs storage node to load backup media in backup device. Backup server instructs backup clients to send data to be backed up to storage node. 3b 4 4 5 Backup clients send data to storage node and update the backup catalog on the backup server. Storage node sends data to backup device. 1 2 3a 5 6 Storage node sends metadata and media information to backup server. 7 6 7 Backup server updates the backup catalog. EMC Backup Proven Server Professional Storage Node Backup Device Module 10: Backup and Archive 19

Recovery Operation Application Servers (Backup Clients) 1 Backup client requests backup server for data restore. 2 Backup server scans backup catalog to identify data to be restored and the client that will receive data. 3 Backup server instructs storage node to load backup media in backup device. 4 Data is then read and send to backup client. 1 4 5 Storage node sends restore metadata to backup server. 6 Backup server updates the backup catalog. 2 3 4 6 5 Backup Server Storage Node Backup Device Module 10: Backup and Archive 20

Backup Methods Two methods of backup, based on the state of the application when the backup is performed Hot or Online Application is up and running, with users accessing their data during backup Open file agent can be used to backup open files Cold or Offline Requires application to be shutdown during the backup process Bare-metal recovery OS, hardware, and application configurations are appropriately backed up for a full system recovery Server configuration backup (SCB) can also recover a server onto dissimilar hardware Module 10: Backup and Archive 21

Server Configuration Backup Creates and backs up server configuration profiles, based on user-defined schedules Profiles are used to configure the recovery server in case of production server failure Profiles include OS configurations, network configurations, security configurations, registry settings, application configurations Two types of profiles used Base profile Contains the key elements of the OS required to recover the server Extended profile Typically larger than base profile and contains all necessary information to rebuild application environment Module 10: Backup and Archive 22

Modern virtual environment note In a modern cluster of hypervisors, you don t worry so much about server configuration All servers are similar: they re just dumb hosts for the hypervisor Virtual machines are the true unit of backup in this case Figure from VMware docs here. 23

Key Backup/Restore Considerations Customer business needs determine: What are the restore requirements RPO & RTO? Which data needs to be backed up? How frequently should data be backed up? How long will it take to backup? How many copies to create? How long to retain backup copies? Location, size, and number of files? Module 10: Backup and Archive 24

Module 10: Backup and Archive Lesson 2: Backup Topologies and Backup in NAS Environment During this lesson the following topics are covered: Common backup topologies Backup in NAS environment Module 10: Backup and Archive 25

Direct-Attached Backup LAN Metadata Backup Data Backup Server Application Server/ Backup Client/ Storage Node Backup Device Module 10: Backup and Archive 26

LAN-based Backup Application Server/ Backup Client Backup Server Metadata LAN Backup Data Storage Node Backup Device Module 10: Backup and Archive 27

SAN-based Backup Backup Server LAN Metadata Application Server/ Backup Client FC SAN Backup Data Backup Device Storage Node Module 10: Backup and Archive 28

Mixed Backup Topology Application Server-2/ Backup Client Metadata LAN Metadata FC SAN Backup Data Backup Server Application Server-1/ Backup Client Backup Device Storage Node Module 10: Backup and Archive 29

Backup in NAS Environment Common backup implementations in a NAS environment are: Server-based backup Serverless backup NDMP 2-way backup NDMP 3-way backup Module 10: Backup and Archive 30

Server-based backup Storage Array Application Server/ Backup Client LAN NAS Head FC SAN Backup Data Backup Device Metadata Backup Server/ Storage Node Module 10: Backup and Archive 31

Serverless Backup Storage Array NAS Head Application Server LAN FC SAN Backup Data Backup Device Backup Server/ Storage Node/ Backup Client Module 10: Backup and Archive 32

NDMP 2-way Backup Backup Device Storage Array Backup Data LAN NAS Head FC SAN Application Server/ Backup Client Metadata Backup Server Module 10: Backup and Archive 33

NDMP 3-way Backup NAS Head FC SAN Application Server/ Backup Client Storage Array LAN Private LAN Backup Data Backup Server Metadata NAS Head FC SAN Backup Device Module 10: Backup and Archive 34

Backup consistency Assume live ( hot ) backup Is data crash-consistent, or can we do better? Quiesce: To make consistent at this time (quiescent). Tell the OS that you re about to take a snapshot, request quiescence OS flushes all buffers and commits the journal, pauses all IO, says OK Take snapshot Allow OS to resume Base the backup (which takes longer) off this snapshot Resulting backup is OS consistent Can also be application-aware Same as above, but you tell the application to quiesce Requires backup-aware applications (e.g. Microsoft SQL Server, Oracle database, etc.) Resulting backups are application consistent 35

Module 10: Backup and Archive Lesson 3: Backup Targets During this lesson the following topics are covered: Backup to Tape Backup to Disk Backup to Virtual Tape Module 10: Backup and Archive 36

Backup to Tape Traditionally low cost solution Tape drives are used to read/write data from/to a tape Sequential/linear access Multiple streaming to improve media performance Writes data from multiple streams on a single tape Limitation of tape Backup and recovery operations are slow due to sequential access Wear and tear of tape Shipping/handling challenges Controlled environment is required for tape storage Causes shoe shining effect or backhitching Module 10: Backup and Archive 37

Backup to Disk Enhanced overall backup and recovery performance Random access More reliable Can be accessed by multiple hosts simultaneously Disk Backup/Restore 24 Minutes Typical Scenario: 800 users, 75 MB mailbox 60 GB database Tape Backup/Restore 108 Minutes 0 10 20 30 40 50 60 70 80 90 100 110 120 Recovery Time in Minutes* Source: EMC Engineering and EMC IT Module 10: Backup and Archive 38

Backup to Virtual Tape Disks are emulated and presented as tapes to backup software Does not require any additional modules or changes in the legacy backup software Provides better single stream performance and reliability over physical tape Online and random disk access Provides faster backup and recovery Module 10: Backup and Archive 39

Virtual Tape Library Appliance Virtual Tape Library Backup Server/ Storage Node FC SAN LAN Emulation Engine Storage (LUNs) Backup Clients Module 10: Backup and Archive 40

Backup Target Comparison Tape Disk Virtual Tape Offsite Replication Capabilities No Yes Yes Reliability No inherent protection methods RAID, spare RAID, spare Performance Low High High Use Backup only Multiple (backup and production) Backup only Module 10: Backup and Archive 41

In defense of tape These slides omit a key features of tape that s the reason it s still not dead. You can stick a tape in a vault for 20 years and probably still read it. A tape can t have a head crash, bad bearing, or flaky controller board. Tape is crazy expensive compared to most other backup techniques, but if you need extreme archival capability, it s not wrong to use tape. 42

Module 10: Backup and Archive Lesson 4: Data Deduplication During this lesson the following topics are covered: Deduplication overview Deduplication methods Deduplication implementations Key benefits of deduplication Module 10: Backup and Archive 43

Module 10: Backup and Archive Lesson 5: Backup in Virtualized Environment During this lesson the following topics are covered: Traditional backup approach Image-based backup Module 10: Backup and Archive 44

Backup in Virtualized Environment Overview Backup options Traditional backup approach Image-based backup approach Module 10: Backup and Archive 45

Traditional Backup Approaches Backup agent on VM Requires installing a backup agent on each VM running on a hypervisor Can only backup virtual disk data Does not capture VM files such as VM swap file, configuration file Challenge in VM restore Backup agent on Hypervisor Requires installing backup agent only on hypervisor Backs up all the VM files Backup agent runs on each VM Backup agent runs on Hypervisor = Backup Agent Module 10: Backup and Archive 46

Mount Image-based Backup Creates a copy of the guest OS, its data, VM state, and configurations The backup is saved as a single file image Mounts image on a proxy server Offloads backup processing from the hypervisor Enables quick restoration of VM Application Server Storage Snapshots Proxy Server Backup Device Module 10: Backup and Archive 47

Module 10: Backup and Archive Lesson 6: Data Archive During this lesson the following topics are covered: Fixed content Data archive Archive solution architecture Module 10: Backup and Archive 48

Fixed Content Fixed content is growing at more than 90% annually Significant amount of newly created information falls into this category New regulations require retention and data protection Examples of Fixed Content Electronic Documents Contracts and claims Email attachments Financial spread sheets CAD/CAM designs Presentations Digital Records Documents Checks, securities trades Historical preservation Photographs Personal/professional Surveys Seismic, astronomic, geographic Rich Media Medical X-rays, MRIs, CT Scan Video News/media, movies Security surveillance Audio Voicemail Radio Module 10: Backup and Archive 49

Data Archive A repository where fixed content is stored Enables organizations retaining their data for an extended period of time in order to Meet regulatory compliance Plan new revenue strategies Archive can be implemented as Online Nearline Offline Module 10: Backup and Archive 50

Challenges of Traditional Archiving Solutions Both tape and optical are susceptible to wear and tear Involve operational, management, and maintenance overhead Have no intelligence to identify duplicate data Same content could be archived many times Inadequate for long-term preservation (years-decades) Unable to provide online and fast access to fixed content Module 10: Backup and Archive 51

Content Addressed Storage An Archival Solution Disk-based storage that has emerged as an alternative to traditional archiving solutions Provides online accessibility to archive data Enables organization to meet the required SLAs Provides features that are required for storing archive data Content authenticity and content integrity Location independence Single-instance storage Retention enforcement Data protection Module 10: Backup and Archive 52

Archiving Solution Architecture File Server Archiving Agent Email Server Archiving Agent Archiving Server Archiving Storage Device Module 10: Backup and Archive 53

Use Case: Email Archiving Moves the emails from primary to archive storage, based on policy Saves space on primary storage Enables to retain emails in the archive for longer period to meet regulatory requirements Gives end users virtually unlimited mailbox space File archiving is another use case that benefits from an archival solution Module 10: Backup and Archive 54

Local replication 55

What is Replication? Replication It is a process of creating an exact copy (replica) of data. Replication can be classified as Local replication Replicating data within the same array or data center Remote replication Replicating data at remote site REPLICATION Source Replica (Target) Module 11: Local Replication 56

Uses of Local Replica Alternate source for backup Fast recovery Decision support activities Testing platform Data Migration Module 11: Local Replication 57

Why local replication? Remember my rules? Local replication is useful: Can have lower RPO/RTO Can be cheaper May be sufficient for non-critical workloads where data loss is survivable Local replication is useful but not sufficient for business-critical workloads! 58

Replica Characteristics Recoverability/Restartability Replica should be able to restore data on the source device Restart business operation from replica Consistency Replica must be consistent with the source Choice of replica tie back into RPO Point-in-Time (PIT) Non-zero RPO Continuous Near-zero RPO Module 11: Local Replication 59

Understanding Consistency Consistency ensures the usability of replica Consistency can be achieved in various ways for file system and database Offline Online File System Unmount file system Flushing host buffers Database Shutdown database a)using dependent write I/O principle b)holding I/Os to source before creating replica Module 11: Local Replication 60

Remote replication 61

What is Remote Replication? Process of creating replicas at remote sites Addresses risk associated with regionally driven outages Modes of remote replication Synchronous Asynchronous REPLICATION Storage Array Source site Storage Array Remote site Module 12: Remote Replication 62

Synchronous Replication 1 A write is committed to both source and remote replica before it is acknowledged to the host Ensures source and replica have identical data at all times Maintains write ordering Provides near-zero RPO Host 1 4 3 Source 2 Target at Remote Site Data Write Data Acknowledgment Module 12: Remote Replication 63

Synchronous Replication 2 Response time depends on bandwidth and distance Requires bandwidth more than the maximum write workload Typically deployed for distance less than 200 km (125 miles) between two sites Max Writes MB/s Required bandwidth Typical workload Time Module 12: Remote Replication 64

Asynchronous Replication 1 A write is committed to the source and immediately acknowledged to the host Data is buffered at the source and transmitted to the remote site later Finite RPO Host 1 2 4 Source 3 Replica will be behind the source by a finite amount Target Data Write Data Acknowledgment Module 12: Remote Replication 65

Asynchronous Replication 2 RPO depends on size of buffer and available network bandwidth Requires bandwidth equal to or greater than average write workload Sufficient buffer capacity should be provisioned Can be deployed over long distances Writes MB/s Average Typical workload Required bandwidth Time Module 12: Remote Replication 66

Host-based Remote Replication Replication is performed by host-based software LVM-based replication All writes to the source volume group are replicated to the target volume group by the LVM Can be synchronous or asynchronous Log shipping Commonly used in a database environment All relevant components of source and target databases are synchronized prior to the start of replication Transactions to source database are captured in logs and periodically transferred to remote host Module 12: Remote Replication 67

Storage Array-based Remote Replication 1 Replication is performed by array-operating environment Three replication methods: synchronous, asynchronous, and disk buffered Synchronous Writes are committed to both source and replica before it is acknowledged to host Asynchronous Writes are committed to source and immediately acknowledged to host Data is buffered at source and transmitted to remote site later Module 12: Remote Replication 68

Storage Array-based Remote Replication 2 Disk-buffered Source Device Local Replica Local Replica Remote Replica Production Host Source Array Target Array Production host writes data to source device. A consistent PIT local replica of the source device is created. Data from local replica is transmitted to the remote replica at target. Optionally a PIT local replica of the remote replica on the target is created. Module 12: Remote Replication 69

Three-site Replication Data from source site is replicated to two remote sites Replication is synchronous to one of the remote sites and asynchronous or disk buffered to the other remote site Mitigates the risk in two site replication No DR protection after source or remote site failure Implemented in two ways: Cascade/multihop Triangle/multitarget Module 12: Remote Replication 70

Three-site Replication: Cascade/Multihop Synchronous + Disk Buffered Source Device Remote Replica Synchronous Disk Buffered Local Replica Remote Replica Source Site Bunker Site Remote Site Synchronous + Asynchronous Source Device Remote Replica Synchronous Asynchronous Source Site Bunker Site Remote Site Remote Replica Module 12: Remote Replication 71

Three-site Replication: Triangle/Multitarget Remote Replica Source Device Bunker Site Asynchronous with Differential Resynchronization Source Site Remote Replica Remote Site Module 12: Remote Replication 72

Summary Disaster Recovery (DR) exists to handle cases where High Availability (HA) redundancy is overwhelmed For data, the key is backups; for compute, it s secondary compute servers Backup isn t just mirroring! Rules: 1. Record changes to data over time 2. Have a copy at a separate physical location 3. Must be automatic 4. Require separate credentials to access 5. Be unwritable by anyone except the backup software (which ideally should live in the restricted backup environment) 6. Reliably report on progress and alert on failure 7. Have periodic recovery tests to ensure the right data is being captured Can do replication locally (for low cost, low RTO/RPO) and/or remotely (true DR, RTO/RPO proportional to cost) 73