Technical Consideration in Disaster Recovery GLOBAL CAPABILITY. PERSONAL ACCOUNTABILITY. Daniel J. Morris Business Solutions Consultant November 2008 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
2 And to think... those wimps at the power company use straps and cleats to get up this high!"
3 "Jack stands? Hah! Who needs 'em?"
4 I'm sure this guy still wonders why he got fired over this!
Step 1: Remove shoes Step 2: Place metal ladder in water Step 3: Begin using power tools while standing barefoot on metal ladder in water 5
Topics The Business Drivers of BCDR Protecting the Information: Remote Mirroring Network Impact: Topology & Protocol Considerations The Verizon Storage Practice: Putting the Pieces Together 6
The Business Drivers Protecting Information Network Impact VZ Storage Practice GLOBAL CAPABILITY. PERSONAL ACCOUNTABILITY. 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
Business Continuance & Disaster Recovery are Different Business Continuance Focus is on avoiding disruption Deploy local and remote technology Disaster Recovery Focus is on recovering from disruption Deploy primarily remote technology Zero downtime can be achieved Close to zero downtime can be achieved Develop contingency plans Develop recovery plans Procedures to maintain business functions Procedures to recover business functions Train, Test, and Maintain Train, Test, and Maintain 8
Business Considerations: Starts with Leadership!! POLICY 9 RISK MANAGEMENT
Technologies Dictated By RPO/RTO Hours of Lost Transactions (RPO) Hours Required to Resume Business (RTO) Cost Per -24-12 0 12 24 36 48 60 72 84 Month Full Volume Tape Back up Nightly Tape Vaulting 20K 30K Database Journaling 40K Consistent Recovery Restart Asynchronous Point in Time Copy Continuous Asynchronous Synchronous Mirror 60K 90K 150K 250K Transactions Not Captured Declaration Data Retrieval Transit System Restore IPL & Network Database Restore Transaction Recreation 10
The Business Drivers Protecting Information Network Impacts VZ Storage Practice GLOBAL CAPABILITY. PERSONAL ACCOUNTABILITY. 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
Tiered Storage: The First Step in Protecting Data Tier 1 Tier 2 Tier 3 Tier 4 Tier 5 Availability (Unplanned downtime) Seconds to minutes Minutes to hours Hours Hours Hours to days Dynamic workload Highest transaction volume High performance for constant workloads Performance (Workload) Moderate performance Primarily read access Recovery Point Internet performance Primarily read access N/A Seconds Seconds to minutes Minutes to hours Up to 24 hours Up to 72 hours High-end remote replication High-end/midrange Fibre Channel disk local replication ATA, low-cost Fibre Channel CAS Tape 12
Protection Level Distinctions PARAMETER BACK-UP ARCHIVE MIRRORING DATA TYPE Secondary Copy Primary Copy Secondary Copy RETENTION DURATION Long Term Overwritten Data Long Term Retention Short Term Retention DATA ACCESS LAYER RPO/RTO CHARACTERISTIC File Level Access Block and/or File* Block Level Long RPO/RTO* UNPROTECTED Typically Short RPO/RTO CONTROLLER MECHANISM Appliance VTL Library Host Appliance Various Platforms Host Appliance Array MEDIA Tape/Disk/CD Tape/Disk/CD Disk 13
RTO/RPO Drivers for Remote Protection Schemas RTOs/RPOs coupled w/ Geographic Diversity dictate: Level Of Protection Level Of Network Requirements Level Of Application Requirements TIER 1 SYNCH MIRROR TIER 2 ASYNCH MIRROR TIER 3 BACKUP & OFF Tape 14
Mirroring Considerations: Storage Systems Controller Type: HOST BASED APPLIANCE BASED ARRAY BASED & TYPE Mirroring Methodology ASYNCHRONOUS SYNCHRONOUS BLENDED BI-DIRECTIONAL MULTI-DIRECTIONAL Storage Topology NAS CAS SAN ROUTING PROTOCOL: FCIP FCOE iscsi TCP/IP GFP/SONET DISTANCE LATENCY THRESHOLDS (Application) NETWORK TOPOLOGY: DETERMINISTIC NON-DETERMINISTIC BANDWIDTH CONSIDERATIONS DATA CHANGE RATES 15
16 Putting Tiered Info Into Context
Protection Schemas by Service Level Policy PLAN Service Levels and Business Requirements Information Protection Services 17 Acceptable data loss 0 seconds Seconds to minutes Hours >24 hours Business application availability Minutes Minutes Hours >24 hours Business disruption Very low Low Medium High Tiered availability CORE PLATFORMS Replication Alternatives, Design, and Technology Portfolio Symmetrix SRDF/S SRDF/A SRDF/AR SRDF/DM CLARiiON MirrorView/S MirrorView/A SAN Copy RecoverPoint EMC Centera Celerra Server clustering and replication Backup and recovery NETWORK CONNECTIVITY RECOVERY PROVIDERS/FACILITIES Build Manage AutoStart, VMware RecoverPoint CDP and CRR EMC Centera Archive Replicator Celerra Replicator AutoStart, RepliStor RepliStor OnCourse Backup to disk: CLARiiON Disk Library, CLARiiON CX and CX3 UltraScale series (SAN), NS Series (LAN), NetWorker) Integration and Plan Development Services Residency Services
Complete Tiered Disaster Recovery Protection Production Site Standby Site 18
SCENARIO 1: Tier One Replication Requirements SRDF Family Multi-Site Protection Options SRDF/Star Multi-site protection Includes SRDF/A link between two remote sites to continue protection if a site fails SRDF/S Near Site SRDF/A Source SRDF/A Far Site Cascaded SRDF New Multi-site protection SRDF/S between Source and Near Site; SRDF/A between Near Site and Far Site SRDF/S SRDF/A Eliminates need for BCV cycling at Near Site; improves recovery-point objectives at Far Site Source Near Site Far Site Concurrent SRDF Multi-site protection leveraging a single source and concurrently replicating to two remote sites SRDF/S Near Site Source SRDF/A Far Site 19
SCENARIO 2: Mid Tier Array Based Replication CLARiiON Remote-Replication Replication Family MirrorView/Synchronous RPO: Zero seconds Both images identical Limited distance High network bandwidth One primary to one or two secondaries 1 4 Primary 2 Limited distance 3 Secondary MirrorView/Asynchronous RPO: 30 minutes to hours Target updated periodically Unlimited distance Restartable copy on secondary if session fails Optimized for low network bandwidth (consumes 100 Mb/s maximum) One primary to one secondary 1 2 Primary 4 Unlimited distance 5 Secondary 3 20 SAN Copy RPO: Hours to days Data mobility between tiers, CLARiiON, and qualified third-party arrays Disaster recovery with application coordination for restartable copy on secondary site Available as incremental or full modes Unlimited distance One source to multiple destinations (up to 100) 1 Primary 2 Unlimited distance 3 Secondary
SCENARIO 3: Bandwidth Constrained Replication Recover Point for Heterogeneous Environments Local site Application response time Remote site Oracle Exchange SQL Applicationconsistent recovery Oracle Exchange SQL Corruption protection SAN SAN SAN Existing infrastructure Communications cost Disaster-recovery testing IBM HDS HP EMC STK Heterogeneous storage IBM HDS EMC HP STK 21
HA A Clustering: Marrying Remote Servers to Remote Storage SYTEM FAILURE...ATTEMPTING TO REBOOT NTFS ERROR 14CXX011322 Born out of Grid Computing Create Loose or Tightly coupled systems High Availability Clustering provides heart beat between Server nodes. Typically exacting specs/configs between servers Geo-Clustering, coupled with resilient high speed networks and disk mirroring is the ultimate Business Continuity solution AKA: Stretch, Metro, Dispersed or Extended Clustering 22
SCENARIO 1: Clustering w/ SRDF EMC AutoStart for SRDF/S and SRDF/A Zero data-loss business-restart solution with SRDF/S Controlled-data-loss business-restart solution with SRDF/A AutoStart failover can be automatic or automated (requires operator) Business restart automates application restart on top of a disaster-restartable copy of data AutoStart ensures R2s are in a consistent state prior to restart Verifies that no invalid tracks are owed to R2 prior to bringing applications online with SRDF/S Requires DMX with Enginuity 5670 or higher AutoStart performs dynamic swap with Enginuity 5x71 and EMC Solutions Enabler V6.2 or higher Production site Secondary site Heartbeat over IP R1 N-1 N-1 R2 R1 N-1 N-1 R2 R1 N-1 N-1 R2 23
SCENARIO 2: EMC RecoverPoint with VMware Site Recovery Manager Replicate VMware VMFS (Virtual Machine File System) across heterogeneous storage Compress data, optimize bandwidth (up to 10 times) Protect and recover a single virtual machine or the entire VMware ESX Server Protect virtual environments with local and/or remote point-in-time recovery PRODUCTION DISASTER RECOVERY Virtual Center SRM SRA Virtual Machines VMware Site VMware Recovery Site Recovery Manager Manager SRA SRM Virtual Machines Virtual Center APP APP APP APP APP APP APP APP APP APP APP APP APP APP OS OS OS OS OS OS OS OS OS OS OS OS OS OS VMware Infrastructure Servers EMC RecoverPoint Adapter for VMware Site Recovery Manager VMware Infrastructure Servers Heterogeneous Storage EMC RecoverPoint Heterogeneous Storage 24
The Business Drivers Protecting Information Network Impact VZ Storage Practice GLOBAL CAPABILITY. PERSONAL ACCOUNTABILITY. 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
Mirroring Considerations: Storage Systems Controller Type: HOST BASED APPLIANCE BASED ARRAY BASED & TYPE Mirroring Methodology ASYNCHRONOUS SYNCHRONOUS BLENDED BI-DIRECTIONAL MULTI-DIRECTIONAL Storage Topology NAS CAS SAN ROUTING PROTOCOL: FCIP FCOE iscsi TCP/IP GFP/SONET DISTANCE LATENCY THRESHOLDS (Application) NETWORK TOPOLOGY: DETERMINISTIC NON-DETERMINISTIC BANDWIDTH CONSIDERATIONS DATA CHANGE RATES 26
Technology Considerations: Network Protocol Typical Usage & Native Distance Maximum Throughput ESCON FICON FCP (FCoS) FCIP iscsi Mainframe Droop after 9km Mainframe (FC + ESCON) Droop after 120km* (Apps Issue) Open Systems; Chan Xtenders 10km w/ Single mode LC; VARIES Single dedicated tunnel Over Ethernet (Noisy) Direct map of SCSI to IP Subordinate to Network* 200Mbps 400Mbps Up to 10Gbps GigE Speeds (Shared up to 10Gbps) GigE Speeds (Shared up to 10Gbps) ifcp Encapsulated & routable FCP map to SCSI to IP GigE Speeds (Shared up to 10Gbps) 27
Case 1 Use VPN to connect to DCs Pros: Bandwidth flexibility (1M to 1G) Easier upgrades Ethernet everywhere Ease of adding/removing nodes Any-to-any communication or specific point-to-point links Challenges No support for storage standards FC, ESCON, FICON No SLAs for latency and jitter Data rate typically limited to 1G Ethernet Access Ethernet Access Ethernet Access Metro (National) VPN [Layer 2 or Layer 3] Ethernet Access Ethernet Access 28
Case 2 Inter-DC Link Ring Options Pros: Greater bandwidth (up to 440G) Support all sort of interfaces: Ethernet, SONET, FC, FICON, etc. Protected service Fixed latency and no jitter Challenges Cost Only metro deployment Metro (National) VPN [Layer 2 or Layer 3] 29
The Business Drivers Protecting Information Protecting the Network VZ Storage Practice GLOBAL CAPABILITY. PERSONAL ACCOUNTABILITY. 2008 Verizon. All Rights Reserved. PTEXXXXX XX/08
Look at the full picture There is tendency to address business needs for BCDR one item at a time. The majority of the requests are intended to address an immediate need and not part of an overall design. The result is a less than optimal design. 31
Building to the Full Picture STANDBY ARRAY (Target) SAN Remote Clients WAE/WAAS Infrastructure Collapse Remote Apps/Servers Centera Active Archive for Filesystems NAS Gateway to DMX4 plus Centera POLICY POLICY MPLS/PIP NETWORK WAE MGR LAN WAE DEV NS-NAS FILERS Archive Fies Disk Xtender BUDGET BUDGET 32 RPO & RTOs SAN DMX-4 ARRAY Centera Active
Positioning the Right SLA Tiers between Information, Application & Networks Objectives ILM FW Network Topologies Optimization BCDR SES Switch CPA/IP-MPLS Backbone Service Edge Router SONET DWDM BACK UP OCn UNI SONET DWDM Optical Backbone SONET DWDM SONET DWDM NAS DS1 UNI DS3 UNI Fast Packet ATM Cloud ARCHIVAL Class 5 Switch Fast Packet Frame Relay Cloud 33
The California Strategic Sourcing Initiative (CSSI) DGS Contract #: 1S-05 05-70-10 (Open Systems Hardware, Software & Services) DGS Contract #: 1S-05 05-70-11 (Mainframe Systems Hardware, Software & Services) Features Competitively bid Contracts Benefits Reliable storage solutions Pre-negotiated rates for EMC solutions In alignment with Integrated IT Governance Approach Guaranteed Small Business Participation EMC-accredited pre-sales engineering support No cap on contract/order value Allows use of design/build approach Reduced time and cost in procurement 34 No requirement to use traditional RFP, RFQ or FSR process CSSI WEBSITE: http://verizon.ca.ssicatalog.com/desktopdefault.aspx
Contact Info Daniel Morris 11080 White Rock Road, Rancho Cordova, CA Email: daniel.j.morris@verizonbusiness.com Office: 916-779-5695 Cell: 916-803-0478 35
36 Thank you for your time!