Universal Storage Consistency of DASD and Virtual Tape

Similar documents
DISK LIBRARY FOR MAINFRAME

DISK LIBRARY FOR MAINFRAME

New trends in Virtual Tape: Cloud and Automation

Simple And Reliable End-To-End DR Testing With Virtual Tape

'Why' Ultra High Availability and Integrated EMC-Brocade Data Replication Networks

DISK LIBRARY FOR MAINFRAME

Under the Covers. Benefits of Disk Library for Mainframe Tape Replacement. Session 17971

VMAX: Achieving dramatic performance and efficiency results with EMC FAST VP

Blue Cross Blue Shield of Minnesota - Replication and DR for Linux on System z

Blue Cross Blue Shield of Minnesota - Replication and DR for Linux on System z

EMC for Mainframe Tape on Disk Solutions

Data Migration and Disaster Recovery: At Odds No More

DLm8000 Product Overview

Mainframe Virtual Tape: Improve Operational Efficiencies and Mitigate Risk in the Data Center

DLm TM TRANSFORMS MAINFRAME TAPE! WHY DELL EMC DISK LIBRARY FOR MAINFRAME?

Rapid Recovery from Logical Corruption

How much Oracle DBA is to know of SAN Part 1 Srinivas Maddali

Breaking the Tape at the Finish Line Disk Library for Mainframe. Steve Anderson

Paradigm Shifts in How Tape is Viewed and Being Used on the Mainframe

Achieving Continuous Availability for Mainframe Tape

EMC GDDR for SRDF /Star with AutoSwap

Balakrishnan Nair. Senior Technology Consultant Back Up & Recovery Systems South Gulf. Copyright 2011 EMC Corporation. All rights reserved.

DELL EMC MAINFRAME TECHNOLOGY OVERVIEW

EMC Business Continuity for Microsoft Applications

EMC Symmetrix V-Max in System z Environments

Simplify and Improve IMS Administration by Leveraging Your Storage System

DISK LIBRARY FOR MAINFRAME (DLM)

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange Enabled by MirrorView/S

Replication is the process of creating an

EMC VPLEX with Quantum Stornext

EMC Business Continuity for Microsoft SharePoint Server (MOSS 2007)

EMC GDDR for SRDF /S with ConGroup Version 3.1

Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam. Copyright 2014 SEP

E EMC. EMC Storage and Information Infrastructure Expert for Technology Architects

Copyright 2012 EMC Corporation. All rights reserved.

Fifth Third Bancorp BUSINESS VALUE HIGHLIGHTS SOLUTION SNAPSHOT. Customer Focus on Business Continuity

EMC VPLEX Geo with Quantum StorNext

SAN for Business Continuity

TS7720 Implementation in a 4-way Grid

EMC GDDR for SRDF /A. Product Guide. Version 5.0 REV 01

Simplify and Improve IMS Administration by Leveraging Your Storage System

What s new with EMC Symmetrix VMAX and Enginuity?

Real-time Protection for Microsoft Hyper-V

Mainframe Backup Modernization Disk Library for mainframe

IBM GDPS V3.3: Improving disaster recovery capabilities to help ensure a highly available, resilient business environment

EMC GDDR for SRDF /S with AutoSwap

EMC DATA PROTECTION, FAILOVER AND FAILBACK, AND RESOURCE REPURPOSING IN A PHYSICAL SECURITY ENVIRONMENT

IBM GDPS V3.3: Improving disaster recovery capabilities to help ensure a highly available, resilient business environment

UNITE 2012 OS 3034 ClearPath Storage Update. Data Management & Storage Portfolio Management May 2012 Steven M. O Brien

EMC Symmetrix VMAX and DB2 for z/os

EMC Innovations in High-end storages

Technical Note P/N REV A04 June 2, 2010

Technology Insight Series

Simplify and Improve DB2 Administration by Leveraging Your Storage System

IBM TS7700 grid solutions for business continuity

Copyright 2012 EMC Corporation. All rights reserved.

XtremIO Business Continuity & Disaster Recovery. Aharon Blitzer & Marco Abela XtremIO Product Management

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange 2007

Hitachi Adaptable Modular Storage and Hitachi Workgroup Modular Storage

Panel Discussion The Benefits of Going Tapeless Session #10931

SAP HANA Disaster Recovery with Asynchronous Storage Replication

Replicating Mainframe Tape Data for DR Best Practices

EMC GDDR for SRDF /S with ConGroup

EMC S YMMETRIX VMAX USING EMC SRDF/TIMEFINDER AND ORACLE

Hitachi Adaptable Modular Storage and Workgroup Modular Storage

EMC Symmetrix DMX Series The High End Platform. Tom Gorodecki EMC

Solutions for SAP Systems Using IBM DB2 for IBM z/os

TSM Paper Replicating TSM

EMC GDDR for SRDF /Star with AutoSwap

Best Practices for deploying VMware ESX 3.x and 2.5.x server with EMC Storage products. Sheetal Kochavara Systems Engineer, EMC Corporation

EMC GDDR for SRDF /A. Product Guide. Version 4.2 REV 01

Hybrid Backup & Disaster Recovery. Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam

IMS Disaster Recovery Tools Solutions IMS Recovery Expert

Chapter 4 Data Movement Process

Overcoming Obstacles to Petabyte Archives

EMC DiskXtender for Windows and EMC RecoverPoint Interoperability

INTRODUCTION TO XTREMIO METADATA-AWARE REPLICATION

Interfamily Connectivity

Copyright 2012 EMC Corporation. All rights reserved. EMC And VMware. <Date>

Data Deduplication Makes It Practical to Replicate Your Tape Data for Disaster Recovery

Disaster Recovery-to-the- Cloud Best Practices

EMC VMAX3 & VMAX ALL FLASH ENAS BEST PRACTICES

DATA PROTECTION IN A ROBO ENVIRONMENT

TOP REASONS TO CHOOSE DELL EMC OVER VEEAM

Storage Foundation for Oracle RAC with EMC SRDF

The World s Fastest Backup Systems

50 TB. Traditional Storage + Data Protection Architecture. StorSimple Cloud-integrated Storage. Traditional CapEx: $375K Support: $75K per Year

Using FDR/UPSTREAM and your z/os Mainframe for File Level Data Protection for Linux on z Systems & USS Data.

Copyright 2015 EMC Corporation. All rights reserved. Published in the USA.

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

EMC GDDR for SRDF /SQAR with AutoSwap

First Financial Bank. Highly available, centralized, tiered storage brings simplicity, reliability, and significant cost advantages to operations

GDDR 3.2 Update. John Egan EMC. Tuesday August 3, 2010 Session Number

TS7700 Technical Update What s that I hear about R3.2?

EMC BUSINESS CONTINUITY SOLUTION FOR GE HEALTHCARE CENTRICITY PACS-IW ENABLED BY EMC MIRRORVIEW/CE

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution

Improve Disaster Recovery and Lower Costs with Virtual Tape Replication

Disaster Recovery Solutions for Oracle Database Standard Edition RAC. A Dbvisit White Paper By Anton Els

VERITAS Volume Replicator. Successful Replication and Disaster Recovery

TSM Studio Dataview's and Dataview Commands. TSM Studio

Transcription:

Universal Storage Consistency of DASD and Virtual Tape Jim Erdahl U.S.Bank August, 14, 2013 Session Number 13848

AGENDA Context mainframe tape and DLm Motivation for DLm8000 DLm8000 implementation GDDR automation for DLm8000 2

Connotation of tape Tape no longer refers to physical tape media A different Tier of Storage (non-sms) Characteristics Serial access Sequential stream of data Special catalog and retentions controls (CA-1) Lower cost than DASD Limited number of drives (256 per VTE) No space allocation constructs ( 5GB x 255 volumes = 1,275GB of compressed data) 3

Characterization of mainframe tape data Archive / Retrieval Operational files Backup / Recovery Profile at US Bank At least 60% of overall tape data is archive, including HSM ML2, CA View reports, and CMOD statements / images; this data is retrieved extensively No more than 25% of tape data is associated with backups, including image copies and archive logs Tape data is critical! 4

DLm Origin / Primer Origin Key Component Role Bus-Tech MDL EMC NAS (Celerra / CLARiiON) VTEs Data Movers SATA disks Tape drive emulation, FICON interface, compression, tape library portal File System sharing over an IP network / replication Data storage 5 Tape files reside in File Systems; the name of the file is the Volser Multiple file systems are defined within a tape library to provide for sufficient volsers and capacity headroom A single tape library is typically created per Sysplex and is mounted on specified VTE drives

DLm Requirements DLm implemented in 2008 for a technology refresh in conjunction with a Data Center migration, providing: Consistently good performance Low maintenance High scalability and redundancy 6

Original US Bank Storage Configuration / Replication DC1 (Primary) tape DASD Sync DASD DC2 (Alternate) Async DC3 (DR) Either DC1 or DC2 is the primary site for DASD tape DASD 7

DLm Experience at US Bank Notable metrics Over 50,000 mounts per day 99.9% of all tape mounts fulfilled in less than 1 second 4.5 to 1 compression Over 720 terabytes of usable capacity Peak host read / write rate of 1,200 megabytes / second Async replication to DC3 (DR Site) within several minutes Tape is becoming more critical If life is so good, what is the motivation for DLm8000? 8

Problem #1 - Disaster Recovery Scenario The missing tape problem at DC3 (outof-region DR site) Tape replication lags behind DASD replication (RPO measured in minutes versus seconds) In an out-of-region disaster declaration, tens and maybe hundreds of tape files closed immediately before the Disaster have not completely replicated But these files are defined in catalogs replicated on DASD (TMS, ICF catalogs, HSM CDSs, IMS RECON, DB2 BSDS, etc.) Hence, there are critical data inconsistencies between tape and DASD at DC3 (DR Site) 9

Problem #1 (continued) During a Disaster Recovery at DC3. 1. What if HSM recalls fail because of missing ML2 tape data? 2. What if the DBAs cannot perform database recoveries because archive logs are missing? 3. What if business and customer data archived to tape is missing? 4. How does this impact overall recovery time (RTO)? 5. Are Disaster Recovery capabilities adequate if tapes are missing? 10

Problem #2 Local Resiliency Scenario Local DASD resiliency is not sufficient Three site DASD configuration with synchronous replication between DC1 and DC2 and asynchronous replication to DC3. Two site tape configuration with asynchronous replication from DC1 to DC3. In the event of a DASD catastrophe at the primary site, a local DASD failover is performed non-disruptively with zero data loss, and asynchronous replication is re-instated to DC3. But, what if a catastrophe occurs to tape storage at DC1? Mainframe processing will come to a grinding halt. A disruptive recovery can be performed at DC3, but this is a last resort. 11 REMBER: Tape is becoming more essential.

What are the options? To solve problems #1 and #2, why not convert all tape allocations to DASD? The cost is prohibitive, but Space allocation is the impediment SMS and Data Classes are not adequate Massive JCL conversion is required to stipulate space allocation To solve problems #1 and #2, why not synchronize tape and DASD replication / failover? 12

EMC has the technologies, but 1. Can DLm support a Symmetrix backend? Yes 2. Can SRDF/S and SRDF/A handle enormous tape workloads without impacts? Yes 3. Can we achieve universal data consistency between DASD and tape at DC2 and DC3? Yes 4. Can GDDR manage a SRDF/STAR configuration with Autoswap which includes the tape infrastructure? Yes 13

Creation of DLm8000 US Bank tape resiliency requirements articulated in executive briefings and engineering round tables EMC solicited requirements from other customers and created a business case EMC performed a proof of concept validating overall functionality including failover processes EMC built a full scale configuration, based on US Bank s capacity requirements; performed final validation of replication, performance, and failover capabilities GDDR automation designed and developed with significant collaboration across product divisions 14 US Bank began implementation in June, 2012

DLm8000 what is under the hood? Product Line Key Component Role Management Interface MDL 6000 VTEs Tape drive emulation, FICON interface, compression, tape library portal ACPs VNX VG8 Data Movers File System sharing over an IP network Control Stations Vmax SATA FBA drives / SRDF Data storage / replication Gatekeepers 15

Migration to the DLm8000 Old Backend (NS80s) Celerra Replicator VTEs (MDL6000s) New Backend (VG8 & Vmax) SRDF/S SRDF/A 1. Configured new backend at all three sites 2. Setup SRDF/A and SRDF/S 3. Defined File Systems on new backend 4. Partitioned old and new file systems with DLm storage classes 5. Updated Scratch synonyms to control scratch allocations by storage class 6. Deployed outboard DLm migration utility to copy tape files 7. Maintained dual replication from old backend and new backend during the migration 8. Incorporated tape into GDDR along with DASD (ConGroup, MSC, STAR, etc.). 16

DLm Migration Utility Process Overview Storage administrator creates a list of Volsers and starts the migration utility script on a VTE, with Volser list and parameter specifications For each Volser, the migration utility performs the following actions: creates the target file in the specified storage class locks the source and target files copies the source to the target and performs a checksum renames the source and target files so that the target file is live unlocks the files An average throughput of 25 megabytes per second was achieved per VTE script without impacts to SRDF replication and no host impacts other than an occasional mount delay while a Volser is locked (max copy time < 80 minutes for 20GB tape) 17

Final US Bank Storage Configuration / Replication DC1 DC2 tape DASD Sync tape DASD MSC coordinates overall SRDF/A cycle switching Async ConGroup protects integrity of SRDF/S leg DC3 DR Site tape DASD Either DC1 or DC2 is the primary site for tape and DASD 18

0:00 0:20 0:40 1:00 1:20 1:40 2:00 2:20 2:40 3:00 3:20 3:40 4:00 4:20 4:40 5:00 5:20 5:40 6:00 Seconds SRDF/A RPO - Before and After MSC Consolidation 50 Before: DASD - 5 DMX4s After: DASD and Tape Before: Tape - 1 Vmax 40 30 20 10 0 10 minute interval 19

How do we monitor / control all of this? Geographically Dispersed Disaster Restart (GDDR) scripts were updated for DLm8000. Recover At DC3 Test from BCV s at DC3 and DC2 Restart SRDF/A replication to DC3 Restart SRDF/S replication between DC1 and DC2 Planned Autoswap between DC1 and DC2 Un-Planned Autoswap between DC1 and DC2 20

GDDR automation for a planned storage swap between DC1 and DC2 (high level steps) Once tape workload is quiesced, GDDR script is initiated 1. Tape drives varied offline 2. Tape configuration disabled at swap from site 3. DASD Autoswap and SRDF swap (R1s not ready, R2s ready and R/W) 4. Failover of Data Movers to swap to site 5. VTEs started at swap to site 6. Tape drives varied online and tape processing resumes 7. Recovery / resumption of SRDF/S, SRDF/A, and STAR 21

Unplanned storage swap between DC1 and DC2 Loss of access to CKD devices triggers DASD Autoswap and SRDF swap In-flight tape processing fails - no tape Autoswap functionality Failed in-flight tape jobs need to be re-started after the tape swap No data loss for closed tape files (or sync points) Note: a tape infrastructure issue does not trigger an unplanned swap GDDR recovery script is automatically triggered after an unplanned swap 22

GDDR script to recover from unplanned swap (high level steps) 1. Tape drives are varied offline 2. Failover of Data Movers to swap to site 3. VTEs started at swap to site 4. Tape drives varied online and tape processing resumes 5. Recovery / resumption of SRDF/A 6. Recovery / resumption of SRDF/S and STAR initiated manually 23

GDDR Value at US Bank Provides sophisticated automation for monitoring and management Handles coordination of comprehensive EMC software and hardware stack Provides recovery for critical events such as unplanned swaps and SRDF/A outages Minimizes requirements for internally developed automation and procedures Indispensable for a multi-site, high availability configuration 24

DLm8000 Value Proposition High Availability Robust storage platform Synchronous and asynchronous replication without impacts Universal data consistency between tape and DASD at DC2 and DC3, enabled with ConGroup and MSC GDDR automation to manage overall storage replication, failover, and recovery 25

Universal Storage Consistency of DASD and Virtual Tape Jim Erdahl U.S.Bank August, 14, 2013 Session Number 13848