IBM SONAS business continuity solution with SONAS asynchronous replication and VMware vsphere 4.1. A technical report

Similar documents
VMware vsphere with ESX 4.1 and vcenter 4.1

Introduction to Virtualization. From NDG In partnership with VMware IT Academy

Protecting VMware vsphere/esx Environments with CA ARCserve

EMC Integrated Infrastructure for VMware. Business Continuity

Protecting VMware vsphere/esx Environments with Arcserve

IBM Active Cloud Engine centralized data protection

Administering VMware vsphere and vcenter 5

VMware vsphere with ESX 6 and vcenter 6

Microsoft E xchange 2010 on VMware

IBM SONAS with VMware vsphere 5: Bigger, better, and faster!

Business Continuity and Disaster Recovery. Ed Crowley Ch 12

VMware Site Recovery Manager 5.x guidelines for the IBM Storwize family

vsan Disaster Recovery November 19, 2017

VMware vsphere 4. Architecture VMware Inc. All rights reserved

Server Fault Protection with NetApp Data ONTAP Edge-T

VMware vsphere 6.5: Install, Configure, Manage (5 Days)

Exploring Options for Virtualized Disaster Recovery

Disaster Recovery-to-the- Cloud Best Practices

Protecting Microsoft Hyper-V 3.0 Environments with Arcserve

VMware vsphere 4. The Best Platform for Building Cloud Infrastructures

Easy VMware Disaster Recovery & Business Continuity in Amazon Web Services

[VMICMV6.5]: VMware vsphere: Install, Configure, Manage [V6.5]

VMware vcenter Site Recovery Manager 5 Technical

"Charting the Course... VMware vsphere 6.7 Boot Camp. Course Summary

Enterprise X-Architecture 5th Generation And VMware Virtualization Solutions

Potpuna virtualizacija od servera do desktopa. Saša Hederić Senior Systems Engineer VMware Inc.

VMware Join the Virtual Revolution! Brian McNeil VMware National Partner Business Manager

Real-time Protection for Microsoft Hyper-V

Introducing VMware Validated Designs for Software-Defined Data Center

A Dell Technical White Paper Dell Virtualization Solutions Engineering

Migration and Building of Data Centers in IBM SoftLayer

arcserve r16.5 Hybrid data protection

VMware vsphere: Install, Configure, Manage (vsphere ICM 6.7)

White Paper. A System for Archiving, Recovery, and Storage Optimization. Mimosa NearPoint for Microsoft

VPLEX & RECOVERPOINT CONTINUOUS DATA PROTECTION AND AVAILABILITY FOR YOUR MOST CRITICAL DATA IDAN KENTOR

DELL EMC UNITY: REPLICATION TECHNOLOGIES

VMware vsphere: Taking Virtualization to the Next Level

Introducing VMware Validated Designs for Software-Defined Data Center

vsphere Networking Update 1 ESXi 5.1 vcenter Server 5.1 vsphere 5.1 EN

Best Practice Guide for Implementing VMware vcenter Site Recovery Manager 4.x with Oracle ZFS Storage Appliance

SONAS Best Practices and options for CIFS Scalability

VMware - VMware vsphere: Install, Configure, Manage [V6.7]

By the end of the class, attendees will have learned the skills, and best practices of virtualization. Attendees

Protecting Mission-Critical Workloads with VMware Fault Tolerance W H I T E P A P E R

Introducing VMware Validated Designs for Software-Defined Data Center

Vblock Architecture Accelerating Deployment of the Private Cloud

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

DATA PROTECTION IN A ROBO ENVIRONMENT

Using EonStor DS Series iscsi-host storage systems with VMware vsphere 5.x

Arcserve Unified Data Protection Virtualization Solution Brief

Business Continuity and Disaster Recovery Disaster-Proof Your Business

Migration. 22 AUG 2017 VMware Validated Design 4.1 VMware Validated Design for Software-Defined Data Center 4.1

vsphere Networking Update 2 VMware vsphere 5.5 VMware ESXi 5.5 vcenter Server 5.5 EN

Virtual Disaster Recovery

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution

EMC Business Continuity for Microsoft Applications

Configuring file system archival solution with Symantec Enterprise Vault

2014 VMware Inc. All rights reserved.

Step into the future. HP Storage Summit Converged storage for the next era of IT

Using VMware vsphere Replication. vsphere Replication 6.5

EMC STORAGE FOR MILESTONE XPROTECT CORPORATE

EMC Backup and Recovery for Microsoft Exchange 2007

Functional Testing of SQL Server on Kaminario K2 Storage

IBM Storage Software Strategy

Foundation for Cloud Computing with VMware vsphere 4

VMware vsphere 6.5 Boot Camp

IBM TS7700 grid solutions for business continuity

High Availability for Cisco Unified Communications on the Cisco Unified Computing System (UC on UCS)

VMware vcenter Site Recovery Manager 4.1 Evaluator s Guide EVALUATOR'S GUIDE

Virtualizing Business- Critical Applications with Confidence TECHNICAL WHITE PAPER

SRM Evaluation Guide First Published On: Last Updated On:

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Virtualization with Arcserve Unified Data Protection

Stellar performance for a virtualized world

Data Sheet: High Availability Veritas Cluster Server from Symantec Reduce Application Downtime

vsphere Installation and Setup Update 2 Modified on 10 JULY 2018 VMware vsphere 6.5 VMware ESXi 6.5 vcenter Server 6.5

EMC BUSINESS CONTINUITY FOR VMWARE VIEW 5.1

VMware vsphere with ESX 4 and vcenter

How to Protect Your Small or Midsized Business with Proven, Simple, and Affordable VMware Virtualization

Detail the learning environment, remote access labs and course timings

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Exploring Options for Virtualized Disaster Recovery. Ranganath GK Solution Architect 6 th Nov 2008

Hedvig as backup target for Veeam

SAP HANA. HA and DR Guide. Issue 03 Date HUAWEI TECHNOLOGIES CO., LTD.

IBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide

The Future of Virtualization. Jeff Jennings Global Vice President Products & Solutions VMware

7 Things ISVs Must Know About Virtualization

Disaster Recovery with Dell EqualLogic PS Series SANs and VMware vsphere Site Recovery Manager 5

EMC VSPEX FOR VIRTUALIZED MICROSOFT EXCHANGE 2013 WITH MICROSOFT HYPER-V

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?

VMware vsphere: Fast Track [V6.7] (VWVSFT)

IMPLEMENTING VIRTUALIZATION IN A SMALL DATA CENTER

Microsoft Office SharePoint Server 2007

Customer Onboarding with VMware NSX L2VPN Service for VMware Cloud Providers

IBM Storwize V5000 disk system

VMware vsphere 5.5 Professional Bootcamp

Symantec NetBackup 7 for VMware

Implementing disaster recovery solution using IBM SAN Volume Controller stretched cluster and VMware Site Recovery Manager

Huawei OceanStor ReplicationDirector Software Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Date

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump

Transcription:

IBM SONAS business continuity solution with SONAS asynchronous replication and VMware vsphere 4.1 A technical report Udayasuryan Kodoly IBM Systems and Technology Group ISV Enablement August 2011

Table of contents Abstract...1 Executive summary...1 Intended audience... 1 Scope... 1 Prerequisites...2 IBM SONAS DR solution key components overview...2 VMware vsphere... 2 IBM SONAS... 3 Asynchronous replication... 3 Tiers of protection...4 Data center protection... 5 Synchronous file system replication... 6 Regional protection... 6 Asynchronous replication in a single direction... 7 Asynchronous replication in two active directions... 8 IBM SONAS DR solution architecture and end-to-end asynchronous replication configuration...10 Material list for IBM SONAS DR solution setup... 11 IBM SONAS setup... 12 VMware vsphere 4.1 setup... 12 Asynchronous replication considerations for IBM SONAS DR solution... 15 Establishing the asynchronous replication relationship from Site 1 to Site 2... 17 Site 2 (IBM SONAS Core) target side configuration... 18 Site 1 (IBM SONAS Edge) source side configuration... 19 Starting asynchronous replication on Site 1... 22 Incremental asynchronous replication at Site 1... 23 Scheduling an established Site 1 to Site 2 asynchronous replication task... 23 Establishing the asynchronous replication relationship from Site 2 to Site 1... 23 Site 1 (IBM SONAS Edge) target side configuration... 24 Site 2 (IBM SONAS Core) source side configuration... 26 Starting asynchronous replication on Site 2... 27 Incremental asynchronous replication on Site 2... 27 Scheduling asynchronous replication on Site 2... 27 IBM SONAS DR solution file tree replica considerations... 27 IBM SONAS business continuance planning and testing...29 Recovering the disaster... 31 Recovery from an extended Site 1 outage... 33 Summary...35 Appendix A: Glossary...36 Appendix B: Materials used in the lab setup...37

Appendix C: Resources...38 Appendix D: High-level VMware PowerCLI script to register VM...41 About the author...46 Trademarks and special notices...47

Abstract This technical paper discusses disaster recovery in terms of levels of data protection for a VMware environment that uses IBM Scale Out Network Attached Storage (SONAS) for its primary storage. These levels of protection are based on customer needs and can be considered independently. Protection can be at a data center, or regional level. There might also be a combination of both data center and regional-level protection. This technical report provides example of how asynchronous replication can be combined with VMware vsphere high availability features such as VMware High Availability (HA), VMware Fault Tolerance (FT) and VMware vsphere PowerCLI scripting features to provide an effective disaster recovery solution for enterprise IT virtual infrastructure. Executive summary As enterprise businesses have become increasingly dependent on data and access to that data, disaster recovery (DR) has become a major concern. Using latest technologies, many enterprise businesses are turning to virtualization as a method of solving their DR challenges in a cost-effective manner. VMware as a leader in the virtual infrastructure arena, this technical report discusses how effectively you can combine VMware vsphere with IBM SONAS to provide a reliable DR solution for extremely scalable enterprise IT virtual infrastructure. According to many recent surveys, high percentage of enterprise IT virtual infrastructure use virtualization for consolidation and for DR solution. Combining VMware vsphere and IBM SONAS products creates a robust DR solution that is highly available, extremely scalable, and evolves as enterprise IT virtual infrastructure needs evolve. As more enterprise businesses look to virtualization as a means to cost effectively simplify their server environments, they realize the importance of highly available and extremely scalable storage environment with DR capabilities. Levels of protection might be needed in the data center, at a regional level, or a combination of both. This technical report provides a detailed plan for setting up and testing both IBM SONAS and VMware vsphere products (and providing the necessary levels of protection for both the products) in an enterprise IT virtualized environment. This technical report outlines the hardware and software used, the installation and configuration steps performed and operational scenarios. This technical report is intended to be used as reference architecture for specific customer scenarios. Intended audience This technical report is intended for: Customers and prospects looking to implement effective DR solution for enterprise business IT virtual environment that demands extreme scalability and high availability by integrating VMware vsphere 4.1 and IBM SONAS Users and management seeking information to implement DR solution for highly available and massively scalable enterprise IT virtual infrastructure Scope This technical report provides: Detailed DR solution implementation for the enterprise quality IT infrastructure, which demands highest availability and scalability features by integrating IBM SONAS and VMware vsphere 4.1 Detailed design and implementation guide; configuration best practices 1

Reproducible test results that simulate common failure scenarios resulting from operational problems and unplanned outages This technical report does not: Discuss any performance impact and analysis from a user perspective Replace any official manuals and documents from IBM and VMware on the products used in the solution Prerequisites This technical paper assumes familiarity with the following prerequisites: Basic knowledge of VMware virtualization technologies and products: VMware vcenter Server 4.1 VMware vsphere 4.1 Basic knowledge of IBM SONAS system The IBM SONAS system must have SONAS version 1.2 GA version or higher. IBM SONAS DR solution key components overview This section briefly describes the key components used in the IBM SONAS DR solution. VMware vsphere VMware vsphere provides enterprise-class virtualization that increases server and other resource utilization. It improves performance, increases security, and minimizes system downtime, reducing the cost and complexity of delivering enterprise service. By leveraging existing technology, VMware enables the roll-out of new applications with less risk and lower platform costs. VMware vsphere is a feature-rich suite that delivers the production-proven efficiency, availability, and dynamic management needed to create a responsive data center. The suite includes: VMware ESX/ESXi Server: Platform for virtualizing servers, storage, and networking VMware vcenter Server: Centralized management, automation, and optimization for IT infrastructure. VMware HA: Cost-effective high availability for virtual machines VMware FT: Continuous availability and reduced downtime for business-critical VMs VMware VMotion: Live migration of VMs without service interruption VMware storage VMotion: Migration of a running VM s disk files from one IBM SONAS NFS data store to another data store on the same ESX host VMware Distributed Resource Scheduler (DRS): Dynamic balancing and allocation of resources for VMs VMware vsphere PowerCLI: Provides a Microsoft Windows PowerShell interface to the vsphere API 2

IBM SONAS The IBM SONAS storage solution can help enterprise organizations consolidate and manage data affordability, reduce crowded floor space, and reduce management expense associated with administering an excessive number of disparate storage systems. SONAS provides a global namespace that enables storage infrastructure to scale to extreme amounts of data, from terabytes (TB) to petabytes (PB). IBM SONAS asynchronous replication feature will be used in this IBM SONAS DR solution. Asynchronous replication Asynchronous replication allows replication of file systems from source SONAS system to destination SONAS systems located across metro or geographical distances. The ability to continue operations in the face of a regional disaster is handled through asynchronous replication provided by the SONAS system. Asynchronous replication allows for one or more file systems within a SONAS file name space to be defined for replication to another SONAS system over the customer network infrastructure. Files that have been created, modified, or deleted at the primary location are carried forward to the remote system at each invocation of the asynchronous replication. Refer to Figure 1. Figure 1: IBM SONAS Asynchronous replication Using the asynchronous replication feature of IBM SONAS can meet appropriate configurable recovery point objective (RPO). Asynchronous replication feature enable IBM SONAS to support multiple uses, including disaster recovery, data distribution, remote access, data migration, and data replication. Main considerations for using the asynchronour replication feature with IBM SONAS are: Need to protect against component and system failures, site failures, and natural disasters Cost of secondary site and network infrastructure Complexity of deployment, failover, and recovery processes 3

Meeting RPOs and recovery time objectives (RTOs) Tiers of protection Combining VMware vsphere and IBM SONAS technologies offers a unique value proposition. The combination resolves a number of customer challenges from both server and a storage perspective, especially the extreme scalability and high availability requirements of enterprise IT virtual infrastructure. On top of that, having both the technologies offers the ability to have a tiered disaster recovery environment. While VMware vsphere offers storage DR capabilities in a data center, from a host perspective through industry proven VMware HA and VMware FT features, IBM SONAS offers storage DR in a data center and at a regional level. Refer to Figure 2. Figure 2: IBM SONAS at data center and regional levels Tiered storage architecture can also increase return on investment (ROI), because this architecture utilizes hardware from both sites. Usage of the storage infrastructure at the primary site is a high percentage and on the other hand, in a typical DR architecture at the secondary site sits idle. The secondary site is typically used as a standby site, and hardware is rarely used for anything else. This is true with any non- VMware vsphere and IBM SONAS infrastructure. 4

Using combined VMware vsphere and IBM SONAS DR architecture, you can create tiered storage architecture so that the primary site continues to be used as it currently is; however, the secondary site s storage architecture can also be used for tiered application such as test/development or critical actions such as testing the DR capabilities of the architecture. VMware vsphere allows such utilization at the alternate sites due to the ability of the administrator to start a copy of the virtual machine (VM) on any server. The encapsulation of the VM s environment into files allows this to happen. Data center protection VMware vsphere provides inherent HA at several levels. VMware HA delivers the availability needed by many applications running in virtual machines, independent of the operating system and application running in it. VMware HA provides uniform, cost-effective failover protection against hardware and operating system failures within enterprise virtual IT infrastructure. VMware FT provides continuous availability to business-critical applications in the event of server failures, by creating a live shadow instance of virtual machine that is in virtual lockstep with the primary instance. VMware FT allows instantaneous failover between the two instances in the event of hardware failure, eliminates even the smallest chance of data loss or disruption. IBM SONAS is designed to address the new storage challenges posed by the ongoing explosion of data. IBM recognizes that a critical component of future enterprise storage is a scale-out architecture that takes advantage of industry trends to create a truly efficient and responsive storage environment, eliminating the waste created by the proliferation of scale-up systems and providing a best in class platform for silos of traditional network-attached storage (NAS) file servers (filers) with unparalleled competitive performance. SONAS provides a storage platform for global access of your business-critical data. Your business-critical data can be secured with both information protection and business continuity solutions, giving you a high level of business continuity assurance. Within a data center, SONAS support provides local synchronous file system replication and point-in-time copy (file system-level snapshots) feature. SONAS standard product features can provide protection and quick recovery in the event of a data center disaster such as a power outage, environmental failures such as air conditioning, hardware failures, and human error. Human error includes unintentional erasure of data or mistakes in data protection procedures. A single SONAS storage system can protect against the types of failure shown in Table 1. Failure Failure of power supply, storage controller, disk storage expansion unit, storage node, and interface node Single- or dual-disk failure Accidental erasure or destruction of data Storage enclosures outage or storage pod failure Protection Built-in redundancy RAID-6 File system-level snapshot copies Local synchronous intra-cluster replication Table 1: IBM SONAS data center level protection table 5

Synchronous file system replication Synchronous file system replication is implemented within a single SONAS cluster so it is defined as intra-cluster replication. Synchronous file system replication is a protection against total loss of a whole storage building block or storage pod and it is implemented by writing all data blocks to two storage building blocks that are part of two separate failure groups. Currently, synchronous replication applies to an entire file system. Refer to Figure 3. Figure 3: IBM SONAS synchronous file system replication Synchronous file system replication includes the following benefits. Ensures data protection against human error, disk failures, and so on Ensures minimal downtime during the disk failures events occurs, with no data loss for business-critical applications Meets increased service-level agreements (SLAs) by reducing planned downtime Synchronous replication configuration is out of the scope of this technical report. You can contact IBM support for appropriately configuring synchronous replication with IBM SONAS Regional protection One of the main benefits of virtualization for disaster recovery is independence of the recovery process from the recovery hardware. The VMs encapsulate the complete environment, including data, application, 6

operating system, Basic Input/Output System (BIOS), and virtualized hardware. Applications can be restored to any hardware with a virtualization platform without the concerns for the differences in underlying hardware, and physical world limitation of having to restore to an identical platform does not apply. By leveraging the hardware independence of VMware vsphere VMs, no identical hardware is required at their DR sites, and this can significantly reduce the cost and complexity of regional DR. VMware vsphere enterprise customers actively take advantage of VMware consolidation benefits for their production and staging servers. These consolidation benefits are even greater for the failover hardware, because enterprises can consolidate servers at the primary data center to fewer physical servers at their disaster recovery centers. Another benefit of VMs that helps to ease the complexity of DR is the VMware vsphere flexible networking features. VMware vsphere handle virtual LANs (VLANs) on its vnetwork Distributed Switch (vds), entire complex network environment can be isolated, contained, or migrated very easily with minimum setup at the DR site. The ability to continue operations in the face of a regional disaster is handled through asynchronous replication provided by the SONAS system. Asynchronous replication allows one or more file systems within a SONAS system for replication to another SONAS system over the customer network infrastructure. Files that have been created, modified, or deleted at the primary location are carried forward to the remote system at each invocation of the asynchronous replication. Asynchronous replication can be configured in the following two ways. Asynchronous replication of file system(s) in a single direction Asynchronous replication of multiple file systems in each direction Asynchronous replication in a single direction Asynchronous replication in a single direction is configured in a single direction one-to-one relationship, such that one site is considered the source of the data and the other is the target. Refer to Figure 4. The replica of the file system at the target remote location is intended to be used in read only mode until a disaster or source file system downtime occurs. During a file system failure recovery operation, fail back is accomplished by defining the replication relationship for this file system from the original target back to the original source. 7

Figure 4: Single directional asynchronous replication configuration Asynchronous replication in two active directions In modern enterprise IT virtual infrastructure, the SONAS systems on both sites facilitate primary file system where active VMs resides. In this scenario, the SONAS at both the sites is used for production I/O, in addition to being the target mirror for the other SONAS system s file structure. This can be both directions, such that both the SONAS systems have their own primary file system, in addition to the file system replica of the file system (primary) of the other. It might also be that both the SONAS systems have their own primary file system, and only one has the mirror of the other. Refer to Figure 5. The replica of the file system on both the SONAS systems are intended to be used in read only mode until a disaster or either of the SONAS active file system s downtime occurs. During a file system failure recovery operation, failback is accomplished by defining the replication relationship from the original target back to the original source. 8

Figure 5: Asynchronous replication in two active directions Asynchronous replication allows enterprises to replicate the changed portions of the production VMs images in a source file tree (file system directory) to the target SONAS site periodically. The replicated VMs in the target file system can be brought up at the remote site in a programmatic way using VMware vsphere PowerCLI, on any available ESX/ESXi host. 9

IBM SONAS DR solution architecture and end-to-end asynchronous replication configuration Figure 6: IBM SONAS DR solution physical architecture Figure 6 illustrates the architecture of the IBM SONAS DR solution for enterprise IT virtual infrastructure. This solution is made up of VMware ESX / ESXi hosts, VMware vcenter servers and IBM SONAS storage. Each site in this solution architecture is active with VMware ESX / ESXi hosts running Microsoft Windows and Linux VMs. VMs of each site, resides on the primary file tree of SONAS on both sites. Primary file tree: File system directory planned to host live VMs on source SONAS system Replicated file tree: Primary file tree replica on target SONAS system The VMware ESX / ESXi hosts in each site can access primary file tree (SONAS file system directory) using NFS protocol. The VMware ESX / ESXi hosts are configured with VMware HA cluster. VMware FT is enabled on business-critical VMs. The VMware vcenter is configured with VMware vcenter Server Heartbeat. 10

This solution is provided through seamless integration of two technologies working in two different layers: VMware HA and FT technologies for business continuance at the server level IBM SONAS asynchronous replication for business continuance at the storage level In this solution using IBM SONAS asynchronous replication feature, the primary file tree of each site is replicated to target SONAS system (belonging to other site) periodically. Material list for IBM SONAS DR solution setup Table 2 lists the software and hardware used in this solution architecture. Infrastructure components Vendor Quantity Details Servers running VMware ESX / ESXi Any For more information, refer to the VMware compatibility guide at: http://www.vmware.com/resources/compatib ility/search.php Example: ibm.com/systems/x/hardware/rack/x3650m 3/specs.html IBM SONAS Storage system IBM 2 http://publib.boulder.ibm.com/infocenter/son asic/sonas1ic/index.jsp?topic=/com.ibm.son as.doc/sonas_r_publications.html Switch Network adapter (Per ESX Server and SONAS interface nodes) Software Any Broadcom IBM IBM VMware VMware 2x10 For more information, refer to Cisco Nexus 5000 at: ibm.com/systems/networking/hardware/eth ernet/c-type/nexus/specifications.html Broadcom NetXtreme II 57710 10 GbE Any Broadcom NetXtreme II BCM 5709 1000Base-T IBM SONAS 1.2 or higher IBM SONAS asynchronous replication VMware vsphere 4.1 (ESX / ESXi) or higher VMware vcenter Server 4.1 or higher Table 2: Solution hardware and software requirements for primary and secondary sites 11

IBM SONAS setup The interface node of the SONAS is connected with 10 GbE (Gigabit Ethernet). The management node of the SONAS is connected with 1 GbE. Consider 80/20 rule for effectively planning the primary file system to deploy live VMs. For additional information about 80/20 rule, refer to the, technical report titleed: Effective implementation of VMware vsphere 4.1 with IBM SONAS at: http://public.dhe.ibm.com/partnerworld/pub/whitepaper/1b59a.pdf Recommendation: For enabling business-critical VMs, consider using high-performance serial-attached SCSI (SAS) disk drives for the primary file system. In this test setup, there is one primary file system on each site on where active VMs resides. The primary file system is configured with SAS disk drive with RAID 6. VMware vsphere 4.1 setup In the test setup, The VMware ESX hosts on both sites are configured with VMware HA cluster. For more information about VMware HA configuration with IBM SONAS refer to the technical report at: http://public.dhe.ibm.com/partnerworld/pub/whitepaper/1a8b2.pdf The VMware ESX / ESXi hosts are connected with 10 GbE. The VMware vcenter Server is installed on Microsoft Windows Server 2008 R2 operating system. The vcenter Server is configured using vcenter Server Heartbeat. The VMware vcenter Server Heartbeat can be configured either by virtual to virtual (V2V) or physical to virtual (P2V) or physical to physical (P2P) architecture. For more information on VMware vcenter Server Heartbeat configuration refer to the quick start guide from VMware at: http://www.vmware.com/pdf/vcenter-server-heartbeat-63-u1-quick-start-guide.pdf The VMware ESX / ESXi hosts configured in VMware HA, leverages the vds feature and using NFS data store to access the primary file tree (file system directory) located in SONAS. 12

Figure 7 shows the complete test setup configuration. Figure 7: Solution test setup configuration 13

Figure 8 shows the complete site 1 (IBM SONAS Edge) test setup. Figure 8: Site 1 (IBM SONAS Edge) setup diagram 14

Figure 9 shows the complete site 2 (IBM SONAS Core) test setup. Figure 9: Site 2 (IBM SONAS Core) setup diagram Asynchronous replication considerations for IBM SONAS DR solution In this solution, SONAS systems belong to both the sites facilitate active VMs for VMware ESX/ESXi hosts. In the test setup, SONAS storage system belongs to both sites has primary file system on where production VMs resides. The primary file system of SONAS on each site will be configured to replicate to target SONAS storage system on the other site periodically using the asynchronous replication feature. During the asynchronous replication process, the SONAS system detects only the modified files from the source file system, and only moves the changed contents from each file to the target destination to create 15

the exact replica. Here are the main considerations for correctly configuring asynchronous replication relationship between SONAS systems that belong to both the sites. The management and the interface nodes of the source SONAS system must communicate over the network with the management and interface nodes of the target SONAS system. The target SONAS file system must be large enough, with enough free space to allow for replication of the source SONAS file system along with overhead to accommodate snapshots. Sufficient network bandwidth is required to replicate all of the file system delta changes with a latency that is sufficient to meet RPO needs during peak utilization. TCP port 1081 is required on the source and target SONAS systems for the configuration process to establish secure communications from the target management node to the source management node using Secure Shell (SSH). TCP port 22 is required on the source and target SONAS system for asynchronous replication to use SSH to transfer encrypted file changes from the source SONAS management and interface nodes to the target management and interface nodes. For replication in both directions or for potential failback after a recovery, the ports should be open in both directions. Following sections provide the detailed steps for configuring SONAS systems of Site 1 and Site 2 involved in this IBM SONAS DR solution, to participate in an asynchronous replication in two active directions configuration. Refer to the Asynchronous replication in two active directions section for more information. The asynchronous replication configuration has been performed on the management node of the SONAS systems involved in the IBM SONAS DR solution. 16

Establishing the asynchronous replication relationship from Site 1 to Site 2 Initially in the test setup, asynchronous replication relationship will be established from Site 1 to Site 2. Source = Site 1 (IBM SONAS Edge) Target = Site 2 (IBM SONAS Core) Figure 10 shows exact how primary file tree structure gets replicated from source Site 1 (IBM SONAS Edge) to Site 2 (IBM SONAS Core). Figure 10: Initial asynchronous replication relationship establishment (Site 1 to Site 2) 17

Site 2 (IBM SONAS Core) target side configuration The configuration of the async relationship starts from the Site 2 (target SONAS system) configuration. In the test setup, initially Site 2 is considered as the target side. The configuration of the asynchronous relationship for Site 2 (target SONAS system) consists of two steps: Step 1: In the test setup, at first on Site 2 (target SONAS system), define its relationship to Site 1 (source SONAS system). The SONAS cfgrepl CLI is used to establish the Site 2 (target SONAS system) relationship with Site 1 (source SONAS system). Note: For this configuration of the target side, only the source parameter and its value are required. The value of the source parameter is the public IP address of the Site 1 (source SONAS system) management node, as shown in the Figure 11. Figure 11: Async replication establishing to Site 1 relationship on Site 2 SONAS Step 2: The final step at Site 2 (target SONAS management node) is to configure the target file system or file tree (file system mount point) path that is supposed to receive data from Site 1 (source SONAS cluster). This is done by using the mkrepltarget SONAS CLI command. Find more information on SONAS mkrepltargetcommand at: http://publib.boulder.ibm.com/infocenter/sonasic/sonas1ic/index.jsp?topic=/com.ibm.sonas.doc/manpa ges/mkrepltarget.html Note: The sourceclusterid parameter in the mkrepltarget SONAS CLI identifies the source SONAS cluster ID. This source cluster ID parameter can be obtained through the lscluster SONAS CLI command issued on Site 1 (source SONAS system). In the test setup, Site 1 (source SONAS) and Site 2 (target SONAS system) have exact file system layout. Therefore, the same file tree structure is defined for the Site 1 (source SONAS system) file tree replica. Refer to Figure 10 for the test setup replication file tree structure. Figure 12 shows the test setup configuration using mkrepltarget SONAS CLI command on Site 2 (target SONAS system) to set the target file tree mount point to be the receiver of the data from Site 1 (source SONAS system). Also, using the lsrepltarget SONAS CLI command lists the source cluster ID and target file path, which has been enabled on Site 2 (target SONAS system). 18

Figure 12: Configuring the Site 1 (target SONAS system) file tree replica path to receive Site 2 (source SONAS system) file tree Site 1 (IBM SONAS Edge) source side configuration The asynchronous replication configuration on the Site 1 (source SONAS system) is very similar to the Site 2 (target SONAS system). The configuration of asynchronous replication for the Site 1 (source SONAS system) consists of two steps. Step 1: In this solution, the first step is to define the relationship between Site 1 (source SONAS system) and Site 2 (target SONAS system) using the cfgrepl SONAS CLI command. You can find more information on cfgrepl SONAS CLI command at: http://publib.boulder.ibm.com/infocenter/sonasic/sonas1ic/index.jsp?topic=/com.ibm.sonas.doc/manpa ges/cfgrepl.html The target parameter needs to be specified, supplying the public IP address of the management node of the target SONAS system. The n parameter asks the system to select the provided number of interface nodes on Site 1 (source SONAS system) and Site 2 (target SONAS system) to be used for asynchronous replication. The pairs parameter allows for the selection of the specific Site 1:Site 2 (source:target) node pairs employed for asynchronous replication. The pairs are specified by identifying the Site 1 (source SONAS system) interface node name along with a Site 2 (target SONAS system) interface node IP address. This configuration instructs the system to cause the flow of replication data to be from the specified Site 1 interface node to the specified Site 2 interface node. Refer to Figure 13. 19

Figure 13: Asynchronous replication data flow configuration diagram In the test setup, the Site 2 IP addresses shown in the Figure 14 are configured for asynchronous replication. 20

Figure 14: Site 2 (target SONAS system) interface node mapping for asynchronous replication on Site 1 Figure 15 shows the test setup using the cfgrepl SONAS CLI command to configure the flow of asynchronous replication of data from the source SONAS system (Site 1). Figure 15: Asynchronous replication of data flow configuration between Site 1 and Site 2 21

Figure 16 shows the use of the lsreplcfg SONAS CLI command to display the asynchronous replication relationship configuration on Site 1. Figure 16: Displaying the relationship defined on the Site 1 source SONAS system Step 2: The final step in the Site 1 (source SONAS system) asynchronous replication configuration is to notify the Site 1 (source SONAS system) to which of its primary source file system should be replicated to Site 2 target file system. The cfgreplfs SONAS CLI command is used to establish this relationship. You can find more information on cfgreplfs SONAS CLI command at: http://publib.boulder.ibm.com/infocenter/sonasic/sonas1ic/index.jsp?topic=/com.ibm.sonas.doc/manpa ges/cfgreplfs.html In the test setup the Site 1 primary file system that enables the active VMs resides is gpfs1. The Site 2 will facilitate gpfs1 file system to receive the Site 1 primary file tree replica. In the test setup, target cluster ID is obtained by issuing the lscluster SONAS CLI command at Site 2 (target SONAS cluster). Figure 17 shows the final Site 1 asynchronous replication configuration to display the configured Site 1 (source SONAS system) primary file system or file tree should be replicated to Site 2 (target SONAS file system) file system or file tree replica. Figure 17: cfgreplfs SONAS CLI command configuration on Site 1 Starting asynchronous replication on Site 1 Starting a defined asynchronous replication cycle uses the startrepl SONAS CLI command. This command initiates the replication process for the Site 1 primary file tree name (/ibm/gpfs1/vspheresite-1). You can find more information on the startrepl command at: http://publib.boulder.ibm.com/infocenter/sonasic/sonas1ic/index.jsp?topic=/com.ibm.sonas.doc/manpa ges/startrepl.html Note: For the very first replication, use the fullsync option to request that all files in the file system be replicated. Figure 18 shows the test setup, Site 1 initiating asynchronous replication using the startrepl SONAS CLI command. 22

Figure 18: Initiating asynchronous replication using startrepl SONAS CLI command at Site 1 Incremental asynchronous replication at Site 1 Incremental replication on Site1 has been started, as shown in the Figure 19. Figure 19: Asynchronous replication incremental replication using startrepl SONAS CLI command at Site 1 Scheduling an established Site 1 to Site 2 asynchronous replication task For appropriately scheduling Site 1 primary file tree normal delta changes to replicate to Site 2 file tree replica, use either Site 1 SONAS GUI or the mkrepltask SONAS CLI. You can find more information at: http://publib.boulder.ibm.com/infocenter/sonasic/sonas1ic/index.jsp?topic=/com.ibm.sonas.doc/mng_t _task_create_arepl.html Figure 20 shows the test setup on Site 1 primary file tree normal delta change replication configured periodically for every hour. Figure 20: Test setup using mkrepltask SONAS CLI command, periodic replication configuration on Site 1 Establishing the asynchronous replication relationship from Site 2 to Site 1 This section describes the quick steps to establish asynchronous replication relationship from Site 2 to Site 1. Source = Site 2 (IBM SONAS Core) Target = Site 1 (IBM SONAS Edge) Figure 21 shows the exact primary file tree structure that gets replicated from source Site 2 (IBM SONAS Edge) to Site 1 (IBM SONAS Core). 23

Figure 21: Asynchronous replication relationship establishment (Site 2 to Site 1) Site 1 (IBM SONAS Edge) target side configuration This section provides a quick SONAS test setup to configure Site 1 target side SONAS asynchronous replication relationship. Step 1: Figure 22 shows Site 1 (target SONAS system) defining its relationship to Site 2 (source SONAS system). Figure 22: Asynchronous replication configuration to establish the Site 2 relationship on Site 1 SONAS 24

Step 2: Figure 23 shows the test setup configuration that uses the mkrepltarget SONAS CLI command on Site 1 (target SONAS system) to set the target file system mount point to be the receiver of the data from Site 2 (source SONAS system). Figure 23: Configuring the Site 1 (target SONAS system) file tree replica path to receive Site 2 (source SONAS system) file tree 25

Site 2 (IBM SONAS Core) source side configuration Step 1: In the test setup, Site 1 IP addresses as shown in the Figure 24 are configured for asynchronous replication. Figure 24: Site 1 (target SONAS system) interface node mapping for asynchronous replication on Site 2 Figure 25 shows the test setup in which the cfgrepl SONAS CLI command is used to configure the flow of asynchronous replication of data from the source SONAS system (Site 2). Figure 25: Asynchronous replication of data flow configuration between Site 2 and Site 1 26

Step 2: Figure 26 shows the final Site 1 asynchronous replication configuration to display the configured Site 1 (source SONAS system) primary file system or file tree should be replicated to Site 2 (target SONAS file system) file system or file tree replica. Figure 26: cfgreplfs SONAS CLI command configuration on Site 2 Starting asynchronous replication on Site 2 Figure 27 shows the start of asynchronous replication on Site 2. Figure 27: Starting Async replication on Site 2 Incremental asynchronous replication on Site 2 Incremental replication on Site 2 is started, as shown in the Figure 28. Figure 28: Asynchronous replication incremental replication using startrepl SONAS CLI command at Site 2 Scheduling asynchronous replication on Site 2 Figure 29 shows the test setup on Site 2 primary file tree normal delta change replication configured periodically every hour. Figure 29 : Test setup using mkrepltask SONAS CLI command periodic replication configuration on Site 2 IBM SONAS DR solution file tree replica considerations Testing the VMs image files (.vmdk) at the remote site is achieved by creating read-only shares against the file tree replica of the DR SONAS through the mkexport SONAS CLI command. This allows for 27

verification and read only (ro) access to the files to ensure that they are readily available for a disaster at the production site. You can find more information on the mkexport SONAS CLI at: http://publib.boulder.ibm.com/infocenter/sonasic/sonas1ic/index.jsp?topic=/com.ibm.sonas.doc/manpages/ mkexport.html The shares at the DR SONAS should only be accessed for read to prevent modifications or deletions occurring in the file tree replica at the target SONAS system. Asynchronous replication tracks only the changes at the source SONAS system and therefore changes to the target SONAS system might not be detected during the incremental asynchronous replication updates. If there is a suspicion that the VM image files (.vmdk) have been deleted or modified at the target SONAS system independent of the source system, then an asynchronous replication with the fullsync option should be performed to detect discrepancies between the two systems by performing an entire audit of the source and target file trees. You can refer to the Starting asynchronous replication on Site 1 or Starting asynchronous replication on Site 2 sections for more information. 28

IBM SONAS business continuance planning and testing The disaster recovery testing is an interdisciplinary concept used to create and validate a practiced logistical plan for how an organization recovers and restores partially- or completely-interrupted critical functions within a predetermined time after a disaster or extended disruption. This logistical plan is commonly referred to as the business continuity plan. An effective business continuity plan provides a smart balance of business needs against cost considering all the risk factors. It is beyond the scope of this paper to detail all the aspects of building a master business continuity plan. However, it is nevertheless important to discuss the importance of frequent testing of the disaster recovery plan. The old saying goes that any disaster recovery plan is only as good as your last successful testing. The overall steps for enabling the recovery site involve the following major tasks: Defining shares / exports to the file tree replica Continuing production operation against remote system In the test setup Site 1 (IBM SONAS Edge) SONAS system outage has been simulated, as shown in the Figure 30. Figure 30: Site 1 SONAS edge system failure Table 3 shows the methods to simulate Site 1 (IBM SONAS Edge) SONAS system failure. Action Description 29

Actions to simulate the complete Site 1 outage Following options can be performed to simulate Site 1 (IBM SONAS Edge) primary file system outage. In the test setup, all the following options has been performed to simulate the Site 2 outage. Remove power from the source management node Remove interface node on which IP addresses has been used to mount NFS data store on VMware ESX Servers from vcenter on Site 1 and interface nodes configured for asynchronous data replication. Manually removed the disks belong to primary file system tree from the storage enclosures. Caution: Consult IBM support before performing this action to simulate primary file system outage. Remove the NFS export to primary file system tree where active VMs reside. Note: Removing NFS export to primary file system tree is the most effective, less complicated, and easier method to simulate the site outage SEVERE int005st001 124027792389569359 nfsd nfssvc: unable to bind UPD socket: errno 98 (Address already in use) SEVERE int005st001 124027792389569359 nfsd nfssvc: Setting version failed: errno 16 (Device or resource busy) SEVERE int005st001 124027792389569359 syslog rpc.statd[1661106]: Caught signal 15, unregistering and exiting. SEVERE int005st001 124027792389569359 ctdbd Trying to restart NFS service SEVERE strg001st001 124027792389569359 cim EFSSA0122C The state MULTIPATH - mpath changed (Disk went down) 401E0302 IBM SONAS events SEVERE ds002st001 124027792389569359 cim EFSSA0137C Received an SNMP-Storage message: Pool state changed to inoperative. There are not enough members in the storage pool for any data operations to occur. WARNING ds002st001 124027792389569359 cim EFSSA0136W Received an SNMP-Storage message: A member of a storage pool has changed its status to missing. WARNING ds002st001 124027792389569359 cim EFSSA0136W Received an SNMP-Storage message: Pool state changed to no redundancy. A storage pool has lost one or more members but can still accept I/O operations. WARNING ds002st001 124027792389569359 cim EFSSA0136W Received an SNMP-Storage message: A device has been removed from a disk slot. WARNING ds001st001 124027792389569359 cim EFSSA0136W Received an SNMP-Storage message: A device has been removed from a disk slot. WARNING ds001st001 124027792389569359 cim EFSSA0136W Received an SNMP-Storage message: A device has been removed from a disk slot. 30

After a brief period, the vcenter indicates that Site 1 VMs are inaccessible. vcenter also indicates the NFS data store outage on Site 1 ESX Servers. Refer to Figure 31. vcenter events Figure 31: vcenter shows the VM and NFS data store outage Table 3: Site 1 (IBM SONAS Edge) SONAS system failure simulation methods Recovering the disaster In the test setup, the team successfully sabotaged the SONAS system of Site 1 (IBM SONAS Edge) to simulate a Site -1 extended outage. In the test setup, the Site 1 ESX Servers in VMware HA configuration lost the NFS data store connection to SONAS system, subsequently VMs are inaccessible. This scenario is considered IBM SONAS Edge system extended outage. Table 4 shows the list of recovery methods. Perform the following steps to prepare Site 2 (IBM SONAS core) system. 1. Using the rmrepltarget command, remove the target path of the Site 1 primary file tree path replica. Site 2 (IBM SONAS Core) preparation for recovery You can find more information on the rmrepltarget SONAS CLI command at: http://publib.boulder.ibm.com/infocenter/sonasic/sonas1ic/index.jsp?topic=/com.ibm.son as.doc/manpages/cfgrepl.html Sample Site 2 usage: rmrepltarget /ibm/gpfs1/vsphere-site-1 -c sonasisvc3.storage.tucson.ibm.com 2. Create NFS export (R/W) on Site 2 file tree replica using the mkexport SONAS CLI command. 31

You can find more information on the mkexport SONAS CLI command at: http://publib.boulder.ibm.com/infocenter/sonasic/sonas1ic/index.jsp?topic=/com.ibm.son as.doc/manpages/mkexport.html In case the file tree replica has the read only (R/O) NFS export defined, use the chexport SONAS CLI change to (R/W) for full VMware ESX hosts access at Site 1. You can find more information on the chexport SONAS CLI command at: http://publib.boulder.ibm.com/infocenter/sonasic/sonas1ic/index.jsp?topic=/com.ibm.son as.doc/manpages/chexport.html 1. Mount Site 2 file tree replica NFS data store to ESX hosts configured in the Site 1 VMware HA. You can find more information on VMware HA configuration with IBM SONAS at: http://public.dhe.ibm.com/partnerworld/pub/whitepaper/1a8b2.pdf Refer to Figure 32. VMware vcenter Figure 32: Site 2 NFS data store mount on Site 1 ESX Server 2. Using the PowerCLI script, you can recover VMs from the NFS data store mounted on Site 2 file tree replica. Use the sample PowerCLI script listed Appendix D: High-level VMware PowerCLI script to register VM for this method. Use the following steps and cmdlets to recover VMs from Site 2. 1. Connect to the relevant vcenter instance. Connect-VIserver vcenterinstance Protocol https User admin Password pass 2. Scan each data store in sequence. Add the discovered virtual machines to inventory on the desired vsphere host: Adding dir vmstores:\vcenterinsancename@443\datacentername\*\*\*.vmx % \{New-virtual machine Host vspherehostname VMFilePath $_.Data 32

VMs through PowerCLI cmdlets storefullpath Where \*\*\* can be replaced by any string or pattern (including wildcards) matching: \Data storename\foldername\vmxfilename 3. Switch on the VMs. Get-virtual machine Name * Start-VM 4. Answer the virtual machine prompt regarding msg.uuid.altered: Get-virtual machine Name* Get-VMQuestion Set-VMQuestion Option I moved it confirm $false At this point, all VMs should be switched on. Table 4: Recovery methods (failover) to Site 2 (IBM SONAS Core) SONAS system Recovery from an extended Site 1 outage This scenario assumes that the SONAS primary file system of Site 1 was done for an extended outage. At the same time ESX / ESXi hosts within Site 1 have established appropriate NFS data store to Site 2 file system replica and VMs of file system replica are up and running. However, the SONAS primary file system of Site 1 was recovered such that either no VM images or only subset of the VM images within the primary file system of Site 1 was damaged or lost. Table 5 shows the appropriate steps for recovering the SONAS primary file system of Site 1. Step Description In order to replicate the file system replica of Site 2 back to Site 1, the Async relationship should be defined from Site 2 to Site 1. Refer to the Establishing the asynchronous replication relationship from Site 2 to Site 1 section for appropriate example. Test setup configuration: Source SONAS = Site 2 (IBM SONAS Core) Define the asynchronous replication from Site 2 to the Site 1 Source file system = Site 2 file system replica (/ibm/gpfs1/vsphere-site1) Target SONAS = Site 1 (IBM SONAS Core) Target file system = Site 1 primary file system (/ibm/gpfs1/vsphere-site1) Replicate file system replica Situation: The primary file system used in Site 1 is still present with all the VM images at the time of the outage, or some amount of the file system VM images were lost. However it is important to re-sync current Site 2 recovery file system back to primary file system of Site 1. 33

(primary) from Site 2 to the Site 1 The first synchronization from the Site 2 to the Site 1 should be performed with the fullsync option of the startrepl SONAS CLI command. This option ensures that every file in Site 2 is verified against the Site 1 SONAS system for consistency. While every file is checked, only files that are new or modified will actually be transmitted. Verify the asynchronous relationship from the Site 1 to Site 2 In order to resume the normal replication from the Site 1 SONAS to the Site 2 SONAS, the asynchronous relationship should be verified with the cfgrepl and cfgreplfs commands on Site 1 and the lsrepltarget command on Site 2. If the asynchronous relationship is not correct, then the asynchronous replication should be redefined using the configuration CLI commands. Ensure Site 1 is current resume replication from Site 1 to Site 2 If the amount of VMs modified at the Site 2 was large enough, the time to replicate the VM images back to the Site1 can take some time. Modifications on Site 2 can continue while the replication takes place. Staged replications might be required to bring Site 1 to the point where operations can be suspended to Site 2. Perform a final asynchronous replication to Site 1 to make it current and reestablish ESX / ESXi hosts NFS data store of Site 1 to primary file system of Site 1. Use PowerCLI script to restart the VM. Table 5: Steps to recover from the extended Site 1 outage The asynchronous relationship can then resume in the direction from Site 1 to Site 2. 34

Summary VMware HA, VMware FT, and IBM SONAS technologies integrated solution provided a simple and best-in-class, highly-available, and extremely-scalable enterprise virtual infrastructure solution for planned and unplanned downtime in virtual data center environments. IBM SONAS business continuity solution for enterprise IT virtual infrastructure is enabled by tight integration of VMware vsphere 4.1 features, VMware PowerCLI, and IBM SONAS asynchronous replication features. This solution offers customers the ability to implement a highly available and extremely scalable enterprise IT virtual infrastructure to support their mission-critical applications and ensure business continuity with reliable disaster recovery solution. This paper is not intended to be a definitive implementation or solutions guide for enterprise virtual infrastructure solutions in VMware vsphere 4.1 with IBM SONAS. Many factors related to specific customer environments are not addressed in this paper. You can contact IBM to speak with one of IBM virtualization solutions experts for any deployment requirement. 35

Appendix A: Glossary IBM Scale Out Network Attached Storage (SONAS) - Build on IBM high-performance computing experience, and based upon IBM General Parallel File System (IBM GPFS ), scale out network-attached storage (NAS) solution provides the performance, clustered scalability, high availability, and functionality that are essential to meet strategic Petabyte Age and cloud-storage requirements. Asynchronous replication Allows replication of file systems across long distances or to lowperformance, high-capacity storage subsystems. Synchronous file system replication - A feature provided by the IBM SONAS file system. VMware vsphere Formerly developed as VMware Infrastructure 4, VMware vsphere is VMware s first cloud operating system that can manage large pools of virtualized computing infrastructure, including software and hardware. VMware High Availability (HA) - Provides easy to use, cost-effective high availability for applications running in virtual machines. VMware Fault Tolerance (FT) - Provides continuous availability for applications in the event of server failures, by creating a live shadow instance of a virtual machine that is in virtual lockstep with the primary instance. VMware ESX / ESXi - Are bare-metal embedded hypervisors. They are VMware's enterprise software hypervisors for servers that run directly on server hardware without requiring an additional underlying operating system. VMware vcenter Server - Delivers centralized management, operational automation, resource optimization, and high availability to IT environments. 36