IBM Power Systems Technical University October 18 22, 2010 Las Vegas, NV Session Title: Designing a PowerHA SystemMirror for AIX Disaster Recovery Solution Session ID: HA18 (AIX) Speaker Name: Michael Herrera 2010 IBM Corporation
Best Practices for Designing a PowerHA Enterprise Edition Solution on AIX Michael Herrera (mherrera@us.ibm.com) Advanced Technical Skills (ATS) Certified IT Specialist + Workload-Optimizing Systems
Agenda Available Offerings Campus Disaster Recovery vs. Extended Distance What you get with Enterprise Edition Expected Fallover Behaviors Summary 3
Tiers of Disaster Recovery PowerHA SM Enterprise Edition HA & DR solutions from IBM for your mission-critical AIX applications PowerHA Enteprise Edition fits in here Tier 7 - Highly automated, business wide, integrated solution (Example: GDPS/PPRC/VTS P2P, AIX PowerHA Enterprise Edition, OS/400 HABP... Tier 6 - Storage mirroring (example:xrc, Metro & Global Mirror, VTS Peer to Peer) Zero or near zero data recreation Applications with Low tolerance to outage Value minutes to hours data recreation up to 24 hours data recreation 24 to 48 hours data recreation Tier 5 - Software two site, two phase commit (transaction integrity) Tier 4 - Batch/Online database shadowing & journaling, Point in Time disk copy (FlashCopy), TSM-DRM Tier 3 - Electronic Vaulting, TSM**, Tape Tier 2 - PTAM, Hot Site, TSM** Tier 1 - PTAM Applications Somewhat Tolerant to outage Applications very tolerant to outage 15 Min 1-4 Hr. 4-8 Hr. 8-12 Hr. 12-16 Hr. 24 Hr. Days Recovery Time Tiers based on SHARE definitions *PTAM=Pickup Truck Access Method with Tape **TSM=Tivoli Storage Manager ***=Geographically Dispersed Parallel Sysplex 4
HACMP is now PowerHA SystemMirror for AIX! A 20 year track record in high availability for AIX Current Release: 7.1.0.X Available on: AIX 6.1 TL06 & 7.1 Packaging Changes: Standard Edition - Local Availability Enterprise Edition - Local & Disaster Recovery (Version 7.1 will not be released until 2011) Licensing Changes: Small, Medium, Large Server Class Product Lifecycle: Version Release Date End of Support Date HACMP 5.4.1 PowerHA 5.5.0 PowerHA SystemMirror 6.1.0 PowerHA SystemMirror 7.1.0 Nov 6, 2007 Nov 14, 2008 Oct 20, 2009 Sept 10, 2010 Sept, 2011 N/A N/A N/A * These dates are subject to change per Announcement Flash 5
PowerHA SystemMirror Version 6.1 Editions for AIX (7.1 Enterprise Edition N/A till 2011) Centralized Management CSPOC SMIT management interfaces AIX event/error management Integrated heartbeat PowerHA DLPAR HA management Smart Assists High Level Features Cluster resource management Shared Storage management Cluster verification framework Integrated disk heartbeat Multi Site HA Management PowerHA GLVM async mode GLVM deployment wizard IBM Metro Mirror support IBM Global Mirror support * EMC SRDF sync/async Hitachi True Copy & Global Replicator * * Hitachi & Global Mirror functionality is only available in 6.1.0.3 Standard Edition Enterprise Edition Highlights: New Editions to optimize software value capture Standard Edition targeted at datacenter HA Enterprise Edition targeted at multi-site HA/DR Tiered pricing structure Small/Med/Large 6
High Availability & DR: Drawing the Line Different Perspectives on the protection and replication of data Campus Style DR Cross Site LVM Mirroring - AIX LVM Mirrors SVC Split I/O VDisk Mirroring - SVC VDisk functionality Metro Mirror or SRDF * - Disk Based Replication * To manage disk level replication the Enterprise Edition is required Extended Distance Offerings Metro Mirror & Global Mirror - SVC, DS6K, DS8K, ESS800 EMC SRDF - DMX3, DMX4, VMAX Hitachi TrueCopy & Global - USPV USPVM Replicator GLVM (sync / async) - IP Based Replication Remote Site Data Center 7
Local High Availability vs. Disaster Recovery How far can I stretch a local cluster? How far can my storage be shared? LVM Mirroring Disk Replication VDisk Mirroring Storage Enclosure 1 Storage Enclosure 2 Network Connectivity Subnetting & potential latency Storage Infrastructure Can you merge fabrics and present LUNs from either location across the campus? Desired Resiliency LVM Mirroring across storage subsystems Both copies are accessible Storage Level Replication Only active copy available VDisk Mirroring (San Volume Controller) Single logical copy mirrored on backend 8 (Distance limitations ~10km or 6 miles)
Campus Style DR: Cross Site LVM Mirroring Leverage AIX Logical Volume Mirroring Distance Limitations: Synchronous Direct SAN links Up to 15 km FC switch FC switch FC switch FC switch LVM mirrors DWDM, CWDM or other SAN extenders (ie. 120-300 km) Distance limited by latency effect on performance FC switch FC switch DWDM DWDM FC switch FC switch LVM mirrors 9
Campus DR: Cross Site LVM vs. Storage Replication Considerations: Standard Edition vs. Enterprise Edition Disk Replication: Common Replication Mechanism across platforms Performance Differences: Host based LVM Mirroring vs. Disk Replication White Paper Cross Site Mirroring Performance Implications http://www-03.ibm.com/support/techdocs/atsmastr.nsf/webindex/wp101269 Choices: Cross Site LVM Mirroring VDisk Mirroring (Split I/O Group) Metro Mirroring LAN LAN SAN DWDM DWDM SAN SVC SVC 10
PowerHA & Logical Volume Mirroring Considerations: What do the volume groups look like? Datavg Logical Volume LV LV hdisk hdisk hdisk hdisk LV Copy 1 primary LV Copy 2 secondary Local Storage Subsystem Remote Storage Subsystem New in AIX 6.1 - Mirror Pools Intended for Asynchronous GLVM Address Issues with Extending Logical Volumes and spanning copies New DR Redbook: Exploiting PowerHA SystemMirror Enterprise Edition - Scenario for Cross Site LVM with Mirror Pools 11
AIX 6.1 & Mirror Pools (SMIT Panels & CLI) Benefits: Prevent spanning copies Requirement for Async GLVM Other Potential Uses: - Cross Site LVM configurations - Synchronous GLVM * Reason that there is no asynchronous GLVM on AIX 5.3 and why it was not retrofitted * CSPOC does not currently allow you to create logical volume via menus. * Work around is to create logical volume using smit mklv and then continue creating Filesystem via CSPOC 12
Infrastructure Considerations Site A Site B LAN LAN SAN DWDM DWDM SAN Node A Node B SITEAMETROVG 50GB 50GB 50GB 50GB 50GB 50GB 50GB 50GB Important: Identify & Eliminate Single Points of Failure! 13
Infrastructure Considerations Site A XD_rs232 XD_IP WAN Site B net_ether_0 LAN LAN SAN DWDM DWDM SAN Node A Node B ECM VG: diskhb_vg1 hdisk2 000fe4111f25a1d1 1GB ECM VG: diskhb_vg1 hdisk3 000fe4111f25a1d1 ECM VG: diskhb_vg2 hdisk3 000fe4112f998235 1GB ECM VG: diskhb_vg2 hdisk4 000fe4112f998235 50GB 50GB 50GB 50GB SITEAMETROVG 50GB 50GB 50GB 50GB Important: Identify Single Points of Failure & design the solution around them 14
XD_rs232 networks and Serial over Ethernet Converted rs232 using rs422/rs485 Using true serial requires rs422/rs485 converters Distance to 1.2 km at 19,200 bps or 5 km (~3.1 miles) at 9600 bps Converted rs232 using fiber optics Fiber optic modems or multiplexors Distances of 20-100 km (~12-62 miles) but must conform to the vendor s specifications to avoid signal loss Companies like Black Box and TC Communications Serial over Ethernet This option provides the greatest distance by not defining any hard limitations, but is based on TCP/IP, which is one of the components that this type of network is designed to isolate Several vendors available online 15
PowerHA SystemMirror: Prominent Client Issues Cluster Subnet Requirements How do clients connect IPAT across Sites (Site Specific IPs) Context switch External Devices (ie. www.f5.com) Static IPs Node Bound Service IP (manual reconnect) DNS Change (consider TTL Time to Live) Resource Group A Startup: Online on Home Node Only Fallover: Fallover to Next Node in List Fallback: Never Fallback Site Policy: Prefer Primary Site Nodes: NodeA Node B Service IP: service_ip1 service_ip2 Volume Groups: datavg Application Server: AppA XD_rs232_net_0 XD_IP_net_0 en2 10.10.10.100 base 10.10.10.120 service_ip1 en2 13.10.10.100 base 13.10.10.120 service_ip2 1GB disk_hb_net_0 disk_hb_net_1 1GB 30GB 30GB 30GB 30GB Bldg A Cluster Data LUNs Bldg B 16
PowerHA Extended DR Solution Progression Building blocks for success SVC_Site A net_ether_01 SVC_Site B xd_ip net_diskhb_01 net_diskhb_02 Node A1 Node A2 Node B1 Node B2 SAN SAN PPRC Links SVC SVC HA first then DR 17
What are customers doing (Manual vs. Automated) Longer Distances require more robust solutions Local Clustering and Replication under the covers Metro / Global Mirror SRDF/A Hitachi True Copy & Global Replicator Oracle Dataguard DB2 HADR * Replicated volumes to an Inactive cluster Standalone GLVM IP replication or Automated GLVM is available in base 5.3 & 6.1 AIX Media Enterprise Edition required for Automation Fully Automated Solution PowerHA SystemMirror Enterprise Edition Additional offerings in the works! 18
PowerHA SystemMirror Storage Replication Integration Enterprise Edition Storage replication offerings Characteristics: Distance Limitations: Synchronous or Asynchronous Supported Replication: Metro & Global Mirror, SRDF, Hitachi TrueCopy & Global Replicator How it works: The cluster will redirect the replication depending on where the resources are being hosted Site A IP Communication Site B Considerations: DS8700 Global Mirror, EMC SRDF & Hitachi True Copy require PowerHA 6.1+ PowerHA SystemMirror Enterprise Cluster The Enterprise Edition adds additional cluster panels to define and store the relationships for the replicated volumes Storage Level Replication CLI is enabled for each replication offering to communicate directly with the storage enclosures and perform a role reversal in the event of a fallover Source LUNs Target LUNs 19
IBM z/vse SVC Version 5 Interoperability Matrix Storage Level virtualization for your Enterprise needs Novell NetWare VMware vsphere 4 Microsoft Windows Hyper-V IBM AIX IBM i 6.1 Sun Solaris HP-UX 11i Tru64 OpenVMS SGI IRIX Linux (Intel/Power/zLinux) RHEL SUSE 11 Apple Mac OS IBM N series Gateway NetApp V-Series IBM TS7650G IBM BladeCenter 1024 Hosts New New Point-in-time Copy Full volume, Copy on write 256 targets, New Incremental, Cascaded, Reverse Space-Efficient, FlashCopy Mgr New Entry Edition software IBM DS DS3400 DS4000 DS5020, DS3950 DS6000 DS8000 IBM XIV DCS9550 DCS9900 IBM N series New Native iscsi New SSD New SAN Volume Controller Hitachi Lightning Thunder TagmaStore AMS 2100, 2300, 2500 WMS, USP New New 8Gbps SAN fabric Continuous Copy Metro/Global MirrorNew Multiple Cluster Mirror HP MA, EMA MSA 2000, XP EVA 6400, 8400 Space-Efficient Virtual Disks New Virtual Disk Mirroring EMC CLARiiON CX4-960 Symmetrix New SAN SAN Volume Controller Sun NetApp NEC Fujitsu StorageTek FAS istorage Bull Eternus StoreWay 3000 For the most current, and more detailed, information please visit ibm.com/storage/svc and click on Interoperability. 8000 Models 2000 & 1200 4000 models 600 & 400 20 Pillar Axiom
Enterprise Edition Disk Replication Integration So what are you paying for? cluster.xd.license cluster.es.pprc.rte cluster.es.pprc.cmds cluster.msg.en_us.pprc cluster.es.spprc.cmds cluster.es.spprc.rte cluster.es.pprc.rte cluster.es.pprc.cmds cluster.es.msg.en_us.pprc cluster.es.sr.cmds cluster.es.sr.rte cluster.msg.en_us.sr cluster.es.tc.cmds cluster.es.tc.rte cluster.msg.en_us.tc Enterprise License Direct PPRC Management DSCLI Management EMC SRDF Hitachi True Copy & Global Replicator Qualified & supported DR configurations - IBM Development & AIX Software Support Teaming with EMC & Hitachi - Cooperative Service Agreement Install all Filesets or only what you need - Note Enterprise Verification takes longer - Don t install if you are not using it Filesets are in addition to base replication solution requirements 21
Geographic Logical Volume Mirroring - GLVM Enterprise Edition integrates with this IP replication offering How it works: Drivers will make the remote disks appear as if they are local over the WAN allowing for LVM mirrors between local and remote disks. Asynchronous replication requires the use of AIO cache logical volumes and Mirror Pools available only in AIX 6.1 and above Site A IP I/O Replication IP Communication Site B GLVM code is available in the AIX base media: AIX 5.3 synchronous replication AIX 6.1 synchronous & asynchronous replication PowerHA SystemMirror Enterprise Cluster PowerHA SystemMirror Enterprise Edition provides SMIT panels to define and manage all configuration information and automates the management of the replication in the event of a fallover Find more details in the new DR Redbook SG24-7841-00 Source LUNs Target LUNs * Storage can be dissimilar subsystems at either location 22
AIX & Geographic Logical Volume Mirroring Filesets: cluster.xd.license cluster.xd.glvm cluster.doc.en_us.glvm glvm.rpv.client glvm.rpv.msg.en_us glvm.rvp.server glvm.rpv.util glvm.rpv.man.en_us glvm.rpv.msg.en_us cluster.msg.en_us.glvm Enterprise License & Integration Filesets Geographic Logical Volume Mirrring (Available on AIX Media) 23
PowerHA SystemMirror & AIX 6.1 - Asynchronous GLVM Vegas Conference: Implementing PowerHA SystemMirror Enterprise Edition for Asynchronous GLVM Double session lab Wednesday Bill Miller 24
New in PowerHA SM 6.1 - GLVM Configuration Wizard Assists in the creation of a Synchronous GLVM cluster GLVM GLVM Cluster Cluster Configuration Configuration Assistant Assistant Type Type or or select select values values in in entry entry fields. fields. Press Press Enter Enter AFTER AFTER making making all all desired desired changes. changes. [Entry [Entry Fields] Fields] * * Communication Communication Path Path to to Takeover Takeover Node Node [] [] + + * * Application Application Server Server Name Name [] [] * * Application Application Server Server Start Start Script Script [] [] * * Application Application Server Server Stop Stop Script Script [] [] HACMP HACMP can can keep keep an an IP IP address address highly highly available: available: Consider Consider specifying specifying Service Service IP IP labels labels and and Persistent Persistent IP IP labels labels for for your your nodes. nodes. Generates: HACMP configuration: Service Service IP IP Label Label [] [] + + Persistent Persistent IP IP for for Local Local Node Node [] [] + + Persistent Persistent IP IP for for Takeover Takeover Node Node [] [] + + Cluster name: <user supplied application name>_cluster 2 HACMP sites: "sitea" "siteb" 2 HACMP nodes - one per site: use hostname for node name Single XD_data network IP-Alias enabled Includes all inter-connected network interfaces Persistent IP address for each node (optional for single interface networks) One Resource Group Inter-site Management Policy Prefer Primary Site Includes all the GMVGs created by the wizard Application Server One or more Service IPs 25
Site Management Policies & Dependencies The Enterprise Edition appends Inter-Site Management policies beyond the resource group node list - Prefer Primary Site - Online on Either Site - Online on Both Sites Standard Edition allows Site Definitions - Cross Site LVM Configs RG Dependencies: - Online on Same Site - will group RGs into a set - rg_move would move set not an individual resource group - SW will prevent removal of RG without removing dependency first 26
Failure Detection Rate & Disaster Recovery IP Based Networks: Serial Networks: Most customers using Local HA have these by default XD Type Networks have slower Failure Detection rates * PowerHA SystemMirror 7.1 has self tuning FDR with IP Multicasting * * There is no Enterprise Edition available for the 7.1 2010 release 27
PowerHA SM Enterprise: Fallover Recovery Action 2 Policies available: AUTO (default) or MANUAL Fallover Expected Behaviors: MANUAL only prevents a failover based on the status of the replicated volumes at time of node failure, therefore, if replication consistency groups reflect a consistent state a failover will still take place * Example shows SVC menu but same option there for all replication options 28
Manual Recovery - Special Instructions In a scenario where the MANUAL recovery action was selected and a fallover did not occur due to the storage relationships being inconsistent the resource groups will go into an ERROR state and special instructions will be printed to the hacmp.out file 29
DLPAR & Disaster Recovery Processing Flow How many licenses do you need? 1. Activate LPARs 2. Start PowerHA Read Requirements Activate LPARs LPAR Profile Min 1 Desired 1 Max 2 Application Server Min 1 Desired 2 Max 2 Application Server Min 1 Desired 2 Max 2 LPAR Profile Min 1 Desired 1 Max 2 Primary Site DLPAR System A HMC System B DLPAR Cluster 1-1 CPU + 1 CPU Oracle DB 21 CPU Oracle Standby DB 12 CPU + 1 CPU - 1 CPU 3. Release resources Fallover or RG_move 4. Site Fallover or movement of resources to Secondary Site Secondary Site HMC System C DLPAR Standby Oracle DB 21 CPU + 1 CPU 30
Enterprise Edition Command Line Interface Additional Commands available in Enterprise Edition: /usr/es/sbin/cluster/pprc/spprc/cmds/cllscss /usr/es/sbin/cluster/pprc/spprc/cmds/cllsspprc /usr/es/sbin/cluster/pprc/spprc/cmds/cllsdss /usr/es/sbin/cluster/svcpprc/cllssvc /usr/es/sbin/cluster/svcpprc/cllssvcpprc /usr/es/sbin/cluster/svcpprc/cllsrelationship /usr/sbin/rpvstat /usr/sbin/gmvgstat /usr/es/sbin/cluster/sr/cmds/cllssr /usr/es/sbin/cluster/tc/cmds/cllstc DS Metro & Global Mirror Relationships San Volume Controller Metro & Global Mirror Relationships GLVM Resources & Statistics EMC SRDF relationships Hitachi True Copy relationships Knowing these will help identify & manage configuration Various usage examples in the new Enterprise Edition Redbook 31
Cluster Test Tool & Enterprise Edition Available Utility in the base level code Automated Test Tool Custom Test Plans Enterprise Edition appends additional tests that can be included in custom test plans 32
Enterprise Edition: Component Failures & Outcomes Failures may not always occur in an orderly fashion ie. rolling disaster In an Ideal Scenario the entire site goes down Traditional Failures: Server / LPAR Failure standard cluster behavior Storage Subsystem Failure (remember AUTO vs. MANUAL) Selective Fallover behavior on quorum loss will result in movement of RG Most risky: Communication links between the sites fail Tested in Redbook by bringing down XD_IP network interfaces Results will vary based on the storage replication type Results: Standby site will acquire and redirect relationship Lost write access to disks and commands hung Might result in a system crash * Note: Environments in same network segment could experience duplicate IP ERROR messages Intermittent Failure (even worse): - Links back up and then log GS_DOM_MER_ERR (halt of Standby Site) - Entire cluster is now down since access to LUNs is N/A on primary site 33
Reference Diagram for Failure Scenario * Note that there is only one network passing heartbeats between the sites * Did not specify replication type but can probably assume that this was an SVC Metro Mirror configuration based on the name of the states * Arrows should really point in other direction for the replication after the failure Avoiding a partitioned cluster: - More XD_IP networks - Serial over Ethernet - Diskhb networks over the SAN Future Considerations: - Quorum server 34
Recovery from Partitioned Cluster: Recommendations Things to check: State of the cluster nodes (connectivity, HMC, state of interfaces, error report) State of heartbeat communication paths (ie. lssrc ls topsvcs) Consistency of the replicated volumes (CLI will vary by replication type) Status of the data What do you do to recover? Identify cause ASAP Beware of intermittent failures Consider bringing down all nodes on one site (avoid a cluster initiated halt) Hard Reset might be the best approach as graceful stop might hang up attempting to release individual resources (ie. unmount, varyoff with no access to volumes) Check consistency of the data Every application will be different Reintegrate Nodes into cluster accordingly Consider Verify & sync before reintegration 35
When to use each Replication Option Major Factors: Distance between sites Campus DR or Extended Distance Infrastructure & available bandwidth What type of Storage currently being used? Same storage type at both locations? Requirement to use CLI for management of relationships SLA Requirements is HA required after a site fallover? What is the True requirement for automated fallover Recovery Time Objective RTO Recovery Point Objective RPO Extended Distance Offerings Introduction to PowerHA SystemMirror for AIX Enterprise Edition HA20 (AIX) Thursday & Friday Shawn Bodily 36
Enterprise Edition: General Recommendations Clustering, Replication and High Availability solutions are not a replacement for backups Mksysbs Flashcopies Snapshots Testing DR Solutions is the only way to guarantee they will work Testing should be performed at least once or twice a year Will help to identify any other components required outside of the cluster Recovery plan should be well documented & reside at both locations Leverage Cluster functions to ensure success CSPOC User functions guarantee that users are propagated to all cluster nodes User Password cluster management functions will ensure that changes are also updated on all cluster nodes 37
PowerHA SM Enterprise Edition: Value Proposition Difference from Standard Edition Automates IP or Disk based Replication Mechanism Stretch Clusters Campus Style DR Distance based on far you can extend shared storage Why pay more for Campus DR? - use Cross Site LVM Automated Fallover Manual Fallover option (based on state of disks) Enterprise Cluster will automatically trigger a fallover To disable alter start up scripts at DR location Ease of Management One time configuration Location of RG will determine direction of replication Installation, Planning, Maintenance & Expected Behaviors Documented in new DR Redbook SG24-7841 38
Questions? Thank you for your time! 39
Additional Resources New - Disaster Recovery Redbook SG24-7841 - Exploiting PowerHA SystemMirror Enterprise Edition for AIX http://www.redbooks.ibm.com/abstracts/sg247841.html?open New - RedGuide: High Availability and Disaster Recovery Planning: Next-Generation Solutions for Multi server IBM Power Systems Environments http://www.redbooks.ibm.com/abstracts/redp4669.html?open Online Documentation http://www-03.ibm.com/systems/p/library/hacmp_docs.html PowerHA SystemMirror Marketing Page http://www-03.ibm.com/systems/p/ha/ PowerHA SystemMirror Wiki Page http://www-941.ibm.com/collaboration/wiki/display/wikiptype/high+availability PowerHA SystemMirror ( HACMP ) Redbooks http://www.redbooks.ibm.com/cgi-bin/searchsite.cgi?query=hacmp 40