MIMIX. Version 7.0 MIMIX Global Operations 5250

Size: px
Start display at page:

Download "MIMIX. Version 7.0 MIMIX Global Operations 5250"

Transcription

1 MIMIX Version 7.0 MIMIX Global Operations 5250

2 Published: September 2010 level Copyrights, Trademarks, and tices Contents Version 7.0 MIMIX Global Operations 5250 Who this book is for... 5 What is in this book... 5 The MIMIX documentation set... 5 Chapter 1 Clustering Introduction 7 What is clustering?... 7 Clustering terminology... 8 Chapter 2 The MIMIX Global Solution 10 Simplified cluster configuration and management Cluster management from any system Application groups to enhance and simplify switching Customized automation scripting for switch processing Additional monitoring capability Simplified cluster administrative domain configuration Support for data protection technologies Journaling-based replication Switchable independent ASPs Geographic mirroring Mirrored SAN environments Requirements and considerations Chapter 3 Clustering Overview 18 Components of the IBM clustering framework Cluster resource services Support for logical replication Support for switchable device resources Support for resilient operational environments Clustering concepts Recovery domain Cluster events Failover Switchover Partition Rejoin System distress messages Resilient applications Chapter 4 Introduction 25 Common terms used throughout this document Effect of data group sets on controlling logical replication Data group set examples

3 Chapter 5 Status 30 Checking application group status Resolving problems reported in the Monitors field Resolving problems reported in the tifications field Resolving problems reported in Status columns Resolving a procedure status problem Resolving an *ATTN status for an application group Resolving other common status values for an application group Status for Work with de Entries Status for Work with Data Resource Group Entries Chapter 6 Working with status of procedures and steps 45 Displaying status of procedures Displaying status of the last run of all procedures Displaying available status history of procedure runs Resolving problems with procedure status Responding to a procedure in *MSGW status Resolving a *FAILED or *CANCELED procedure status Displaying status of steps within a procedure run Resolving problems with step status Responding to a step with a *MSGW status Resolving *CANCEL or *FAILED step statuses Changing status of a procedure Running a procedure of type *USER Canceling a procedure Chapter 7 Basic Operations 60 Starting MIMIX Ending MIMIX for non-dedicated backup processing Chapter 8 Advanced Operations 62 Choosing how to end MIMIX for restricted state processing or an IPL Ending MIMIX as part of an application outage Ending MIMIX as part of an environment outage Ending MIMIX as part of an isolated outage Verifying the sequence of the recovery domain Changing the sequence of backup nodes Examples of changing the backup sequence Chapter 9 Switching 73 Switch processing Performing a planned switch Responding to an unplanned switch Monitoring a switch in progress Monitoring data groups during a switch Check data groups for a specific application group Chapter 10 System Maintenance 76 Installing MIMIX service packs Installing service packs for instances in the system database Installing service packs for instances in a switchable independent ASP

4 MIMIX configuration under clustering System definitions Transfer definitions Data group definitions Index 83 4

5 Who this book is for Who this book is for What is in this book The MIMIX Global Operations book is for administrators and operators in an IBM i clustering environment who either use the basic clustering support provided within MIMIX or who use MIMIX Global to integrate cluster management with MIMIX logical replication or supported hardware-based replication techniques. The MIMIX Global Operations book describes basic clustering concepts and identifies supported clustering implementations. It provides operational procedures for addressing status issues as well as for normal operations such as starting, ending, or switching replication. The MIMIX documentation set The following documents about MIMIX products are available: Using License Manager This book describes software requirements, system security, and other planning considerations for installing MIMIX software and software fixes. The preferred way to obtain license keys and install software is by using AutoValidate and the MIMIX Installation Wizard. However, if you cannot use them, this book provides instructions for obtaining licenses and installing software from a 5250 emulator. This book also describes how to use the additional security functions from Vision Solutions which are available for MIMIX products and commands through License Manager. Also, to support compatible previous releases, this book includes requirements and troubleshooting information for MIMIX Availability Manager. MIMIX Administrator Reference This book provides detailed conceptual, configuration, and programming information for MIMIX Enterprise and MIMIX Professional. It includes checklists for setting up several common configurations, information for planning what to replicate, and detailed advanced configuration topics for custom needs. It also identifies what information can be returned in outfiles if used in automation. MIMIX Global Operations This book provides high level concepts and operational procedures for MIMIX Global users in an IBM i cluster environment. This book focuses on addressing problems reported in status and basic operational procedures such as starting, ending, and switching. MIMIX Operations This book provides high level concepts and operational procedures for managing your high availability environment using MIMIX Enterprise or MIMIX Professional from a 5250 emulator. This book focuses on tasks typically performed by an operator, such as checking status, starting or stopping replication, performing audits, and basic problem resolution. 5

6 Using MIMIX Monitor This book describes how to use the MIMIX Monitor user and programming interfaces available with MIMIX Enterprise or MIMIX Professional. This book also includes programming information about MIMIX Model Switch Framework and support for hardware switching. Using MIMIX Promoter This book describes how to use MIMIX commands for copying and reorganizing active files. MIMIX Promoter is available with MIMIX Enterprise only. MIMIX for IBM WebSphere MQ This book identifies requirements for the MIMIX for MQ feature which supports replication in IBM WebSphere MQ environments. This book describes how to configure MIMIX for this environment and how to perform the initial synchronization and initial startup. Once configured and started, all other operations are performed as described in the MIMIX Operations book. 6

7 What is clustering? CHAPTER 1 Clustering Introduction IBM System i clustering integrates the functionality of reliable hardware, high availability software, and applications software into a robust, highly available, and resilient computing environment. MIMIX Global adds significant value to clustering environments by providing cluster management that focuses on applications, integrates multiple data protection technologies, and provides enhanced switching capabilities. This chapter defines clustering and its commonly used terminology. The MIMIX Global Solution on page 10 provides details of the value added by MIMIX Global. Clustering Overview on page 18 provides background information and a conceptual overview of System i clustering. What is clustering? Clustering is architected to facilitate continuous availability. Whether you experience a system outage, a site loss, or need planned downtime for system maintenance, access to the functions provided on a clustered system can be switched over to one or more other systems that contain a current copy of the critical application data. The fundamental concept of clustering is that of resilient resources -- data, processes, applications, and devices that can be recovered if the system on which they exist fails. With clustering, resources are located on more than one system. If the system which is the primary access point for a particular set of resilient resources should fail, a pre-defined backup system becomes the primary access point. System i clustering is designed so that each of the necessary components can work together to provide continuous availability for data and applications. The primary components of clustering, shown in Figure 1, are: Cluster framework - functions and programming interfaces provided by IBM i and additional licensed programs Resilient applications - software that is made resilient through exit programs which enable it to be controlled in a clustering environment, created with assistance from a high availability business partner (HABP) Resilient data - middleware software for a variety of high availability technologies and the exit programs which enable it to be controlled in a clustering environment, provided by Vision Solutions or IBM Cluster management - tools and user interfaces provided by IBM and by Vision Solutions 7

8 Clustering Introduction Figure 1..Components of a System i cluster solution Clustering terminology The following list introduces IBM terminology used for clustering. A cluster is a collection of interconnected complete computers that work together as a single unified computing resource. The cluster is made up of one or more cluster nodes and is identified by a name comprised of 10 or fewer characters. A cluster node is any system or logical partition (LPAR) that is a member of a cluster. System i clustering supports up to 128 nodes in a cluster, but each node can be defined to only one cluster at a time. The set of nodes that are defined to a cluster is referred to as the cluster membership list. Cluster communication uses TCP/IP protocol to provide communication paths between each node in the cluster. A node must be connected to the cluster using an IP network in order to communicate with other nodes in the cluster. IBM recommends that you establish a dedicated path that is not shared by users or other network traffic. For simplicity, the name of a node is often the same name as the host or system name. This name is then mapped to an 8-character cluster node identifier that is associated with one or more Internet Protocol (IP) addresses that represent a system. Cluster resources are the resources that are required to be highly available by your business and are available to the nodes within a cluster. Cluster resources can be either moved or replicated to one or more nodes within a cluster. Examples include applications, data libraries, devices, and disk units. Resources are identified in cluster resource groups and controlled through cluster resource group exit programs. A cluster resource group (CRG) is an IBM i system object that identifies a collection of cluster resources to be monitored and managed as a single unit. Each CRG defines the relationship between the nodes associated with those resources in a recovery domain that determines role of each node in the CRG as well as the degree to which each can participate in events such as synchronizing or performing a recovery action. Several types of CRGs are available. Each of the following CRG 8

9 What is clustering? types is designed for a specific type of cluster resource: application, data, device, and peer. Each CRG has a CRG exit program that is called on each active node in the CRG s recovery domain in response to a cluster event. The exit program manages cluster events for the environment established by the CRG. All possible cluster events have a pre-determined response in the exit program code. 9

10 The MIMIX Global Solution CHAPTER 2 The MIMIX Global Solution MIMIX Global leverages the architected relationships within IBM System i clustering and Vision expertise with high availability solutions to provide maximum protection and maximum flexibility to clustered environments. MIMIX Global offers cluster management that raises the focus of your interaction with a cluster to the to the level of applications. This adds significant value by enabling the following: A single point for monitoring nodes and controlled recovery from failover events Support for and coordinated control of multiple data protection technologies, including hybrid environments A single button switch that coordinates switching your applications in conjunction with the data protection technologies in use Integration of customized scripts to automate actions during switching Automation that detects problems that would affect your recovery time objectives (RTOs) before you need to switch Simplified configuration for the cluster administration domain Seamless integration of cluster management with high availability solutions from Vision Solutions Simplified cluster configuration and management Unique features of cluster management capabilities provided by MIMIX Global include the ability to manage the cluster from any system, simplified configuration and control in complex environments, application level switching, and automation to detect potential problems before switching. Cluster management from any system Simply stated, multi-management support enables MIMIX Global as well as MIMIX replication products to be managed from more than one node in the cluster. In addition, multi-management support also enables other significant benefits. A key benefit for clustering is that multi-management support ensures that configuration information about data to be replicated is kept synchronized across nodes in the cluster. Even though a node may not be actively participating in data replication, it can easily assume the role of a backup node or the primary node because its replication configuration remains current. Multi-management support can simplify the configuration, operation, and maintenance of multi-node data replication environments. For example, many data replication products are designed to function between two nodes, a production and a backup. Without multi-management support, a classic one-to-many network with three nodes would typically require multiple instances of a replication product to 10

11 Simplified cluster configuration and management provide high availability and disaster recovery for the production system. Each instance would require separate configuration, operation, and maintenance. In the event that the production system is not available, each instance would have to be switched separately and management of the switching operation would require significant manual operations and planning. Multi-management support can significantly reduce, and may even eliminate, the need for multiple instances of the replication product. Multi-management support enables MIMIX Global to manage complex data replication configurations within a cluster along with all other cluster management operations from a single instance. Application groups to enhance and simplify switching MIMIX Global provides the application group construct to enhance the ability to group and control resources in a way that maintains relationships between resources when responding to cluster events. An application group identifies an application, including the application name, release level, its IP takeover address, an exit program to control the various actions, and the data associated with the application. An application group also includes the node entries that define the recovery domain for the application. The most significant benefit of application groups is that they provide the ability to control switching of all data replication associated with an application in a single switch request. When a cluster event occurs, the application group provides a coordinated response for all of its associated resources. Application groups also integrate and simplify cluster management and data replication management. Application groups can also be used when there is no application, only data that needs to be resilient. Customized automation scripting for switch processing MIMIX Global has the ability include customized scripts within the exit programs called by application groups to automate processes, enabling faster switch times. Certified consultants can create customized scripts for switch processing that automate a sequence of actions to be performed by application or data exit programs. Error handling for the result of each action can be included in the script. Exit programs shipped with MIMIX Global use this scripting capability to control starting, stopping, and switching replication from application groups. Additional monitoring capability MIMIX Global provides the following additional monitoring capabilities: Potential independent ASP overflow conditions - MIMIX Global provides improved capability for detecting potential independent ASP overflow conditions that put your high availability solution at risk due to insufficient storage. If an independent ASP overflows data may be lost and applications may no longer function. Consolidated application attention status - An application group status monitor is created when a MIMIX Global application group is created and notifies you of any 11

12 The MIMIX Global Solution conditions that require attention (*ATTN) for the application group. Conditions that cause prolonged switching when hardware is used as the HA technology - A switch delay monitor notifies you when conditions exist that may require action to prevent a prolonged switch that could jeopardize recovery time objectives (RTO). New objects not in the cluster administrative domain - The cluster administrative domain monitor periodically checks for new objects to be added to the IBM cluster administrative domain. The monitor will add a Monitored Resource Entry to the administrative domain for all objects that have been found. Simplified cluster administrative domain configuration MIMIX Global provides the ability to use generics when configuring the cluster administrative domain. Also, MIMIX Global includes the ability to automatically remove manage resource entries for deleted objects that have been flagged as failed. Support for data protection technologies IBM i clustering support is not aware of whether data is resilient. Instead, clustering provides interfaces through which other software products that provide data protection and resiliency can integrate with clustering support. Data resiliency is established through a data CRG by the synchronization and replication of data and objects from the primary node to a backup node in the recovery domain. The IBM System i supports the following data protection technologies for use with clustering: Journaling Switchable independent ASPs Mirrored independent ASPs and mirrored external storage (SAN-based) functions MIMIX Global fully supports all of these technologies through exit programs and provides coordination of cluster activities in hybrid environments which include combinations of data technologies. In addition, MIMIX Global seamlessly integrates cluster management with high availability solutions from Vision Solutions. Journaling-based replication Journal-based replication occurs at the object level. The operating system tracks changes to objects in user journals and the system journal. High availability software products from Vision Solutions read the journals in order to replicate them and apply the changes to a backup system. 12

13 Support for data protection technologies Journal-based data replication can take advantage of robust functionality and enhancements, including remote journaling, user journal support for non-database object types, and minimized journal data. Figure 2. Logical replication using remote journaling MIMIX Global provides data CRG exit programs which interface with Vision s high availability software products for data replication. MIMIX Global integrates its cluster management capabilities with journal-based replication solutions to ensure that the backup node has all data and other information necessary to run critical production applications and jobs following a failover or switchover. Data on backup systems can be made available for other activities, such as queries and saves. Because data is ready at the journal level, switches are fast. MIMIX Global simplifies control of broadcast-to-many environments and is not limited by distance for geographic dispersion of systems. Switchable independent ASPs When data replication is performed by switching independent auxiliary storage pools (ASPs), a single copy of the data exists. A device CRG identifies a collection of disks that can be varied on and off independently of the system ASP and basic ASPs (SYSBAS) in the event of a failover or switchover. One scenario using switchable independent ASPs is to switch a disk pool between LPARs on the same physical system (Figure 3). 13

14 The MIMIX Global Solution Figure 3. LPAR implementation of switchable disk pool In environments which implement switchable independent ASPs, MIMIX Global provides device CRG exit programs and manages switching. MIMIX Global also provides additional monitoring for and notification of conditions that may require action in order to prevent a prolonged switch. Switchable independent ASP support combined with logical replication software from Vision Solutions provides enhanced high availability and disaster recovery. Geographic mirroring Geographic mirroring is an independent ASP solution in which the IBM i operating system replicates data from independent ASPs at the memory page level (Figure 4). The independent ASP copy on the backup node is available for use after a detach operation stops its participation in the replication process. Geographic mirroring supports integrated environments (AIX, Linux, Windows). Synchronous operation with a 10 mile maximum distance is recommended. Because a failure could result in data loss for heavily used objects, journaling is also recommended. In environments which implement geographic mirroring, MIMIX Global provides device CRG exit programs and manages switching. MIMIX Global also provides additional monitoring for and notification of conditions that may require action in order to prevent a prolonged switch. Geographic mirroring support combined with logical replication software from Vision Solutions provides enhanced high availability and disaster recovery. 14

15 Support for data protection technologies Figure 4. Geographic mirroring Mirrored SAN environments Storage area network (SAN) environments perform replication at the disk sector level between two storage servers. Peer to Peer Remote Copy (PPRC) provides a realtime copy on the backup storage server. te: A basic SAN environment without the use of PPRC mirroring software is not a cluster-enabled environment. Basic SAN is a disaster recovery solution that does not require independent ASPs. If it is implemented within a cluster, an abnormal IPL is required on a failover. Metro mirroring (Figure 5) and global mirroring (Figure 6) are SAN implementations which integrate IBM TotalStorage PPRC functions with independent ASPs and clustering. Both solutions require the 5761-HAS, ihasm licensed program as well as licensed programs for Metro or Global PPRC software. For metro mirroring, synchronous replication with a 300 kilometer maximum distance is recommended. Figure 5. Metro mirroring example 15

16 The MIMIX Global Solution Global mirroring supports asynchronous replication over unlimited distances. Figure 6. Global mirroring example In both solutions, journaling is recommended to aid in recovery. MIMIX Global supports all storage solutions. When combined with logical replication software from Vision Solutions, MIMIX Global provides disaster recovery for data that is not within an independent ASP as well as additional monitoring for and notification of conditions that may require action in order to prevent a prolonged switch. Requirements and considerations The following requirements must be met for clustering: Clustering requires that the Internet Daemon (INETD) server is running on all nodes at all times. Consider the following: The Internet Daemon uses ports 5550 and The Internet Daemon should be configured to automatically start with TCP/IP on all cluster nodes. This should be done through iseries Navigator during cluster configuration. The Internet Daemon server will not start on any node where the QUSER user profile has *ALLOBJ special authority. The value of the network attribute ALWADDCLU (Allow Add Cluster) must be set to *ANY on all nodes that will be added to the cluster configuration. The system value QMLTTHDACN (Multithreaded job action) cannot be set to a value of 3 on any node in the cluster. Any address to be used for IP takeover must be available for use on every switchable node within the cluster. Reserve the IP addresses but do not create them in advance. The operating system will create them during cluster configuration. Once the addresses are created, they should not be configured to start automatically. The user profile used to run CRG exit programs must exist on all nodes in the 16

17 Requirements and considerations recovery domain for the CRG and must have *IOSYSCFG special authority. MIMIX Global requires a valid access code. If you do not have a valid *MIMIXCLU code, contact your Visions Solutions Sales Representative. 17

18 Clustering Overview CHAPTER 3 Clustering Overview This chapter provides background information and a conceptual overview of System i clustering. Components of the IBM clustering framework The IBM i operating system provides the underlying architecture, functions, and services necessary for clustering. Since its initial debut, clustering support has expanded to provide additional support for switchable independent ASPs, operational environments, and storage area network (SAN) technologies through separately priced options and licensed programs. Table 1. Product IBM software support for clustering Provides 5722-SS1 or 5761-SS1, IBM i - Base Core clustering functions via cluster resource services Enables management of maintenance switchover and failure recovery at the application level de monitoring and system failure detection 5722-SS1 or 5761-SS1, option 41 - HA Switchable Resources 5761-HAS, IBM System i High Availability Solutions Manager (HASM) Available on IBM i 6.1 or higher Support for switchable independent ASPs and using Management Central GUI in iseries Navigator Support for mirrored storage-based data replication technologies such as metro and global mirroring 1. Most CL commands for cluster control GUI interfaces within IBM Systems Director Navigator for i5/os for Cluster Resource Services and HASM 1. Additional IBM software is required for implementations which include metro or global mirroring Cluster resource services IBM Cluster Resource Services is part of the base operating system. Cluster resource services provides the integrated services and application programming interfaces (APIs) necessary to create and manage a cluster. This includes: Heartbeat monitoring - Heartbeat monitoring ensures that each node (system) in the cluster is active. At regular intervals, each active node in the cluster conveys that it is active by sending a signal to its adjacent nodes. Each node expects an acknowledgment to the heartbeat it sent out as well as an incoming heartbeat from the adjacent node. If a node misses sending a heartbeat for a predetermined 18

19 Components of the IBM clustering framework number of consecutive heartbeats, a heartbeat failure is signaled. Cluster resource services determines what event to initiate after considering the role of the failing node and whether the failure can be confirmed by a distress message. If the failure cannot be confirmed, cluster resource services will partition the cluster. Reliable messaging - The reliable messaging function keeps track of all nodes within a cluster and ensures that all nodes have consistent information about the state of cluster resources. Any status change for a node is broadcast along with a reason code. Retry and timeout values determine how many times a message can be sent to a node before signaling a failure or partition event. More time is allowed on remote networks. Switchover administration - Cluster resource services maintains the hierarchy of each node when a switchover or failover occurs. The hierarchy, called the recovery domain, determines which node assumes the role of the primary node. Distributed activities - Distributed activities provide the synchronization of actions across the nodes, or a subset of nodes, in a cluster to ensure that all of the nodes affected by the action are involved and that results are consistently reflected across the cluster. Parallel jobs - A set of parallel jobs are used to control the cluster, resources defined to the cluster, perform user and exit program requests, and interact with subsystems for highly available applications. APIs - Application programming interfaces (APIs) provide the ability to create clusters, add or remove nodes, and create and manage the system objects which identify groups of cluster resources. IP address takeover - The IP address takeover function allows access to an application or device without regard to the system on which the application is running or to where the device is varied on. A floating IP address is switched from the primary node to a backup node without requiring the re-configuration of clients. IP takeover is a key component in providing application resiliency and device resiliency. Resiliency support - Application, data, and device resiliency depend upon cluster resource group (*CRG) system objects. Cluster resource services provides the ability for users and programs to allocate resources to and manage these objects. The characteristics of messaging and heartbeat monitoring can be adjusted to match the performance of the network. Support for logical replication System i clustering does not support logical replication directly and is not aware of whether data is resilient or not. Middleware software from high availability business partners (HABPs) can be used in a clustering environment to provide data resiliency. Clustering uses data CRGs as the means to interact with software products that perform logical replication using journaling techniques supported by the IBM i operating system. 19

20 Clustering Overview Support for switchable device resources IBM i option 41 (HA Switchable Resources) provides the ability to manage the information necessary to switch access to independent auxiliary storage pools (independent ASPs) from one node to another through device domains. Independent ASPs are those numbered from 33 to 256. IBM i option 41 must be installed with a valid license key before this support can be used with a device CRG. A device domain is a subset of nodes in a cluster that share device resources, or the logical resources associated with the devices, and which can participate in a switching action. All nodes in a device domain need information about the included resources so that no conflicts occur when the devices are switched. For example, for a collection of switched disks, the independent disk pool identification, disk unit assignments, and virtual address assignments must be unique across the entire device domain. A cluster node can belong to only one device domain. A node must be defined as member of device domain before it can be added to the recovery domain for a device CRG. All nodes in a recovery domain for a device CRG must be in the same device domain. des can be added to and removed from device domains as needed. Figure 7 shows an example of a device domain within a cluster. A device CRG identifies nodes A and B as the domain which can share an independent disk pool. de C is not part of the device domain. The devices in the disk pool are accessible (varied on) on de A. When a switchover occurs the devices are varied off on node A, then made available (varied on) on node B. Figure 7. Device domain example. Support for resilient operational environments Clustering supports the use of a cluster administrative domain to maintain a consistent operational environment across nodes in a cluster. Applications often require specific system settings or other environmental conditions collectively known as an operational environment. This may include configuration 20

21 Clustering concepts parameters or data, user profiles, job descriptions, as well as system values, network attributes, system environment variables, and subsystem descriptions. Within a high availability environment, the operational environment must be the same on every node where an application can run or store its data. The resources identified to a cluster administrative domain, called monitored resource entries (MREs), are also identified in an associated peer CRG. The cluster administrative domain monitors the resources for changes and synchronizes any changes across the active domain. Once the domain is created, normal CRG functions are used to manage it. Each node can be defined in only one cluster administrative domain within the cluster. Figure 8. Cluster administration domain example Figure 8 shows an example of a four-node cluster with a cluster administrative domain. Each node is an LPAR, with two LPARs in each system. LPAR 1 is the normal production system. The node roles shown are those of the peer CRG associated with the domain. de roles for application CRGs and data CRGs used in the normal production environment are not shown. Clustering concepts Recovery domain This topic describes significant constructs which clustering uses to identify and control cluster resources. These constructs and the concepts associated with them are fundamental to any clustering discussions and appear in user interfaces. Each CRG defines its own recovery domain. The recovery domain identifies the current role and preferred role of each node within the CRG. While the current role of a node may change, the preferred role is identified when the CRG is created. The recovery domain also determines the order in which nodes can become the primary access point in the event of an outage. des can have the following roles: 21

22 Clustering Overview Cluster events Primary - The node is the primary access point for the resources associated with the CRG. Only one primary node is allowed. For an application CRG, this is the node where the application is currently running. For a data CRG, this node contains the principle copy of the resources. For a device CRG, this node is the current owner of the devices in the CRG. The primary role is not supported for a peer CRG. Backup - A backup node will take over the role of the primary access point for resources associated with the CRG in the event of an outage on the primary node. A backup node either contains a copy of the resources that is kept current by replication software, or contains the IP takeover address and activation instructions necessary to access the resources. The recovery domain determines the sequence in which the exit program will attempt to activate backup nodes during a switchover or failover. This role is not supported for peer CRGs. Replicate - A replicate node contains a copy of the resources associated with the CRG, but cannot participate in a switchover or failover. Replicate nodes are optional. They are reserved for those systems which are either not powerful enough to host applications, are used for queries and reports only, or are perhaps used as a data warehouse server. For peer CRGs, nodes defined as replicate represent an inactive access point. While any CRG type can be defined and managed through MIMIX Global, data CRGs are associated with replication activities. Clustering identifies over twenty cluster events that affect the ability of a node to participate in a cluster. On each node, cluster resource services monitors for and detects these events, to which the CRG exit programs on all nodes respond. The current and preferred roles of a node determine how cluster resource services and CRG exit programs on each node respond to the event. The following events have a significant effect on maintaining availability: failover, switchover, and partition. Failover A failover occurs when cluster resource services responds to a failure of a primary node by switching the access point for cluster resources normally accessed from that node to the first available backup node. For each CRG in which the failing node is the primary node, the access point to the CRG resources is switched to a backup node according to the recovery domain. If the first backup node is not available, the next backup node in the recovery domain is used. When multiple CRGs are involved in a failover, device CRGs are processed first, the data CRGs second, followed by application CRGs. If the cause of the failover is resolved, the failover can be cancelled. The CRG message queue provides the mechanism to cancel a failover. 22

23 Clustering concepts Switchover A switchover differs from a failover in how the request is initiated. A switchover is a user request, via a program or the cluster manager interface, to switch the primary access point for resources in a CRG from the specified node to a backup node. The backup node to use is determined by the current recovery domain. Switchovers are typically requested in order to perform system maintenance, such as applying program temporary fixes (PTFs), installing a new release, upgrading the system, or to test the switching process. The relationships between CRGs must be considered when specifying the order in which to switchover multiple CRGs. Figure 9. Simple switchover example in a two-node cluster, illustrating IP takeover for applications Partition A partition occurs when a node loses contact other nodes and cluster resource services cannot confirm that the node failed. When a partitioned state is in effect, cluster resources restricts some the of actions that can be performed by CRG exit programs within the partition. Partitions are typically caused by communications problems or by a system failure that was not confirmed by a distress message. Rejoin Rejoin is the term used for the process of a node becoming an active member of a cluster after having been a non-participating member. A rejoin ensures that CRGs (specifically, the *CRG object) are identical on all active recovery domain nodes. 23

24 Clustering Overview System distress messages On each node, the operating system will broadcast a distress message when it detects that the system is about to fail or is being shut down. Cluster resource services will respond to a distress message broadcast on the cluster communications network. Examples of when a node would initiate a distress message include, but are not limited to, ending all subsystems or those in which cluster jobs or exit programs reside, when a delayed power down is requested from the Hardware Management Console (HMC), or when UPS battery is drained while operating on backup power. It the failing node is a primary node, cluster resource services will initiate a failover. If the node is not the primary node, the node is ended and no longer participates in the cluster until user action is taken. Resilient applications Resilient applications are those which have the ability to be automatically ended on one node, switched to another node, and started without requiring manual reconfiguration by users. Application resiliency is based on IP address takeover and depends on the use of an application CRG. The application must be able to recognize the loss of the Internet Protocol (IP) connection between the client and the server. During a switchover, the client application must be aware that the IP connection will be temporarily unavailable and must retry access rather than ending. During a failover, the application must recognize that the IP connection is not available and respond to the error condition by ending normally. Application resilience enables better utilization of data resilience. To be considered resilient, an application must also provide the following: An application CRG exit program that handles cluster events. Automated data areas that identify information necessary to set up a resilient environment for the application and its associated data. One or more object specifier file (OSF) that identify the objects associated with the application that must be made resilient. 24

25 Common terms used throughout this document CHAPTER 4 Introduction This document has two main purposes: 1. To document basic operational guidelines and procedures 2. To provide detailed documentation specific to using MIMIX Global Where applicable, this document includes detailed operational, audit, and switching procedures for your availability solution. These procedures are the best practices recommendations that are the result of customer feedback to Certified MIMIX Consultants on the Vision Solutions services team. Common terms used throughout this document The following terms used in this document are defined as follows: backup node - A backup node will take over the role of the primary access point for resources associated with the cluster resource group (CRG) in the event of an outage on the primary node. A backup node contains a copy of the resources that is kept current by replication software. The recovery domain determines the sequence in which the exit program will attempt to activate backup nodes during a switchover or failover. The role of backup node is not supported for peer CRGs. cluster - A cluster is a collection of interconnected complete computers that work together as a single unified computing resource. The cluster is made up of one or more cluster nodes and is identified by a name comprised of 10 or fewer characters. cluster resources - Cluster resources are the resources that are required to be highly available by your business and are available to the nodes within a cluster. Cluster resources can be either moved or replicated to one or more nodes within a cluster. Examples include applications, data libraries, devices, and disk units. Resources are identified in cluster resource groups and controlled through cluster resource group exit programs. cluster resource group (CRG) - A cluster resource group is an IBM i system object that identifies a collection of cluster resources to be monitored and managed as a single unit. Each CRG defines the relationship between the nodes associated with those resources in a recovery domain that determines role of each node in the CRG as well as the degree to which each can participate in events such as synchronizing or performing a recovery action. Several types of CRGs are available. Each of the following CRG types is designed for a specific type of cluster resource: application, data, device, and peer. CRG exit program - Each CRG has a CRG exit program that is called on each active node in the CRG s recovery domain in response to a cluster event. The exit program manages cluster events for the environment established by the CRG. All possible cluster events have a pre-determined response in the exit program code. Broadcast replication - A broadcast replication configuration consists of three or 25

26 Introduction more nodes where a single source node feeds two or more target nodes. For example, in a three-node broadcast replication, system A is the source node to both system B and system C. Cascade replication - A cascade replication configuration consists of three or more nodes in series. For example, a three node cascade replication starts with system A as the source node for system B. System B is the source node for System C. Data group set - A data group set is the total number of data groups needed to enable replication between all nodes in a cluster. The first part of the three-part name of each data group in the set is the same. t all of the data groups in the set will be active at the same time. de - A node refers to one of two or more logical system definitions that make up a valid replication instance. For non-lpar systems, a node represents the entire system footprint. For LPAR systems, a node represents one of the LPAR partitions. Peer node - A node identified as peer has no order within the recovery domain. The peer role is only supported by peer CRGs. The access point to the resources in the peer CRG is controlled by the cluster management application. Primary node - This node is the primary access point for the resources associated with a CRG. Only one primary node is allowed. For an application CRG, this is the node where the application is currently running. For a data CRG, this node is the source of data for the resources to be replicated. For a device CRG, this node is the current owner of the devices in the CRG. The role of primary node is not supported for a peer CRG. Recovery domain - A recovery domain identifies the current role of each node within the CRG. The recovery domain also determines the order in which nodes can become the primary access point in the event of an outage. Each CRG defines its own recovery domain. Replicate node - A replicate node contains a copy of the resources associated with the CRG, but cannot participate in a switchover or failover. Replicate nodes are optional. They are reserved for those systems which are either not powerful enough to host applications, are used for queries and reports only, or are perhaps used as a data warehouse server. For peer CRGs, nodes defined as replicate represent an inactive access point. Replication instance - A replication instance refers to a group of nodes that make up your replication environment. Simple replication - A simple replication configuration consists of two nodes, a source node (primary) and a target node (backup). Effect of data group sets on controlling logical replication By definition, a data group is a MIMIX construct used to control the logical replication of data between two nodes (systems). Clustering environments usually involve more 26

27 Effect of data group sets on controlling logical replication than two nodes. In a clustering environment with three or more nodes, multiple data groups must be configured to ensure that data can flow between any nodes in the cluster. The total number of data groups needed to enable replication between all nodes in a cluster is known as the data group set. In clusters with three or more nodes, at least one data group within the data group set is disabled at any given time. The data groups associated with the current primary node are enabled and data groups associated with only backup nodes are disabled to ensure that data from the primary node can be replicated to only the expected nodes. Disabled data groups are associated with backup nodes. Only one user journal can be identified as the source of replication for a data group. Replicating from a second journal requires a second data group. Similarly, in a clustering environment, each source user journal is associated with a data group in a data group set. Replication from multiple source journals on a node requires multiple data group sets. When starting or ending logical replication in a clustering environment, it may be necessary to invoke more than one command request to ensure that all of the selected processes for all data groups on a node have been addressed. This is because the three-part name of a data group definition only identifies systems, not system roles within the data group or node roles within the cluster. MIMIX can determine the role of a specified system within a data group but it cannot determine whether what you specify will select all processes for all data groups on a specific node. You can either determine the source system of each data group that includes the node you want and tailor your command requests accordingly, or you can adopt the practice of always invoking two requests which specify the data group definition as follows: First request: DGDFN (*ALL *ALL node) Second request: DGDFN (*ALL node *ALL) Data group set examples The following examples illustrate how data group set affects procedures for data group activity. Data group set example - Table 2 shows a three-node cluster that is configured to replicate from two unique source journals. Two sets of data groups are required, one for each journal. Each data group set contains the necessary data groups to allow any node in the cluster to become the primary node. Table 2 illustrates these concepts: Since only one node can be the primary node, data groups defined between nodes which do not include the primary node must be disabled. In this example, one data group in the set must be disabled at any given time. The primary node cannot be determined from just the three-part name of a data group definition. The three-part name only identifies systems, not system roles 27

28 Introduction within the data group or node roles within the cluster. Table 2. Example of a data group set. Cluster des Data Group Sets Name System1 System2 DG1 A B DG1 A C DG1 B C Name System1 System2 DG2 B A DG2 C A DG2 C B Working with target-side only replication processes example - When resolving data group issues, at times it may be appropriate to start or end only the processes which run on the affected node. The key concept to remember is to consider the entire node, not just a data group. While MIMIX commands for starting and ending data groups (STRDG and ENDDG) permit specifying only source or target processes, these commands are not nodeaware; that is, they cannot ensure that the specified processes will be acted upon on all data groups on a specific node. Within a data group, the STRDG or ENDDG command can determine replication roles of each system (source or target) by evaluating the data group s data source parameter (DTASRC) but the commands are not aware of the cluster node roles (primary, backup, or replicate). Therefore, you may need to request the STRDG or ENDDG command multiple times to achieve the expected results. Consider a cluster which has the data group set identified in Table 2. For this example, node A is the primary node and you want to take action on replication processes on node B. Data groups DG1 B C and DG2 C B are disabled. Table 3 illustrates that it is better to issue multiple command requests than to not have all the appropriate processes which run on a node selected. Each row shows a variation for specifying the data group definition (DGDFN) on a request scoped to only target processes PRC(*ALLTGT). When only one command is used, either row is not sufficient to ensure that all target processes on node B are ended. The Result column illustrates the variations in results due to which system is considered source by the data group. It also illustrates that the cluster role of primary node does not necessarily correlate to the data group role of source system. At times, such as following a switch, the primary node role and data group source role are not the same system. Using two commands corresponding to the two rows in Table 3 will ensure that all of the appropriate processes are selected for action. 28

Availability Implementing high availability

Availability Implementing high availability System i Availability Implementing high availability Version 6 Release 1 System i Availability Implementing high availability Version 6 Release 1 Note Before using this information and the product it

More information

IBM. Availability Implementing high availability. IBM i 7.1

IBM. Availability Implementing high availability. IBM i 7.1 IBM IBM i Availability Implementing high availability 7.1 IBM IBM i Availability Implementing high availability 7.1 Note Before using this information and the product it supports, read the information

More information

Availability Implementing High Availability with the solution-based approach Operator's guide

Availability Implementing High Availability with the solution-based approach Operator's guide System i Availability Implementing High Availability with the solution-based approach Operator's guide Version 6 Release 1 System i Availability Implementing High Availability with the solution-based

More information

MIMIX Availability. Version 7.1 MIMIX Operations 5250

MIMIX Availability. Version 7.1 MIMIX Operations 5250 MIMIX Availability Version 7.1 MIMIX Operations 5250 Notices MIMIX Operations - 5250 User Guide April 2014 Version: 7.1.21.00 Copyright 1999, 2014 Vision Solutions, Inc. All rights reserved. The information

More information

IBM i Version 7 Release 3. Availability Implementing high availability IBM

IBM i Version 7 Release 3. Availability Implementing high availability IBM IBM i Version 7 Release 3 Availability Implementing high availability IBM IBM i Version 7 Release 3 Availability Implementing high availability IBM Note Before using this information and the product it

More information

Availability Implementing High Availability with the task-based approach

Availability Implementing High Availability with the task-based approach System i Availability Implementing High Availability with the task-based approach Version 6 Release 1 System i Availability Implementing High Availability with the task-based approach Version 6 Release

More information

MIMIX. Version 7.0 Using License Manager. Installation and Security Information for MIMIX Products

MIMIX. Version 7.0 Using License Manager. Installation and Security Information for MIMIX Products MIMIX Version 7.0 Using License Manager Installation and Security Information for MIMIX Products Published: September 2010 level 7.0.01.00 Copyrights, Trademarks, and Notices Contents Who this book is

More information

Table of Contents WHITE PAPER

Table of Contents WHITE PAPER Table of Contents Abstract.................................................................. 3 Executive Summary...4 Overview...5 Unique IBM i Platform Technologies Explained...5 Hardware Technologies

More information

High Availability for IBM i. A Technical Review of Softwareand Hardware-Based Solutions

High Availability for IBM i. A Technical Review of Softwareand Hardware-Based Solutions High Availability for IBM i A Technical Review of Softwareand Hardware-Based Solutions Abstract This paper briefly describes what you need to know in order to make an informed decision with regard to IBM

More information

High Availability Options for SAP Using IBM PowerHA SystemMirror for i

High Availability Options for SAP Using IBM PowerHA SystemMirror for i High Availability Options for SAP Using IBM PowerHA Mirror for i Lilo Bucknell Jenny Dervin Luis BL Gonzalez-Suarez Eric Kass June 12, 2012 High Availability Options for SAP Using IBM PowerHA Mirror for

More information

Real-time Protection for Microsoft Hyper-V

Real-time Protection for Microsoft Hyper-V Real-time Protection for Microsoft Hyper-V Introduction Computer virtualization has come a long way in a very short time, triggered primarily by the rapid rate of customer adoption. Moving resources to

More information

System i and System p. Creating a virtual computing environment

System i and System p. Creating a virtual computing environment System i and System p Creating a virtual computing environment System i and System p Creating a virtual computing environment Note Before using this information and the product it supports, read the information

More information

Lawson M3 on IBM PowerHA for i

Lawson M3 on IBM PowerHA for i Lawson M3 on IBM PowerHA for i IBM Systems & Technology Group Paul Swenson paulswen@us.ibm.com This document can be found on the web, Version Date: March 31, 2010 Table of Contents 1. Introduction...3

More information

More Info. Version 2 of Vision Solutions Portal supports portal applications for the following Vision Solutions products:

More Info. Version 2 of Vision Solutions Portal supports portal applications for the following Vision Solutions products: More Info MIMIX Installation Wizard The MIMIX Installation Wizard provides a simple way to download, distribute, and install MIMIX Availability software to an IBM Power System or to multiple systems simultaneously.

More information

IBM Tivoli System Automation for z/os

IBM Tivoli System Automation for z/os Policy-based self-healing to maximize efficiency and system availability IBM Highlights Provides high availability for IBM z/os Offers an advanced suite of systems and IBM Parallel Sysplex management and

More information

More Info. MIMIX Installation Wizard. Supported portal applications and implications

More Info. MIMIX Installation Wizard. Supported portal applications and implications More Info MIMIX Installation Wizard The MIMIX Installation Wizard provides a simple way to download, distribute, and install MIMIX version 8.0 software to an IBM Power System or to multiple systems simultaneously.

More information

VERITAS Global Cluster Manager

VERITAS Global Cluster Manager VERITAS Global Cluster Manager Heterogeneous Cluster Management and Disaster Recovery Protection V E R I T A S W H I T E P A P E R Table of Contents Executive Overview........................................................................................................................................

More information

More Info. MIMIX Installation Wizard. Software choices when installing. Supported portal applications

More Info. MIMIX Installation Wizard. Software choices when installing. Supported portal applications More Info MIMIX Installation Wizard The MIMIX Installation Wizard provides a simple way to download, distribute, and install MIMIX Availability software to an IBM Power System or to multiple systems simultaneously.

More information

Security Service tools user IDs and passwords

Security Service tools user IDs and passwords IBM Systems - iseries Security Service tools user IDs and passwords Version 5 Release 4 IBM Systems - iseries Security Service tools user IDs and passwords Version 5 Release 4 Note Before using this information

More information

Lawson S3 on IBM PowerHA for i

Lawson S3 on IBM PowerHA for i Lawson S3 on IBM PowerHA for i IBM Systems & Technology Group Paul Swenson paulswen@us.ibm.com This document can be found on the web, Version Date: October 06, 2010 Table of Contents 1. Introduction...

More information

IBM TotalStorage Enterprise Storage Server Model 800

IBM TotalStorage Enterprise Storage Server Model 800 A high-performance resilient disk storage solution for systems across the enterprise IBM TotalStorage Enterprise Storage Server Model 800 e-business on demand The move to e-business on demand presents

More information

iseries Managing disk units

iseries Managing disk units iseries Managing disk units iseries Managing disk units Copyright International Business Machines Corporation 2001. All rights reserved. US Government Users Restricted Rights Use, duplication or disclosure

More information

Data Sheet: High Availability Veritas Cluster Server from Symantec Reduce Application Downtime

Data Sheet: High Availability Veritas Cluster Server from Symantec Reduce Application Downtime Reduce Application Downtime Overview is an industry-leading high availability solution for reducing both planned and unplanned downtime. By monitoring the status of applications and automatically moving

More information

HPE Data Replication Solution Service for HPE Business Copy for P9000 XP Disk Array Family

HPE Data Replication Solution Service for HPE Business Copy for P9000 XP Disk Array Family Data sheet HPE Data Replication Solution Service for HPE Business Copy for P9000 XP Disk Array Family HPE Lifecycle Event Services HPE Data Replication Solution Service provides implementation of the HPE

More information

EMC VPLEX Geo with Quantum StorNext

EMC VPLEX Geo with Quantum StorNext White Paper Application Enabled Collaboration Abstract The EMC VPLEX Geo storage federation solution, together with Quantum StorNext file system, enables a global clustered File System solution where remote

More information

Read This First. Version 7.1 Service Pack

Read This First. Version 7.1 Service Pack Read This First Version 7.1 Service Pack 7.1.02.00 Prerequisites: If you are upgrading to MIMIX version 7.1, a MIMIX version 7.0 service pack (7.0.00.00 or higher) must be installed first. Note: If you

More information

INTRODUCING VERITAS BACKUP EXEC SUITE

INTRODUCING VERITAS BACKUP EXEC SUITE INTRODUCING VERITAS BACKUP EXEC SUITE January 6, 2005 VERITAS ARCHITECT NETWORK TABLE OF CONTENTS Managing More Storage with Fewer Resources...3 VERITAS Backup Exec Suite...3 Continuous Data Protection...

More information

SteelEye Protection Suite for Windows Microsoft Internet Information Services Recovery Kit v Administration Guide

SteelEye Protection Suite for Windows Microsoft Internet Information Services Recovery Kit v Administration Guide SteelEye Protection Suite for Windows Microsoft Internet Information Services Recovery Kit v8.0.1 Administration Guide March 2014 This document and the information herein is the property of SIOS Technology

More information

Disaster Recovery Solutions for Oracle Database Standard Edition RAC. A Dbvisit White Paper By Anton Els

Disaster Recovery Solutions for Oracle Database Standard Edition RAC. A Dbvisit White Paper By Anton Els Disaster Recovery Solutions for Oracle Database Standard Edition RAC A Dbvisit White Paper By Anton Els Copyright 2017 Dbvisit Software Limited. All Rights Reserved V3, Oct 2017 Contents Executive Summary...

More information

Veritas Cluster Server from Symantec

Veritas Cluster Server from Symantec Delivers high availability and disaster recovery for your critical applications Data Sheet: High Availability Overviewview protects your most important applications from planned and unplanned downtime.

More information

Essentials. Oracle Solaris Cluster. Tim Read. Upper Saddle River, NJ Boston Indianapolis San Francisco. Capetown Sydney Tokyo Singapore Mexico City

Essentials. Oracle Solaris Cluster. Tim Read. Upper Saddle River, NJ Boston Indianapolis San Francisco. Capetown Sydney Tokyo Singapore Mexico City Oracle Solaris Cluster Essentials Tim Read PRENTICE HALL Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris Madrid Capetown Sydney Tokyo Singapore Mexico

More information

IBM GDPS V3.3: Improving disaster recovery capabilities to help ensure a highly available, resilient business environment

IBM GDPS V3.3: Improving disaster recovery capabilities to help ensure a highly available, resilient business environment Marketing Announcement February 14, 2006 IBM GDPS V3.3: Improving disaster recovery capabilities to help ensure a highly available, resilient business environment Overview GDPS is IBM s premier continuous

More information

IBM TS7700 grid solutions for business continuity

IBM TS7700 grid solutions for business continuity IBM grid solutions for business continuity Enhance data protection and business continuity for mainframe environments in the cloud era Highlights Help ensure business continuity with advanced features

More information

Maximum Availability Architecture: Overview. An Oracle White Paper July 2002

Maximum Availability Architecture: Overview. An Oracle White Paper July 2002 Maximum Availability Architecture: Overview An Oracle White Paper July 2002 Maximum Availability Architecture: Overview Abstract...3 Introduction...3 Architecture Overview...4 Application Tier...5 Network

More information

Avoiding the Cost of Confusion: SQL Server Failover Cluster Instances versus Basic Availability Group on Standard Edition

Avoiding the Cost of Confusion: SQL Server Failover Cluster Instances versus Basic Availability Group on Standard Edition One Stop Virtualization Shop Avoiding the Cost of Confusion: SQL Server Failover Cluster Instances versus Basic Availability Group on Standard Edition Written by Edwin M Sarmiento, a Microsoft Data Platform

More information

EMC VPLEX with Quantum Stornext

EMC VPLEX with Quantum Stornext White Paper Application Enabled Collaboration Abstract The EMC VPLEX storage federation solution together with Quantum StorNext file system enables a stretched cluster solution where hosts has simultaneous

More information

Managed Service. Managed Services. High Availability / Disaster Recovery Solutions. Cloud and Hosting Solutions. Security Solutions.

Managed Service. Managed Services. High Availability / Disaster Recovery Solutions. Cloud and Hosting Solutions. Security Solutions. Managed Service Managed Services IBM i Intel AIX High Availability / Disaster Recovery Solutions Design & Implementation Monitoring Cloud and Hosting Solutions Security Solutions Single Sign-On Assessments

More information

IBM PowerHA SystemMirror for AIX. Standard Edition. Version 7.1. PowerHA SystemMirror concepts IBM

IBM PowerHA SystemMirror for AIX. Standard Edition. Version 7.1. PowerHA SystemMirror concepts IBM IBM PowerHA SystemMirror for AIX Standard Edition Version 7.1 PowerHA SystemMirror concepts IBM IBM PowerHA SystemMirror for AIX Standard Edition Version 7.1 PowerHA SystemMirror concepts IBM Note Before

More information

Geographic LVM: Planning and administration guide

Geographic LVM: Planning and administration guide High Availability Cluster Multi-Processing XD (Extended Distance) Geographic LVM: Planning and administration guide SA23-1338-07 High Availability Cluster Multi-Processing XD (Extended Distance) Geographic

More information

IBM TotalStorage Enterprise Storage Server (ESS) Model 750

IBM TotalStorage Enterprise Storage Server (ESS) Model 750 A resilient enterprise disk storage system at midrange prices IBM TotalStorage Enterprise Storage Server (ESS) Model 750 Conducting business in the on demand era demands fast, reliable access to information

More information

ERserver. Service provider information Service functions

ERserver. Service provider information Service functions ERserver Service provider information Service functions ERserver Service provider information Service functions Note Before using this information and the product it supports, be sure to read the information

More information

IBM TotalStorage Enterprise Storage Server Model 800

IBM TotalStorage Enterprise Storage Server Model 800 A high-performance disk storage solution for systems across the enterprise IBM TotalStorage Enterprise Storage Server Model 800 e-business on demand The move to e-business on demand presents companies

More information

IBM DB2 Analytics Accelerator High Availability and Disaster Recovery

IBM DB2 Analytics Accelerator High Availability and Disaster Recovery Redpaper Patric Becker Frank Neumann IBM Analytics Accelerator High Availability and Disaster Recovery Introduction With the introduction of IBM Analytics Accelerator, IBM enhanced for z/os capabilities

More information

Carbonite Availability. Technical overview

Carbonite Availability. Technical overview Carbonite Availability Technical overview Table of contents Executive summary The availability imperative...3 True real-time replication More efficient and better protection... 4 Robust protection Reliably

More information

ForeScout CounterACT Resiliency Solutions

ForeScout CounterACT Resiliency Solutions ForeScout CounterACT Resiliency Solutions User Guide CounterACT Version 7.0.0 About CounterACT Resiliency Solutions Table of Contents About CounterACT Resiliency Solutions... 5 Comparison of Resiliency

More information

VMware HA: Overview & Technical Best Practices

VMware HA: Overview & Technical Best Practices VMware HA: Overview & Technical Best Practices Updated 8/10/2007 What is Business Continuity? Business Continuity = Always-on uninterrupted availability of business systems and applications Business Continuity

More information

EMC DATA PROTECTION, FAILOVER AND FAILBACK, AND RESOURCE REPURPOSING IN A PHYSICAL SECURITY ENVIRONMENT

EMC DATA PROTECTION, FAILOVER AND FAILBACK, AND RESOURCE REPURPOSING IN A PHYSICAL SECURITY ENVIRONMENT White Paper EMC DATA PROTECTION, FAILOVER AND FAILBACK, AND RESOURCE REPURPOSING IN A PHYSICAL SECURITY ENVIRONMENT Genetec Omnicast, EMC VPLEX, Symmetrix VMAX, CLARiiON Provide seamless local or metropolitan

More information

IBM Tivoli Storage Manager for AIX Version Installation Guide IBM

IBM Tivoli Storage Manager for AIX Version Installation Guide IBM IBM Tivoli Storage Manager for AIX Version 7.1.3 Installation Guide IBM IBM Tivoli Storage Manager for AIX Version 7.1.3 Installation Guide IBM Note: Before you use this information and the product it

More information

HUAWEI OceanStor Enterprise Unified Storage System. HyperReplication Technical White Paper. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD.

HUAWEI OceanStor Enterprise Unified Storage System. HyperReplication Technical White Paper. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD. HUAWEI OceanStor Enterprise Unified Storage System HyperReplication Technical White Paper Issue 01 Date 2014-03-20 HUAWEI TECHNOLOGIES CO., LTD. 2014. All rights reserved. No part of this document may

More information

VERITAS Volume Replicator. Successful Replication and Disaster Recovery

VERITAS Volume Replicator. Successful Replication and Disaster Recovery VERITAS Volume Replicator Successful Replication and Disaster Recovery V E R I T A S W H I T E P A P E R Table of Contents Introduction.................................................................................1

More information

High Availability Guide for Distributed Systems

High Availability Guide for Distributed Systems IBM Tivoli Monitoring Version 6.3 Fix Pack 2 High Availability Guide for Distributed Systems SC22-5455-01 IBM Tivoli Monitoring Version 6.3 Fix Pack 2 High Availability Guide for Distributed Systems SC22-5455-01

More information

IBM GDPS V3.3: Improving disaster recovery capabilities to help ensure a highly available, resilient business environment

IBM GDPS V3.3: Improving disaster recovery capabilities to help ensure a highly available, resilient business environment Marketing Announcement February 14, 2006 IBM GDPS V3.3: Improving disaster recovery capabilities to help ensure a highly available, resilient business environment Overview GDPS is IBM s premier continuous

More information

VMware Site Recovery Technical Overview First Published On: Last Updated On:

VMware Site Recovery Technical Overview First Published On: Last Updated On: VMware Site Recovery Technical Overview First Published On: 11-28-2017 Last Updated On: 11-29-2017 1 Table of Contents 1. VMware Site Recovery Technical Overview 1.1.Introduction 1.2.Overview 1.3.Use Cases

More information

Are AGs A Good Fit For Your Database? Doug Purnell

Are AGs A Good Fit For Your Database? Doug Purnell Are AGs A Good Fit For Your Database? Doug Purnell About Me DBA for Elon University Co-leader for WinstonSalem BI User Group All things Nikon Photography Bring on the BBQ! Goals Understand HA & DR Types

More information

EMC Celerra CNS with CLARiiON Storage

EMC Celerra CNS with CLARiiON Storage DATA SHEET EMC Celerra CNS with CLARiiON Storage Reach new heights of availability and scalability with EMC Celerra Clustered Network Server (CNS) and CLARiiON storage Consolidating and sharing information

More information

Huawei OceanStor ReplicationDirector Software Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Date

Huawei OceanStor ReplicationDirector Software Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Date Huawei OceanStor Software Issue 01 Date 2015-01-17 HUAWEI TECHNOLOGIES CO., LTD. 2015. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without

More information

IBM i Version 7.3. Systems management Disk management IBM

IBM i Version 7.3. Systems management Disk management IBM IBM i Version 7.3 Systems management Disk management IBM IBM i Version 7.3 Systems management Disk management IBM Note Before using this information and the product it supports, read the information in

More information

What's in this guide... 4 Documents related to NetBackup in highly available environments... 5

What's in this guide... 4 Documents related to NetBackup in highly available environments... 5 Contents Chapter 1 About in this guide... 4 What's in this guide... 4 Documents related to NetBackup in highly available environments... 5 Chapter 2 NetBackup protection against single points of failure...

More information

System p. Partitioning with the Integrated Virtualization Manager

System p. Partitioning with the Integrated Virtualization Manager System p Partitioning with the Integrated Virtualization Manager System p Partitioning with the Integrated Virtualization Manager Note Before using this information and the product it supports, read the

More information

Red Hat Enterprise Virtualization (RHEV) Backups by SEP

Red Hat Enterprise Virtualization (RHEV) Backups by SEP Red Hat Enterprise Virtualization (RHEV) Backups by SEP info@sepusa.com www.sepusa.com Table of Contents INTRODUCTION AND OVERVIEW AGENT BASED BACKUP IMAGE LEVEL BACKUP VIA RHEV API RHEV BACKUP WITH SEP

More information

IBM Europe Announcement ZP , dated April 8, 2008

IBM Europe Announcement ZP , dated April 8, 2008 IBM Europe Announcement ZP08-0185, dated April 8, 2008 IBM TotalStorage Productivity Center for Replication for System z V3.4 delivers enhanced management and new high availability features for IBM System

More information

System i and System p. Service provider information Reference information

System i and System p. Service provider information Reference information System i and System p Service provider information Reference information System i and System p Service provider information Reference information Note Before using this information and the product it

More information

MQ High Availability and Disaster Recovery Implementation scenarios

MQ High Availability and Disaster Recovery Implementation scenarios MQ High Availability and Disaster Recovery Implementation scenarios Sandeep Chellingi Head of Hybrid Cloud Integration Prolifics Agenda MQ Availability Message Availability Service Availability HA vs DR

More information

IBM Tivoli Storage Manager for HP-UX Version Installation Guide IBM

IBM Tivoli Storage Manager for HP-UX Version Installation Guide IBM IBM Tivoli Storage Manager for HP-UX Version 7.1.4 Installation Guide IBM IBM Tivoli Storage Manager for HP-UX Version 7.1.4 Installation Guide IBM Note: Before you use this information and the product

More information

IBM System Storage DS5020 Express

IBM System Storage DS5020 Express IBM DS5020 Express Manage growth, complexity, and risk with scalable, high-performance storage Highlights Mixed host interfaces support (FC/iSCSI) enables SAN tiering Balanced performance well-suited for

More information

EView/400i Management for HP BSM. Operations Manager i

EView/400i Management for HP BSM. Operations Manager i EView/400i Management for HP BSM Operations Manager i Concepts Guide Software Version: 7.00 July 2015 Legal Notices Warranty EView Technology makes no warranty of any kind with regard to this document,

More information

Understanding high availability with WebSphere MQ

Understanding high availability with WebSphere MQ Mark Hiscock Software Engineer IBM Hursley Park Lab United Kingdom Simon Gormley Software Engineer IBM Hursley Park Lab United Kingdom May 11, 2005 Copyright International Business Machines Corporation

More information

Veritas NetBackup OpenStorage Solutions Guide for Disk

Veritas NetBackup OpenStorage Solutions Guide for Disk Veritas NetBackup OpenStorage Solutions Guide for Disk UNIX, Windows, Linux Release 8.0 Veritas NetBackup OpenStorage Solutions Guide for Disk Legal Notice Copyright 2016 Veritas Technologies LLC. All

More information

ForeScout CounterACT. Resiliency Solutions. CounterACT Version 8.0

ForeScout CounterACT. Resiliency Solutions. CounterACT Version 8.0 ForeScout CounterACT Resiliency Solutions CounterACT Version 8.0 Table of Contents About ForeScout Resiliency Solutions... 4 Comparison of Resiliency Solutions for Appliances... 5 Choosing the Right Solution

More information

SM B10: Rethink Disaster Recovery: Replication and Backup Are Not Enough

SM B10: Rethink Disaster Recovery: Replication and Backup Are Not Enough SM B10: Rethink Disaster Recovery: Replication and Backup Are Not Enough Paul Belk Director, Product Management Mike Weiss Staples Ranga Rajagopalan Principal Product Manager Tsunami Hurricane Philippines

More information

QuickStart Guide vcenter Server Heartbeat 5.5 Update 1 EN

QuickStart Guide vcenter Server Heartbeat 5.5 Update 1 EN vcenter Server Heartbeat 5.5 Update 1 EN-000205-00 You can find the most up-to-date technical documentation on the VMware Web site at: http://www.vmware.com/support/ The VMware Web site also provides the

More information

Step into the future. HP Storage Summit Converged storage for the next era of IT

Step into the future. HP Storage Summit Converged storage for the next era of IT HP Storage Summit 2013 Step into the future Converged storage for the next era of IT 1 HP Storage Summit 2013 Step into the future Converged storage for the next era of IT Karen van Warmerdam HP XP Product

More information

Virtualization And High Availability. Howard Chow Microsoft MVP

Virtualization And High Availability. Howard Chow Microsoft MVP Virtualization And High Availability Howard Chow Microsoft MVP Session Objectives And Agenda Virtualization and High Availability Types of high availability enabled by virtualization Enabling a highly

More information

Grid Computing with Voyager

Grid Computing with Voyager Grid Computing with Voyager By Saikumar Dubugunta Recursion Software, Inc. September 28, 2005 TABLE OF CONTENTS Introduction... 1 Using Voyager for Grid Computing... 2 Voyager Core Components... 3 Code

More information

Microsoft SQL Server Fix Pack 15. Reference IBM

Microsoft SQL Server Fix Pack 15. Reference IBM Microsoft SQL Server 6.3.1 Fix Pack 15 Reference IBM Microsoft SQL Server 6.3.1 Fix Pack 15 Reference IBM Note Before using this information and the product it supports, read the information in Notices

More information

Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages

Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages Exploiting IT Log Analytics to Find and Fix Problems Before They Become Outages Clyde Richardson (richarcl@us.ibm.com) Technical Sales Specialist Sarah Knowles (seli@us.ibm.com) Strategy and Portfolio

More information

TECHNICAL ADDENDUM 01

TECHNICAL ADDENDUM 01 TECHNICAL ADDENDUM 01 What Does An HA Environment Look Like? An HA environment will have a Source system that the database changes will be captured on and generate local journal entries. The journal entries

More information

Course 6231A: Maintaining a Microsoft SQL Server 2008 Database

Course 6231A: Maintaining a Microsoft SQL Server 2008 Database Course 6231A: Maintaining a Microsoft SQL Server 2008 Database OVERVIEW About this Course Elements of this syllabus are subject to change. This five-day instructor-led course provides students with the

More information

DISK LIBRARY FOR MAINFRAME

DISK LIBRARY FOR MAINFRAME DISK LIBRARY FOR MAINFRAME Geographically Dispersed Disaster Restart Tape ABSTRACT Disk Library for mainframe is Dell EMC s industry leading virtual tape library for mainframes. Geographically Dispersed

More information

IBM i Version 7 Release 3. Availability High availability overview IBM

IBM i Version 7 Release 3. Availability High availability overview IBM IBM i Version 7 Release 3 Availability High availability overview IBM IBM i Version 7 Release 3 Availability High availability overview IBM Note Before using this information and the product it supports,

More information

VMware Mirage Getting Started Guide

VMware Mirage Getting Started Guide Mirage 5.8 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document,

More information

W H I T E P A P E R : T E C H N I C AL. Symantec High Availability Solution for Oracle Enterprise Manager Grid Control 11g and Cloud Control 12c

W H I T E P A P E R : T E C H N I C AL. Symantec High Availability Solution for Oracle Enterprise Manager Grid Control 11g and Cloud Control 12c W H I T E P A P E R : T E C H N I C AL Symantec High Availability Solution for Oracle Enterprise Manager Grid Control 11g and Cloud Control 12c Table of Contents Symantec s solution for ensuring high availability

More information

VMware Mirage Getting Started Guide

VMware Mirage Getting Started Guide Mirage 5.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document,

More information

Course 6231A: Maintaining a Microsoft SQL Server 2008 Database

Course 6231A: Maintaining a Microsoft SQL Server 2008 Database Course 6231A: Maintaining a Microsoft SQL Server 2008 Database About this Course This five-day instructor-led course provides students with the knowledge and skills to maintain a Microsoft SQL Server 2008

More information

DELL EMC UNITY: REPLICATION TECHNOLOGIES

DELL EMC UNITY: REPLICATION TECHNOLOGIES DELL EMC UNITY: REPLICATION TECHNOLOGIES A Detailed Review ABSTRACT This white paper explains the replication solutions for Dell EMC Unity systems. This paper outlines the native and non-native options

More information

Replicator Disaster Recovery Best Practices

Replicator Disaster Recovery Best Practices Replicator Disaster Recovery Best Practices VERSION 7.4.0 June 21, 2017 Scenario Guide Article 1120504-01 www.metalogix.com info@metalogix.com 202.609.9100 Copyright International GmbH, 2002-2017 All rights

More information

Using VERITAS Volume Replicator for Disaster Recovery of a SQL Server Application Note

Using VERITAS Volume Replicator for Disaster Recovery of a SQL Server Application Note Using VERITAS Volume Replicator for Disaster Recovery of a SQL Server Application Note February 2002 30-000632-011 Disclaimer The information contained in this publication is subject to change without

More information

HP StoreVirtual Storage Multi-Site Configuration Guide

HP StoreVirtual Storage Multi-Site Configuration Guide HP StoreVirtual Storage Multi-Site Configuration Guide Abstract This guide contains detailed instructions for designing and implementing the Multi-Site SAN features of the LeftHand OS. The Multi-Site SAN

More information

New IBM i Technologies and Wine Make Backup and Recovery Better. Debbie Saugen Director, Business Continuity Services

New IBM i Technologies and Wine Make Backup and Recovery Better. Debbie Saugen Director, Business Continuity Services New IBM i Technologies and Wine Make Backup and Recovery Better Debbie Saugen Director, Business Continuity Services debbie.saugen@helpsystems.com About the Speaker Debbie Saugen is recognized worldwide

More information

Lotus Sametime 3.x for iseries. Performance and Scaling

Lotus Sametime 3.x for iseries. Performance and Scaling Lotus Sametime 3.x for iseries Performance and Scaling Contents Introduction... 1 Sametime Workloads... 2 Instant messaging and awareness.. 3 emeeting (Data only)... 4 emeeting (Data plus A/V)... 8 Sametime

More information

HP StoreVirtual Storage Multi-Site Configuration Guide

HP StoreVirtual Storage Multi-Site Configuration Guide HP StoreVirtual Storage Multi-Site Configuration Guide Abstract This guide contains detailed instructions for designing and implementing the Multi-Site SAN features of the LeftHand OS. The Multi-Site SAN

More information

CLUSTERING. What is Clustering?

CLUSTERING. What is Clustering? What is Clustering? CLUSTERING A cluster is a group of independent computer systems, referred to as nodes, working together as a unified computing resource. A cluster provides a single name for clients

More information

IBM System Storage DS6800

IBM System Storage DS6800 Enterprise-class storage in a small, scalable package IBM System Storage DS6800 Highlights Designed to deliver Designed to provide over enterprise-class functionality, 1600 MBps performance for with open

More information

Veritas NetBackup for Microsoft SQL Server Administrator's Guide

Veritas NetBackup for Microsoft SQL Server Administrator's Guide Veritas NetBackup for Microsoft SQL Server Administrator's Guide for Windows Release 8.1.1 Veritas NetBackup for Microsoft SQL Server Administrator's Guide Last updated: 2018-04-10 Document version:netbackup

More information

Veritas Storage Foundation for Windows by Symantec

Veritas Storage Foundation for Windows by Symantec Veritas Storage Foundation for Windows by Symantec Advanced online storage management Veritas Storage Foundation 5.1 for Windows brings advanced online storage management to Microsoft Windows Server environments,

More information

SRM 8.1 Technical Overview First Published On: Last Updated On:

SRM 8.1 Technical Overview First Published On: Last Updated On: First Published On: 12-23-2016 Last Updated On: 04-17-2018 1 Table of Contents 1. Introduction 1.1.Overview 1.2.Terminology 2. Architectural Overview 2.1.Overview 3. Use Cases 3.1.Overview 3.2.Disaster

More information

IBM Geographically Dispersed Resiliency for Power Systems. Version Release Notes IBM

IBM Geographically Dispersed Resiliency for Power Systems. Version Release Notes IBM IBM Geographically Dispersed Resiliency for Power Systems Version 1.2.0.0 Release Notes IBM IBM Geographically Dispersed Resiliency for Power Systems Version 1.2.0.0 Release Notes IBM Note Before using

More information

An Oracle White Paper May Oracle VM 3: Overview of Disaster Recovery Solutions

An Oracle White Paper May Oracle VM 3: Overview of Disaster Recovery Solutions An Oracle White Paper May 2014 Oracle VM 3: Overview of Disaster Recovery Solutions Contents Introduction... 1 Overview of DR Solutions with Oracle VM... 2 Choose your DR solution path... 2 Continuous

More information

MySQL Cluster Ed 2. Duration: 4 Days

MySQL Cluster Ed 2. Duration: 4 Days Oracle University Contact Us: +65 6501 2328 MySQL Cluster Ed 2 Duration: 4 Days What you will learn This MySQL Cluster training teaches you how to install and configure a real-time database cluster at

More information

Power Systems High Availability & Disaster Recovery

Power Systems High Availability & Disaster Recovery Power Systems High Availability & Disaster Recovery Solutions Comparison of various HA & DR solutions for Power Systems Authors: Carl Burnett, Joe Cropper, Ravi Shankar Table of Contents 1 Abstract...

More information