InterSystems High Availability Solutions

Similar documents
Routing EDIFACT Documents in Productions

HIGH AVAILABILITY STRATEGIES

Important Announcement: Substantial Upcoming Enhancement to Mirroring. Change Required for Sites Currently Using IsOtherNodeDown^ZMIRROR

Using SOAP and Web Services with Caché

Using ODBC with InterSystems IRIS

August Oracle - GoldenGate Statement of Direction

Replicator Disaster Recovery Best Practices

How To Make Databases on SUSE Linux Enterprise Server Highly Available Mike Friesenegger

The Right Choice for DR: Data Guard, Stretch Clusters, or Remote Mirroring. Ashish Ray Group Product Manager Oracle Corporation

High Availability and Disaster Recovery Solutions for Perforce

Eliminate Idle Redundancy with Oracle Active Data Guard

Balancing RTO, RPO, and budget. Table of Contents. White Paper Seven steps to disaster recovery nirvana for wholesale distributors

IBM TS7700 grid solutions for business continuity

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc.

Documentation Accessibility. Access to Oracle Support

SAP HANA. HA and DR Guide. Issue 03 Date HUAWEI TECHNOLOGIES CO., LTD.

InterSystems IRIS Error Reference

Oracle Fail Safe. Release for Microsoft Windows E

Maximum Availability Architecture: Overview. An Oracle White Paper July 2002

Downtime Prevention Buyer s Guide. 6 QUESTIONS to help you choose the right availability protection for your applications

Oracle E-Business Availability Options. Solution Series for Oracle: 2 of 5

VERITAS Volume Replicator. Successful Replication and Disaster Recovery

FCUBS GridLink Datasource Configuration Oracle FLEXCUBE Universal Banking Release [May] [2018]

Copyright 1998, 2009, Oracle and/or its affiliates. All rights reserved.

Disaster Recovery Guide

Module 4 STORAGE NETWORK BACKUP & RECOVERY

EMC VPLEX with Quantum Stornext

Address new markets with new services

Business Continuity and Disaster Recovery. Ed Crowley Ch 12

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange Enabled by MirrorView/S

Microsoft E xchange 2010 on VMware

ECE Engineering Robust Server Software. Spring 2018

Maximum Availability Architecture. Oracle Best Practices For High Availability

Protecting Mission-Critical Workloads with VMware Fault Tolerance W H I T E P A P E R

Virtualization And High Availability. Howard Chow Microsoft MVP

Epicor ERP Cloud Services Specification Multi-Tenant and Dedicated Tenant Cloud Services (Updated July 31, 2017)

WHITE PAPER: ENTERPRISE SOLUTIONS

FUJITSU Storage ETERNUS AF series and ETERNUS DX S4/S3 series Non-Stop Storage Reference Architecture Configuration Guide

vsphere Availability Guide

How everrun Works. An overview of the everrun Architecture

EMC Integrated Infrastructure for VMware. Business Continuity

Protecting Mission-Critical Application Environments The Top 5 Challenges and Solutions for Backup and Recovery

Asigra Cloud Backup Provides Comprehensive Virtual Machine Data Protection Including Replication

Oracle Retail MICROS Stores2 Functional Document Stores2 for Portugal Disaster Recovery Release

HP OpenView Storage Data Protector A.05.10

Tableau Server 9.0 High Availability:

Veritas Storage Foundation for Windows by Symantec

WHY BUILDING SECURITY SYSTEMS NEED CONTINUOUS AVAILABILITY

Benefits of an Exclusive Multimaster Deployment of Oracle Directory Server Enterprise Edition

Using EonStor DS Series iscsi-host storage systems with VMware vsphere 5.x

Veritas Storage Foundation for Windows by Symantec

High Availability- Disaster Recovery 101

Veritas Storage Foundation and High Availability Solutions Microsoft Clustering Solutions Guide for Microsoft SQL 2008

High-Availability Practice of ZTE Cloud-Based Core Network

Release for Microsoft Windows

Understanding high availability with WebSphere MQ

Oracle MaxRep for SAN. Configuration Sizing Guide. Part Number E release November

Using VERITAS Volume Replicator for Disaster Recovery of a SQL Server Application Note

VERITAS Storage Foundation 4.0 TM for Databases

High Availability and Disaster Recovery features in Microsoft Exchange Server 2007 SP1

StarWind Virtual SAN Windows Geo-Clustering: SQL Server

New England Data Camp v2.0 It is all about the data! Caregroup Healthcare System. Ayad Shammout Lead Technical DBA

Version is the follow-on release after version 8.1, featuring:

Security+ Guide to Network Security Fundamentals, Third Edition. Chapter 13 Business Continuity

A Carrier-Grade Cloud Phone System

Veritas Storage Foundation and High Availability Solutions HA and Disaster Recovery Solutions Guide for Enterprise Vault

BrightStor ARCserve Backup for Windows

HUAWEI OceanStor Enterprise Unified Storage System. HyperReplication Technical White Paper. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD.

An Oracle White Paper May Oracle VM 3: Overview of Disaster Recovery Solutions

INTRODUCING VERITAS BACKUP EXEC SUITE

Real-time Protection for Microsoft Hyper-V

High Availability- Disaster Recovery 101

Protecting Microsoft Hyper-V 3.0 Environments with Arcserve

EMC VPLEX Geo with Quantum StorNext

Disaster Recovery Solution Achieved by EXPRESSCLUSTER

Data Sheet: High Availability Veritas Cluster Server from Symantec Reduce Application Downtime

Sun Virtual Desktop Infrastructure. Update Guide for Version 3.1

COMPLETE ONLINE MICROSOFT SQL SERVER DATA PROTECTION

Alfresco 2.1. Backup and High Availability Guide

Massive Scalability With InterSystems IRIS Data Platform

VERITAS Volume Replicator Successful Replication and Disaster Recovery

Understanding Virtual System Data Protection

White paper High Availability - Solutions and Implementations

SQL Server Virtualization 201

Technical Note. Dell/EMC Solutions for Microsoft SQL Server 2005 Always On Technologies. Abstract

Oracle Enterprise Manager

Stretched Clusters & VMware Site Recovery Manager December 15, 2017

IBM Geographically Dispersed Resiliency for Power Systems. Version Release Notes IBM

Proficy* HMI/SCADA - ifix LAN R EDUNDANCY

Copyright 2015 EMC Corporation. All rights reserved. Published in the USA.

Boost your data protection with NetApp + Veeam. Schahin Golshani Technical Partner Enablement Manager, MENA

Veritas Storage Foundation for Windows by Symantec

Storwize/IBM Technical Validation Report Performance Verification

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

Oracle Enterprise Manager Ops Center. Introduction. Creating Oracle Solaris 11 Zones Guide 12c Release 1 ( )

Protecting remote site data SvSAN clustering - failure scenarios

HIGH-AVAILABILITY & D/R OPTIONS FOR MICROSOFT SQL SERVER

Advanced Architecture Design for Cloud-Based Disaster Recovery WHITE PAPER

OUR CUSTOMER TERMS CLOUD SERVICES - INFRASTRUCTURE

Maximum Availability Architecture on Dell PowerEdge Servers and Dell/EMC Storage over Wide Area Networks

Transcription:

InterSystems High Availability Solutions Version 2018.1.1 2018-08-13 InterSystems Corporation 1 Memorial Drive Cambridge MA 02142 www.intersystems.com

InterSystems High Availability Solutions InterSystems IRIS Data Platform Version 2018.1.1 2018-08-13 Copyright 2018 InterSystems Corporation All rights reserved. InterSystems, InterSystems Caché, InterSystems Ensemble, InterSystems HealthShare, HealthShare, InterSystems TrakCare, TrakCare, InterSystems DeepSee, and DeepSee are registered trademarks of InterSystems Corporation. InterSystems IRIS Data Platform, InterSystems IRIS, InterSystems iknow, Zen, and Caché Server Pages are trademarks of InterSystems Corporation. All other brand or product names used herein are trademarks or registered trademarks of their respective companies or organizations. This document contains trade secret and confidential information which is the property of InterSystems Corporation, One Memorial Drive, Cambridge, MA 02142, or its affiliates, and is furnished for the sole purpose of the operation and maintenance of the products of InterSystems Corporation. No part of this publication is to be used for any other purpose, and this publication is not to be reproduced, copied, disclosed, transmitted, stored in a retrieval system or translated into any human or computer language, in any form, by any means, in whole or in part, without the express prior written consent of InterSystems Corporation. The copying, use and disposition of this document and the software programs described herein is prohibited except to the limited extent set forth in the standard software license agreement(s) of InterSystems Corporation covering such programs and related documentation. InterSystems Corporation makes no representations and warranties concerning such software programs other than those set forth in such standard software license agreement(s). In addition, the liability of InterSystems Corporation for any losses or damages relating to or arising out of the use of such software programs is limited in the manner set forth in such standard software license agreement(s). THE FOREGOING IS A GENERAL SUMMARY OF THE RESTRICTIONS AND LIMITATIONS IMPOSED BY INTERSYSTEMS CORPORATION ON THE USE OF, AND LIABILITY ARISING FROM, ITS COMPUTER SOFTWARE. FOR COMPLETE INFORMATION REFERENCE SHOULD BE MADE TO THE STANDARD SOFTWARE LICENSE AGREEMENT(S) OF INTERSYSTEMS CORPORATION, COPIES OF WHICH WILL BE MADE AVAILABLE UPON REQUEST. InterSystems Corporation disclaims responsibility for errors which may appear in this document, and it reserves the right, in its sole discretion and without notice, to make substitutions and modifications in the products and practices described in this document. For Support questions about any InterSystems products, contact: InterSystems Worldwide Response Center (WRC) Tel: +1-617-621-0700 Tel: +44 (0) 844 854 2917 Email: support@intersystems.com

Table of Contents InterSystems High Availability Solutions... 1 1 Issues in InterSystems IRIS HA Solutions... 1 2 No HA Solution... 2 3 OS-Level Cluster HA... 2 4 Virtualization Platform HA... 3 5 InterSystems IRIS Mirroring... 3 6 HA Solutions Feature Comparison... 4 7 Using Distributed Caching with a Failover Strategy... 5 List of Figures Figure 1: Failover Cluster Configuration... 2 Figure 2: Failover in a Virtual Environment... 3 Figure 3: InterSystems IRIS Mirror... 4 InterSystems High Availability Solutions iii

InterSystems High Availability Solutions High availability (HA) refers to the goal of keeping a system or application operational and available to users a very high percentage of the time, minimizing both planned and unplanned down time. InterSystems IRIS provides its own HA solution, and easily integrates with common HA solutions supplied by operating system providers. The primary mechanism for maintaining high system availability is called failover. Under this approach, a failed primary system is replaced by a backup system; that is, production fails over to the backup system. Many HA configurations also provide mechanisms for disaster recovery (DR), which is the resumption of system availability when HA mechanisms have been unable to keep the system available. This article briefly discusses these general strategies for achieving HA with InterSystems IRIS-based applications: No HA Solution OS-Level Cluster HA Virtualization Platform HA InterSystems IRIS Mirroring This article also discusses issues in InterSystems IRIS HA solutions, provides an HA solutions feature comparison, and discusses using distributed caching with a failover strategy. 1 Issues in InterSystems IRIS HA Solutions Bear in mind the following two significant issues when evaluating potential HA solutions for your InterSystems IRIS systems: Shared storage An important principle of HA architecture is the avoidance of single points of failure. Most HA solutions rely on a shared storage component; which represents just such a risk; if the storage fails, it is impossible to keep the system available. Storage-level redundancy can mitigate this risk to an extent, but can also carry forward some types of data corruption. InterSystems IRIS mirroring, on the other hand, uses logical data replication between fully independent primary and backup storage, which eliminates the single point of failure problem entirely and avoids carrying forward most types of corruption. When using a solution other than mirroring, therefore, a single storage failure can be disastrous. For this reason, disk redundancy, database journaling as described in the Journaling chapter of the Data Integrity Guide, and good backup procedures, as described in the Backup and Restore chapter of the Data Integrity Guide, must always be part of your approach, as they are vital to mitigating the consequences of disk failure. InterSystems IRIS upgrades Many HA solutions allow for planned down time of a given component system without interrupting overall availability. Most, however, require significant down time to upgrade the production InterSystems IRIS instance. InterSystems IRIS mirroring, however, allows for minimum downtime InterSystems IRIS upgrade when application code, classes, and routines are kept in separate databases from application data. InterSystems High Availability Solutions 1

No HA Solution HA solutions other than mirroring therefore require careful planning of down time windows for InterSystems IRIS upgrades, or any other maintenance requiring InterSystems IRIS shutdown. 2 No HA Solution The structural and logical integrity of your InterSystems IRIS database is always protected from production system failure by the built-in features described in the Introduction to Data Integrity chapter of the Data Integrity Guide: write image journaling, database journaling, and transaction processing. With no HA solution in place, however, a failure can result in significant down time, depending on the cause of the failure and your ability to isolate and resolve it. For many applications that are not business-critical, this risk may be acceptable. Customers that adopt this approach share the following traits: Clear and detailed operational recovery procedures, including journaling and backup and restore. Disk redundancy (RAID and/or disk mirroring). Ability to replace hardware quickly. 24/7 maintenance contracts with all vendors. Management acceptance and application user tolerance of moderate downtime caused by failures. 3 OS-Level Cluster HA A common HA solution provided at the operating system level is the failover cluster, in which the primary production system is supplemented by a (typically identical) standby system, with shared storage and a cluster IP address that follows the active member. In the event of a production system failure, the standby assumes the production workload, taking over the programs and services formerly running on the failed primary. The standby must be capable of handling normal production workloads for as long as it may take to restore the failed primary. Optionally, the standby can become the primary, with the failed primary becoming the standby once it is restored. InterSystems IRIS is designed to easily integrate with the failover cluster technologies of supported platforms (as described in InterSystems Supported Platforms). An InterSystems IRIS instance is installed on the cluster s shared storage device so that both cluster members recognize it, then added to the cluster configuration so it will be restarted automatically on the standby as part of failover. On restart after failover, the system automatically performs the normal startup recovery, maintaining structural and logical integrity exactly as if InterSystems IRIS had been restarted on the failed system. Multiple InterSystems IRIS instances can be installed on a single cluster if desired. Figure 1: Failover Cluster Configuration 2 InterSystems High Availability Solutions

Virtualization Platform HA 4 Virtualization Platform HA Virtualization platforms generally provide HA capabilities, which typically monitor the status of both the guest operating system and the hardware it is running on. On the failure of either, the virtualization platform automatically restarts the failed virtual machine, on alternate hardware as needed. When the InterSystems IRIS instance restarts, it automatically performs the normal startup recovery, maintaining structural and logical integrity as if InterSystems IRIS had been restarted on a physical server. Figure 2: Failover in a Virtual Environment Virtualization HA has the advantage of being built into the virtualization platform infrastructure, and thus can require very little effort to configure, in some cases none at all. In addition, virtualization platforms allow the planned relocation of virtual machines to alternate hardware for maintenance purposes, enabling upgrade of physical servers, for example, without any down time. 5 InterSystems IRIS Mirroring InterSystems IRIS mirroring with automatic failover takes a different approach to HA, relying on logical data replication between fully independent systems to avoid the single point of failure risk ofy shared storage and ensure that production can immediately fail over to an alternate InterSystems IRIS instance in almost all failure scenarios system, storage, and network. In an InterSystems IRIS mirror, one InterSystems IRIS instance, called the primary failover member, provides access to the production databases. Another instance on a separate host, called the backup failover member, communicates synchronously with the primary, retrieving its journal records, acknowledging their receipt, and applying them to its own copies of the same databases. In this way, both the primary and the backup always know whether the backup has the most recent journal files from the primary, and can therefore precisely synchronize its databases with those on the primary. When this is the case, the mirror can quickly and automatically fail over to the backup in the event of a primary outage with no loss of data. A third system, the arbiter, helps the backup determine whether it should take over when the primary becomes unresponsive. A mechanism such as a virtual IP address shared by the failover members or a distributed cache cluster redirects application connections to the new primary. The failover process takes just seconds; many users won t even notice it happening. And because the backup has its own copies of the databases, even a total failure of the primary and its storage does not render the databases unavailable. In fact, even when the backup is missing the most recent journal data, the backup s mirror agent can retrieve it from the primary s host, if it is still online. Once you restore the former primary to operation following failover, it becomes the backup and its databases quickly catch up with those on the new primary, returning the mirror to full HA capability. You can then return the systems to their original roles ior maintain the new arrangement. InterSystems High Availability Solutions 3

HA Solutions Feature Comparison Mirrors can also include disaster recovery (DR) async members, which are asynchronously maintained copies of the primary; a DR async can be promoted to failover member, for example to become backup when a failed primary cannot be restored to operation quickly, or (if physically separate) for disaster recovery when an outage, such as a data center failure, brings down both failover members. Finally, mirrors can contain reporting async members, which maintain asynchronous copies of the production databases for business intelligence and data warehousing purposes. Figure 3: InterSystems IRIS Mirror Mirroring can also be used together with virtualization platform HA to create a hybrid HA approach, under which the virtualization platform responds to unplanned system or OS-level outages while mirroring handles all planned outages and unplanned database outages (including both InterSystems IRIS outages and storage failures) through automatic failover. For complete information about InterSystems IRIS mirroring, including its DR capabilities, see the Mirroring chapter of the High Availability Guide. 6 HA Solutions Feature Comparison The following table provides a very general feature comparison mirroring, clustering, and virtualization as HA solutions. InterSystems IRIS Mirroring OS-Level Clustering Virtualization Platform HA Failover after machine power loss or crash Handles machine failure seamlessly. Handles machine failure seamlessly. Handles physical and virtual machine failures seamlessly. Protection from storage failure and data corruption Built-in replication protects against storage failure; logical replication avoids carrying forward most types of corruption. Relies on shared storage device, so failure is disastrous; storage-level redundancy optional, but can carry forward some types of corruption Relies on shared storage device, so failure is disastrous; storage-level redundancy optional, but can carry forward some types of corruption. Failover after InterSystems IRIS shutdown, hang, or crash Rapid detection and failover is built in. Can be configured to fail over after InterSystems IRIS outage. Can be configured to fail over after InterSystems IRIS outage. InterSystems IRIS upgrades Allows for minimum- downtime InterSystems IRIS upgrades.* InterSystems IRIS upgrades require downtime. InterSystems IRIS upgrades require downtime. 4 InterSystems High Availability Solutions

Using Distributed Caching with a Failover Strategy InterSystems IRIS Mirroring OS-Level Clustering Virtualization Platform HA Application mean time to recovery Failover time is typically seconds. Failover time can be minutes. Failover time can be minutes. External file synchronization Only databases are replicated; external files need external solution. All files are available to both nodes. All files available after failover. * Requires a configuration in which application code, classes, and routines are kept in databases separate from those that contain application data 7 Using Distributed Caching with a Failover Strategy Whatever approach you take to HA, a distributed cache cluster based on the Enterprise Cache Protocol (ECP) can be used to provide a layer of insulation between the users and the database server. Users remain connected to the cluster s application servers when the data server fails; user sessions and automatic transactions that are actively accessing data during the outage pause until the data server becomes available again through completion of failover or restart of the failed system. Bear in mind, however, that adding distributed caching to your HA strategy can increase complexity and introduce additional points of failure. For information about distributed caching, see Horizontally Scaling Systems for User Volume with InterSystems Distributed Caching in the Scalability Guide. InterSystems High Availability Solutions 5