White Paper: Clustering of Servers in ABBYY FlexiCapture

Similar documents
Contingency Planning and Disaster Recovery

White Paper: ABBYY Recognition Server Web Service API Example

CLUSTERING. What is Clustering?

Introducing VMware Validated Designs for Software-Defined Data Center

ABBYY FlexiCapture Performance

Owner of the content within this article is Written by Marc Grote

VMware vsphere with ESX 6 and vcenter 6

Administering VMware vsphere and vcenter 5

Failover Clustering failover node cluster-aware virtual server one

Brainware Intelligent Capture

Contents OVERVIEW... 3

Advanced Architectures for Oracle Database on Amazon EC2

EMC VSPEX END-USER COMPUTING

Brainware Intelligent Capture

Perceptive Intelligent Capture

Addressing Data Management and IT Infrastructure Challenges in a SharePoint Environment. By Michael Noel

Reference Architecture

Surveillance Dell EMC Storage with Synectics Digital Recording System

EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH

EMC VSPEX FOR VIRTUALIZED MICROSOFT EXCHANGE 2013 WITH MICROSOFT HYPER-V

Dell Compellent Storage Center

Citrix XenApp 6.5 Administration

Course CXA-206: Citrix XenApp 6.5 Administration

Introducing VMware Validated Designs for Software-Defined Data Center

Introducing VMware Validated Designs for Software-Defined Data Center

Project: Configure ArcGIS Server 10 using Microsoft Server 2008 Failover Cluster

Surveillance Dell EMC Isilon Storage with Video Management Systems

MyCloud Computing Business computing in the cloud, ready to go in minutes

CMB-207-1I Citrix Desktop Virtualization Fast Track

Dispatcher. Phoenix. Dispatcher Phoenix Enterprise White Paper Version 0.2

HP Implementing Windows Server 2003 on HP ProLiant Cluster Solutions.

How to Lift-and-Shift a Line of Business Application onto Google Cloud Platform

Reference Architecture. vrealize Automation 7.0

Storage Considerations for VMware vcloud Director. VMware vcloud Director Version 1.0

Installation and Cluster Deployment Guide for VMware

Migration and Building of Data Centers in IBM SoftLayer

Installation and Cluster Deployment Guide for VMware

WhatsUp Gold. Evaluation Guide

Planning Resources. vrealize Automation 7.1

Parallels Virtuozzo Containers 4.6 for Windows

TANDBERG Management Suite - Redundancy Configuration and Overview

Virtualization And High Availability. Howard Chow Microsoft MVP

Configuring Advanced Windows Server 2012 Services

SnapCenter Software 4.0 Concepts Guide

White Paper. EonStor GS Family Best Practices Guide. Version: 1.1 Updated: Apr., 2018

EMC VSPEX END-USER COMPUTING

EMC STORAGE FOR MILESTONE XPROTECT CORPORATE

Installation and Cluster Deployment Guide

Network and storage settings of ES NAS high-availability network storage services

CXS Citrix XenServer 6.0 Administration

MCSA Windows Server 2012

Virtualizing your Datacenter

Synology High Availability (SHA)

Introduction to Virtualization. From NDG In partnership with VMware IT Academy

Simplifying HDS Thin Image (HTI) Operations

Surveillance Dell EMC Storage with Bosch Video Recording Manager

Citrix - CXA XenApp 6.5 Administration

COURSE OUTLINE IT TRAINING

VMware vsphere 6.5: Install, Configure, Manage (5 Days)

CXA Citrix XenApp 6.5 Administration

Installing and Configuring VMware Identity Manager Connector (Windows) OCT 2018 VMware Identity Manager VMware Identity Manager 3.

Network and storage settings of ES NAS high-availability network storage services

Surveillance Dell EMC Storage with Digifort Enterprise

Course CXS-203 Citrix XenServer 6.0 Administration

Surveillance Dell EMC Isilon Storage with Video Management Systems

Surveillance Dell EMC Storage with LENSEC Perspective VMS

VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014

Don t just manage your documents. Mobilize them!

Advanced Architecture Design for Cloud-Based Disaster Recovery WHITE PAPER

Partitioned Intradomain Federation for IM and Presence Service on Cisco Unified Communications Manager, Release 11.5(1)SU2

Microsoft SQL Server on Stratus ftserver Systems

TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1

Surveillance Dell EMC Storage with FLIR Latitude

Vendor: Microsoft. Exam Code: Exam Name: Configuring and Deploying a Private Cloud with System Center Questions:

Surveillance Dell EMC Storage in Physical Security Solutions with Axis NAS-Attached Cameras

Secret Server Demo Outline

Microsoft SharePoint Server 2010 Implementation on Dell Active System 800v

EMC Celerra Manager Makes Customizing Storage Pool Layouts Easy. Applied Technology

EMC VSPEX FOR VIRTUALIZED MICROSOFT EXCHANGE 2013 WITH HYPER-V

Interoute Use Case. SQL 2016 Always On in Interoute VDC. Last updated 11 December 2017 ENGINEERED FOR THE AMBITIOUS

Deployment Scenarios Microsoft TMG Standard, TMG Enterprise, TMG Branch Office series Appliances

Nexenta Technical Sales Professional (NTSP)

Installing and Configuring System Center 2012 Operations Manager SCOM

Fundamentals of Windows Server 2008 Network and Applications Infrastructure

Deploying enterprise applications on Dell Hybrid Cloud System for Microsoft Cloud Platform System Standard

EMC Celerra NS20. EMC Solutions for Microsoft Exchange Reference Architecture

When will I use ED Analyzer (EDA) with LAW?

Using EonStor DS Series iscsi-host storage systems with VMware vsphere 5.x

Enterprise print management in VMware Horizon

Microsoft Certified Solutions Expert (MCSE)

Microsoft Certified Solutions Associate (MCSA)

Hedvig as backup target for Veeam

Windows Server 2016 MCSA Bootcamp

CENTRALIZED MANAGEMENT DELL POWERVAULT DL 2100 POWERED BY SYMANTEC

Installation and User Guide

Creating a 2 node virtual SQL Server 2008 Cluster Configuration Using Windows 2003 MSCS

Technical Overview. Access control lists define the users, groups, and roles that can access content as well as the operations that can be performed.

Oracle WebLogic Server 12c on AWS. December 2018

SearchWinIT.com SearchExchange.com SearchSQLServer.com

Citrix XenServer 6 Administration

Transcription:

White Paper: Clustering of Servers in ABBYY FlexiCapture By: Jim Hill Published: May 2018

Introduction Configuring an ABBYY FlexiCapture Distributed system in a cluster using Microsoft Windows Server Clustering provides many advantages. Each server (called node) in the cluster can be configured either for failover or network load balancing. The distributed version of FlexiCapture has been architected to take maximum advantage of these features. There are licensing implications when implementing clustering and/or load balancing, so please contact your ABBYY account manager for details. Key Advantages of Configuring Cluster in the ABBYY FlexiCapture System There are several advantages of clustering the ABBYY FlexiCapture Distributed application. The primary benefit of clustering is the provision of fault tolerance and distributed workloads as discussed below. 1. Fault Tolerance. Configuring FlexiCapture in a cluster provides greater fault tolerance as each function is distributed between multiple servers making the whole installation much more fault tolerant. If one server is unable to provide a particular function that can automatically be picked up by the other server in the cluster assigned to that function. This eliminates a single point of failure for the system and provides the high availability necessary for a global capture application. 2. Distribution of Workloads Among Servers. Configuring FlexiCapture in a cluster provides a very easy way to increase processing capacity therefore making scaling of processing capacity very easy. Just add another server node to the cluster. 3. Greater Availability. This is important for companies where FlexiCapture is needed on a 24 x 7 basis such as when the system is used for the global capture application or in cases where critical business functions are provided such as invoice order processing. 4. Easier Management and Administration of the System. Administration of the cluster is done centrally through the Microsoft Failover Clustering utility and the NLB Manager in Windows Server. Configuration of clustering on the FlexiCapture system is easily performed through out of the box functions. Additional FlexiCapture licenses may not be required, this is discussed later in this article. But please check with your ABBYY partner or sales representative to be certain before planning the implementation. Understand Cluster Types Two Cluster Types and Multiple Configurations NLB Versus Failover. Microsoft provides two distinct types of cluster configurations, the NLB (network load balancing) type and the failover cluster. The NLB type is used to progressively scale the FlexiCapture system as increasing processing needs are addressed. The failover cluster type picks up the processing task when an operation fails on a single node in the cluster. Note that failover and NLB clusters cannot work on the same server installation. Tel: (248) 447-0100

Active Versus Passive. Each cluster type can be configured in either an active or passive type. In the active failover cluster configuration, each node (server in the cluster) is running and performing processing functions at the same time. In the case of a passive failover cluster configuration the second or successive nodes don t come into action until the previous node fails. In the case of FlexiCapture the failover cluster is generally configured in passive mode for things like the licensing server. More will be explained about this later in this article. Hardware Considerations Anyone who has ever worked with a Microsoft cluster will remember that two network cards are required on the server in order to provide a distinct network connection for each node because one is required to handle the private network traffic on the node with two-way heartbeat information. Basic Configuration of Servers and Nodes Figure 1 shows the various FlexiCapture stations and servers in the FlexiCapture Distributed system. For high capacity processing each function must be installed on a separate server as shown in Figure 1 below. Keep in mind the distinct differences between the two types of clusters, the network load balancing (NLB) and the failover cluster. Tel: (248) 447-0100 2

Figure 1: Basic FlexiCapture Station Diagram Application Layer Application Server, can be installed on a network load balancing cluster. Provides a web service in IIS (Internet Information Services) that verifies user authentication and authorization and performs other functions including execution of the capture workflows. This server hosts Tel: (248) 447-0100 3

the admin and monitoring console web page through which the main functions of the FlexiCapture system are administered. As an alternative configuration, it is possible to configure the routing of network connections to specific cluster nodes in contrast to using NLB affinity settings. When you first install the application server, you will of course have to create the database and also configure the file storage for this application server. This is done through the web administration console familiar to anyone who has ever installed FlexiCapture Distributed, and hopefully you have provided sufficient DBA credentials for the database creation! Note: scripts are provided in the installation directory (C:\inetpub\wwwroot\FlexiCaptureXX\Server by default) that can be given to your DBA for manual database creation if security is a concern. Licensing Server (also called Protection Server), can be installed on a failover cluster. Manages the software application licensing for all users including those logging in through the various rich clients and web stations. Note that each licensing server must have its own FlexiCapture license (or the same production license with an additional activation) because if one node fails the second node will require a license. Processing Layer Processing Server, can be installed on a failover cluster. This manages a pool of processing stations which provide the software OCR and other functions. The processing server is actually the only service which is shared between the nodes which makes it easy to test the failover service. Database Server. FlexiCapture requires access to a Microsoft SQL Server or Oracle instance, except that for development instances Microsoft SQL Server Express can be utilized. Also, for development it is possible to install all servers and stations on a single machine with the understanding that processing performance will be limited. Clustering of the database server is outside of the scope of this paper as it falls into the category of database administration. However, FlexiCapture can work with SQL Server installed on a failover cluster. ABBYY recommends that the database server always be installed on a separate machine. Processing Stations, can be installed on an NLB cluster and The web-based stations (which connect to IIS) can be installed on NLB clusters. Processing stations are the workers which provide the necessary OCR processing power in order to accomplish the data extraction functions within the allotted timeframe. There is much to understand about choosing the number of cores and memory assigned to processing stations. For example, the addition of CPU cores will only allow for the simultaneous processing of additional batches, not faster processing of existing batches. There is a point of diminishing returns when adding cores if the underlying system cannot support them. Almost more important than adding cores is disk bandwidth, because at some point the cores will be limited by the bandwidth of the disks. It is a best practice to install the processing stations on their own servers. Multiple processing stations on multiple servers inherently provides both load balancing and fail over. Although stations Tel: (248) 447-0100 4

can also be limited to specific tasks. For instance, you can have a processing station server that only does import, 3 that only do recognition, one for export, etc. Data Layer File Storage Server. FlexiCapture requires a network location to store files during the work in process. For a development system or smaller FlexiCapture processing environments this function may be performed by storing files within the database server. Files are only stored in the file storage server as long as steps are remaining to be completed in the workflow. For example, verification operators in an invoice processing solution may need to gather required information about a particular invoice document for a week or longer. During that time the invoice document will remain within the FlexiCapture file storage server. The moment that the last remaining workflow step has been completed the document and data is exported data will remain in the system until the set retention window is exceeded. The default retention window is 14 days, and it is possible to configure the system such that data is never deleted. For extremely high processing environments it is recommended that an external storage server (NAS, the lower cost option, or SAN) be utilized which provides read/write access at 1 Gb/second. In a medium capacity environment, a disk array in a RAID10 (or less ideal RAID1) should be provided along with very fast disk drives or SSD units. Figure 2 shows a high-level diagram of the basic servers and cluster types. Tel: (248) 447-0100 5

Figure 2: Servers and Cluster Types ABBYY FlexiCapture Licensing Implications Licensing is an important consideration in the planning of the cluster. As mentioned in the introduction, there are some licensing implications to cluster ABBYY FlexiCapture. When ordering a cluster license, you must consider both the page count and the licensing of the various stations. This is because during the time that the primary node is unable to serve the licensing function the secondary node will be required to provide sufficient licensing for the users. Considerations include the number of pages expected, the type of FlexiCapture system (invoices versus plain FlexiCapture), and the number of type of each station license required. The following diagram is useful to help plan the licensing needs for the failover server operating the licensing function. You will want to create your own detailed map of the servers being clustered with details including the main IP address, heartbeat IP address, and DNS assigned server names for each. Tel: (248) 447-0100 6

Figure 3: Architecture Showing Stations Conclusion Configuring clustering of the servers used in ABBYY FlexiCapture provides many advantages including high availability, load distribution among the servers, easier management and centralized administration. Clustering is a built-in function for FlexiCapture and it can be done for any of the servers without additional licensing costs except for the licensing server itself. How User Friendly Consulting Can Help We provide consulting services for ABBYY FlexiCapture, including both the installation and administration of the product well as assisting with or performing development using the web service API. Please reach out to us if you would like to have us solve your document conversion or data extraction challenge or just for help with properly configuring your existing ABBYY FlexiCapture system. We provide a wide range of consulting options including ABBYY trained and certified personnel as well as a wide range of training options to get your employees up to speed on the product very quickly. We Tel: (248) 447-0100 7

also distribute other ABBYY products such as ABBYY Recognition Server that are geared towards bulk document conversion or language translation. Tel: (248) 447-0100 8