Planning for an on-premise deployment
Fusion Architecture Part number: VV/GUIDE/CUST/62/D Copyright 2017 VoiceVault Inc. All rights reserved. This document may not be copied, reproduced, transmitted or distributed in part or in whole by any means without the prior written approved VoiceVault Inc. This specification is confidential and proprietary to VoiceVault Inc. and is provided to you under a licence agreement or nondisclosure agreement. The content of this document is provided as-is and for informational use only. The information contained in this document is subject to change without notice and should not be interpreted as a commitment by VoiceVault Inc. VoiceVault Inc. assumes no responsibility or liability for any errors or inaccuracies that may appear in this document. Except as permitted by such license, no part of this publication may be reproduced, stored in a retrieval system or transmitted, in any form or by any means, electronic, mechanical, recording or otherwise, without the prior written permission of VoiceVault Inc. VoiceVault is a trademark of Biometric Security Holdings Limited. All other trademarks and trade names mentioned herein are hereby acknowledged and recognized as property of their respective owners. VoiceVault Inc. 400 Continental Blvd 6 th Floor El Segundo CA 90245 USA (310) 426 2792 info@voicevault.com
Table of Contents Introduction... 4 Overview... 4 Fusion System Prerequisites... 5 Web Layer... 5 Data Layer... 5 Processing Layer... 5 Browser Support (for the Management UI)... 5 Virtualization Support... 5 Concepts and Components... 5 Web Layer (API Farm)... 5 Data Layer (Biometric Data Store)... 6 Processing Layer (Engine Farm)... 6 Scalability and High Availability... 7 Web Layer... 7 Data Layer... 8 Processing Layer... 8 Base Level Production System... 8 Overview... 8 Application Performance... 8 Typical Base Level System Requirements... 9
Introduction This architecture document is designed to assist in the planning of an on-premise deployment of the VoiceVault Fusion voice biometric system. Overview The following is a conceptual diagram showing the relationships between the various tiers of the VoiceVault Fusion system. The shaded icons on the left represent existing customer infrastructure (in this case a typical IVR-based system) and the icons to the right show the VoiceVault Fusion components. The following diagram describes the fundamental components and layers of VoiceVault Fusion.
Fusion System Prerequisites Web Layer Windows Server 2012 x64 Standard or Enterprise IIS 8.0 or above.net Framework 4.5 Data Layer Windows Server 2012 x64 Standard or Enterprise SQL Server 2012 x64 Processing Layer Windows Server 2012 x64 Standard or Enterprise.NET Framework 4.5 Browser Support (for the Management UI) Internet Explorer 10 or functionally equivalent browser Firefox 21.0 or functionally equivalent browser Google Chrome 27.0 or functionally equivalent browser Virtualization Support VoiceVault Fusion is supported on: VMWare ESX 4.1 Update 3 VMWare ESXi 5.0 Update 2 Amazon Xen-virtualized images VoiceVault Fusion has not been tested on, but is expected to work with: Microsoft HyperV role of Windows Server 2012 Citrix XenServer 6.1 Concepts and Components Web Layer (API Farm) The web layer hosts the VoiceVault Fusion Web Services application programming interfaces (API). All requests to VoiceVault Fusion from customer systems are made through the web layer and passed through into the data layer, and responses from the data layer are read by the web layer to return to the calling application.
Calling patterns are designed to be asynchronous, where client applications poll for updates. The Web Layer consists of the following main components: The Biometric API, which is the programming interface that exposes the voice verification functionality, hosted as a Windows Communication Foundation (WCF).NET 4.5 Web Service using HTTP over SOAP bindings and SSL/TLS transport security. The REST API, implemented as a REST-based XML service layer on top of the Biometric API, exposes the most common VoiceVault Fusion operations such as those used for mobile application development. The REST API is called using simple HTTP operations, either POST or GET, secured using an SSL transport. The Management API, provides a development interface for building user profile management or reporting in to applications, with access restricted by role-based user control. Hosted as a WCF.NET 4.5 Web Service using HTTP over SOAP secured by SSL/TLS. The Management UI, providing a web-based console to allow administrators to manage configuration, claimant and processing engine settings, and to obtain system activity reports and monitor system health. It is implemented as an ASP.NET application, intended to be deployed within a secure network segment. Data Layer (Biometric Data Store) The data layer stores all information required by the VoiceVault Fusion voice biometric system, such as: Biometric data, including a claimant s enrollment voice model Configuration data, including thresholds and processing-related settings Transactional data, including live enrollment/verification dialogues Audit data, including audio audit trail for enrollments and verifications All of the static, dynamic and transient data used by the Biometric Processing Engines associated with voice enrollment and verification is stored in the data layer. Processing Layer (Engine Farm) A processing engine performs all of the calculations associated with the voice biometric algorithms, including Speech Quality Measurement (SQM), voice model generation, voice model comparison, and replay attack detection. The processing layer distributes operations using a pull model, where requests are placed into the database by the web layer and pulled from the database by the next available processing engine. Once the engine has completed work on the request it places the results back into the database. The processing engine itself is a stateless service that performs mathematical operations relating to the voice biometric algorithms. Any number of processing engines can be
deployed and these can be associated with individual configurations as defined by a client application. Scalability and High Availability VoiceVault Fusion is specifically designed for high availability and scalability by leveraging the Microsoft technologies that the solution is built upon, and supports all standard load balancing technologies. The following diagram illustrates a recommended logical network deployment: The main components in this deployment are shown duplicated in order to provide a high level of availability. Web Layer The Web Layer scales out through the addition of web server nodes load balanced by any standard stateless HTTP load balancing solution: Round robin DNS Weighted least-load routing Microsoft Network Load Balancing Content Switching Appliance (such as Citrix NetScaler) This technique is also used to meet availability requirements for the system, and the exact choice of load balancing technique is dependent upon the predicted load. Note that the Management UI cannot be load balanced.
Data Layer The Data Layer is scaled using the capabilities available in Microsoft SQL Server, such as Microsoft Cluster Services for SQL Server, database mirroring (including automated fail-over using a Witness server) and transactional replication. Existing Microsoft SQL Server systems utilizing high scalability configurations can be used in place of the recommended deployment, assuming that it is supported solely by the Microsoft SQL Client used by VoiceVault Fusion. Specifically scalability and availability must be possible through configuration in the Fusion app.config files. Processing Layer The processing layer can scale both horizontally and vertically through the addition of additional CPUs and the provision of additional hardware to host more processing engines. The exact number of processing engines that need to be deployed will depend on the peak level of biometric transactions that need to be processed and the desired system response time. Base Level Production System Overview It is assumed that the hardware will be virtualized, therefore the hardware requirements are expressed in terms of cores, RAM and where necessary, disk space. The examples below are for a typical production system with load requirements of up to 30,000 identity verification or enrollment transactions per hour. Such a system would require a single farm of two processing engine servers (assuming dual quad core 2.4GHz, 8GB RAM), and provide 100% redundant capacity. Application Performance In a properly sized system, the end-to-end biometric processing (within either an enrollment or verification step) will be sub-second, typically supporting in excess of 10 transactions per second for a small/medium installation. For mobile applications over a cellular connection, or from a web application over the internet, there will be additional overhead due to the infrastructure and latency of the network, which is influenced by factors such as the device used, network provider, current cell loading, signal strength, backbone network bandwidth, etc. Trials of mobile connectivity over 4G with a simulated loaded network indicate that download speeds of over 10Mbits/s, upload speeds up to 6Mbits/s, and latencies of around 30 milliseconds are possible. However, real-world results will vary widely, but
typically upload speeds would be expected to be well in excess of 2Mbits/s with latencies of between 100-200ms. For a well-designed system, using a typical compressed audio files (A-Law/U-Law), the total end-to-end response should be well under 1 second (for VoiceVault processing and network latency). In addition, by careful asynchronous design of the application, it is possible to record multiple utterances while submitting and processing previous utterances in an enrollment/verification session, and so this small delay will only be experienced by the user after the last utterance is recorded. Typical Base Level System Requirements Web Layer: Data Layer: Processing Layer: Load Balancing: Two Windows Server nodes with IIS deployed: o Two dedicated cores running at least 2.4 GHz o 4GB RAM.NET Framework 4.5 Two Windows Server nodes with SQL Server deployed: o Four dedicated cores running at least 2.4 GHz o 16GB RAM o 2TB free disk space SQL Server 2012 configured for high availability/failover as required Windows Server nodes, variable depending upon virtualization decision: o Total of eight dedicated cores - number of Windows OS instances at the discretion of the customer (e.g. 4 dual core Windows Server installs or 2 quad core Windows Server installs) o Each node requires minimum 4GB RAM o Each node requires 20GB free disk space.net Framework 4.5 Load balancing solution at customer discretion Recommended with Two dedicated cores running at least 2.4 GHz VoiceVault. 2017