IBM FileNet Content Manager and IBM GPFS

Similar documents
IBM FileNet Content Manager 5.2. Asynchronous Event Processing Performance Tuning

IBM SWG Enterprise Content Management

An Introduction to GPFS

Lenovo RAID Introduction Reference Information

Tivoli Storage Manager for Virtual Environments: Data Protection for VMware Solution Design Considerations IBM Redbooks Solution Guide

Continuous Availability with the IBM DB2 purescale Feature IBM Redbooks Solution Guide

IBM Platform LSF. Best Practices. IBM Platform LSF and IBM GPFS in Large Clusters. Jin Ma Platform LSF Developer IBM Canada

IBM System Storage Reference Architecture featuring IBM FlashSystem for SAP landscapes, incl. SAP HANA

Designing a Reference Architecture for Virtualized Environments Using IBM System Storage N series IBM Redbooks Solution Guide

IBM Active Cloud Engine centralized data protection

Application and Database Protection in a VMware vsphere Environment

Lenovo SAN Manager - Provisioning and Mapping Volumes

A GPFS Primer October 2005

Optimizing Data Integration Solutions by Customizing the IBM InfoSphere Information Server Deployment Architecture IBM Redbooks Solution Guide

CONFIGURING SSO FOR FILENET P8 DOCUMENTS

Best practices. Starting and stopping IBM Platform Symphony Developer Edition on a two-host Microsoft Windows cluster. IBM Platform Symphony

Release Notes. IBM Tivoli Identity Manager Rational ClearQuest Adapter for TDI 7.0. Version First Edition (January 15, 2011)

Implementing IBM Easy Tier with IBM Real-time Compression IBM Redbooks Solution Guide

iscsi Configuration Manager Version 2.0

An introduction to GPFS Version 3.3

IBM. Combining DB2 HADR with Q Replication. IBM DB2 for Linux, UNIX, and Windows. Rich Briddell Replication Center of Competency.

Virtualisation, tiered storage, space management How does it all fit together?

IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clinical Platform

VERITAS Storage Foundation 4.0 TM for Databases

WELCOME TO TIVOLI NOW!

IBM Real-time Compression and ProtecTIER Deduplication

How Smarter Systems Deliver Smarter Economics and Optimized Business Continuity

IBM System Storage DS5020 Express

Lenovo SAN Manager Rapid RAID Rebuilds and Performance Volume LUNs

SAS workload performance improvements with IBM XIV Storage System Gen3

Tivoli Access Manager for Enterprise Single Sign-On

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary

Best practices. Reducing concurrent SIM connection requests to SSM for Windows IBM Platform Symphony

IBM Tivoli Access Manager for Enterprise Single Sign-On: Authentication Adapter Version 6.00 September, 2006

Flex System FC port 16Gb FC Adapter Lenovo Press Product Guide

Elastic Caching with IBM WebSphere extreme Scale IBM Redbooks Solution Guide

IBM iseries Models 800 and 810 for small to medium enterprises

IBM Operational Decision Manager Version 8 Release 5. Configuring Operational Decision Manager on Java SE

Netcool/Impact Version Release Notes GI

The advantages of architecting an open iscsi SAN

Veritas Volume Replicator Option by Symantec

IBM Copy Services Manager Version 6 Release 1. Release Notes August 2016 IBM

Release Notes. IBM Tivoli Identity Manager Oracle PeopleTools Adapter. Version First Edition (May 29, 2009)

Migrating Classifications with Migration Manager

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

Veritas Storage Foundation for Windows by Symantec

IBM DeepFlash Elastic Storage Server

IBM Daeja ViewONE Virtual Performance and Scalability

The Deployment of SAS Enterprise Business Intelligence Solution in a large IBM POWER5 Environment

Installation and User s Guide

Tivoli Access Manager for Enterprise Single Sign-On

Open-E Data Storage Server. Intel Modular Server

Demanding More in Today s Business World: IBM eserver Capabilities Proven Superior with Oracle Applications Standard Benchmark Performance

Veritas Storage Foundation for Windows by Symantec

Implementing IBM CICS JSON Web Services for Mobile Applications IBM Redbooks Solution Guide

Using application properties in IBM Cúram Social Program Management JUnit tests

Using the IBM DS8870 in an OpenStack Cloud Environment IBM Redbooks Solution Guide

Enterprise Caching in a Mobile Environment IBM Redbooks Solution Guide

Introduction to PCI Express Positioning Information

IBM Tivoli Monitoring for Databases. Release Notes. Version SC

Flex System FC5024D 4-port 16Gb FC Adapter Lenovo Press Product Guide

ServeRAID-MR10i SAS/SATA Controller IBM System x at-a-glance guide

Best practices. Linux system tuning for heavilyloaded. IBM Platform Symphony

Implementing Disk Encryption on System x Servers with IBM Security Key Lifecycle Manager Solution Guide

IBM Storwize V7000: For your VMware virtual infrastructure

IBM System Storage DS6800

An Oracle White Paper May Oracle VM 3: Overview of Disaster Recovery Solutions

Brocade 20-port 8Gb SAN Switch Modules for BladeCenter

IBM License Metric Tool Enablement Guide

Tivoli Access Manager for Enterprise Single Sign-On

ServeRAID M5000 Series Performance Accelerator Key for System x Product Guide

Tivoli Access Manager for Enterprise Single Sign-On

Performance Tuning Guide

Upgrading to UrbanCode Deploy 7

Utility Capacity on Demand: What Utility CoD Is and How to Use It

IBM System Storage DS5020 Express

Veritas Storage Foundation from Symantec

Flex System FC port 8Gb FC Adapter Lenovo Press Product Guide

HA200 SAP HANA Installation & Operations SPS10

IBM TotalStorage Enterprise Storage Server Model 800

Installing Watson Content Analytics 3.5 Fix Pack 1 on WebSphere Application Server Network Deployment 8.5.5

Optimizing Storage Efficiency with Information Lifecycle Management Solutions

About Database Adapters

IBM Storage Driver for OpenStack Version Release Notes

IBM System Storage DS4800

IBM z/os Management Facility V2R1 Solution Guide IBM Redbooks Solution Guide

Contents. Configuring AD SSO for Platform Symphony API Page 2 of 8

Implementing Enhanced LDAP Security

Tivoli Access Manager for Enterprise Single Sign-On

IBM i 7.3 Features for SAP clients A sortiment of enhancements

Patch Management for Solaris

EMC Integrated Infrastructure for VMware. Business Continuity

White Paper: Configuring SSL Communication between IBM HTTP Server and the Tivoli Common Agent

IBM Tivoli Directory Server Version 5.2 Client Readme

Stellar performance for a virtualized world

Overview. Implementing Fibre Channel SAN Boot with the Oracle ZFS Storage Appliance. January 2014 By Tom Hanvey; update by Peter Brouwer Version: 2.

IBM TotalStorage DS8800 Disk Storage Microcode Bundle v6 Release Note Information

Data Sheet: High Availability Veritas Cluster Server from Symantec Reduce Application Downtime

Remove complexity in protecting your virtual infrastructure with. IBM Spectrum Protect Plus. Data availability made easy. Overview

Oracle s JD Edwards EnterpriseOne IBM POWER7 performance characterization

Transcription:

IBM FileNet Content Manager support for IBM General Parallel File System (GPFS) September 2014 IBM SWG Enterprise Content Management IBM FileNet Content Manager and IBM GPFS Copyright IBM Corporation 2014 Enterprise Content Management www.ibm.com No part of this document may be reproduced in any form by any means without prior written authorization of IBM. This document is provided as is without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranty of merchantability or fitness for a particular purpose. This document is intended for informational purposes only. It could include technical inaccuracies or typographical errors. The information herein and any conclusions drawn from it are subject to change without notice. Many factors have contributed to the results described herein and IBM does not guarantee comparable results. Performance numbers will vary greatly depending upon system configuration. All data in this document pertains only to the specific test configuration and specific releases of the software described.

IBM FileNet Content Manager and IBM GPFS Page 2 CONTENTS Introduction...3 More about GPFS...3 GPFS configuration models...4 SAN / direct access model (supported)...4 NSD / shared disk model (supported)...4 Share nothing model (not currently supported)...5 Configuration best practices...6 Storage considerations...6 NSD server considerations...6 Tier Breaker Disks...6 Optimal number of quorum nodes...6 GPFS performance considerations...6 pagepool...6 maxfilestocache...7 maxstatcache...7 maxmbps...7 blocksize...7 Separate disks for data and meta-data...7 Conclusion...7 References...8 Author Information...9

IBM FileNet Content Manager and IBM GPFS Page 3 Introduction IBM General Parallel File System (GPFS) is a POSIX compliant file management infrastructure that provides both outstanding performance and reliability. GPFS has been proven to perform in small as well as large clustered deployments, reaching up to thousands of nodes hosting multi-petabyte file systems. GPFS also provides industry leading high availability for IBM FileNet Content Manager. IBM FileNet Content Manger fully supports IBM GPFS for the file system tier of your enterprise ECM solution. By choosing the proper model, GPFS has the flexibility to meet any IT organization's needs. This document will provide a high-level overview of GPFS and how FileNet Content Manager can leverage GPFS's vast capabilities to manage file system content efficiently and effectively. Also, a brief discussion on how to configure GPFS for optimal performance and high availability with IBM FileNet Content Manager is provided. Following the best practices outlined in this document will help ensure IBM FileNet Content Manager provides the performance and reliability expected from a superior file system technology. More about GPFS Advanced document management capabilities are achieved using the GPFS Information Lifecycle Management (ILM) toolset. The ability to define storage policies and create storage pools gives IT departments the flexibility to store content on multiple storage tiers. This improves application performance in addition to saving money by moving less frequently accessed data to less costly storage solutions. Robust clustering features, automatic replication and snapshot capabilities provide a high level of fault tolerance. A properly configured GPFS cluster can remain online even after suffering multiple failures providing zero-down time. For more information regarding GPFS, please consult the GPFS Knowledge Center or contact your IBM sales representative. For more information regarding GPFS, please consult the IBM GPFS Knowledge Center.

IBM FileNet Content Manager and IBM GPFS Page 4 GPFS configuration models SAN / direct access model (supported) The SAN model, where all GPFS nodes are designated as server nodes with direct access to the underlying storage devices, is typically the best performing configuration, although there are drawbacks to this model that should be noted. The SAN model can be expensive and is less flexible than the NSD model. This is due to all GPFS nodes requiring the necessary hardware to connect directly to the storage device. Therefore, it may be impractical for large GPFS clusters to follow the SAN model and those customers should consider the NSD model. The model provides excellent high availability characteristics. When each node is designated as a server node and tie breaker disk(s) are in place, the cluster can remain online in the most trying of circumstances. Consider a 4 node cluster where each node has direct fiber connections to an underlying SAN storage device. If 3 of the nodes lose SAN connectivity, quorum can still be maintained with one node and the use of a tie breaker disk. The nodes that lost connectivity to the storage device will essentially become clients to the single remaining GPFS server and be able to retrieve data over the Ethernet communication channel until direct SAN connectivity can be restored. NSD / shared disk model (supported) Figure 1: SAN model The Network Shared Disk (NSD) model is considered the most flexible and scalable configuration. This model allows for GPFS servers and clients, where the servers have direct access to the storage device(s) and a client can access this data from any network addressable location. When using the NSD model, it is suggested to use the highest performing Ethernet option available between nodes. Both Ethernet and InfiniBand is supported for the LAN fabric. A

IBM FileNet Content Manager and IBM GPFS Page 5 minimum of two GPFS servers with direct disk access is not only important for maintaining high availability, application performance may be impacted negatively if the GPFS server(s) cannot meet the I/O demands of the GPFS clients. Figure 2: NSD model Share nothing model (not currently supported) This model is not supported by IBM FileNet CM. Due to the stateless nature of FileNet CM, each CPE instance must have access to all underlying content. This model is primarily designed for Hadoop MapReduce and SAP HANA applications. Cloud deployments can also leverage this model since each tenant represents a separate application environment with full data isolation between tenants, but that type of configuration is out of the scope of this document.

IBM FileNet Content Manager and IBM GPFS Page 6 Configuration best practices Storage considerations It is important to ensure the SAN device is highly available, meaning there are no single points of failure present. Highly available configurations include using multiple RAID controllers, configuring SAN failover and defining primary and backup servers for each LUN. It is best practice for a NSD to be associated with a single LUN. Each LUN can have up to 8 GPFS servers defined. NSD server considerations For optimal performance, it is necessary to configure multiple I/O paths for each GPFS server. A minimum of 2 I/O paths is suggested for parallel I/O operations. For the SAN model, even though all GPFS servers have a direct SAN connection, you should define servers for the NSDs so if the fiber connection fails, access to the NSD from that I/O server can still occur over the network. GPFS will always prefer block devices over a network shared disk. Tier Breaker Disks Tie breaker disks should only be needed in small clusters. In a two node cluster, it is best to require both nodes for quorum and then create tie breaker disk(s) in case one node goes offline so that quorum is still maintained. GPFS best practices state it is best to create either 1 or 3 tie breaker disks. Large clusters do not typically benefit from tie breaker disks since there are typically enough nodes in the cluster to maintain quorum rather than rely on tie breaker disks. Optimal number of quorum nodes In general, it is suggested to have 3 to 5 quorum nodes and never more than 7 as this can have a negative impact on cluster performance during recovery without providing increased availability. GPFS performance considerations pagepool Sufficient pagepool (pinned memory) is critcal for optimal application performance. GPFS does not use operating system file cache and will rely on the pagepool for cache, similar to a database bufferpool. The pagepool can be increased dynamically but not reduced dynamically and can be set at the node level for increased flexibility.

IBM FileNet Content Manager and IBM GPFS Page 7 maxfilestocache Controls the number of files that can be held in the pagepool. General guidance states this setting should be large enough to handle the number of concurrently open files in addition to recently used files. Consider full text indexing of recently ingested content. FileNet CM will need to access the content element once the index request is processed in order to perform extraction. If this content is found in the cache, the overall indexing performance can be improved dramatically. For scenarios where recently ingested content is not required for immediate retrieval, the default value should suffice. maxstatcache Defaults to 4 times the maxfilestocache which should be sufficient for general purpose workloads. FileNet CM workloads that are heavily oriented towards content retrieval may see improved performance by increasing this value. If you have increased the maxfilestocache parameter to a very large value, the maxstatcache may be set unnecessarily high and will consume memory that could be used by other applications. maxmbps Recommended to be set to twice the throughput required by the system. In GPFS v3.5, the default is 2048, which should be sufficient for most applications. blocksize The default block size should be sufficient for most applications, although a larger blocksize may be considered if large content is being stored on the file system. For maximum performance, it is important to consider the GPFS block size when defining the RAID stripe size for the underlying LUN(s). The GPFS block size should match, or be a multiple of the RAID stripe size. Separate disks for data and meta-data GPFS provides the ability to define a disk to hold data only, file system meta-data only or both data and meta-data. For general purposes, it is not necessary to separate meta-data from data. If your FileNet Content Manager deployment frequently performs file system meta-data operations, application performance may be improved by dedicating multiple disks for only file system meta-data. Conclusion IBM FileNet Content Manger fully supports IBM GPFS for the file system tier of your enterprise ECM solution. Choosing either the SAN or NSD model, GPFS has the flexibility to meet any IT organization's needs. Following the best practices outlined in this document will help ensure both models provide the performance and reliability expected from a superior file system technology.

IBM FileNet Content Manager and IBM GPFS Page 8 References 1. IBM FileNet P8 Software: http://www.ibm.com/software/ecm/filenet 2. IBM GPFS Knowledge Center: http://www.ibm.com/support/knowledgecenter/ssfkcn/gpfs_welcome.html

IBM FileNet Content Manager and IBM GPFS Page 9 Author Information Michael Bordash, ECM Server System Test Engineer Contributors Matthew Vest, ECM Server System Test & Performance Engineering Senior Manager Dave Royer, ECM Performance Architect, Senior Software Engineer Special thanks to the following members of the ECM CE development team: Tim Morgan Disclaimer The information in this publication is not intended as a substitution of the IBM FileNet product documentation provided by IBM. Please see www.ibm.com/software/ecm/filenet for more information about what publications are considered to be product documentation. References in this publication to IBM products, programs or services do not imply that IBM intends to make these available in all countries in which IBM operates. Any reference to an IBM product, program, or service is not intended to state or imply that only IBM's product, program, or service may be used. Any functionally equivalent program that does not infringe any of IBM's intellectual property rights may be used instead of the IBM product, program or service. Information in this publication was developed in conjunction with use of the equipment specified, and is limited in application to those specific hardware and software products and levels. The information contained in this publication was derived under specific operating and environmental conditions. While IBM has reviewed the information for accuracy under the given conditions, the results obtained in your operating environments may vary significantly. Accordingly, IBM does not provide any representations, assurances, guarantees, or warranties regarding performance. Any information about non-ibm ("vendor") products in this document has been supplied by the vendor and IBM assumes no responsibility for its accuracy or completeness. IBM, IBM FileNet Content Manager, DB2, WebSphere, AIX, Rational, and Tivoli are trademarks or registered trademarks of IBM Corporation in the United States, other countries, or both. Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. UNIX is a trademark of The Open Group. Windows is a registered trademark of Microsoft Corporation in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of others. Copyright IBM Corporation 2012 Produced in the United States of America All Rights Reserved The e-business logo, the eserver logo, IBM, the IBM logo, IBM Directory Server, DB2, FileNet, FileNet Content Manager and WebSphere are trademarks of International Business Machines Corporation in the United States, other countries or both. The following are trademarks of other companies: Solaris, Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States, other countries or both. Windows and Windows 2008 Enterprise Edition are trademarks of Microsoft Corporation in the United States and/or other countries Oracle 9i and all Oracle-based trademarks and logos are trademarks of the Oracle Corporation in the United States, other countries or both. Other company, product and service names may be trademarks or service marks of others. INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PAPER AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. Information in this paper as to the availability of products was believed accurate as of the time of publication. IBM cannot guarantee that identified products will continue to be made available by their suppliers. This information could include technical inaccuracies or typographical errors. Changes may be made periodically to the information herein; these changes may be incorporated in subsequent versions of the paper. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this paper at any time without notice. Any references in this document to non-ibm web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents.

Copyright IBM Corporation 2014 IBM 3565 Harbor Boulevard Costa Mesa, CA 92626-1420 USA Printed in the USA 01-07 All Rights Reserved. IBM and the IBM logo are trademarks of IBM Corporation in the United States, other countries, or both. All other company or product names are registered trademarks or trademarks of their respective companies. The IBM home page on the Internet can be found at ibm.com