Technical Brief: Specifying a PC for Mascot

Similar documents
Geneva 10.0 System Requirements

MINIMUM HARDWARE AND OS SPECIFICATIONS File Stream Document Management Software - System Requirements for V4.2

Autodesk Revit Structure 2012 System Requirements and Recommendations. Minimum: Entry-level configuration. Operating System Microsoft Windows 7 32-bit

Trend Micro Core Protection Module 10.6 SP1 System Requirements

Powering your business... on demand

vrealize Business System Requirements Guide

Minimum Hardware and OS Specifications

Minimum System Requirements for Horizon s new Windows Product

System Requirements 2008 R2

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

Robert Jamieson. Robs Techie PP Everything in this presentation is at your own risk!

T E C H N I C A L S A L E S S O L U T I O N S

Sage ERP MAS 200 SQL Version 4.50 Supported Platform Matrix Revised as of July 24, 2014

Sage MAS 90 Extended Enterprise Suite Version 1.4 Supported Platform Matrix Revised as of March 11, 2010

Sage Compatibility guide. Last revised: August 20, 2018

Copyright 2009 by Scholastic Inc. All rights reserved. Published by Scholastic Inc. PDF0090 (PDF)

Forensic Toolkit System Specifications Guide

(Business & Ultimate Edition only) Windows XP Professional x86. Windows Server 2003 x86. Windows Vista x86

SPECIFICATION FOR NETWORK ATTACHED STORAGE (NAS) TO BE FILLED BY BIDDER. NAS Controller Should be rack mounted with a form factor of not more than 2U

(Business & Ultimate Edition only) Windows XP Professional. Windows Server Windows Vista

Sage ERP Accpac. Compatibility Guide Version 6.0. Revised: February 2, Version 6.0 Compatibility Guide

Sage 300 ERP. Compatibility Guide Version Revised: Oct 1, Version 6.0 Compatibility Guide i

ArcExplorer -- Java Edition 9.0 System Requirements

Sage 100 Standard Version 2017 Supported Platform Matrix Created as of October 25, 2016

Maximizing Memory Performance for ANSYS Simulations


Patriot Hardware and Systems Software Requirements

MYOB Enterprise Solutions System Requirement Guidelines. Wednesday 21 st March 2012 Version 2.6

Virtualizing Agilent OpenLAB CDS EZChrom Edition with VMware

Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice.

Parallels Virtuozzo Containers

Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice.

Molecular Devices High Content Screening Computer Specifications

The Optimal CPU and Interconnect for an HPC Cluster

Comparison of Storage Protocol Performance ESX Server 3.5

Hardware & System Requirements

IT Business Management System Requirements Guide

Supra-linear Packet Processing Performance with Intel Multi-core Processors

Open Benchmark Phase 3: Windows NT Server 4.0 and Red Hat Linux 6.0

Building 96-processor Opteron Cluster at Florida International University (FIU) January 5-10, 2004

DocuShare 6.6 Customer Expectation Setting

Ekran System System Requirements and Performance Numbers

Sage ERP Accpac. Compatibility Guide Version 6.0. Revised: November 18, 2010

Sage 100 Premium Version 2017 Supported Platform Matrix Created as of October 28, 2016

MYOB Enterprise Solutions

IBM TotalStorage Enterprise Storage Server Model 800

Introduction. Architecture Overview

WHITE PAPER: BEST PRACTICES. Sizing and Scalability Recommendations for Symantec Endpoint Protection. Symantec Enterprise Security Solutions Group

MYOB ENTERPRISE SOLUTIONS. System Requirement Guidelines EXO BUSINESS. Version 2.9 Thursday 1 st May 2014 MYOB ENT ER P R IS E S OLUT IONS

File Server Comparison: Executive Summary. Microsoft Windows NT Server 4.0 and Novell NetWare 5. Contents

System Requirements. SuccessMaker 3

2 to 4 Intel Xeon Processor E v3 Family CPUs. Up to 12 SFF Disk Drives for Appliance Model. Up to 6 TB of Main Memory (with GB LRDIMMs)

Use of the Internet SCSI (iscsi) protocol

Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice.

RIGHTNOW A C E

Four-Socket Server Consolidation Using SQL Server 2008

IBM Emulex 16Gb Fibre Channel HBA Evaluation

Performance Pack. Benchmarking with PlanetPress Connect and PReS Connect

Parallel Performance Studies for a Clustering Algorithm

IBM InfoSphere Streams v4.0 Performance Best Practices

Entuity Network Monitoring and Analytics 10.5 Server Sizing Guide

ACT! by Sage Corporate Edition 2010 System Requirements

ProLiant DL F100 Integrated Cluster Solutions and Non-Integrated Cluster Bundle Configurations. Configurations

Short Note. Cluster building and running at SEP. Robert G. Clapp and Paul Sava 1 INTRODUCTION

AxxonSoft. The Axxon Smart. Software Package. Recommended platforms. Version 1.0.4

Cisco Prime Home 6.X Minimum System Requirements: Standalone and High Availability

Hitachi Converged Platform for Oracle

The DMFAS Programme DMFAS 5.2. Hardware, Software and Training Requirements CONFÉRENCE DES NATIONS UNIES SUR LE COMMERCE ET LE DÉVELOPPEMENT

F-Secure Policy Manager Proxy Administrator's Guide

VERITAS Foundation Suite TM 2.0 for Linux PERFORMANCE COMPARISON BRIEF - FOUNDATION SUITE, EXT3, AND REISERFS WHITE PAPER

DELL POWERVAULT MD FAMILY MODULAR STORAGE THE DELL POWERVAULT MD STORAGE FAMILY

Navigator & XiFlow System Specifications

ASN Configuration Best Practices

WHITE PAPER. How Deduplication Benefits Companies of All Sizes An Acronis White Paper

June 2004 Now let s find out exactly what we ve bought, how to shop a new system and how to speed up an existing PC!

Symantec NetBackup PureDisk Compatibility Matrix Created August 26, 2010

Contents Overview of the Compression Server White Paper... 5 Business Problem... 7


Recommended Hardware & Software Requirements for Installing Invu Document Management

Technical Documentation Version 7.4. Performance

PAC094 Performance Tips for New Features in Workstation 5. Anne Holler Irfan Ahmad Aravind Pavuluri

Contents Overview of the Performance and Sizing Guide... 5 Architecture Overview... 7 Performance and Scalability Considerations...

IBM System x servers. Innovation comes standard

SUSE Linux Enterprise Server: Supported Virtualization Technologies

Agilent Noise Source Calibration Using the Agilent N8975A Noise Figure Analyzer and the N2002A Noise Source Test Set. Technical Overview

Data management for Proteomics ABRF 2005

Benchmarking CPU Performance

NetXplorer. Installation Guide. Centralized NetEnforcer Management Software P/N D R3

Intel Select Solutions for Professional Visualization with Advantech Servers & Appliances

Exchange 2003 Deployment Considerations for Small and Medium Business

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Enterprise Server Midrange - Hewlett Packard

March MomentumPro V3.1 Technical Specification Guidelines

Iomega REV Drive Data Transfer Performance

Assessing performance in HP LeftHand SANs

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

Professional Edition. Hardware Requirements

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

AttAcc Systems Installation

Computer chip: A very small pieces of silicon or other semi-conducting material onto which integrated circuits are embedded Circuit board: A thin

Transcription:

Technical Brief: Specifying a PC for Mascot Matrix Science 8 Wyndham Place London W1H 1PP United Kingdom Tel: +44 (0)20 7723 2142 Fax: +44 (0)20 7725 9360 info@matrixscience.com http://www.matrixscience.com

Specifying a PC for Mascot Introduction... 2 Processor (CPU)... 2 Random Access Memory (RAM)... 3 Hard Disk Storage... 4 Operating System... 4 Web Server Software... 5 Mascot Cluster Mode... 5 Introduction Any recent, high specification PC containing either Intel or AMD processor(s) should make a suitable platform for Mascot. If you are buying a new PC, then a dual processor system, or one which can be upgraded to dual processors, will be a good investment. Systems with more than two processors usually carry a substantial price premium. If you plan to do high throughput work, and need to run Mascot on more than two processors, a cluster of dual processor boxes will usually offer the most cost effective solution. Processor (CPU) Two conditions must be satisfied in order for search speed to be proportional to processor speed. First, the FASTA sequence databases must be memory mapped, as discussed below. Second, the processor cache must be large enough and fast enough to prevent the limited bandwidth between processor and memory becoming a bottleneck. This second factor is critical because, even though the processor may be running at 1 GHz or more, the memory bus will normally be running at 133 MHz or less. The processors mentioned below all have adequate cache provision, particularly the Pentium Xeon variants, which is available with a choice of cache sizes. The variety of processors on the market and the rate at which new models are introduced makes it difficult to give specific recommendations. From Intel, Coppermine Pentium III, Pentium III Xeon, Pentium 4, and Pentium Xeon families are all suitable choices. From AMD, the Athlon is known to perform well. With new or unusual processors, operating system compatibility can be an issue. Hardware compatibility lists for Windows 2000 and Red-Hat Linux can be found here: http://www.microsoft.com/windows2000/! "professional/howtobuy/upgrading/compat/default.asp http://hardware.redhat.com/hcl/genpage2.cgi 2001 Matrix Science Ltd. Page 2

Not all processors are compatible with multiprocessor operation, and the operating system may also impose restrictions. The Intel Pentium III supports dual processor operation, but going beyond this requires a Xeon variant. At time of writing, it appears that Intel s marketing strategy is to restrict the standard Pentium 4 to single processor usage, and promote the Xeon variants, (called simply Pentium Xeon), for multiprocessor applications. You should recognise that very little commercial software actually uses multiple processors, and relatively few 4-way and 8-way systems are in use, so there is a greater chance of encountering hardware and operating system problems with multiprocessor boards than with single processor boards. We have observed excellent scalability with dual Pentium III processors running under Microsoft Windows 2000, NT 4, and Linux. That is, throughput from a dual processor system comes very close to double that obtained from a single processor. However, we cannot predict or guarantee the scalability of Mascot on hardware configurations that have not been specifically tested. Random Access Memory (RAM) RAM requirements are strongly dependent on the selection of databases you plan to search. Mascot Monitor makes a compressed copy of each FASTA database, in which the title lines have been removed and the sequence strings have been packed in a byte efficient manner. The compressed copy of each database is mapped into RAM and, if there is sufficient room, can be locked in place. When a search calls for a database that is not in memory, the search duration is increased by the time taken to read the database from disk. For a long search, such as a no-enzyme specificity search of a large LC-MS/MS dataset, this additional time may be negligible. For a short search, reading from disk may take longer than the search itself. Databases should always be memory mapped, even though a system might not have sufficient physical RAM to hold them all. Memory mapping only consumes virtual address space, and enables the file to be accessed more efficiently. However, it doesn t guarantee that a particular database will be in memory when a search calls for it; some other process may have kicked it out. So, the smaller, frequently searched databases should be locked into memory, guaranteeing that they are always loaded in RAM. RAM requirements can be estimated from the sizes of the FASTA files you intend to lock in memory. For a protein database, the required RAM is roughly 80% of the FASTA file size, while for a nucleic acid database it is roughly 40%. Some examples are given in the following table, but the comprehensive sequence databases increase significantly in size every month. Database FASTA (Mb) RAM (Mb) Compression Swiss-Prot 104 86 1 : 0.82 MSDB 272 220 1 : 0.81 dbest 5429 2115 1 : 0.39 You also need to allow approximately 60 Mb for the operating system (Windows) and some 10 Mb for each executing Mascot search. So, for a single non-redundant protein database, 2001 Matrix Science Ltd. Page 3

512Mb RAM is sufficient. To have Swiss-Prot, MSDB and dbest, plus a few smaller databases locked in memory at the same time requires at least 2.5 Gb. Since many PC motherboards only support a maximum of 1 or 2 Gb RAM, this looks like a problem. But, in practice, it is rarely necessary for a database as large as dbest to be locked in memory. Being composed of short stretches of nucleic acid sequence, it is not suitable for peptide mass fingerprint searches, and tends to be used as a database of last resort for large searches, where the overhead of reading it from disk represents only a small part of the total search time. Hard Disk Storage The Mascot program files require very little disk space in comparison to the sequence databases and the accumulating result files. For the sequence databases, you will need to maintain free disk space of the order of 3 times the largest FASTA file. This is because, during a database update, there may be the current FASTA file and its associated compressed files plus the equivalent for the incoming database. The space needed for result files depends on the overall search profile and on how long results are to remain on-line. Individual result file sizes range from 20 kb for a peptide mass fingerprint search through to several Mb for a large LC-MS/MS dataset. Disk drives are very inexpensive, and most PC s support up to four IDE devices. It is difficult to have too much disk space, especially if you plan to search databases similar in size to dbest. If any databases are not memory mapped, short searches may be disk I/O bound, and a fast disk (e.g. fast wide SCSI) or a disk array (e.g. RAID) can then become an important factor in maximising throughput. Operating System Supported operating systems for Mascot on Intel are: Operating System Max. CPU Microsoft Windows XP Professional 2 Microsoft Windows 2000 Professional 2 Microsoft Windows 2000 Server 4 Microsoft Windows 2000 Advanced Server 8 Microsoft Windows 2000 Data Center 32 Microsoft Windows NT4 Workstation 2 Microsoft Windows NT4 Server 4 Microsoft Windows 2000 Enterprise Server 8 RedHat Linux 7.1, kernel version 2.4.2 or later N/A 2001 Matrix Science Ltd. Page 4

Web Server Software Mascot requires a web server for administration and interactive use. In the case of Windows, Microsoft s Internet Information Server (IIS) is the obvious choice unless you are committed to some other package. IIS is bundled with Windows 2000 and included in Option pack 4 for NT. The Mascot installation program automatically configures IIS versions 4 and later. If you decide to use a web server other than IIS, some manual configuration will be required. The web server provided with NT4 Workstation is called Microsoft Personal Web Server (PWS). This server is very similar to IIS, but differs in a few key features, such as the maximum number of simultaneous connections. Full details can be found at: http://www.microsoft.com/ntworkstation/news/mktbulletins/ntwvnts.asp Apache is a good choice for Linux. It can also be used under Windows, but the current Windows version doesn t support non-parsed headers, which prevents the display of progress reports during a search. http://www.apache.org Running a web browser on the same PC as the web server can take a surprising amount of processor time, so search times may suffer. If the same PC is also used for instrument control and data acquisition, you may need to adjust job priorities using Windows Task Manager to ensure that the instrument gets adequate priority. Mascot Cluster Mode A Mascot licence for 4 or more processors automatically supports operation on a cluster of systems connected by a dedicated 100 Base-T LAN. A cluster offers several advantages over a single, multiprocessor system: Mass market, reliable, low cost PC hardware can be used The cluster can be incrementally expanded as workload increases The RAM required to map sequence databases is distributed across multiple systems, circumventing the limits of a single system. The limited bandwidth of the PC bus is effectively multiplied by the number of systems in the cluster. 2001 Matrix Science Ltd. Page 5