Experiences with HP SFS / Lustre in HPC Production

Similar documents
Filesystems on SSCK's HP XC6000

Rechenzentrum HIGH PERFORMANCE SCIENTIFIC COMPUTING

Parallel File Systems Compared

Performance Monitoring in an HP SFS Environment

Lessons learned from Lustre file system operation

Using file systems at HC3

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

2008 International ANSYS Conference

VMmark V1.0.0 Results

Introduction...2. Executive summary...2. Test results...3 IOPs...3 Service demand...3 Throughput...4 Scalability...5

The Optimal CPU and Interconnect for an HPC Cluster

NetApp High-Performance Storage Solution for Lustre

Introduction The Project Lustre Architecture Performance Conclusion References. Lustre. Paul Bienkowski

Exchange Server 2007 Performance Comparison of the Dell PowerEdge 2950 and HP Proliant DL385 G2 Servers

Windows NT Server Configuration and Tuning for Optimal SAS Server Performance

LUSTRE NETWORKING High-Performance Features and Flexible Support for a Wide Array of Networks White Paper November Abstract

Cray events. ! Cray User Group (CUG): ! Cray Technical Workshop Europe:

Embedded Filesystems (Direct Client Access to Vice Partitions)

1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K

FhGFS - Performance at the maximum

An Overview of Fujitsu s Lustre Based File System

High-Performance Lustre with Maximum Data Assurance

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Parallel File Systems for HPC

Assistance in Lustre administration

Demonstration Milestone for Parallel Directory Operations

Oracle Platform Performance Baseline Oracle 12c on Hitachi VSP G1000. Benchmark Report December 2014

Lustre overview and roadmap to Exascale computing

CSCS HPC storage. Hussein N. Harake

Scalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms

Tested By: Hewlett-Packard Test Date: Configuration Section Configuration

JMR ELECTRONICS INC. WHITE PAPER

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

VMware VMmark V1.1 Results

LLNL Lustre Centre of Excellence

Grid Computing Competence Center Large Scale Computing Infrastructures (MINF 4526 HS2011)

High Performance Storage Solutions

BlueGene/L. Computer Science, University of Warwick. Source: IBM

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

HP Data Protector Software

Performance Benefits of OpenVMS V8.4 Running on BL8x0c i2 Server Blades

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1

Multi-Rail LNet for Lustre

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

VMware VMmark V1.1 Results

Platform Choices for LS-DYNA

QuickSpecs. ProLiant Cluster F500 for the Enterprise SAN. Overview. Retired

HP StorageWorks Enterprise File Services Clustered Gateway performance and scalability with HP StorageWorks XP12000 Disk Array white paper

Architecting Storage for Semiconductor Design: Manufacturing Preparation

Datura The new HPC-Plant at Albert Einstein Institute

Intel Enterprise Processors Technology

Oracle Performance on M5000 with F20 Flash Cache. Benchmark Report September 2011

SAP HANA Inspirience Day Workshop SAP HANA Infra. René Witteveen Master ASE Converged Infrastructure, HP

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Lustre at Scale The LLNL Way

ABySS Performance Benchmark and Profiling. May 2010

Oracle Data Warehouse with HP a scalable and reliable infrastructure

HP Certified Professional Implementing Compaq ProLiant Clusters for NetWare 6 exam #HP0-876 Exam Preparation Guide

Exploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer

Solaris Engineered Systems

Fast-communication PC Clusters at DESY Peter Wegner DV Zeuthen

Distributed Filesystem

To Infiniband or Not Infiniband, One Site s s Perspective. Steve Woods MCNC

VMmark 3.0 Results. Number of Hosts: 2 Uniform Hosts [yes/no]: yes Total sockets/cores/threads in test: 4/112/224

Assessing performance in HP LeftHand SANs

QuickSpecs HP Cluster Platform 3000 and HP Cluster Platform 4000

Voltaire Making Applications Run Faster

The current status of the adoption of ZFS* as backend file system for Lustre*: an early evaluation

HPC Architectures. Types of resource currently in use

Parallel Performance Studies for a Clustering Algorithm

HP SAS benchmark performance tests

BeeGFS. Parallel Cluster File System. Container Workshop ISC July Marco Merkel VP ww Sales, Consulting

Number of Hosts: 2 Uniform Hosts [yes/no]: yes Total sockets/cores/threads in test: 4/32/64

Configuring SR-IOV. Table of contents. with HP Virtual Connect and Microsoft Hyper-V. Technical white paper

CMS experience with the deployment of Lustre

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

HIGH PERFORMANCE SANLESS CLUSTERING THE POWER OF FUSION-IO THE PROTECTION OF SIOS

A ClusterStor update. Torben Kling Petersen, PhD. Principal Architect, HPC

PC DESY Peter Wegner. PC Cluster Definition 1

Oracle on HP Storage

VMware VMmark V1.1 Results

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation

Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation

The Last Bottleneck: How Parallel I/O can improve application performance

Crossing the Chasm: Sneaking a parallel file system into Hadoop

VMware VMmark V1.1 Results

New HPE 3PAR StoreServ 8000 and series Optimized for Flash

DELIVERABLE D5.5 Report on ICARUS visualization cluster installation. John BIDDISCOMBE (CSCS) Jerome SOUMAGNE (CSCS)

NEMO Performance Benchmark and Profiling. May 2011

Technical White paper Certification of ETERNUS DX in Milestone video surveillance environment

Advanced Data Placement via Ad-hoc File Systems at Extreme Scales (ADA-FS)

Quick recovery for Microsoft Exchange 2003 using VERITAS Storage Foundation for Microsoft Windows and HP storage white paper

Software Defined Storage at the Speed of Flash. PRESENTATION TITLE GOES HERE Carlos Carrero Rajagopal Vaideeswaran Symantec

Optimizing LS-DYNA Productivity in Cluster Environments

Diagnosing Performance Issues in Cray ClusterStor Systems

Experience the GRID Today with Oracle9i RAC

InfiniBand Networked Flash Storage

Transcription:

Experiences with HP SFS / Lustre in HPC Production Computing Centre (SSCK) University of Karlsruhe Laifer@rz.uni-karlsruhe.de page 1

Outline» What is HP StorageWorks Scalable File Share (HP SFS)? A Lustre product from HP available since December 2004» Performance measurements depending on underlying hardware at SSCK» Experiences with HP SFS SSCK has one of the first Lustre production installations in Europe page 2

HP SFS system architecture Client Population Interconnect ( Quadrics / Myrinet / GigE / Infiniband ) Admin MDS OSS OSS OSS OSS... STO STO STO STO STO Connections to site network Legend Admin: Administration Server MDS: Metadata Server OSS: Object Storage Server STO: Dual connected storage subsystem: either EVA3000 storage arrays or SFS20 storage arrays page 3

Value added of HP SFS compared to free Lustre» HP selects appropriate hardware Only dedicated hardware is supported on server side selects a current Lustre version Freely available Lustre releases become available with up to 1 year delay adds additional software for failover and management Both components are not part of free Lustre Management software supplies central point of administration runs additional tests and puts patches on top of the code delivers software, documentation, and licences Software includes client rpm packages for XC clusters supplies support page 4

HP XC 6000 Cluster installation schedule at SSCK today Phase 0 Phase 1 Q1 04 Q4 04 Q1 05 Q1 06 Phase 2 Phase 0 (Q1 2004), Development» 16 two-way nodes 12 Integrity rx2600 4 ProLiant DL360 G3 Single rail QsNet II» 2 TB storage system Phase 1 (Q4 2004), Production» 116 two-way nodes 108 Integrity rx2600 8 ProLiant DL360 G3 Single rail QsNet II» 11 TB storage system Phase 1 (Q1 2005), Production» 12 eight-way nodes 6 Integrity rx8620, two partitions Single rail QsNet II Phase 2 (Q1 2006), Production» 218 four-way nodes Two sockets Dual core Montecito Single or dual rail QsNet II» 30 TB storage system page 5

HP SFS on SSCK's HP XC6000 MDS and Admin for $HOME and $WORK allows > 50 million files $HOME 3.8 TB storage $WORK 7.6 TB storage Legend Admin: Administration Server MDS: Metadata Server OSS: Object Storage Server EVA: EVA5000 storage array C: Client page 6

Performance measurement environment» Used HP SFS software version was 1.1-0 Is based on Cluster Filesystem's Lustre version 1.2.6» Underlying hardware Clients are IA64 systems (rx2600, 1.5 GHz, 2 CPUs, 12 GB memory) Quadrics QsNet II (Elan4) interconnect EVA5000 (not EVA3000) storage systems with 2 controllers OSS disks are 146 GB 10K, MDS disks are 72 GB 15K Servers are IA32 systems (DL360 G3, 3.2 GHz, 2 CPUs, 4/2 GB memory) One file system ($HOME) with 2 OSS and 128 KB stripe size One file system ($WORK) with 4 OSS and 1 MB stripe size» Performance measurement details Measurements were done in parallel to production Visible impact should be low Benchmarking software was bonnie++ page 7

Sequential block write performance with 4 OSS Throughput (MB/s) 500 450 400 350 300 250 200 150 100 50 400 MB/s from one process 115 MB/s per OSS 1 process per node 2 processes per node 0 1 2 3 4 5 6 7 8 9 Number of nodes page 8

Block vs character write performance with 2 OSS Throughput (MB/s) 250 225 200 175 150 125 100 75 50 25 again 115 MB/s per OSS no performance degradation with many clients character operations reduce throughput on clients only block write with 1 process per node character write with 1 process per node 0 1 6 11 16 21 26 31 36 41 Number of nodes page 9

Sequential block read performance with 4 OSS Throughput (MB/s) 900 800 700 600 500 400 300 200 100 300 MB/s from one process about 190 MB/s per OSS 1 process per node 2 processes per node 0 1 2 3 4 5 6 7 8 9 Number of nodes page 10

Block vs character read performance with 2 OSS Throughput (MB/s) 375 350 325 300 275 250 225 200 175 150 125 100 75 50 25 0 1 6 11 16 21 26 31 36 41 Number of nodes again 190 MB/s per OSS again no impact of character operations on server performance block read with 1 process per node character read with 1 process per node page 11

File creation performance Number of creates / s 6000 5500 5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0 1 2 3 4 5 6 7 8 9 10 11 Number of nodes about 5000 file creates per second 1 process per node 2 processes per node page 12

Performance measurement summary» RAW lun performance using 2 controllers on 1 EVA5000 in parallel showed about 120 MB/s for writes and about 195 MB/s for reads» Main benchmarking results Write performance is about 115 MB/s per OSS Read performance can reach 190 MB/s per OSS» Possible results per OSS with 4 SFS20 storage arrays: About 400 MB/s for writes and about 580 MB/s for reads SFS20 was not yet available when SSCK s hardware was delivered» Performance mainly depends on installed hardware Linear scaling with number of OSS page 13

Performance of OSS components» Quadrics Elan4 Internally about 1300 MB/s Only PCI-X adapters exist write QsNet II read» PCI-X bus on servers About 900 MB/s PCI-X» Dual-ported FC adapter About 195 MB/s Actually only 1 port is used» EVA5000 storage array About 120 MB/s for writes Nearly 500 MB/s for reads 2 Gb FC EVA5000 write EVA5000 read 0 250 500 750 1000 1250 1500 Throughput (MB/s) page 14

Experiences with HP SFS 1.1-0» Works pretty stable when everything is up and running Production server system usually runs for weeks without problems MDS threads got blocked after about 4 weeks, solved with a patch» Filesystem operations continue after a problem is repaired Usually batch jobs continue to run» Understanding the system behaviour is not easy: Some Lustre error messages are critical and some are normal Status of clients can have influence on servers e.g. takeover is faster if all clients can be reached Timing has an influence e.g. takeover only occurs if failover server is up for more than 10 minutes» After dumps check local disk space Filesystem /local on OSS is hidden and not visible by the df command page 15

Conclusion» We are working together with HP to reach a highly reliable system Parallel file systems are very complex Hence it is normal to have critical software bugs with new file systems» HP SFS has the most important features of a parallel file system Performance, resilience, scalability, and ease of administration Additional features are needed for using file systems from two clusters e.g. support for different Lustre versions between clients and servers» HP SFS and Lustre are very interesting and promising products It works and is heavily used at SSCK's production system» Now it's the right time to start using HP SFS / Lustre! page 16