Demonstration Milestone for Parallel Directory Operations

Similar documents
Demonstration Milestone Completion for the LFSCK 2 Subproject 3.2 on the Lustre* File System FSCK Project of the SFS-DEV-001 contract.

FhGFS - Performance at the maximum

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Scalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms

DELL Terascala HPC Storage Solution (DT-HSS2)

Lustre Metadata Fundamental Benchmark and Performance

SFA12KX and Lustre Update

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Architecting a High Performance Storage System

High-Performance Lustre with Maximum Data Assurance

Lustre on ZFS. At The University of Wisconsin Space Science and Engineering Center. Scott Nolin September 17, 2013

Experiences with HP SFS / Lustre in HPC Production

Dell TM Terascala HPC Storage Solution

ABySS Performance Benchmark and Profiling. May 2010

Network Request Scheduler Scale Testing Results. Nikitas Angelinas

1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K

Introduction The Project Lustre Architecture Performance Conclusion References. Lustre. Paul Bienkowski

Remote Directories High Level Design

The current status of the adoption of ZFS* as backend file system for Lustre*: an early evaluation

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

CSCS HPC storage. Hussein N. Harake

InfiniBand Networked Flash Storage

AcuSolve Performance Benchmark and Profiling. October 2011

HMEM and Lemaitre2: First bricks of the CÉCI s infrastructure

2008 International ANSYS Conference

Efficient Object Storage Journaling in a Distributed Parallel File System

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

AcuSolve Performance Benchmark and Profiling. October 2011

SDSC s Data Oasis Gen II: ZFS, 40GbE, and Replication

Analytics of Wide-Area Lustre Throughput Using LNet Routers

Extraordinary HPC file system solutions at KIT

NAMD Performance Benchmark and Profiling. January 2015

Using file systems at HC3

Parallel File Systems for HPC

Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters

What is Parallel Computing?

Lustre overview and roadmap to Exascale computing

Small File I/O Performance in Lustre. Mikhail Pershin, Joe Gmitter Intel HPDD April 2018

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Metadata Performance Evaluation LUG Sorin Faibish, EMC Branislav Radovanovic, NetApp and MD BWG April 8-10, 2014

The Impact of Inter-node Latency versus Intra-node Latency on HPC Applications The 23 rd IASTED International Conference on PDCS 2011

Lustre HSM at Cambridge. Early user experience using Intel Lemur HSM agent

Voltaire Making Applications Run Faster

Unified Runtime for PGAS and MPI over OFED

Deep Learning Performance and Cost Evaluation

Lustre/HSM Binding. Aurélien Degrémont Aurélien Degrémont LUG April

A Plugin-based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS

A ClusterStor update. Torben Kling Petersen, PhD. Principal Architect, HPC

Solaris Engineered Systems

Z RESEARCH, Inc. Commoditizing Supercomputing and Superstorage. Massive Distributed Storage over InfiniBand RDMA

6.5 Collective Open/Close & Epoch Distribution Demonstration

Lustre File System. Proseminar 2013 Ein-/Ausgabe - Stand der Wissenschaft Universität Hamburg. Paul Bienkowski Author. Michael Kuhn Supervisor

GROMACS (GPU) Performance Benchmark and Profiling. February 2016

NetApp High-Performance Storage Solution for Lustre

Study. Dhabaleswar. K. Panda. The Ohio State University HPIDC '09

Lustre Clustered Meta-Data (CMD) Huang Hua Andreas Dilger Lustre Group, Sun Microsystems

Deep Learning Performance and Cost Evaluation

Fan Yong; Zhang Jinghai. High Performance Data Division

Datura The new HPC-Plant at Albert Einstein Institute

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science

SMB Direct Update. Tom Talpey and Greg Kramer Microsoft Storage Developer Conference. Microsoft Corporation. All Rights Reserved.

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015

Andreas Dilger. Principal Lustre Engineer. High Performance Data Division

John Fragalla TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY. Presenter s Name Title and Division Sun Microsystems

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

GW2000h w/gw175h/q F1 specifications

MegaGauss (MGs) Cluster Design Overview

Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance

Filesystem Performance on FreeBSD

Altair RADIOSS Performance Benchmark and Profiling. May 2013

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

LBRN - HPC systems : CCT, LSU

5.4 - DAOS Demonstration and Benchmark Report

Project Quota for Lustre

Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters

PlaFRIM. Technical presentation of the platform

Mission-Critical Lustre at Santos. Adam Fox, Lustre User Group 2016

Accelerate Applications Using EqualLogic Arrays with directcache

TELECOMMUNICATIONS TECHNOLOGY ASSOCIATION JET-SPEED HHS3124F / HHS2112F (10 NODES)

Filesystems on SSCK's HP XC6000

ANSYS Fluent 14 Performance Benchmark and Profiling. October 2012

Embedded Filesystems (Direct Client Access to Vice Partitions)

NEMO Performance Benchmark and Profiling. May 2011

DNE2 High Level Design

HP SAS benchmark performance tests

MELLANOX MTD2000 NFS-RDMA SDK PERFORMANCE TEST REPORT

Aziz Gulbeden Dell HPC Engineering Team

GRID Testing and Profiling. November 2017

Virtualized SQL Server Performance and Scaling on Dell EMC XC Series Web-Scale Hyper-converged Appliances Powered by Nutanix Software

EMC CLARiiON CX3 Series FCP

Quantifying FTK 3.0 Performance with Respect to Hardware Selection

PM Support in Linux and Windows. Dr. Stephen Bates, CTO, Eideticom Neal Christiansen, Principal Development Lead, Microsoft

Technical guide. Windows HPC server 2016 for LS-DYNA How to setup. Reference system setup - v1.0

An Exploration of New Hardware Features for Lustre. Nathan Rutman

The Hyperion Project: Collaboration for an Advanced Technology Cluster Testbed. November 2008

ENERGY-EFFICIENT VISUALIZATION PIPELINES A CASE STUDY IN CLIMATE SIMULATION

CA AppLogic and Dell PowerEdge R420 Equipment Validation

Olaf Weber Senior Software Engineer SGI Storage Software. Amir Shehata Lustre Network Engineer Intel High Performance Data Division

Transcription:

Demonstration Milestone for Parallel Directory Operations This milestone was submitted to the PAC for review on 2012-03-23. This document was signed off on 2012-04-06. Overview This document describes the work required to demonstrate that the Parallel Directory Operations code meets the agreed acceptability criteria. The Parallel Directory Operations code is functionally complete. The purpose of this final demonstration phase is to show that the code provides an enhancement to Lustre when used in a production-like environment. Acceptance Criteria PDO patch will be accepted as properly functioning if: 1. 2. 3. Improved parallel create / remove performance under large shared directory for mknod and narrow stripe files (0 4) on multi-core server (8+ cores) No performance regression for other metadata performance tests No functionality regression Baseline The baseline measurement for this Demonstration will be Lustre 2.1 enhanced with Parallel Directory Operations code. Baseline measurements will be made with the Parallel Directory Operations disabled. Parallel Directory Operations and related performance enhancements has been included in Lustre-Master since build #431 The isolated Parallel Directory Operations code can be found as commit 8e85867. NOTE: In order to deliver the maximum benefit of the Parallel Directory Operations work a number of additional enhancements were made to Lustre. These include Multiple OI tables ( LU-822) Configurable BH LRU size ( LU-50) Root node BH ref of IAM container ( LU-32) These enhancements remain active even with Parallel Directory Operations is disabled. As a result, the overall performance increase observed simply by enabling Parallel Directory Operations is reduced compared to a Lustre version that predates the beginning of the Parallel Directory Operations work. Hardware platform Demonstration will take place on the Whamcloud cluster Toro. The hardware specification include: 16 Client Nodes, 3 Server Nodes (1 x MDS, 2 x OSS) MDS node are of type 'Fat Intel Node' 2 x Intel Xeon(R) X5650 2.67GHz Six-core Processor (2-HT each core) 16GB DDR3 1333MHz Memory 2PT 40Gb/s 4X QSFP InfiniBand adapter card (Mellanox MT26428)

1 QDR IB port on motherboard SSD as external journal device (INTEL SSDSA2CW120G3), SATA II Enterprise Hard Drive as MDT (single disk, WDC WD2502ABYS-02B7A0), OSS Nodes are of type: 'Fat AMD Node' 2 x AMD Opteron 6128 2.0GHz Eight-Core Processor 16GB DDR3 1333MHz Memory 2PT 40Gb/s 4X QSFP InfiniBand adapter card (Mellanox MT26428) 1 QDR IB port on motherboard 3 x 1TB SATA II Enterprise Hard Drive (single disk, WDC WD1003FBYX-01Y7B0) Client Nodes are of type: 'Type 1 Thin Clients' 2 x Quad-Core Intel E5507 2.26G/4MB/800 Mellanox ConnectX 6 QDR Infiniband 40Gbps Controller (MTS3600Q-1BNC) 12GB DDR3 1333MHz E/R memory 3.5" 250GB SATA II RAID Enterprise Hard Drive ( SATA II 3.0Gb/s, 16MB Buffer, 7200 RPM ) Infiniband between all the nodes: MTS3600Q managed QDR switch with 36 ports. Test Methodology The intention of this phase of testing is to demonstrate the Parallel Directory Operations behaves in accordance with the acceptance criteria within a realistic production-like environment (defined above). To perform a test in this environment, the following tools are used: mds_survey mds_survey is a metadata performance benchmarking tool developed by Whamcloud, it can only be used with the Lustre software stack. Current version of mds_survey will run over MDD directly which will bypass RPC and LDLM layers, which means it can show pure metadata performance of MDS local stack. User can specify 1-N threads to create/stat/remove files under 1-M target directories. Because goal of this project is improving shared directory metadata performance tests are carried out with a single target directory. Results gained from mds_survey are attached to LU-822. mdtest mdtest is used for testing the metadata performance of a filesystem, we use it measure parallel directory operations performance with whole Lustre stack, also we can verify whether there is any performance drop for single thread metadata performance. We will use mpirun to launch mdtest on client nodes so they can parallel operate under target directory, here is a short example of using mdtest: mpirun \--machinefile mfile.16 \-np 512 mdtest \-d /mnt/lustre/testdir \-n 2048 -F \-C \-r mfile.16 is a regular text file that contains a list of client hostnames. -np 512 indicate that a total of 512 mdtest threads will be created on the client nodes, -n 2048 means that each thread will create/remove 2048 files under target directory. Oprofile In previous round of tests, it was realized that oprofile would not help because significant issues remain with

uncomplete SMP locking work. Without this work, oprofile only exposes locking contention overhead. For this reason, oprofile data is omitted from this test. Test filesystem configuration. 16 Clients, each with 16 threads. Test with shared directory and with unique directory to show no regression. mknod, opencreate (0-stripe), opencreate (1-stripe), opencreate (2-stripe), opencreate (4-stripe), unlink Time the duration to repeat operations one million times. Test repeated five times. Median and max are recorded. Test results The following assets will be collected from each test: mds_survey output mdtest output Oprofile output Test duration Each test iteration is expected to take two hours. Each test requires five iterations and there are five different tests to complete. The expected duration of the test is in the order of three days uninterrupted operation. Appendix A Benchmark characteristics mknod One single rpc is issued. Open-create Open-create benchmarks also include a close command. This means two rpc's are required. Appendix B Path-finding Demonstration results from December 2011.