Lustre OSS Read Cache Feature SCALE Test Plan

Similar documents
Acceptance Small (acc-sm) Testing on Lustre

5.4 - DAOS Demonstration and Benchmark Report

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

Network Request Scheduler Scale Testing Results. Nikitas Angelinas

Fan Yong; Zhang Jinghai. High Performance Data Division

NetApp High-Performance Storage Solution for Lustre

Small File I/O Performance in Lustre. Mikhail Pershin, Joe Gmitter Intel HPDD April 2018

A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing

Demonstration Milestone for Parallel Directory Operations

Lustre Metadata Fundamental Benchmark and Performance

Dell TM Terascala HPC Storage Solution

DELL Terascala HPC Storage Solution (DT-HSS2)

Parallel File Systems for HPC

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

Lustre Parallel Filesystem Best Practices

Lustre on ZFS. Andreas Dilger Software Architect High Performance Data Division September, Lustre Admin & Developer Workshop, Paris, 2012

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

An Overview of Fujitsu s Lustre Based File System

Scalability Test Plan on Hyperion For the Imperative Recovery Project On the ORNL Scalable File System Development Contract

Assistance in Lustre administration

FhGFS - Performance at the maximum

Experiences with HP SFS / Lustre in HPC Production

Introduction to System Messages for the Cisco MDS 9000 Family

Zhang, Hongchao

Upstreaming of Features from FEFS

PARALLEL SYSTEMS PROJECT

Kerberized Lustre 2.0 over the WAN

Before starting installation process, make sure the USB cable is unplugged.

Blue Waters I/O Performance

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

High-Performance Lustre with Maximum Data Assurance

SFA12KX and Lustre Update

Aziz Gulbeden Dell HPC Engineering Team

OM-60-TP SERVICE LOGGER

Metadata Performance Evaluation LUG Sorin Faibish, EMC Branislav Radovanovic, NetApp and MD BWG April 8-10, 2014

Parallel Virtual File Systems on Microsoft Azure

Extraordinary HPC file system solutions at KIT

Subject-specific study and examination regulations for the M.Sc. Computer Science degree programme

The current status of the adoption of ZFS* as backend file system for Lustre*: an early evaluation

Using file systems at HC3

GroupWise 2012 Quick Start

Filesystems on SSCK's HP XC6000

Welcome! Virtual tutorial starts at 15:00 BST

TTC Catalog - Computer Technology (CPT)

T10PI End-to-End Data Integrity Protection for Lustre

Comparison of different distributed file systems for use with Samba/CTDB

Feedback on BeeGFS. A Parallel File System for High Performance Computing

If any problems occur after making BIOS settings changes (poor performance, intermittent issues, etc.), reset the desktop board to default values:

The Spider Center-Wide File System

Bringsel, File System Benchmarking and Load Simulation in HPC Technical Environments.

Lustre at Scale The LLNL Way

Toward An Integrated Cluster File System

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Fujitsu's Lustre Contributions - Policy and Roadmap-

Fujitsu s Contribution to the Lustre Community

Integration Path for Intel Omni-Path Fabric attached Intel Enterprise Edition for Lustre (IEEL) LNET

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Lustre Lockahead: Early Experience and Performance using Optimized Locking. Michael Moore

A ClusterStor update. Torben Kling Petersen, PhD. Principal Architect, HPC

Testing Lustre with Xperior

High Level Architecture For UID/GID Mapping. Revision History Date Revision Author 12/18/ jjw

Managing Lustre TM Data Striping

Lustre on ZFS. At The University of Wisconsin Space Science and Engineering Center. Scott Nolin September 17, 2013

Linux HPC Software Stack

INTEGRATING HPFS IN A CLOUD COMPUTING ENVIRONMENT

I/O Performance on Cray XC30

Guidelines for Efficient Parallel I/O on the Cray XT3/XT4

Illinois Proposal Considerations Greg Bauer

Andreas Dilger High Performance Data Division RUG 2016, Paris

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Coordinating Parallel HSM in Object-based Cluster Filesystems

Data life cycle monitoring using RoBinHood at scale. Gabriele Paciucci Solution Architect Bruno Faccini Senior Support Engineer September LAD

Lessons learned from Lustre file system operation

Andreas Dilger. Principal Lustre Engineer. High Performance Data Division

Project Quota for Lustre

Intel Omni-Path Fabric Manager GUI Software

Bug tracking. Second level Third level Fourth level Fifth level. - Software Development Project. Wednesday, March 6, 2013

Introduction to Parallel I/O

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1

Title page. Nortel IP Phone User Guide. Nortel Communication Server 1000

CSCS HPC storage. Hussein N. Harake

Exploring the non-functional properties of model transformation techniques used in industry Ronan Barrett, Ericsson AB

Lustre Clustered Meta-Data (CMD) Huang Hua Andreas Dilger Lustre Group, Sun Microsystems

Scalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms

Advanced Software for the Supercomputer PRIMEHPC FX10. Copyright 2011 FUJITSU LIMITED

Performance Measurement

libhio: Optimizing IO on Cray XC Systems With DataWarp

Distributed file system on the SURFnet network Report

Design and Evaluation of a 2048 Core Cluster System

Symmetry M2100 Controller

Pentium 300 or faster. Computer CDN yes Inquire Inquire. braryxp.htm.

Octopus F270 IT Octopus F100/200/400/650 Octopus F IP-Netpackage Octopus F470 UC AFT F Operating Instructions ================!

ZEST Snapshot Service. A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1

Title page. Nortel IP Phone User Guide. Nortel Communication Server 1000

A Generic Methodology of Analyzing Performance Bottlenecks of HPC Storage Systems. Zhiqi Tao, Sr. System Engineer Lugano, March

Performance Monitoring in an HP SFS Environment

Active-Active LNET Bonding Using Multiple LNETs and Infiniband partitions

Lustre Data on MDT an early look

Lustre 2.8 feature : Multiple metadata modify RPCs in parallel

Scaling Apache 2.x to > 50,000 users.

Transcription:

Lustre OSS Read Cache Feature SCALE Test Plan Auth Date Description of Document Change Client Approval By Jack Chen 10/08//08 Add scale testing to Feature Test Plan Jack Chen 10//30/08 Create Scale Test Plan, draft Client Approval Date Jack Chen 11/04/08 Add perfmance with OSS read cache enabled/disabled Jack Chen 11/06/08 Add test goal by Alex's suggestion Jack Chen 02/11/09 Q&A, add new test cases 09/2/13 <COPYRIGHT 2008 SUN MICROSYSTEMS> Page 1 of 7

I. Test Plan Overview This test plan describes various testing activities and responsibilities that are planned to be perfmed by Lustre QE. Per this test plan, Lustre QE will provide large-scale, system level testing f the OSS read cache feature. Executive Summary Create a test plan to provide testing f the OSS read cache project f a large-scale cluster. Required input from developers. Require customer large cluster lab. The output will be all tests are passed. Problem Statement We need to test the OSS read cache feature on a large-scale cluster to make sure the feature is scalable. Goal Verify that OSS read cache functions with a large system. Compare perfmance data between OSS read cache enabled/disabled. 1) See improvement in those specific cases. 2) Prove there is no noticeable regression in the other cases. Success Facts All tests need to run successfully. Perfmance data of OSS read cache enabled should be better then OSS read cache disabled. Testing Plan Pre-gate landing Define the setup steps that need to happen f the hardware to be ready? Who is responsible f these tests? Get system time on a large system. Pre-feature testing has been completed and signed off by SUN QE f this feature. Specify the date these tests will start, and length of time that these test will take to complete. Date started: 2008-11-05 The time estimation f new test creation: 1 week 09/2/13 <COPYRIGHT 2008 SUN MICROSYSTEMS> Page 2 of 7

The time estimation of 1 run: OSS read cache feature tests (large-scale) : 2 ~ 3 days Perfmance tests: 2 ~ 3 days * In the case of defects found, the tests should be repeated. The estimated time to complete testing depends on: -- the number of defects found during testing; -- the time needed by a developer to fix the defects; Specify (at a high level) what tests will be completed? New, Exist tests, manual automate IOR, PIOS Existing test, automate Specify what requirements f a large scale testing MDS number 8 OSS number Client numberss Me than 512 Memy size on MDSs Memy size on OSSs Memy size on Clients 32~128, Depand on how many memy size on each OSS node. Nmally, the me the better OSSs_memy_size me than ½ Clients_memy_size, the me, the better Not specify Specify how you will reste the hardware (if required) and notify the client your testing is done. We will need feedback from the user; recommend that we use bugzilla f outputs. The bugzilla ticket is filed f each failure. Summary and status rept are printed in the bug that we create f this test. Test Cases Test Cases Post-gate landing All these tests are (will be) integrated into acceptance-small as largescale.sh (LARGE_SCALE). To run this large scale test: 1. Install lustre.rpm and lustre-tests.rpm on all cluster nodes. 2. Specify the cluster configuration file, see cfg/local.sh and cfg/ncli.sh f details. 3. Run the test without lustre refmatting as: SETUP=: CLEANUP=: FORMAT=: ACC_SM_ONLY=LARGE_SCALE NAME=<config_file> sh acceptance-small.sh 09/2/13 <COPYRIGHT 2008 SUN MICROSYSTEMS> Page 3 of 7

SETUP=: CLEANUP=: FORMAT=: NAME=<config_file> sh large-scale.sh Requirements: 1. Installed Lustre build packages on all cluster nodes. 2. Installed Lustre user tools (lctl). 3. Shared directy with lustre-tests build on all clients 4. Fmatted Lustre file system, mounted by clients 5. The configuration file in accding to a fmatted Lustre system 6. Installed dd, tar, dbench, iozone, IOR, PIOS Rept: output file. Test Cases Improvement testing Using kernel build to measure if OSS read cache improve read perfmance. Requirements: 1. Installed Lustre build packages on all cluster nodes. 2. Installed Lustre user tools (lctl). 3. Fmatted Lustre file system, mounted by clients 4. Copy Lustre source to Lustre mount piont 5. Disable OSS read cache, see Perfmance test cases 6. Build Lustre 1) download linux kernel source 2) set start time 3) compile linux kernel 4) set end time 7, Enable OSS read cache, repeat #6 8. Compare the time between enabled/disable time Excepted result: Enabled OSS read cache will spend less time than disable OSS read cache Perfmance compare test data between cache enabled and disabled. Test perfmance regressions No. Test Case Description 1. IOR 1)Run IOR with OSS cache disabled Create a file system with all of the OSTs and set stripe across all. Run IOR using ${MPIRUN} from one client and recd the perfmance numbers. Install and configure Lustre first. Manual steps to run IOR: 1) disable OSS cache 09/2/13 <COPYRIGHT 2008 SUN MICROSYSTEMS> Page 4 of 7

2. PIOS 2) Run IOR with OSS cache enabled Run PIOS with OSS read cache disabled lctl set_param -n odbfilter.*.cache 0 lctl set_param -n obdfilter.*.read_cache_enable 0 lctl set_param -n obdfilter.*.writethrough_cache_enable 0 2) disabled debug /sbin/sysctl -w lnet.subsystem_debug=0 /sbin/sysctl -w lnet.debug=0 3) set scripe count to -1 /usr/bin/lfs setstripe ${MOUNTPOINT} 1048576-1 -1 run IOR as mpiuser with/without -C f file_per_process ${MPIRUN} -machinefile ${MACHINE_FILE} -np ${NPROC} ${IOR} - t $bufsize -b $blocksize -f $input_file Note: Options: -C -v -w -r -k (-T ${(15*60)} MACHINE_FILE list all Lustre clients we used NPROC is the number of MPI processes, the better number is the client number BUFSIZE is 1M BLOCKSIZE depends on how many memy blocks are on all Lustre client nodes, $Total_client_memy X 2 = $BLOCKSIZE x $NPROC 4) set stripe count to 1 run IOR as mpiuser with/without -C f file per process ${MPIRUN} -machinefile ${MACHINE_FILE} -np ${NPROC} ${IOR} - t $bufsize -b $blocksize -f $input_file Note: Options: -C -v -w -r -k (-T ${(15*60)} MACHINE_FILE list all Lustre clients we used NPROC is the number of MPI processes, the better number is the client number BUFSIZE is 1M BLOCKSIZE depends on how many memy blocks are on all Lustre client nodes, $Total_client_memy X 2 = $BLOCKSIZE x $NPROC 5) enable debug, rerun 4) and 5) 6) collect vmstat 1 and oprofiles f all the runs - this is very imptant data. The different step is 1) enable OSS read cache first. 1) enable OSS cache lctl set_param -n odbfilter.*.cache 1 lctl set_param -n obdfilter.*.read_cache_enable 1 lctl set_param -n obdfilter.*.writethrough_cache_enable 1 Create a file system with all of the OSTs and set stripe across all. Run PIOS using pdsh from all of the clients and recd the perfmance numbers. 09/2/13 <COPYRIGHT 2008 SUN MICROSYSTEMS> Page 5 of 7

Install and configure Lustre first. Manual steps to run PIOS: 1) disable OSS cache lctl set_param -n odbfilter.*.cache 0 lctl set_param -n obdfilter.*.read_cache_enable 0 lctl set_param -n obdfilter.*.writethrough_cache_enable 0 2) set stripe /usr/bin/lfs setstripe ${MOUNTPOINT} 1048576-1 -1 3) run PIOS as mpiuser $PIOS -t 1,8,16,40 -n 1024 -c 1M -s 8M -o 16M -p $LUSTRE $PIOS -t 1,8,16,40 -n 1024 -c 1M -s 8M -o 16M -p $LUSTRE verify $PIOS -t 1,8,16,40 -n 1024 -c 1M -s 8M -o 16M -L fpp -p $LUSTRE $PIOS -t 1,8,16,40 -n 1024 -c 1M -s 8M -o 16M -L fpp -p $LUSTRE verify Run PIOS with OSS read cache enabled 4) collect vmstat 1 and oprofiles f all the runs - this is very imptant data The different step is 1) enable OSS read cache first. 1) enable OSS cache lctl set_param -n odbfilter.*.cache 1 lctl set_param -n obdfilter.*.read_cache_enable 1 lctl set_param -n obdfilter.*.writethrough_cache_enable 1 Benchmarking No benchmarks will be done. I. Test Plan Approval Review date f the Test Plan review with the client: Date the Test Plan was approved by the client (and by whom) Date(s) agreed to by the client to conduct testing I. Test Plan Final Rept Test rept requirements of result 1. Lustre configuration 1) Node infmation: Process model, Process count, Memy size, Disks, Netwk type 2) Disk read speed 09/2/13 <COPYRIGHT 2008 SUN MICROSYSTEMS> Page 6 of 7

3) How many OSSs and clients used 2. F Improvement testing and Post-gate testing, rept test output files 3. F Perfmance testing, rept as spreadsheet file recded test data, output file and vmstat 1 / oprofiles file f each run. Example: IOR (xfer size: 1M) Number of processes 2 4 8 File-per-process-Max write (MB/sec) 57.21 54.29 51.37 File-per-process-Max read (MB/sec) 72.66 67.51 63.23 Single-share-file-Max write (MB/sec) 54.64 55.23 49.27 Single-share-file-Max read (MB/sec) 66.47 83.22 68.48 PIOS Measurement Scenario (MB/s) 1 Cycle : Threads T:1,4,8,16,32,40 N:1024 C:1M S:16M O:32M 1 4 8 16 32 40 First Write Single File 36 38 32 36 37 37 Read & Verify Single File 24 43 36 41 42 42 First Write Multiple Files 11 74 72 72 69 63 Read & Verify Multiple Files 21 22 38 54 61 60 Benchmarking Conclusions Summary of the test: Next Steps 09/2/13 <COPYRIGHT 2008 SUN MICROSYSTEMS> Page 7 of 7