Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage
|
|
- Marcia Golden
- 5 years ago
- Views:
Transcription
1 Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Evaluation of Lustre File System software enhancements for improved Metadata performance Wojciech Turek, Paul Calleja,John Talyor, (University of Cambridge HPCS ) Quy Ta, Onur Celebioglu (Dell HPC Engineering)
2 Contents Abstract 1 1. Introduction to Lustre Metadata Performance and Scalability 2 2. Lustre Distributed Namespace [DNE-1] 3 3. MDT reply reconstruction improvement 5 4. Test System Reference Specification 6 5. Benchmarking Methodology and Tools 8 6. Benchmarks 9 7. Conclusion 2 8. References 21 ii
3 Abstract The Lustre filesystem is well known for its ability to handle large sequential I/O patterns typically seen in HPC workloads and is the main reason why Lustre is used in many top HPC centers in the world today. Lustre has steadily gained popularity and has become the first choice parallel filesystem in many academic and industry HPC environments. In recent years the requirement for I/O performance has increased dramatically with increasing demand for both traditional HPC I/O bandwidth but also for increased IOP s and Metadata performance. Traditionally Lustre is not known for high IOP s performance due to the serial nature of Lustre metadata server architecture. This legacy position has now changed due to significant improvements to the Lustre software in terms of metadata performance and scalability. In this paper we investigate the implementation of scalable Lustre metadata servers called Distributed Namespace phase 1 (DNE-1). The HPC team within the University of Cambridge undertook a detailed study and optimisation of Lustre Metadata performance in partnership with the Dell HPC engineering team at Dell Austin. The base design of the testbed utilised a new Dell/Intel DNE-1 Lustre blue print design, which will be published at the end 215/start 216. The work presented here has also been extended to look at how single client multithreaded Metadata performance can been improved by further Lustre code enhancements soon to be release in IEEL. The paper demonstrates that together with the removal of serial of metadata servers and serial multithreaded single client metadata transactions significantly improve both the performance and scalability of Lustre metadata activity thereby completely removing the traditional metadata performance limitations of Lustre. The paper looks ahead to the IEEL release in 216 where these software features will be combined with DNE- 2 and describes future work to investigate further hardware optimisations of MDS processor and memory configuration combined of new Intel nonvolatile memory technologies that could provide yet another large boost to metadata performance. 1
4 1. Introduction to Lustre Metadata Performance and Scalability Today s large-scale data processing systems can consist of thousands of compute nodes (Lustre clients) running tens of thousands of concurrent processes. Up to now within a Lustre file system it has been possible to scale the number of Lustre Object Storage Servers and increase the I/O throughput of the cluster but it has only been possible to have one Metadata Target (MDT) per filesystem. There have been many development efforts to improve Lustre metadata performance however as client numbers increases the single MDS represents a fundamental bottleneck limiting scalability of the metadata transactions within the Lustre filesystem. The Distributed Namespace (DNE) development addresses this fundamental limit by distributing the filesystem metadata over multiple metadata servers and metadata targets. Phase 1 of the development project focuses on distributing Lustre namespace by allowing directory entries to reference sub-directories on different metadata targets (MDTs). In the DNE-1 implementation, this is referred to as Remote Directories and allows for metadata workloads distributed across multiple directories to scale. DNE-2 will introduce shared directories in which directory entries in a single directory are striped over multiple MDTs and allows metadata performance within a single directory to scale. This is often referred to as Striped Directories. As of today, the DNE-1 has been fully implemented and is available in production releases of Lustre. DNE-2 is available as a preview feature since Lustre-2.7 and full implementation will be available in the community release 2.8, which is planned to be released by the end of year 215 (it is expected in Intel Enteprise Edition for Lustre 3. at the start of 216). 2
5 2. Lustre Distributed Namespace [DNE-1] Removing serialization of metadata servers Remote Directories: Lustre sub-directories are distributed over multiple metadata targets (MDTs) Figure 1. Sub-directory distribution is defined by an administrator using a Lustre-specific mkdir command. Figure 1 DNE-1 The Lustre architecture with DNE is shown on Figure 2, where the metadata can now be distributed across multiple MDT devices and multiple MDS servers. This new architecture enables the metadata component to scale in the same manner as the object storage component. Figure 2 Lustre System Architecture 3
6 To enable DNE on a Lustre filesystem, one needs to simply format another MDT device with the same fsname as the existing one MDT and use next available MDT index then mount it on the MDS. If using IEEL-2.3+ the easiest way to enable DNE is from the Intel Manager for Lustre (IML) web interface as shown in Figure 3. Creation of remote directories requires administrative privileges. An administrator can allocate a sub-directory to a given MDT using the command: client# lfs mkdir i mdt_index /mount_point/remote_dir This command will allocate the sub-directory remote dir onto the MDT of index <mdt index>. Only an administrator can create remote sub-directories allocated to separate MDTs. Creating remote subdirectories in parent directories not hosted on MDT is not recommended. This is because the failure of the parent MDT will leave the namespace below it inaccessible. For this reason by default it is only possible to create remote sub-directories within MDT. To relax this restriction and enable remote sub-directories within any MDT an administrator must issue the command. lctl set_param mdd.*.enable_remote_dir=1. Figure 3 Intel Manager for Lustre DNE enablement 4
7 3. MDT reply reconstruction improvement Removing serialization of client Metadata transitions Currently, the MDT cannot handle more than one single filesystem-modifying RPC at a time because there is only one slot per client in the MDT last rcvd file. Consequently, the filesystem-modifying MDC requests are serialized leading to poor metadata performance scaling on a single Lustre client. The Intel Lustre Support Ticket LU-5319 tracks the work on implementing support for multiple slots per client for reply reconstruction of filesystem modifying MDT requests to improve metadata operations performance of a single client for multithreaded workloads. Before a fix of this issue is implemented within IEEL a workaround for this problem is to create multiple mount points per client and to structure the workload so it is distributed across these multiple mount points. In production workloads this approach would not be practical. However it is useful to implement the workaround here on our Intel EE for Lustre testbed to understand what level of improvement can be achieved once the serialization problem is resolved and the fix is available in the production version of Lustre (Currently the fix is available for community Lustre 2.8 and it will be available in Intel EE for Lustre 3. release early next year). 5
8 4. Test System Reference Specification This section describes in detailed the base test system configuration. The main system (shown in Figure 4) consists of two MDS servers and two OSS servers configured as failover pairs. The system is running either IEE or Intel Enterprise Edition for Lustre 2.3 which enables use of DNE-1. The Lustre system is managed by the Intel Manager for Lustre (IML) server. This server is a single point for configuration and monitoring of Lustre filesystems. The metadata storage is provided by two Dell Power Vault MD342 RAID enclosures. Each enclosure is fully populated with 22 SAS 15K HDDs and 2 x 146GB SSD. The enclosures are connected to both MDS servers therefore allow shared access to all disks for both servers. This enables failover functionality for MDS servers and better load balance of MDT devices across available MDS servers. Similarly the OST targets are provided by Dell Power Vault MD346 RAID enclosure. Both the MD346 and MD342 disk enclosure contain two active-active RAID controllers. This provides full redundancy for accessing the disks but also improves performance and load balances workloads. The RAID disk enclosures and servers are connected via direct 12Gbps SAS connections. All servers are interconnected by high speed and low latency Infiniband FDR network. We find that the Dell hardware in this configuration is a very good match for Lustre filesystem performance capabilities and provides balanced storage system architecture. Figure 4 Tests system 6
9 IML - Intel Manager for Lustre Platform CPU RAM Boot disk Disk LOM Network R43 1 x Xeon E C 2.4GHz 32GB 2 x 1GB SSD 2 x 1TB NL-SAS 2 x 1GbE MDS - Mata Data Server Platform CPU RAM Disk Fast Network LOM Network R63 2 x Xeon E5-263 v3 8C 2.4GHz 128GB 2 x 1GB SSD FDR Infiniband 2 x 1GbE MDT - RAID Storage Enclosure Platform Disks Controller Cache MD x 15GB SAS 15k HDDs Dual RAID controllers 8GB of cache per controller OSS - Object Storage Server Platform CPU RAM Boot disk Fast Network MGT Network R43 2 x Xeon E C 2.4GHz 64GB 2 x 1GB SSD FDR Infiniband 2 x 1GbE OST - RAID Storage Enclosure Platform Disks Controller MD346 6 x 4TB NL-SAS HDDs 4 x 2GB SSDs Dual RAID controllers Table 1 Hardware Specification 7
10 5. Benchmarking Methodology and Tools The metadata performance was measured using the MDTEST benchmark tool. The MDTEST benchmark measures the rate of most common metadata operations such as directory and file creation/ deletion and stat operations. The benchmark uses MPI to coordinate and threads between the test nodes making it suitable for testing a single client performance as well as hundreds or thousands of clients. The MDTEST benchmark was run using a maximum of 64 Lustre client nodes, up to 124 threads and a minimum of one million files and directories were used per test. The following parameters were used: mdtest -n <number of files/directories per test directory> -i 3 -y -N 1 -t -u -d <test_directories> -n: every process will creat/stat/remove # directories and files -i: number of iterations the test will run -y: sync file after writing -N: stride # between neighbor tasks for file/dir stat (local=) -t: time unique working directory overhead -u: unique working directory for each task -d: the directory in which the tests will run 8
11 Table 2 lists the test cases performed on the test system. Test were selected to highlight the DNE-1 metadata performance improvements but also to show the major reply reconstruction metadata bottleneck affecting a single client performance. The test provide a preview of how the performance will improve once the bottleneck is a removed and once parallel MDS servers are deployed. TAB IEEL-2.3-SCD-1MDT IEEL-2.3-SCO-1MDT IEEL-2.3-MCO-1MDT IEEL-2.3-MCO-2MDT IEEL-2.3-MCO-4MDT DNE Comparison - Create ops DNE Comparison - Stats ops DNE Comparison - Remove ops Description Results from a single Lustre client test in default mode: no tuning, no multi mounts, single MDT Results from a single Lustre client test in optimised mode: small files Lustre client tuning, multi mounts, single MDT Results from a multi Lustre client test in optimised mode: small files tuning, multi mounts, single MDT Results from a multi Lustre client test in optimised mode: small files tuning, multi mounts, 2 x MDT Results from a multi Lustre client test in optimised mode: small files tuning, multi mounts, 4 x MDT These are charts for File Create operations to help visualise deltas between distributed metadata These are charts for File Stat operations to help visualise deltas between distributed metadata These are charts for File Remove operations to help visualise deltas between distributed metadata Table 2 Test Cases 9
12 6. Benchmarks Single MDT tests Figure 5 Single MDT Configuration In the single MDT series of tests, a single Dell Power Vault MD342 was used as Lustre metadata target. The RAID enclosure is populated with 22 SAS 15K disks and all disks are configured as RAID1 disk group. The RAID group is then mapped as a single virtual disk to the metadata servers. Only one RAID controller is active in this configuration the second controller is in standby-failover mode. 1
13 Single Client Default - IEEL-2.3-SCD-1MDT This test case looks at performance of a typical Lustre client with one Lustre filesystem mount point and with no Lustre client side specific tuning. This test case only use single MDT therefore is not using DNE-1 feature. I.e this is the base line Lustre performance IEEL-2.3-SCD-1MDT - File Operations OPS File Create File Stat File Remove Threads Figure 6 IEEL-2.3-SCD-1MDT - File Operations 25 IEEL-2.3-SCD-1MDT - Directory Operations 2 OPS Directory Create Directory Stat Directory Remove Threads Figure 7 IEEL-2.3-SCD-1MDT - Directory Operations 11
14 The metadata performance of a single client only scales for stat operations. It is very clear that the create and remove operation do not scale well with increasing number of threads. The operation rates stop scaling with just 4 threads. This is caused by the serialization of the filesystem-modifying MDC requests (there is 1 outstanding MDC RPC call for each MDT versus 8 OSC RPC calls for each OST). The fix for the serialization has been developed and implemented in the latest community Lustre-2.8. The details about the fix can be found in LU The next test case will show metadata performance after applying the work around for the fix and allowing Lustre metadata server to handle multiple MDC requests in parallel. Single Client Default - IEEL-2.3-SCO-1MDT Figure 8 and Figure 9 show results for optimized single client test case. The Lustre client side has been optimized for small file performance and to improve high transactional workloads, see Table 3 for parameters and their values. In order to mitigate the modifying MDC requests serialization Lustre has been mounted 16 times per node and each thread is using different mount point. This way we avoid serialization of the modifying RPC requests on the metadata server. The results show a significant improvement for all types of operations. Figure 1 and Figure 11 Show comparison for before and after optimisation. The most significant change is for file creates and removes operation. The Performance scales almost linearly with increasing number of threads. This is a major improvement for Lustre s metadata capability and enables Luster to perform much better for multithreaded or farm workloads on a single compute node. Linux Client Tuning Parameter name Default Tuned MAX_RPCS_IN_FLIGHT MAX_DIRTY_MB MAX_PAGES_PER_RPC Table 3 Client Side Tuning 12
15 IEEL-2.3-SCO-1MDT- File Operations OPS File Create File Stat File Remove Threads Figure 8 IEEL-2.3-SCO-1MDT- File Operations IEEL-2.3-SCO-1MDT - Directory Operations OPS Directory Create Directory Stat Directory Remove Threads Figure 8 IEEL-2.3-SCO-1MDT- File Operations 13
16 Comparison of IEEL-2.3-SCD-1MDT vs EE-2.3-SCO-1MDT File Operations File create EE-2.3-SCD-1MDT File Stat EE-2.3-SCD-1MDT File Remove EE-2.3-SCD-1MDT File create EE-2.3-SCO-1MDT File Stat EE-2.3-SCO-1MDT File Remove EE-2.3-SCO-1MDT Figure 1 IEEL-2.3-SCD-1MDT vs IEEL-2.3-SCO-1MDT - File Ops Directory Operations Directory Create EE-2.3-SCD-1MDT Directory Stat EE-2.3-SCD-1MDT Directory Remove EE-2.3-SCD-1MDT Directory Create EE-2.3-SCO-1MDT Directory Stat EE-2.3-SCO-1MDT Directory Remove EE-2.3-SCO-1MDT Figure 11 IEEL-2.3-SCD-1MDT vs IEEL-2.3-SCO-1MDT - Dir Ops 14
17 Multithreaded 64 Client Optimised - IEEL-2.3-MCO-1MDT The metadata performance continues to scale beyond single client but it is quickly saturated with just 32 threads for the create and remove operations. The stat operations scale much better and reach peak performance with 512 threads. 25 Multi Node Optimised - 1MDT - File Ops 2 OPS File Create File Stat File Remove Throughput Figure 12 Multi Node Optimised - 1MDT - File Ops Multi Node Optimised - 1MDT - Dir Ops OPS Directory Create Directory Stat Directory Remove Throughput Figure 13 Multi Node Optimised - 1MDT - Dir Ops 15
18 Two MDTs Tests Figure 14 Two MDTs Configuration In this configuration two Dell Power Vault MD342 disk RAID enclosures are used. Both systems are populated with 22 x SAS 15K HDDs and 2 x SSDs and all SAS disks in each disk enclosure are configured as a single RAID1 disk group. Only one RAID controller in each disk enclosure is active in this configuration, the second controller is in standby-failover mode. Multithreaded 64 Clients Optimised - IEEL-2.3-MCO-2MDT After adding additional MDT, test results show that the performance had a significant improvement for file operations but has stopped scaling with 128 threads for create and removal operations. The directory operations have also improved but not as much as files Multi Client Optimised - EE-2.3-MCO-2MDT Ops File Create File Stat File Remove Threads Figure 15 Multi Client Optimised - IEEL-2.3-MCO-2MDT 16
19 Multi Client Optimised - IEEL-2.3-MCO-2MDT 25 2 Axis Title Dir Create Dir Stat Dir Remove Threads Figure 16 Multi Client Optimised - IEEL-2.3-MCO-2MDT Four MDTs Tests Figure 17 Four MDTs Configuration In this configuration two Dell Power Vault MD342 disk RAID enclosures are used. Both systems are fully populated with 22 x SAS 15K HDDs and 2 x SSDs, all SAS disks in each disk enclosure are configured as a single RAID1 disk group. Only one RAID controller in each disk enclosure is active in this configuration, the second controller is in standby-failover mode. Multithreaded 64 Clients Optimised - IEEL-2.3-MCO-4MDT Adding more MDTs, continues to improve metadata performance scaling. In this test case we use the same hardware as in case with 2MDTs but this time hardware is configured to present 4 MDTs. This has a very positive impact on recorded performance. Therefore, it is beneficial to split disks into two smaller RAID disk groups, one for each controller. This is most likely due to the fact that the disk enclosure has two controllers and creating two RAID groups per disk enclosures allows to use both controllers at the same time. 17
20 64 Nodes Multithreaded - IEEL-2.3-MCO-4MDT Ops File Create File Stat File Remove Throughput Figure Nodes Multithreaded - IEEL-2.3-MCO-4MDT Nodes Multithreaded - IEEL-2.3-MCO-4MDT Ops Dir Create Dir Stat Dir Remove Throughput Figure Nodes Multithreaded - IEEL-2.3-MCO-4MDT 18
21 DNE Scaling File Create Ops DNE Scaling Comparison - File Creates Ops Throughput 1MDT 2MDTs 4MDTs Figure 2 DNE Scaling Comparison - File Creates File Stat Ops MDT Scaling Comparison - File Stats Ops MDT 2MDTs 4MDTs Throughput Figure 21 MDT Scaling Comparison - File Stats 19
22 File Stat Removes 25 MDT Scaling Comparison - File Removes 2 Ops MDT 2MDTs 4MDTs Throughput Figure 22 MDT Scaling Comparison - File Removes When comparing results for DNE scaling test cases, Figure 2, Figure 21 and Figure 22, it clearly shows that performance scales with the additional MDTs. It also shows that a single MDT performance is quite limited and can be saturated with 32 threads. As the MDTs are added to the system the peak performance increases with the number of threads. The test results suggest that scaling will continue with additional metadata targets. 7. Conclusion This paper clearly shows that the two metadata performance enhancements (1)DNE-1 (parallelisation of metadata servers) and (2)MDT reply reconstruction improvement (parallelisation of client Metadata transitions at the MDS) have a very large positive effect on Lustre meta data performance. With these enhancements the Metadata performance of Lustre has been transformed in terms of its multithreaded single node performance and in terms of its multi-node metadata scalability. These enhancements now resolve Lustre s legacy metadata performance issues allowing industry leading parallel file system performance in terms of both I/O throughput and metadata performance. The next release of Intel EE for Lustre due for release in early 216 will include the MDT reply reconstruction improvement and also implementation of DNE-2 which allows for the striping of a single directory across multiple MDT s. This will bring together all the features required to fully unlock the scalable metadata performance required by data intensive workloads that we now find in modern day HPC data centers. This next set of Intel EE for Lustre Metadata software improvements will also enable Lustre to take advantage of new Intel advancements in nonvolatile memory technologies to boost metadata performance even further. The next paper in this series will investigate these new Intel EE for Lustre software features as expressed on Dell storage hardware in combination with hardware optimisations using nonvolatile RAM technologies and MDS server processor and memory configurations to provide a definitive review of modern Lustre metadata performance possibilities. 2
23 8. References [1] Lustre Manual [2] LU [3] DNE-1 [4] Lustre tuning paper 21
Lustre Metadata Fundamental Benchmark and Performance
09/22/2014 Lustre Metadata Fundamental Benchmark and Performance DataDirect Networks Japan, Inc. Shuichi Ihara 2014 DataDirect Networks. All Rights Reserved. 1 Lustre Metadata Performance Lustre metadata
More informationSmall File I/O Performance in Lustre. Mikhail Pershin, Joe Gmitter Intel HPDD April 2018
Small File I/O Performance in Lustre Mikhail Pershin, Joe Gmitter Intel HPDD April 2018 Overview Small File I/O Concerns Data on MDT (DoM) Feature Overview DoM Use Cases DoM Performance Results Small File
More informationThe current status of the adoption of ZFS* as backend file system for Lustre*: an early evaluation
The current status of the adoption of ZFS as backend file system for Lustre: an early evaluation Gabriele Paciucci EMEA Solution Architect Outline The goal of this presentation is to update the current
More informationSFA12KX and Lustre Update
Sep 2014 SFA12KX and Lustre Update Maria Perez Gutierrez HPC Specialist HPC Advisory Council Agenda SFA12KX Features update Partial Rebuilds QoS on reads Lustre metadata performance update 2 SFA12KX Features
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton
More informationLustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE
Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Hitoshi Sato *1, Shuichi Ihara *2, Satoshi Matsuoka *1 *1 Tokyo Institute
More informationNetApp High-Performance Storage Solution for Lustre
Technical Report NetApp High-Performance Storage Solution for Lustre Solution Design Narjit Chadha, NetApp October 2014 TR-4345-DESIGN Abstract The NetApp High-Performance Storage Solution (HPSS) for Lustre,
More informationIsilon Performance. Name
1 Isilon Performance Name 2 Agenda Architecture Overview Next Generation Hardware Performance Caching Performance Streaming Reads Performance Tuning OneFS Architecture Overview Copyright 2014 EMC Corporation.
More informationMission-Critical Lustre at Santos. Adam Fox, Lustre User Group 2016
Mission-Critical Lustre at Santos Adam Fox, Lustre User Group 2016 About Santos One of the leading oil and gas producers in APAC Founded in 1954 South Australia Northern Territory Oil Search Cooper Basin
More informationScalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms
Scalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms Research Technologies High Performance File Systems hpfs-admin@iu.edu
More informationHigh-Performance Lustre with Maximum Data Assurance
High-Performance Lustre with Maximum Data Assurance Silicon Graphics International Corp. 900 North McCarthy Blvd. Milpitas, CA 95035 Disclaimer and Copyright Notice The information presented here is meant
More informationFeedback on BeeGFS. A Parallel File System for High Performance Computing
Feedback on BeeGFS A Parallel File System for High Performance Computing Philippe Dos Santos et Georges Raseev FR 2764 Fédération de Recherche LUmière MATière December 13 2016 LOGO CNRS LOGO IO December
More informationLustre 2.8 feature : Multiple metadata modify RPCs in parallel
Lustre 2.8 feature : Multiple metadata modify RPCs in parallel Grégoire Pichon 23-09-2015 Atos Agenda Client metadata performance issue Solution description Client metadata performance results Configuration
More informationAziz Gulbeden Dell HPC Engineering Team
DELL PowerVault MD1200 Performance as a Network File System (NFS) Backend Storage Solution Aziz Gulbeden Dell HPC Engineering Team THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL
More informationRIGHTNOW A C E
RIGHTNOW A C E 2 0 1 4 2014 Aras 1 A C E 2 0 1 4 Scalability Test Projects Understanding the results 2014 Aras Overview Original Use Case Scalability vs Performance Scale to? Scaling the Database Server
More informationSun Lustre Storage System Simplifying and Accelerating Lustre Deployments
Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems
More informationLustre HSM at Cambridge. Early user experience using Intel Lemur HSM agent
Lustre HSM at Cambridge Early user experience using Intel Lemur HSM agent Matt Rásó-Barnett Wojciech Turek Research Computing Services @ Cambridge University-wide service with broad remit to provide research
More informationArchitecting Storage for Semiconductor Design: Manufacturing Preparation
White Paper Architecting Storage for Semiconductor Design: Manufacturing Preparation March 2012 WP-7157 EXECUTIVE SUMMARY The manufacturing preparation phase of semiconductor design especially mask data
More informationIntroduction The Project Lustre Architecture Performance Conclusion References. Lustre. Paul Bienkowski
Lustre Paul Bienkowski 2bienkow@informatik.uni-hamburg.de Proseminar Ein-/Ausgabe - Stand der Wissenschaft 2013-06-03 1 / 34 Outline 1 Introduction 2 The Project Goals and Priorities History Who is involved?
More informationDELL EMC ISILON F800 AND H600 I/O PERFORMANCE
DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access
More informationParallel File Systems for HPC
Introduction to Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 The Need for 2 The File System 3 Cluster & A typical
More informationCMS experience with the deployment of Lustre
CMS experience with the deployment of Lustre Lavinia Darlea, on behalf of CMS DAQ Group MIT/DAQ CMS April 12, 2016 1 / 22 Role and requirements CMS DAQ2 System Storage Manager and Transfer System (SMTS)
More informationBest Practices for Setting BIOS Parameters for Performance
White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page
More informationLustre* is designed to achieve the maximum performance and scalability for POSIX applications that need outstanding streamed I/O.
Reference Architecture Designing High-Performance Storage Tiers Designing High-Performance Storage Tiers Intel Enterprise Edition for Lustre* software and Intel Non-Volatile Memory Express (NVMe) Storage
More informationDell TM Terascala HPC Storage Solution
Dell TM Terascala HPC Storage Solution A Dell Technical White Paper Li Ou, Scott Collier Dell Massively Scale-Out Systems Team Rick Friedman Terascala THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY,
More informationComputer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research
Computer Science Section Computational and Information Systems Laboratory National Center for Atmospheric Research My work in the context of TDD/CSS/ReSET Polynya new research computing environment Polynya
More informationDELL Terascala HPC Storage Solution (DT-HSS2)
DELL Terascala HPC Storage Solution (DT-HSS2) A Dell Technical White Paper Dell Li Ou, Scott Collier Terascala Rick Friedman Dell HPC Solutions Engineering THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES
More informationA Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510
A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510 Incentives for migrating to Exchange 2010 on Dell PowerEdge R720xd Global Solutions Engineering
More informationArchitecting a High Performance Storage System
WHITE PAPER Intel Enterprise Edition for Lustre* Software High Performance Data Division Architecting a High Performance Storage System January 2014 Contents Introduction... 1 A Systematic Approach to
More informationDell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia
More informationPowerVault MD3 SSD Cache Overview
PowerVault MD3 SSD Cache Overview A Dell Technical White Paper Dell Storage Engineering October 2015 A Dell Technical White Paper TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS
More informationAndreas Dilger. Principal Lustre Engineer. High Performance Data Division
Andreas Dilger Principal Lustre Engineer High Performance Data Division Focus on Performance and Ease of Use Beyond just looking at individual features... Incremental but continuous improvements Performance
More informationLustre on ZFS. At The University of Wisconsin Space Science and Engineering Center. Scott Nolin September 17, 2013
Lustre on ZFS At The University of Wisconsin Space Science and Engineering Center Scott Nolin September 17, 2013 Why use ZFS for Lustre? The University of Wisconsin Space Science and Engineering Center
More informationRAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System
RAIDIX Data Storage Solution Clustered Data Storage Based on the RAIDIX Software and GPFS File System 2017 Contents Synopsis... 2 Introduction... 3 Challenges and the Solution... 4 Solution Architecture...
More informationExperiences with HP SFS / Lustre in HPC Production
Experiences with HP SFS / Lustre in HPC Production Computing Centre (SSCK) University of Karlsruhe Laifer@rz.uni-karlsruhe.de page 1 Outline» What is HP StorageWorks Scalable File Share (HP SFS)? A Lustre
More informationIBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide
V7 Unified Asynchronous Replication Performance Reference Guide IBM V7 Unified R1.4.2 Asynchronous Replication Performance Reference Guide Document Version 1. SONAS / V7 Unified Asynchronous Replication
More informationDell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing ANSYS Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for ANSYS Mechanical, ANSYS Fluent, and
More informationLow Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces
Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces Evaluation report prepared under contract with LSI Corporation Introduction IT professionals see Solid State Disk (SSD) products as
More informationIsilon Scale Out NAS. Morten Petersen, Senior Systems Engineer, Isilon Division
Isilon Scale Out NAS Morten Petersen, Senior Systems Engineer, Isilon Division 1 Agenda Architecture Overview Next Generation Hardware Performance Caching Performance SMB 3 - MultiChannel 2 OneFS Architecture
More informationDeploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c
White Paper Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c What You Will Learn This document demonstrates the benefits
More informationHIGH-PERFORMANCE STORAGE FOR DISCOVERY THAT SOARS
HIGH-PERFORMANCE STORAGE FOR DISCOVERY THAT SOARS OVERVIEW When storage demands and budget constraints collide, discovery suffers. And it s a growing problem. Driven by ever-increasing performance and
More informationUpgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure
Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure Generational Comparison Study of Microsoft SQL Server Dell Engineering February 2017 Revisions Date Description February 2017 Version 1.0
More informationA ClusterStor update. Torben Kling Petersen, PhD. Principal Architect, HPC
A ClusterStor update Torben Kling Petersen, PhD Principal Architect, HPC Sonexion (ClusterStor) STILL the fastest file system on the planet!!!! Total system throughput in excess on 1.1 TB/s!! 2 Software
More informationTPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage
TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage Performance Study of Microsoft SQL Server 2016 Dell Engineering February 2017 Table of contents
More informationDell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions
Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions A comparative analysis with PowerEdge R510 and PERC H700 Global Solutions Engineering Dell Product
More informationDemonstration Milestone for Parallel Directory Operations
Demonstration Milestone for Parallel Directory Operations This milestone was submitted to the PAC for review on 2012-03-23. This document was signed off on 2012-04-06. Overview This document describes
More information1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K
1. ALMA Pipeline Cluster specification The following document describes the recommended hardware for the Chilean based cluster for the ALMA pipeline and local post processing to support early science and
More informationHPC Innovation Lab Update. Dell EMC HPC Community Meeting 3/28/2017
HPC Innovation Lab Update Dell EMC HPC Community Meeting 3/28/2017 Dell EMC HPC Innovation Lab charter Design, develop and integrate Heading HPC systems Lorem ipsum Flexible reference dolor sit amet, architectures
More informationDDN s Vision for the Future of Lustre LUG2015 Robert Triendl
DDN s Vision for the Future of Lustre LUG2015 Robert Triendl 3 Topics 1. The Changing Markets for Lustre 2. A Vision for Lustre that isn t Exascale 3. Building Lustre for the Future 4. Peak vs. Operational
More informationEfficient Object Storage Journaling in a Distributed Parallel File System
Efficient Object Storage Journaling in a Distributed Parallel File System Presented by Sarp Oral Sarp Oral, Feiyi Wang, David Dillow, Galen Shipman, Ross Miller, and Oleg Drokin FAST 10, Feb 25, 2010 A
More informationLustre Clustered Meta-Data (CMD) Huang Hua Andreas Dilger Lustre Group, Sun Microsystems
Lustre Clustered Meta-Data (CMD) Huang Hua H.Huang@Sun.Com Andreas Dilger adilger@sun.com Lustre Group, Sun Microsystems 1 Agenda What is CMD? How does it work? What are FIDs? CMD features CMD tricks Upcoming
More informationData Management. Parallel Filesystems. Dr David Henty HPC Training and Support
Data Management Dr David Henty HPC Training and Support d.henty@epcc.ed.ac.uk +44 131 650 5960 Overview Lecture will cover Why is IO difficult Why is parallel IO even worse Lustre GPFS Performance on ARCHER
More informationAltair OptiStruct 13.0 Performance Benchmark and Profiling. May 2015
Altair OptiStruct 13.0 Performance Benchmark and Profiling May 2015 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute
More informationLustre File System. Proseminar 2013 Ein-/Ausgabe - Stand der Wissenschaft Universität Hamburg. Paul Bienkowski Author. Michael Kuhn Supervisor
Proseminar 2013 Ein-/Ausgabe - Stand der Wissenschaft Universität Hamburg September 30, 2013 Paul Bienkowski Author 2bienkow@informatik.uni-hamburg.de Michael Kuhn Supervisor michael.kuhn@informatik.uni-hamburg.de
More informationIBM Emulex 16Gb Fibre Channel HBA Evaluation
IBM Emulex 16Gb Fibre Channel HBA Evaluation Evaluation report prepared under contract with Emulex Executive Summary The computing industry is experiencing an increasing demand for storage performance
More informationFhGFS - Performance at the maximum
FhGFS - Performance at the maximum http://www.fhgfs.com January 22, 2013 Contents 1. Introduction 2 2. Environment 2 3. Benchmark specifications and results 3 3.1. Multi-stream throughput................................
More informationCloudian Sizing and Architecture Guidelines
Cloudian Sizing and Architecture Guidelines The purpose of this document is to detail the key design parameters that should be considered when designing a Cloudian HyperStore architecture. The primary
More informationMicrosoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage
Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage A Dell Technical White Paper Dell Database Engineering Solutions Anthony Fernandez April 2010 THIS
More informationIntegration Path for Intel Omni-Path Fabric attached Intel Enterprise Edition for Lustre (IEEL) LNET
Integration Path for Intel Omni-Path Fabric attached Intel Enterprise Edition for Lustre (IEEL) LNET Table of Contents Introduction 3 Architecture for LNET 4 Integration 5 Proof of Concept routing for
More informationWHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY
WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY Table of Contents Introduction 3 Performance on Hosted Server 3 Figure 1: Real World Performance 3 Benchmarks 3 System configuration used for benchmarks 3
More informationSONAS Best Practices and options for CIFS Scalability
COMMON INTERNET FILE SYSTEM (CIFS) FILE SERVING...2 MAXIMUM NUMBER OF ACTIVE CONCURRENT CIFS CONNECTIONS...2 SONAS SYSTEM CONFIGURATION...4 SONAS Best Practices and options for CIFS Scalability A guide
More informationEvaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA
Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA Evaluation report prepared under contract with HP Executive Summary The computing industry is experiencing an increasing demand for storage
More informationScaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX
Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX Inventing Internet TV Available in more than 190 countries 104+ million subscribers Lots of Streaming == Lots of Traffic
More informationAndreas Dilger, Intel High Performance Data Division Lustre User Group 2017
Andreas Dilger, Intel High Performance Data Division Lustre User Group 2017 Statements regarding future functionality are estimates only and are subject to change without notice Performance and Feature
More informationMetadata Performance Evaluation LUG Sorin Faibish, EMC Branislav Radovanovic, NetApp and MD BWG April 8-10, 2014
Metadata Performance Evaluation Effort @ LUG 2014 Sorin Faibish, EMC Branislav Radovanovic, NetApp and MD BWG April 8-10, 2014 OpenBenchmark Metadata Performance Evaluation Effort (MPEE) Team Leader: Sorin
More informationConsolidating OLTP Workloads on Dell PowerEdge R th generation Servers
Consolidating OLTP Workloads on Dell PowerEdge R720 12 th generation Servers B Balamurugan Phani MV Dell Database Solutions Engineering March 2012 This document is for informational purposes only and may
More informationLustre overview and roadmap to Exascale computing
HPC Advisory Council China Workshop Jinan China, October 26th 2011 Lustre overview and roadmap to Exascale computing Liang Zhen Whamcloud, Inc liang@whamcloud.com Agenda Lustre technology overview Lustre
More informationNetwork Request Scheduler Scale Testing Results. Nikitas Angelinas
Network Request Scheduler Scale Testing Results Nikitas Angelinas nikitas_angelinas@xyratex.com Agenda NRS background Aim of test runs Tools used Test results Future tasks 2 NRS motivation Increased read
More informationA Generic Methodology of Analyzing Performance Bottlenecks of HPC Storage Systems. Zhiqi Tao, Sr. System Engineer Lugano, March
A Generic Methodology of Analyzing Performance Bottlenecks of HPC Storage Systems Zhiqi Tao, Sr. System Engineer Lugano, March 15 2013 1 Outline Introduction o Anatomy of a storage system o Performance
More informationRACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING
RACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING EXECUTIVE SUMMARY Today, businesses are increasingly turning to cloud services for rapid deployment of apps and services.
More informationLustre * Features In Development Fan Yong High Performance Data Division, Intel CLUG
Lustre * Features In Development Fan Yong High Performance Data Division, Intel CLUG 2017 @Beijing Outline LNet reliability DNE improvements Small file performance File Level Redundancy Miscellaneous improvements
More informationDell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results
Dell Fluid Data solutions Powerful self-optimized enterprise storage Dell Compellent Storage Center: Designed for business results The Dell difference: Efficiency designed to drive down your total cost
More informationNAMD Performance Benchmark and Profiling. January 2015
NAMD Performance Benchmark and Profiling January 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationData life cycle monitoring using RoBinHood at scale. Gabriele Paciucci Solution Architect Bruno Faccini Senior Support Engineer September LAD
Data life cycle monitoring using RoBinHood at scale Gabriele Paciucci Solution Architect Bruno Faccini Senior Support Engineer September 2015 - LAD Agenda Motivations Hardware and software setup The first
More informationRemote Directories High Level Design
Remote Directories High Level Design Introduction Distributed Namespace (DNE) allows the Lustre namespace to be divided across multiple metadata servers. This enables the size of the namespace and metadata
More informationAll-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP
All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP All-flash configurations are designed to deliver maximum IOPS and throughput numbers for mission critical workloads and applicati
More informationPicking the right number of targets per server for BeeGFS. Jan Heichler March 2015 v1.3
Picking the right number of targets per server for BeeGFS Jan Heichler March 2015 v1.3 Picking the right number of targets per server for BeeGFS 2 Abstract In this paper we will show the performance of
More informationSoftware Defined Storage at the Speed of Flash. PRESENTATION TITLE GOES HERE Carlos Carrero Rajagopal Vaideeswaran Symantec
Software Defined Storage at the Speed of Flash PRESENTATION TITLE GOES HERE Carlos Carrero Rajagopal Vaideeswaran Symantec Agenda Introduction Software Technology Architecture Review Oracle Configuration
More informationRed Hat Gluster Storage performance. Manoj Pillai and Ben England Performance Engineering June 25, 2015
Red Hat Gluster Storage performance Manoj Pillai and Ben England Performance Engineering June 25, 2015 RDMA Erasure Coding NFS-Ganesha New or improved features (in last year) Snapshots SSD support Erasure
More informationAccelerate Applications Using EqualLogic Arrays with directcache
Accelerate Applications Using EqualLogic Arrays with directcache Abstract This paper demonstrates how combining Fusion iomemory products with directcache software in host servers significantly improves
More informationSTAR-CCM+ Performance Benchmark and Profiling. July 2014
STAR-CCM+ Performance Benchmark and Profiling July 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: CD-adapco, Intel, Dell, Mellanox Compute
More informationReduce Costs & Increase Oracle Database OLTP Workload Service Levels:
Reduce Costs & Increase Oracle Database OLTP Workload Service Levels: PowerEdge 2950 Consolidation to PowerEdge 11th Generation A Dell Technical White Paper Dell Database Solutions Engineering Balamurugan
More informationUsing DDN IME for Harmonie
Irish Centre for High-End Computing Using DDN IME for Harmonie Gilles Civario, Marco Grossi, Alastair McKinstry, Ruairi Short, Nix McDonnell April 2016 DDN IME: Infinite Memory Engine IME: Major Features
More informationHigh Performance Supercomputing using Infiniband based Clustered Servers
High Performance Supercomputing using Infiniband based Clustered Servers M.J. Johnson A.L.C. Barczak C.H. Messom Institute of Information and Mathematical Sciences Massey University Auckland, New Zealand.
More informationWelcome! Virtual tutorial starts at 15:00 BST
Welcome! Virtual tutorial starts at 15:00 BST Parallel IO and the ARCHER Filesystem ARCHER Virtual Tutorial, Wed 8 th Oct 2014 David Henty Reusing this material This work is licensed
More informationMaximizing NFS Scalability
Maximizing NFS Scalability on Dell Servers and Storage in High-Performance Computing Environments Popular because of its maturity and ease of use, the Network File System (NFS) can be used in high-performance
More informationDDN. DDN Updates. DataDirect Neworks Japan, Inc Nobu Hashizume. DDN Storage 2018 DDN Storage 1
1 DDN DDN Updates DataDirect Neworks Japan, Inc Nobu Hashizume DDN Storage 2018 DDN Storage 1 2 DDN A Broad Range of Technologies to Best Address Your Needs Your Use Cases Research Big Data Enterprise
More informationAll-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP
All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP All-flash configurations are designed to deliver maximum IOPS and throughput numbers for mission critical workloads and applicati
More informationLAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015
LAMMPS-KOKKOS Performance Benchmark and Profiling September 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, NVIDIA
More informationExtraordinary HPC file system solutions at KIT
Extraordinary HPC file system solutions at KIT Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State Roland of Baden-Württemberg Laifer Lustre and tools for ldiskfs investigation
More informationSSD Architecture Considerations for a Spectrum of Enterprise Applications. Alan Fitzgerald, VP and CTO SMART Modular Technologies
SSD Architecture Considerations for a Spectrum of Enterprise Applications Alan Fitzgerald, VP and CTO SMART Modular Technologies Introduction Today s SSD delivers form-fit-function compatible solid-state
More informationIME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning
IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application
More informationVeritas NetBackup on Cisco UCS S3260 Storage Server
Veritas NetBackup on Cisco UCS S3260 Storage Server This document provides an introduction to the process for deploying the Veritas NetBackup master server and media server on the Cisco UCS S3260 Storage
More informationLustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions
LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions Roger Goff Senior Product Manager DataDirect Networks, Inc. What is Lustre? Parallel/shared file system for
More informationImproving Packet Processing Performance of a Memory- Bounded Application
Improving Packet Processing Performance of a Memory- Bounded Application Jörn Schumacher CERN / University of Paderborn, Germany jorn.schumacher@cern.ch On behalf of the ATLAS FELIX Developer Team LHCb
More informationFlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC
white paper FlashGrid Software Intel SSD DC P3700/P3600/P3500 Topic: Hyper-converged Database/Storage FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC Abstract FlashGrid
More informationMicrosoft SQL Server 2012 Fast Track Reference Architecture Using PowerEdge R720 and Compellent SC8000
Microsoft SQL Server 2012 Fast Track Reference Architecture Using PowerEdge R720 and Compellent SC8000 This whitepaper describes the Dell Microsoft SQL Server Fast Track reference architecture configuration
More informationXyratex ClusterStor6000 & OneStor
Xyratex ClusterStor6000 & OneStor Proseminar Ein-/Ausgabe Stand der Wissenschaft von Tim Reimer Structure OneStor OneStorSP OneStorAP ''Green'' Advancements ClusterStor6000 About Scale-Out Storage Architecture
More informationFour-Socket Server Consolidation Using SQL Server 2008
Four-Socket Server Consolidation Using SQL Server 28 A Dell Technical White Paper Authors Raghunatha M Leena Basanthi K Executive Summary Businesses of all sizes often face challenges with legacy hardware
More informationConsolidating Microsoft SQL Server databases on PowerEdge R930 server
Consolidating Microsoft SQL Server databases on PowerEdge R930 server This white paper showcases PowerEdge R930 computing capabilities in consolidating SQL Server OLTP databases in a virtual environment.
More information