Performance Analysis and Evaluation of LANL s PaScalBB I/O nodes using Quad-Data-Rate Infiniband and Multiple 10-Gigabit Ethernets Bonding
|
|
- Jesse Briggs
- 6 years ago
- Views:
Transcription
1 Performance Analysis and Evaluation of LANL s PaScalBB I/O nodes using Quad-Data-Rate Infiniband and Multiple 10-Gigabit Ethernets Bonding Hsing-bugn Chen, Alfred Torrez, Parks Fields HPC-5, Los Alamos National Lab Los Alamos, New Mexico 87111, USA {hbchen, atorrez, parks}@lanlgov Juan C Franco, Daniel Illescas, Rocio Perez-Medina, Jharrod LaFon, Ben Haynes, John Herrera INST-OFF, HPC Summer School Los Alamos National Lab Abstract - In the LANL s PaScalBB network I/O nodes carry data traffic between backend compute nodes and global scratch based file systems An I/O node is normally equipped with one Infiniband Nic for backend traffic and one or more 10-Gigabit Ethernet Nics for parallel file system data traffic With the growing deployment of multiple, multi-core processors in server and storage systems, overall platform efficiency and CPU and memory utilization depends increasingly on interconnect bandwidth and latency PCI- Express (PCIe) generation 20 has recently become available and has doubled the transfer rates available This additional I/O bandwidth balances the system and makes higher data rates for external interconnects such as Infiniband feasible As a result, Infiniband Quad-Data Rate (QDR) mode has become available on the Infiniband Host Channel Adapter (HCA) with a 40 Gb/sec signaling rate Combining HCA QDR data rates with multiple 10-Gigabit Ethernet links and using it in an IO node has created the potential to solve some of the I/O traffic bottlenecks that currently exist We have setup a small-scale PaScalBB testbed and conduct a sequence of I/O node performance tests The goal of this I/O node performance testing is to figure out an enhanced network configuration that we can apply to the LANL s Cielo machine and future LANL HPC machines using PaScalBB architecture Keywords- Server I/O networking, High Performance Networking, Infiniband, 10 Gigabit Ethernet, Link aggregation, Load balancing 1 INTRODUCTION Commercial off the shelf based cluster computing Systems have delivered reasonable performance to technical and commercial areas for years High speed computing, global storage, and networking (IPC and I/O) are the three most critical elements to build a large scale HPC cluster system Without these three elements being well balanced, we cannot fully utilize a HPC cluster High data bandwidth I/O networking provides a data super-highway to meet the needs of constantly increasing computation power and storage capacity LANL s PaScalBB server I/O architecture is designed to support data-intensive scientific applications running on very large-scale clusters The main goal of PaScalBB is to provide high performance, efficient, reliable, parallel, and scalable I/O capabilities for data-intensive scientific applications running on very large-scale clusters Data-intensive scientific simulation-based analysis normally requires efficient transfer of a huge volume of complex data among simulation, visualization, and data manipulation functions To date PaScalBB has been implemented on most of HPC production machines at LANL; Roadrunner (1 st Petaflops machine), RedTail, LOBO, Turing, TLCC, etc I/O nodes are used in the LANL s PaScalBB network to carry data traffic between backend compute nodes and global scratch based file systems An I/O node is normally equipped with one Infiniband NIC for backend IPC traffic and one or more 10-Gigabit Ethernet NICs for parallel file system data traffic With the growing deployment of multiple, multi-core processors in server and storage systems, overall platform efficiency and CPU and memory utilization depends increasingly on interconnect bandwidth and latency PCI- Express (PCIe) generation 20 has recently become available and has doubled the transfer rates available This additional I/O bandwidth balances the system and makes higher data rates for external interconnects such as Infiniband feasible As a result, Infiniband Quad-Data Rate (QDR) mode has become available on the Infiniband Host Channel Adapter (HCA) with a 40 Gb/sec signaling rate Combining HCA QDR rates with multiple 10-Gigabit IPC Ethernet links has the potential to solve some of the I/O traffic bottlenecks that currently exist We have setup a small-scale PaScalBB test bed and conduct a sequence of I/O node performance tests The goal of this I/O node performance testing is to figure out an enhanced network configuration that we can apply to the LANL s Cielo machine and future LANL HPC machines using PaScalBB architecture The rest of this paper is organized as follows In section two we describe LANL s PaScalBB server I/O infrastructure Section three introduces Infiniband/QDR and 10Gigabit Ethernet technologies We then illustrate our experimental setup and discuss testing results and performance data in section four Finally, we present our conclusion and future works in section five 2 PASCALBB SERVER I/O BACKBONE ARCHITECTURE LANL s PaScalBB [10] adopts several hardware and software components to provide a unique and scalable server This work was carried out under the auspices of the National Nuclear Security Administration of the US Department of Energy at Los Alamos National
2 I/O networking architecture Figure-1 illustrates the system components used in PaScalBB 21 Hardware Components used in PaScalBB 211 Level-1 High Speed Interconnection Network The Level-1 interconnect uses (a) high speed interconnect systems such as Quadrics, Myrinet, or Infiniband for fulfilling requirements of low latency, high speed, high bandwidth cluster IPC communication and (b) aggregating I/O-Aware multi-path routes for load-balancing and failover 212 Level 2 IP based Interconnection Network The Level-2 interconnect uses multiple Gigabit Ethernet switches/routers with layer-3 network routing support to provide latency-tolerant I/O communication and global IP based storage systems Without using the Federated network solution, we can linearly expand the Level-2 IP based network by employing a global host domain multicasting feature in metadata servers of a global file system With this support we can maintain a single name space global storage system and provide a linear cost growing path for I/O networking 213 Compute node A Compute node is equipped with at least one high-speed interface card connected to a high-speed interconnect fabric in Level-1 The node is setup with Linux multi-path equalized routing to multiple available I/O nodes for load balancing and failover (high availability) A Compute node is used for computing only and is not involved with any routing activities 214 I/O node I/O node: An I/O routing node has two network interfaces One high-speed interface card is connected to the Level-1 network for communication with Compute nodes One or more Gigabit Ethernet interface cards (bondable) are connected to the Level-2 linear scaling Gigabit switches I/O nodes serve as the routing gateways between Level-1 and Level-2 network Every I/O has the same networking capability 22 System Software Components used in PaScalBB 221 Equal Cost Multi-path routing for load balancing Multi-path routing is used to provide balanced outbound traffic to the multiple I/O gateways It also supports failover and dead-gateway detection capability for choosing good routes from active I/O gateways Linux Multi-Path routing is a destination address-based load-balancing algorithm Multipath routing should improve system performance through load balancing and reduce end-to-end delay Multi-path routing overcomes the capacity constraint of single-path routing and routes through less congested paths Each Compute node is setup with N-ways multi-path routes thru N I/O nodes Multi-path routing also balances the bandwidth gap between the Level-1 and the Level-2 interconnects We use the Equal Cost Multi-path (ECMP) routing strategy on compute nodes so compute nodes can evenly distribute traffic workloads on all I/O nodes With this bi-directional multi-path routing, we can sustain parallel data paths for both write (outbound) and read (inbound) data transfer This is especially useful when applied to concurrent socket I/O sessions on IP based storage systems PaScalBB can evenly allocate socket I/O sessions to routing available I/O routing nodes I/O nodes are used heavily in the LANL s PaScalBB network to carry data traffic between backend compute nodes and global scratch based file systems An I/O node is normally equipped with one Infiniband NIC for backend IPC traffic and one or more 10-Gigabit Ethernet NICs for parallel file system data traffic [6][7][8] 3 INFINIBAND AND 10 GIGABIT ETHERNET Infiniband [3] is a standard switched fabric communication link used in high performance computing and enterprise data centers The InfiniBand Architecture (IBA) is designed to provide high bandwidth, low-latency computing; the scalability to support thousands of nodes and multiple processor cores per server; and efficient utilization of compute processing resources The TOP-500 list published in November 2010 shows that more than 42% of the computing systems use Infiniband as their primary high-speed interconnecting network The growth rate of Infiniband in the TOP-500 systems is about 30% This is an indication of a strong momentum in adoption of Infiniband technology in HPC and Enterprise communities Ethernet has long been the dominant LAN technology Now the availability of 10-Gigabit Ethernet has enabled new applications in the data center and IP based storage systems Because 10-Gigabit Ethernet is based on the core Ethernet technology, it takes advantage of the wealth of improvement that has been developed over the years and simplifies the migration to this higher-speed technology With the growing deployment of multiple, multi-core processors in server and storage systems, overall platform efficiency and CPU and memory utilization depends increasingly on interconnect bandwidth and latency PCI- Express (PCIe) generation 20 has recently become available and has doubled the transfer rates available This additional I/O bandwidth balances the system and makes higher data rates for external interconnects such as Infiniband feasible As a result, Infiniband Quad-Data Rate (QDR) mode has become available on the Infiniband Host Channel Adapter (HCA) with a 40 Gb/sec signaling rate Combining Infiniband HCA QDR data rates with multiple 10-Gigabit Ethernet links and using it in IO node nodes has created the potential to solve some of the I/O traffic bottlenecks that currently exist in HPC machines 4 EXPERIMENTAL TESTING SETUP AND PERFORMANCE EVALUATION We setup a small-scale PaScalBB test bed and conduct a sequence of I/O node performance tests This work was carried out under the auspices of the National Nuclear Security Administration of the US Department of Energy at Los Alamos National
3 41 Testing setup and configuration Hardware equipment includes (a) Twelve Linux server machine Intel Nehalem 5600 DualQuad-core with 16GB DDR3 memory: seven Compute nodes with one Mellanox ConnectX Infiniband QDR on each compute node, one I/O node with Mellanox ConnectX Infiniband QDR [10] and multiple Mellanox ConnectX 10-Gigabit Ethernet Nics, and four data nodes with one 10-Gigabit Ethernet connection on each node, (b) One Mellanox 36-port Infiniband QDR switch, and (c) One Arista 24-port 10-Gigabit Ethernet Switch [11] Software components include (a) Fedora 12/Linux64-bit OS, (b) OFED (OpenFabrics Enterprise Distribution) [9] Infiniband/10Gigabit Ethernet system software, (c) Linux Ethernet bonding driver, and (d) netperf [12] - a network performance benchmark software 42 Performance testing and evaluation 421 Infiniband SDR/DDR/QDR performance testing Figure-2 shows the one-way communications from IB/SDR(single data rate), IB/DDR(double data rate) and IB/QDR(quad data rate) This figure illustrates the improvement of 75% of bi-directional bandwidth when moving from DDR to QDR Figure-3 shows the latency testing results from IB/SDR IB/DDR, and IB/QDR This result demonstrates the advantage of using QDR in terms of lower latency Figure- 4 shows that MPI I/O testing using various message packet sizes from 1MB to 200MB This result shows that IB/QDR can persistently provide consistent bandwidth when various message sizes are applied in MPI applications Figure-5 shows the results of (a) QDR/UC (unreliable connection) one way communication bandwidth (b) QDR/RC (reliable connection) one way communicaiton bandwidth, and (c) QDR/SRQ(shared receiving queue) bi-direction communication bandwidth We can see that IB/QDR can reach a peak of 5600MB+/sec bi-directional bandwidth from multiple streams of netperf testing Gigabit Ethernet performance testing Figure-6 shows the performance results for back-to-back connection using one single 10-Gigabit Ethernet link between two server nodes We can reach 95% bandwidth of a physical 10-Gigable link Figure-7 shows the performance result from triple 10-Gigabit Ethernet bounding back-to-back connection This figure illustrates that we can reach a peak 2300MB/sec bandwidth from three-10gige link bounding Figure-8 shows the performance result from quad 10-Gigabit Ethernet bounding back-to-back connection It only improve 5% -10% bandwidth compared it with the three-10-gigabit Ethernet bounding It may be due to the Ethernet chip-set processing capability or the Linux TCP/IP software stack 423 I/O node performance testing and justification Figure-9 shows the results of using four compute nodes and sending concurrent multiple streams of netperf data traffic through one I/O node and arriving at four different data nodes Data includes four individual links, data bandwidth, and the accumulated data bandwidth It can reach about 2950MB/sec Figure-10 shows the result of using seven compute nodes We can push the bandwidth to 4100MB/sec Figure-9 and Figure- 10 prove that we can gain more bandwidth when more compute nodes are involved in sending networking traffics This also demonstrates the scaling capability of using the LANL s PaScalBB server I/O infrastructure In Figure-11, we verify the advantage of using Linux Ethernet bonding capability We try two Ethernet bonding algorithm implemented in Linux Kernel: mode-0 and mode-5 Linux Ethernet bonding algorithm mode-0, named balancerr or Round-robin policy It transmits data packets in sequential order from the first available slave through the last This mode provides load balancing and fault tolerance Linux Ethernet bonding algorithm mode-5, named balance-tlb or Adaptive transmits load balancing It supports channel/port bonding that does not require any special switch support The outgoing data traffic is well distributed according to the current load on each slave link In-coming data traffic is received by the current slave link If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave The purpose of this testing is to figure out a better traffic load balancing algorithm that can accommodate the advantage of parallel file systems used in HPC machines Our results show that mode-5 (Adaptive transmit load balancing) can obtain 10%-15% more bandwidth compared with mode-0 (a simple Round-robin policy) From the above results, we can conclude that there is definitely an advantage of using multiple 10-gigabits Ethernet bonding in an I/O node when transferring data through an IB/QDR link We also learn how to tune 10-Gigabit Ethernet bonding algorithms to come out with the best fit for HPC parallel file system such as the Paransas Panfs ActiverScale Parallel File storage system 5 CONCLUSIONS AND FUTURE WORKS We evaluate the bandwidth performance of using IB/SDR/, IN/QDR, and IB/QDR We also evaluate of various bonding algorithms of using multiple 10-Gigabie Ethernet links We verify the capability of an I/O node equipped with one IB/QDR and multiple 10-Gigabit Ethernet links We study the Linux Ethernet bonding algorithms We observe the scaling capability of an I/O when it handling more network traffics We figure out a better way of network setup and configuration for LANL s PaScalBB network We have applied our testing results to LANL s production machines As part of the future works, we intend to conduct evaluations on larger test beds, possibly using some available production HPC machines, and studying the impact of new PaScalBB network setups and configuration We also intend to carry more in-depth studies of applying different network This work was carried out under the auspices of the National Nuclear Security Administration of the US Department of Energy at Los Alamos National
4 benchmarking testing, MPI-IO testing, and parallel file system testing REFERENCES [1] Hari Subramoni, Matthew Koop and Dhabaleswar K Panda, Designing Next generation Clusters: Evaluation of Infiniband DDR/QDR on Intel Computing Platforms, HOTI th IEEE Annual Symposium on High- Performance Interconnects [2] Matthew J Koop, Wei Huang, Karthik Gopalakrishanan, Dhabaleswar K Panda, Performance Analysis and Evaluation of PCIe 20 Quad-Data Rate Infiniband, HOTI th IEEE Annual Symposium on High- Performance Interconnects [3] Infiniband Road map, Infiniband Trace Association, [4] HPC Advisory Coucil Network of Expertise, Interconnect Analysis: 10GigE and infiniband in High Performance Computing, 2009 [5] Munira Hussain, Gilad Shalner, Tong Liu, Onur Celebioglu, Comparing DDR and QDR Infiniband 11 th - generation Dell Poweredge Clusters, DELL Power Solution, 2010 Issue 1 [6] Gary Grider, Hsing-bung Chen, James Nunez, Steve Poole, Rosie Wacha, Parks Fields, Robert Martinez, Paul Martinez, Satsangat Khalsa, PaScal A New Parallel and Scalable Server IO Networking Infrastructure for Supporting Global Storage/File Systems in Large-size Linux Clusters, Proceedings of the 25th IEEE International Performance, Computing, and Communications Conference, 2006 (IPCCC 2006) April 2006 [7] Hsing-bung Chen, Gary Grider, Parks Fields, A Cost- Effective, High Bandwidth Server I/O network Architecture for Cluster Systems, 2007 IEEE IPDPS Conference [8] Hsing-bung Chen, parks Fields, Alfred Torrez, An Intelligent Parallel and Scalable Server I/O Networking Environment for High Performance Cluster Computing Systems, PAPTA 2008 Conference [9] OFED OpenFabrics, [10] Mellanox [11] Arista network - [12] Netperf - Comp Node Comp Node Comp nodes - Outbound N-way load balancing Multi-path routing Level-1 Interconnect network I/O Node I/O Node Switch - Inbound M-way multiple streams Equal Cost Multi-path routing - switch Level-2 Interconnect network Global File System Comp Node I/O Node I/O nodes/vlan use OSPF to route inbound and outbound traffics for Level-1 and Level- 2 networks Figure 1: System diagram LANL s PaScalBB Server I/O architecture This work was carried out under the auspices of the National Nuclear Security Administration of the US Department of Energy at Los Alamos National
5 Figure-2:IB/SDR, IB/DDR, and IB/QDR performance testing Figure-3: IB/SDR, IB/DDR, and IB/QDR latency testing Figure-4: Multithread MPI testing using IB/QDR Figure-5: IB/QDR bi-directional bandwidth testing This work was carried out under the auspices of the National Nuclear Security Administration of the US Department of Energy at Los Alamos National
6 Figure-6: back-to-back one single 10-Gigabie Ethernet testing Figure-7: Three 10Gigabit Ethernet bonding performance testing Figure 8: Four 10Gigabit Ethernet bonding performance testing Figure 9: Using four compute nodes scaling testing Figure 10: Using seven compute nodes scaling testing Figure 11: Linux bonding mode-0 vs mode-5 testing This work was carried out under the auspices of the National Nuclear Security Administration of the US Department of Energy at Los Alamos National
Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters
Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters Hari Subramoni, Ping Lai, Sayantan Sur and Dhabhaleswar. K. Panda Department of
More informationPerformance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms
Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms Sayantan Sur, Matt Koop, Lei Chai Dhabaleswar K. Panda Network Based Computing Lab, The Ohio State
More informationMemory Scalability Evaluation of the Next-Generation Intel Bensley Platform with InfiniBand
Memory Scalability Evaluation of the Next-Generation Intel Bensley Platform with InfiniBand Matthew Koop, Wei Huang, Ahbinav Vishnu, Dhabaleswar K. Panda Network-Based Computing Laboratory Department of
More informationReducing Network Contention with Mixed Workloads on Modern Multicore Clusters
Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters Matthew Koop 1 Miao Luo D. K. Panda matthew.koop@nasa.gov {luom, panda}@cse.ohio-state.edu 1 NASA Center for Computational
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton
More informationDesigning High Performance Communication Middleware with Emerging Multi-core Architectures
Designing High Performance Communication Middleware with Emerging Multi-core Architectures Dhabaleswar K. (DK) Panda Department of Computer Science and Engg. The Ohio State University E-mail: panda@cse.ohio-state.edu
More informationHYCOM Performance Benchmark and Profiling
HYCOM Performance Benchmark and Profiling Jan 2011 Acknowledgment: - The DoD High Performance Computing Modernization Program Note The following research was performed under the HPC Advisory Council activities
More informationMVAPICH-Aptus: Scalable High-Performance Multi-Transport MPI over InfiniBand
MVAPICH-Aptus: Scalable High-Performance Multi-Transport MPI over InfiniBand Matthew Koop 1,2 Terry Jones 2 D. K. Panda 1 {koop, panda}@cse.ohio-state.edu trj@llnl.gov 1 Network-Based Computing Lab, The
More informationLS-DYNA Productivity and Power-aware Simulations in Cluster Environments
LS-DYNA Productivity and Power-aware Simulations in Cluster Environments Gilad Shainer 1, Tong Liu 1, Jacob Liberman 2, Jeff Layton 2 Onur Celebioglu 2, Scot A. Schultz 3, Joshua Mora 3, David Cownie 3,
More informationSR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience
SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience Jithin Jose, Mingzhe Li, Xiaoyi Lu, Krishna Kandalla, Mark Arnold and Dhabaleswar K. (DK) Panda Network-Based Computing Laboratory
More informationPerformance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA
Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to
More informationABySS Performance Benchmark and Profiling. May 2010
ABySS Performance Benchmark and Profiling May 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationMellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007
Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise
More informationScheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications
Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications Sep 2009 Gilad Shainer, Tong Liu (Mellanox); Jeffrey Layton (Dell); Joshua Mora (AMD) High Performance Interconnects for
More informationMM5 Modeling System Performance Research and Profiling. March 2009
MM5 Modeling System Performance Research and Profiling March 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center
More informationStudy. Dhabaleswar. K. Panda. The Ohio State University HPIDC '09
RDMA over Ethernet - A Preliminary Study Hari Subramoni, Miao Luo, Ping Lai and Dhabaleswar. K. Panda Computer Science & Engineering Department The Ohio State University Introduction Problem Statement
More informationARISTA: Improving Application Performance While Reducing Complexity
ARISTA: Improving Application Performance While Reducing Complexity October 2008 1.0 Problem Statement #1... 1 1.1 Problem Statement #2... 1 1.2 Previous Options: More Servers and I/O Adapters... 1 1.3
More informationMPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA
MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA Gilad Shainer 1, Tong Liu 1, Pak Lui 1, Todd Wilde 1 1 Mellanox Technologies Abstract From concept to engineering, and from design to
More informationHigh Performance MPI on IBM 12x InfiniBand Architecture
High Performance MPI on IBM 12x InfiniBand Architecture Abhinav Vishnu, Brad Benton 1 and Dhabaleswar K. Panda {vishnu, panda} @ cse.ohio-state.edu {brad.benton}@us.ibm.com 1 1 Presentation Road-Map Introduction
More informationIn the multi-core age, How do larger, faster and cheaper and more responsive memory sub-systems affect data management? Dhabaleswar K.
In the multi-core age, How do larger, faster and cheaper and more responsive sub-systems affect data management? Panel at ADMS 211 Dhabaleswar K. (DK) Panda Network-Based Computing Laboratory Department
More informationDesigning Power-Aware Collective Communication Algorithms for InfiniBand Clusters
Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters Krishna Kandalla, Emilio P. Mancini, Sayantan Sur, and Dhabaleswar. K. Panda Department of Computer Science & Engineering,
More informationSun Lustre Storage System Simplifying and Accelerating Lustre Deployments
Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems
More informationInfiniBand Strengthens Leadership as The High-Speed Interconnect Of Choice
InfiniBand Strengthens Leadership as The High-Speed Interconnect Of Choice Providing the Best Return on Investment by Delivering the Highest System Efficiency and Utilization Top500 Supercomputers June
More informationFuture Routing Schemes in Petascale clusters
Future Routing Schemes in Petascale clusters Gilad Shainer, Mellanox, USA Ola Torudbakken, Sun Microsystems, Norway Richard Graham, Oak Ridge National Laboratory, USA Birds of a Feather Presentation Abstract
More informationPerformance Analysis and Evaluation of PCIe 2.0 and Quad-Data Rate InfiniBand
th IEEE Symposium on High Performance Interconnects Performance Analysis and Evaluation of PCIe. and Quad-Data Rate InfiniBand Matthew J. Koop Wei Huang Karthik Gopalakrishnan Dhabaleswar K. Panda Network-Based
More informationOptimizing LS-DYNA Productivity in Cluster Environments
10 th International LS-DYNA Users Conference Computing Technology Optimizing LS-DYNA Productivity in Cluster Environments Gilad Shainer and Swati Kher Mellanox Technologies Abstract Increasing demand for
More informationGROMACS Performance Benchmark and Profiling. August 2011
GROMACS Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationBirds of a Feather Presentation
Mellanox InfiniBand QDR 4Gb/s The Fabric of Choice for High Performance Computing Gilad Shainer, shainer@mellanox.com June 28 Birds of a Feather Presentation InfiniBand Technology Leadership Industry Standard
More informationCESM (Community Earth System Model) Performance Benchmark and Profiling. August 2011
CESM (Community Earth System Model) Performance Benchmark and Profiling August 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell,
More informationSTAR-CCM+ Performance Benchmark and Profiling. July 2014
STAR-CCM+ Performance Benchmark and Profiling July 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: CD-adapco, Intel, Dell, Mellanox Compute
More informationBlueGene/L. Computer Science, University of Warwick. Source: IBM
BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours
More informationCP2K Performance Benchmark and Profiling. April 2011
CP2K Performance Benchmark and Profiling April 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationAcuSolve Performance Benchmark and Profiling. October 2011
AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox, Altair Compute
More informationInfiniBand Networked Flash Storage
InfiniBand Networked Flash Storage Superior Performance, Efficiency and Scalability Motti Beck Director Enterprise Market Development, Mellanox Technologies Flash Memory Summit 2016 Santa Clara, CA 1 17PB
More information2008 International ANSYS Conference
2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,
More informationMaximizing NFS Scalability
Maximizing NFS Scalability on Dell Servers and Storage in High-Performance Computing Environments Popular because of its maturity and ease of use, the Network File System (NFS) can be used in high-performance
More informationPerformance Evaluation of InfiniBand with PCI Express
Performance Evaluation of InfiniBand with PCI Express Jiuxing Liu Amith Mamidala Abhinav Vishnu Dhabaleswar K Panda Department of Computer and Science and Engineering The Ohio State University Columbus,
More informationOptimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications
Optimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications K. Vaidyanathan, P. Lai, S. Narravula and D. K. Panda Network Based Computing Laboratory
More informationCRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart
CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart Xiangyong Ouyang, Raghunath Rajachandrasekar, Xavier Besseron, Hao Wang, Jian Huang, Dhabaleswar K. Panda Department of Computer
More informationMemcached Design on High Performance RDMA Capable Interconnects
Memcached Design on High Performance RDMA Capable Interconnects Jithin Jose, Hari Subramoni, Miao Luo, Minjia Zhang, Jian Huang, Md. Wasi- ur- Rahman, Nusrat S. Islam, Xiangyong Ouyang, Hao Wang, Sayantan
More informationNIC TEAMING IEEE 802.3ad
WHITE PAPER NIC TEAMING IEEE 802.3ad NIC Teaming IEEE 802.3ad Summary This tech note describes the NIC (Network Interface Card) teaming capabilities of VMware ESX Server 2 including its benefits, performance
More informationICON Performance Benchmark and Profiling. March 2012
ICON Performance Benchmark and Profiling March 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource - HPC
More informationMicrosoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays
Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays This whitepaper describes Dell Microsoft SQL Server Fast Track reference architecture configurations
More informationA Plugin-based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS
A Plugin-based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS Adithya Bhat, Nusrat Islam, Xiaoyi Lu, Md. Wasi- ur- Rahman, Dip: Shankar, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng
More information10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G
10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G Mohammad J. Rashti and Ahmad Afsahi Queen s University Kingston, ON, Canada 2007 Workshop on Communication Architectures
More informationAMBER 11 Performance Benchmark and Profiling. July 2011
AMBER 11 Performance Benchmark and Profiling July 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource -
More informationAcuSolve Performance Benchmark and Profiling. October 2011
AcuSolve Performance Benchmark and Profiling October 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, Altair Compute
More informationNAMD Performance Benchmark and Profiling. November 2010
NAMD Performance Benchmark and Profiling November 2010 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox Compute resource - HPC Advisory
More informationCommunication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.
Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance
More informationInfiniband and RDMA Technology. Doug Ledford
Infiniband and RDMA Technology Doug Ledford Top 500 Supercomputers Nov 2005 #5 Sandia National Labs, 4500 machines, 9000 CPUs, 38TFlops, 1 big headache Performance great...but... Adding new machines problematic
More informationDell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance
Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia
More informationGROMACS Performance Benchmark and Profiling. September 2012
GROMACS Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource
More informationThe NE010 iwarp Adapter
The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter
More informationDesign Alternatives for Implementing Fence Synchronization in MPI-2 One-Sided Communication for InfiniBand Clusters
Design Alternatives for Implementing Fence Synchronization in MPI-2 One-Sided Communication for InfiniBand Clusters G.Santhanaraman, T. Gangadharappa, S.Narravula, A.Mamidala and D.K.Panda Presented by:
More informationApplication-Transparent Checkpoint/Restart for MPI Programs over InfiniBand
Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand Qi Gao, Weikuan Yu, Wei Huang, Dhabaleswar K. Panda Network-Based Computing Laboratory Department of Computer Science & Engineering
More informationPerformance Evaluation of Soft RoCE over 1 Gigabit Ethernet
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 7-66, p- ISSN: 7-77Volume 5, Issue (Nov. - Dec. 3), PP -7 Performance Evaluation of over Gigabit Gurkirat Kaur, Manoj Kumar, Manju Bala Department
More informationWorkshop on High Performance Computing (HPC) Architecture and Applications in the ICTP October High Speed Network for HPC
2494-6 Workshop on High Performance Computing (HPC) Architecture and Applications in the ICTP 14-25 October 2013 High Speed Network for HPC Moreno Baricevic & Stefano Cozzini CNR-IOM DEMOCRITOS Trieste
More informationThe Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011
The Road to ExaScale Advances in High-Performance Interconnect Infrastructure September 2011 diego@mellanox.com ExaScale Computing Ambitious Challenges Foster Progress Demand Research Institutes, Universities
More informationNAMD GPU Performance Benchmark. March 2011
NAMD GPU Performance Benchmark March 2011 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Intel, Mellanox Compute resource - HPC Advisory
More informationScaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc
Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC
More informationCan Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems?
Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems? Sayantan Sur, Abhinav Vishnu, Hyun-Wook Jin, Wei Huang and D. K. Panda {surs, vishnu, jinhy, huanwei, panda}@cse.ohio-state.edu
More informationNAMD Performance Benchmark and Profiling. January 2015
NAMD Performance Benchmark and Profiling January 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationDell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage
Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Database Solutions Engineering By Raghunatha M, Ravi Ramappa Dell Product Group October 2009 Executive Summary
More informationFROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE
FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff cctrieloff@redhat.com Red Hat Lee Fisher lee.fisher@hp.com Hewlett-Packard High Performance Computing on Wall Street conference 14
More informationInformatix Solutions INFINIBAND OVERVIEW. - Informatix Solutions, Page 1 Version 1.0
INFINIBAND OVERVIEW -, 2010 Page 1 Version 1.0 Why InfiniBand? Open and comprehensive standard with broad vendor support Standard defined by the InfiniBand Trade Association (Sun was a founder member,
More informationCluster Network Products
Cluster Network Products Cluster interconnects include, among others: Gigabit Ethernet Myrinet Quadrics InfiniBand 1 Interconnects in Top500 list 11/2009 2 Interconnects in Top500 list 11/2008 3 Cluster
More informationIntra-MIC MPI Communication using MVAPICH2: Early Experience
Intra-MIC MPI Communication using MVAPICH: Early Experience Sreeram Potluri, Karen Tomko, Devendar Bureddy, and Dhabaleswar K. Panda Department of Computer Science and Engineering Ohio State University
More informationPerformance Evaluation of InfiniBand with PCI Express
Performance Evaluation of InfiniBand with PCI Express Jiuxing Liu Server Technology Group IBM T. J. Watson Research Center Yorktown Heights, NY 1598 jl@us.ibm.com Amith Mamidala, Abhinav Vishnu, and Dhabaleswar
More informationNAMD Performance Benchmark and Profiling. February 2012
NAMD Performance Benchmark and Profiling February 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource -
More informationHigh-Performance and Scalable Non-Blocking All-to-All with Collective Offload on InfiniBand Clusters: A study with Parallel 3DFFT
High-Performance and Scalable Non-Blocking All-to-All with Collective Offload on InfiniBand Clusters: A study with Parallel 3DFFT Krishna Kandalla (1), Hari Subramoni (1), Karen Tomko (2), Dmitry Pekurovsky
More informationNFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications
NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications Outline RDMA Motivating trends iwarp NFS over RDMA Overview Chelsio T5 support Performance results 2 Adoption Rate of 40GbE Source: Crehan
More informationMemory Management Strategies for Data Serving with RDMA
Memory Management Strategies for Data Serving with RDMA Dennis Dalessandro and Pete Wyckoff (presenting) Ohio Supercomputer Center {dennis,pw}@osc.edu HotI'07 23 August 2007 Motivation Increasing demands
More informationTo Infiniband or Not Infiniband, One Site s s Perspective. Steve Woods MCNC
To Infiniband or Not Infiniband, One Site s s Perspective Steve Woods MCNC 1 Agenda Infiniband background Current configuration Base Performance Application performance experience Future Conclusions 2
More informationEnhancing Checkpoint Performance with Staging IO & SSD
Enhancing Checkpoint Performance with Staging IO & SSD Xiangyong Ouyang Sonya Marcarelli Dhabaleswar K. Panda Department of Computer Science & Engineering The Ohio State University Outline Motivation and
More informationLUSTRE NETWORKING High-Performance Features and Flexible Support for a Wide Array of Networks White Paper November Abstract
LUSTRE NETWORKING High-Performance Features and Flexible Support for a Wide Array of Networks White Paper November 2008 Abstract This paper provides information about Lustre networking that can be used
More informationNEMO Performance Benchmark and Profiling. May 2011
NEMO Performance Benchmark and Profiling May 2011 Note The following research was performed under the HPC Advisory Council HPC works working group activities Participating vendors: HP, Intel, Mellanox
More informationSNAP Performance Benchmark and Profiling. April 2014
SNAP Performance Benchmark and Profiling April 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox For more information on the supporting
More informationThe Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook)
Workshop on New Visions for Large-Scale Networks: Research & Applications Vienna, VA, USA, March 12-14, 2001 The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Wu-chun Feng feng@lanl.gov
More informationVM Migration Acceleration over 40GigE Meet SLA & Maximize ROI
VM Migration Acceleration over 40GigE Meet SLA & Maximize ROI Mellanox Technologies Inc. Motti Beck, Director Marketing Motti@mellanox.com Topics Introduction to Mellanox Technologies Inc. Why Cloud SLA
More informationLS-DYNA Performance Benchmark and Profiling. April 2015
LS-DYNA Performance Benchmark and Profiling April 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource
More informationMELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구
MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio
More informationMessaging Overview. Introduction. Gen-Z Messaging
Page 1 of 6 Messaging Overview Introduction Gen-Z is a new data access technology that not only enhances memory and data storage solutions, but also provides a framework for both optimized and traditional
More informationLAMMPS Performance Benchmark and Profiling. July 2012
LAMMPS Performance Benchmark and Profiling July 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: AMD, Dell, Mellanox Compute resource - HPC
More informationDB2 purescale: High Performance with High-Speed Fabrics. Author: Steve Rees Date: April 5, 2011
DB2 purescale: High Performance with High-Speed Fabrics Author: Steve Rees Date: April 5, 2011 www.openfabrics.org IBM 2011 Copyright 1 Agenda Quick DB2 purescale recap DB2 purescale comes to Linux DB2
More informationFour-Socket Server Consolidation Using SQL Server 2008
Four-Socket Server Consolidation Using SQL Server 28 A Dell Technical White Paper Authors Raghunatha M Leena Basanthi K Executive Summary Businesses of all sizes often face challenges with legacy hardware
More informationCooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O devices
Cooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O devices Ryousei Takano, Hidemoto Nakada, Takahiro Hirofuchi, Yoshio Tanaka, and Tomohiro Kudoh Information Technology Research
More informationDelivering HPC Performance at Scale
Delivering HPC Performance at Scale October 2011 Joseph Yaworski QLogic Director HPC Product Marketing Office: 610-233-4854 Joseph.Yaworski@QLogic.com Agenda QLogic Overview TrueScale Performance Design
More informationThe Optimal CPU and Interconnect for an HPC Cluster
5. LS-DYNA Anwenderforum, Ulm 2006 Cluster / High Performance Computing I The Optimal CPU and Interconnect for an HPC Cluster Andreas Koch Transtec AG, Tübingen, Deutschland F - I - 15 Cluster / High Performance
More informationThe Impact of Inter-node Latency versus Intra-node Latency on HPC Applications The 23 rd IASTED International Conference on PDCS 2011
The Impact of Inter-node Latency versus Intra-node Latency on HPC Applications The 23 rd IASTED International Conference on PDCS 2011 HPC Scale Working Group, Dec 2011 Gilad Shainer, Pak Lui, Tong Liu,
More informationChelsio 10G Ethernet Open MPI OFED iwarp with Arista Switch
PERFORMANCE BENCHMARKS Chelsio 10G Ethernet Open MPI OFED iwarp with Arista Switch Chelsio Communications www.chelsio.com sales@chelsio.com +1-408-962-3600 Executive Summary Ethernet provides a reliable
More informationInterconnect Your Future
Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators
More informationANALYSIS OF CLUSTER INTERCONNECTION NETWORK TOPOLOGIES
ANALYSIS OF CLUSTER INTERCONNECTION NETWORK TOPOLOGIES Sergio N. Zapata, David H. Williams and Patricia A. Nava Department of Electrical and Computer Engineering The University of Texas at El Paso El Paso,
More informationAdvanced RDMA-based Admission Control for Modern Data-Centers
Advanced RDMA-based Admission Control for Modern Data-Centers Ping Lai Sundeep Narravula Karthikeyan Vaidyanathan Dhabaleswar. K. Panda Computer Science & Engineering Department Ohio State University Outline
More informationDELL EMC ISILON F800 AND H600 I/O PERFORMANCE
DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access
More informationAnalytics of Wide-Area Lustre Throughput Using LNet Routers
Analytics of Wide-Area Throughput Using LNet Routers Nagi Rao, Neena Imam, Jesse Hanley, Sarp Oral Oak Ridge National Laboratory User Group Conference LUG 2018 April 24-26, 2018 Argonne National Laboratory
More informationCP2K Performance Benchmark and Profiling. April 2011
CP2K Performance Benchmark and Profiling April 2011 Note The following research was performed under the HPC Advisory Council HPC works working group activities Participating vendors: HP, Intel, Mellanox
More informationAltair RADIOSS Performance Benchmark and Profiling. May 2013
Altair RADIOSS Performance Benchmark and Profiling May 2013 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Altair, AMD, Dell, Mellanox Compute
More informationThe Future of Interconnect Technology
The Future of Interconnect Technology Michael Kagan, CTO HPC Advisory Council Stanford, 2014 Exponential Data Growth Best Interconnect Required 44X 0.8 Zetabyte 2009 35 Zetabyte 2020 2014 Mellanox Technologies
More informationBest Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays
Dell EqualLogic Best Practices Series Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays A Dell Technical Whitepaper Jerry Daugherty Storage Infrastructure
More informationAccelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet
WHITE PAPER Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet Contents Background... 2 The MapR Distribution... 2 Mellanox Ethernet Solution... 3 Test
More information