LUSTRE NETWORKING High-Performance Features and Flexible Support for a Wide Array of Networks White Paper November Abstract
|
|
- Avis Gilmore
- 5 years ago
- Views:
Transcription
1 LUSTRE NETWORKING High-Performance Features and Flexible Support for a Wide Array of Networks White Paper November 2008 Abstract This paper provides information about Lustre networking that can be used to plan cluster file system deployments for optimal performance and scalability. The paper includes information on Lustre message passing, Lustre Network Drivers, and routing in Lustre networks, and describes how these features can be used to improve cluster storage management. The final section of this paper describes new Lustre networking features that are currently under consideration or planned for future release.
2 Table of Contents Challenges in Cluster Networking Lustre Networking Architecture and Current Features LNET architecture Network types supported in Lustre networks Routers and multiple interfaces in Lustre networks Applications of LNET Remote direct memory access (RDMA) and LNET Using LNET to implement a site-wide or global file system Using Lustre over wide area networks Using Lustre routers for load balancing Anticipated Features in Future Releases New features for multiple interfaces Server-driven QoS A router control plane Asynchronous I/O Conclusion
3 1 Challenges in Cluster Networking Chapter 1 Challenges in Cluster Networking Networking in today s datacenters provides many challenges. For performance, file system clients must access servers using native protocols over a variety of networks, preferably leveraging capabilities such as remote direct memory access. In large installations, multiple networks may be encountered and all storage must be simultaneously accessible over multiple networks through routers and by using multiple network interfaces on the servers. While storage management nightmares such as staging multiple copies of data on file systems local to a cluster are common practice, they are also highly undesirable. Lustre networking (LNET) provides features that address many of these challenges. Chapter 2 provides an overview of some of the key features of the LNET architecture. Chapter 3 discusses how these features can be used in specific high-performance computing (HPC) networking applications. Chapter 4 looks at how LNET is expected to evolve to enhance load balancing, quality of service (QoS), and high availability in networks on a local and global scale. And Chapter 5 provides a short synopsis and recap.
4 2 Lustre Networking Architecture and Current Features Chapter 2 Lustre Networking Architecture and Current Features The LNET architecture comprises a number of key features that can be used to simplify and enhance HPC networking. LNET architecture The LNET architecture has evolved through extensive research into a set of protocols and application programming interfaces (APIs) to support high-performance, highavailability file systems. In a cluster with a Lustre file system, the system network is the network connecting the servers and the clients. LNET is only used over the system network where it provides all communication infrastructure required by the Lustre file system. The disk storage in a Lustre file system is connected to metadata servers (MDSs) and object storage servers (OSSs) using traditional storage area networking (SAN) technologies. However, this SAN does not extend to the Lustre client systems, and typically does not require SAN switches. Key features of LNET include: Remote direct memory access (RDMA), when supported by underlying networks such as Elan, Myrinet, and InfiniBand Support for many commonly used network types such as InfiniBand and IP High-availability and recovery features that enable transparent recovery in conjunction with failover servers Simultaneous availability of multiple network types with routing between them
5 3 Lustre Networking Architecture and Current Features Figure 1 shows how these network features are implemented in a cluster deployed with LNET. Metadata server (MDS) disk storage containing metadata targets (MDT) Object storage servers (OSS) systems s OSS storage with object storage targets (OST) Clustered MDS pool MDS 1 (active) MDS 2 (standby) OSS 1 Commodity storage Elan Myrinet InfiniBand OSS 2 Lustre clients 1 100,000 Simultaneous support of multiple network types OSS 3 Shared storage enables failover OSS Router OSS 4 OSS 5 GigE = Failover OSS 6 OSS 7 Enterprise-class storage arrays and SAN fabric Figure 1. Lustre architecture for clusters LNET is implemented using layered software modules. The file system uses a remote procedure API with interfaces for recovery and bulk transport. This API, in turn, uses the LNET Message Passing API, which has its roots in the Sandia Portals message passing API, a well-known API in the HPC community. The LNET architecture supports pluggable drivers to provide support for multiple network types individually or simultaneously, similar in concept to the Sandia Portals network abstraction layer (NAL). The drivers, called Lustre Network Drivers (LNDs), are loaded into the driver stack, with one LND for each network type in use. Routing is possible between different networks. This was implemented early in the Lustre product cycle to provide a key customer, Lawrence Livermore National Laboratories (LLNL), with a site-wide file system (this will be discussed in more detail in Chapter 2, Applications of LNET).
6 4 Lustre Networking Architecture and Current Features Figure 2 shows how the software modules and APIs are layered. Vendor network device libraries Support for multiple network types Lustre Network Drivers (LNDs) LNET library Network I/O (NIO) API Lustre request processing Similar to Sandia Portals, with some new and different features Moves small and large buffers Uses RDMA Generates events Zero-copy marshalling libraries Service framework and request dispatch Connection and address naming Generic recovery infrastructureapi Legend: Not supplied Not portable API Portable Lustre component Figure 2. Modular LNET implemented with layered APIs A Lustre network is a set of configured interfaces on nodes that can send traffic directly from one interface on the network to another. In a Lustre network, configured interfaces are named using network identifiers (NIDs). The NID is a string that has the form <address>@<type><network id>. Examples of NIDs are @tcp0, designating an address on the 0th Lustre TCP network, and 4@elan8, designating address 4 on the 8th Lustre Elan network. Network types supported in Lustre networks The LNET architecture includes LNDs to support many network types, including: InfiniBand (IB): OpenFabrics IB versions 1.0, 1.2, and 1.3 TCP: Any network carrying TCP traffic, including GigE, 10GigE, and IPoIB Quadrics: Elan3 and Elan4 Myricom: GM and MX Cray: SeaStar and RapidArray The LNDs that support these networks are pluggable modules for the LNET software stack.
7 5 Lustre Networking Architecture and Current Features Routers and multiple interfaces in Lustre networks A Lustre network consists of one or more interfaces on nodes configured with NIDS that communicate without the use of intermediate router nodes with their own NIDS. LNET can conveniently define a Lustre network by enumerating the IP addresses of the interfaces forming the Lustre network. A Lustre network is not required to be physically separated from another Lustre network, although that is possible. When more than one Lustre network is present, LNET can route traffic between networks using routing nodes in the network. An example of this is shown in Figure 3, where one of the routers is also an OSS. If multiple routers are present between a pair of networks, they offer both load balancing and high availability through redundancy. Elan clients OSS TCP clients Elan switch MDS Ethernet switch... Router TCP clients access MDS... through the router elano Lustre network tcpo Lustre network Figure 3. Lustre networks connected through routers When multiple interfaces of the same type are available, load balancing traffic across all links becomes important. If the underlying network software for the network type supports interface bonding, resulting in one address, then LNET can rely on that mechanism. Such interface bonding is available for IP networks and Elan4, but not presently for InfiniBand.
8 6 Lustre Networking Architecture and Current Features If the network does not provide channel bonding, Lustre networks can help. Each of the interfaces is placed on a separate Lustre network. The clients on each of these Lustre networks together can utilize all server interfaces. This configuration also provides static load balancing. Additional features that may be developed in future releases to allow LNET to even better manage multiple network interfaces are discussed further in Chapter 4, Anticipated Features in Future Releases. Figure 4 shows how a Lustre server with several server interfaces can be configured to provide load balancing for clients placed on more than one Lustre network. At the top, two Lustre networks are configured as one physical network using a single switch. At the bottom, they are configured as two physical networks using two switches. vibo Lustre network vib1 Lustre network Clients Switch Clients vibo vib1 network rail network rail Server Multiple interfaces vibo Lustre network vib1 Lustre network Clients Switch Switch Clients vibo vib1 network rail network rail Server Multiple interfaces Figure 4. A Lustre server with multiple network interfaces offering load balancing to the cluster
9 7 Applications of LNET Chapter 3 Applications of LNET LNET provides versatility for deployments. A few opportunities are described in this section. Remote direct memory access (RDMA) and LNET With the exception of TCP, LNET provides support for RDMA on all network types. When RDMA is used, nodes can achieve almost full bandwidth with extremely low CPU utilization. This is advantageous, particularly for nodes that are busy running other software, such as Lustre server software. The LND automatically uses this feature for large message sizes. However, provisioning with sufficient CPU power and high-performance motherboards may justify TCP networking as a trade-off to using RDMA. On 64-bit processors, LNET can saturate several GigE interfaces with relatively low CPU utilization, and with the Dual-Core Intel Xeon processor 5100 series, the bandwidth on a 10 GigE network can approach a gigabyte per second. LNET provides extraordinary bandwidth utilization of TCP networks. For example, end-to-end I/O over a single GigE link routinely exceeds 110 MB/sec with LNET. The Internet Wide Area RDMA Protocol (iwarp), developed by the RDMA Consortium, is an extension to TCP/IP that supports RDMA over TCP/IP networks. Linux supports the iwarp protocol using the OpenFabrics Alliance (OFA) code and interfaces. LNET OFA LND supports iwarp properly as well as IB. Using LNET to implement a site-wide or global file system Site-wide file systems and global file systems are implemented to provide transparent access from multiple clusters to one or more file systems. Site-wide file systems are typically associated with one site, while global file systems may span multiple locations and therefore utilize wide area networking. Site-wide file systems are typically desirable in HPC centers where many clusters exist on different high-speed networks. Typically, it is not easy to extend such networks or to connect such networks to other networks. LNET makes this possible.
10 8 Applications of LNET An increasingly popular approach is to build a storage island at the center of such an installation. The storage island contains storage arrays and servers and utilizes an InfiniBand or TCP network. Multiple clusters can connect to this island through Lustre routing nodes. The routing nodes are simple Lustre systems with at least two network interfaces: one to the internal cluster network and one to the network used in the storage island. Figure 5 shows an example of a global file system. Clients Cluster 1 Routers... Switch Elan4... Server farm OSS InfiniBand Clients Cluster 2 Routers Switch Storage network... Switch MDS... IP network... Storage island Figure 5. A global file system implemented using Lustre networks The benefits of site-wide and global file systems are not to be underestimated. Traditional data management for multiple clusters frequently involves staging data from one cluster on the file system to another. By deploying a site-wide Lustre file system, multiple copies of the data are no longer needed and substantial savings can be achieved through improved storage management and reduced capacity requirements. Using Lustre over wide area networks The Lustre file system has been successfully deployed over wide area networks (WANs). Typically, even over a WAN, 80 percent of raw bandwidth can be achieved, which is significantly more than that achieved by many other file systems over local area networks (LANs). For example, within the United States, Lustre file system deployments have achieved a bandwidth of 970 MB/sec over a WAN using a single 10 GigE interface (from a single client). Between Europe and the United States, 97 MB/sec has been achieved with a single GigE connection. On LANs, observed I/O bandwidths are only slightly higher: 1100 MB/sec on a 10 GigE network and 118 MB/sec on a GigE network.
11 9 Applications of LNET Routers can also be used advantageously to connect servers distributed over a WAN. For example, a single Lustre cluster may consist of two widely separated groups of Lustre servers and clients with each group interconnected by an InfiniBand network. As shown in Figure 6, Lustre routing nodes can be used to connect the two groups of Lustre servers and clients via an IP-based WAN. Alternatively, the servers could have an InfiniBand and Ethernet interface. However, this configuration may require more ports on switches, so the routing solution may be more cost effective. WAN IP IP Router Router InfiniBand InfiniBand Lustre cluster group Lustre cluster group Clients Servers Clients Servers Location A Location B Figure 6. A Lustre cluster distributed over a WAN Using Lustre routers for load balancing Commodity servers can be used as Lustre routers to provide a cost-effective, loadbalanced, redundant router configuration. For example, consider an installation with servers on a network with 10 GigE interfaces and many clients attached to a GigE network. It is possible, but typically costly, to purchase IP switching equipment that can connect to both the servers and the clients.
12 10 Applications of LNET With a Lustre network, the purchase of such costly switches can be avoided. For a more cost-effective solution, two separate networks can be created. A smaller, faster network contains the servers and a set of router nodes with sufficient aggregate throughput. A second client network with slower interfaces contains all the client nodes and is also attached to the router nodes. If this second network already exists and has sufficient free ports to add the Lustre router nodes, no changes to this client network are required. Figure 7 shows an installation with this configuration. GigE clients Router farm 10GigE servers GigE switch 10GigE switch Load balancing, redundant router farm Figure 7. An installation combining slow and fast networks using Lustre routers The routers provide a redundant, load-balanced path between the clients and the servers. This network configuration allows many clients together to use the full bandwidth of a server, even if individual clients have insufficient network bandwidth to do so. Because multiple routers stream data to the server network simultaneously, the server network can see data throughput in excess of what a single router can deliver.
13 11 Anticipated Features in Future Releases Chapter 4 Anticipated Features in Future Releases LNET offers many features today. And just like most products, enhancements and new features are intended for future releases. Some possible new features include support of multiple network interfaces, implementation of server-driven quality-of-service (QoS) guarantees, asynchronous I/O, and a control interface for routers. New features for multiple interfaces As previously mentioned, LNET can currently exploit multiple interfaces by placing them on different Lustre networks. This configuration provides reasonable load balancing for a server with many clients. However, it is a static configuration that does not handle link-level failover or dynamic load balancing. It is Sun s intention to address these shortcomings with the following design. First, LNET will virtualize multiple interfaces and offer the aggregate as one NID to the users of the LNET API. In concept, this is quite similar to the aggregation (also referred to as bonding or trunking) of Ethernet interfaces using protocols such as 802.3ad Dynamic Link Aggregation. The key features that a future LNET release may offer are: Load balancing: All links are used based on availability of throughput capacity. Link-level high availability: If one link fails, the other channels transparently continue to be used for communication. These features are shown in Figure 8. Client Client X All traffic Switch Evenly-loaded traffic X Switch Link failure accommodated without server failover Server Server Figure 8. Link-level load balancing and failover
14 12 Anticipated Features in Future Releases From a design perspective, these load-balancing and high-availability features are similar to the features offered with LNET routing described in Chapter 2 in the section Using Lustre routers for load balancing. A challenge in developing these features is providing a simple way to configure the network. Assigning and publishing NIDs for the bonded interfaces should be simple and flexible and should work even if all links are not available at startup. We expect to use the management server protocol to resolve this issue. Server-driven QoS QoS is often a critical issue, for example, when multiple clusters are competing for bandwidth from the same storage servers. A primary QoS goal is to avoid overwhelming server systems with conflicting demands from multiple clusters or systems, resulting in performance degradation for all clusters. Setting and enforcing policies is one way to avoid this. For example, a policy can be established that guarantees that a certain minimal bandwidth is allocated to resources that must respond in real time, such as for visualization. Or a policy can be defined that gives systems or clusters doing mission-critical work priority for bandwidth over less important clusters or systems. The Lustre QoS system s role is not to determine an appropriate set of policies but to provide capabilities that allow policies to be defined and enforced. Two components proposed for the Lustre QoS scheduler are a global Epoch Handler (EH) and a Local Request Scheduler (LRS). The EH provides a shared time slice among all servers. This time slice can be relatively large (one second, for example) to avoid overhead due to excessive server-to-server networking and latency. The LRS is responsible for receiving and queuing requests according to a local policy. The EH and LRS together allow all servers in a cluster to execute the same policy during the same time slice. Note that the policy may subdivide the time slices and use the subdivision advantageously. The LRS also provides summary data to the EH to support global knowledge and adaptation.
15 13 Anticipated Features in Future Releases Figure 9 shows how these features can be used to schedule rendering and visualization of streaming data. In this implementation, LRS policy allocates 30 percent of each Epoch time slice to visualization and 70 percent to rendering. Epoch messaging OSS Rendering cluster Visualization cluster Epoch % 70% 30% 70% 30% 70% Rendering Visualization Rendering Visualization Rendering Visualization Figure 9. Using server-driven QoS to schedule video rendering and visualization A router control plane Lustre technology is expected to be used in vast worldwide file systems that traverse multiple Lustre networks with many routers. To achieve wide-area QoS guarantees that cannot be achieved with static configurations, the configurations of these networks must change dynamically. A control interface is required between the routers and external administrative systems to handle these situations. Requirements are currently being developed for a Lustre Router Control Plane to help address these issues. For example, features are being considered for the Lustre Router Control Plane that could be used when data packets are being routed by routers from A to B and also from C to D and, for operational reasons, a preference needs to be given to routing the packets from C to D. The control plane would apply a policy to the routers so that packets would be sent from C to D before packets are sent from A to B. The Lustre Router Control Plane may also include the capability to provide input to a server-driven QoS subsystem, linking router policies with server policies. It might be particularly interesting to have an interface between the server-driven QoS subsystem and the router control plane to allow coordinated adjustment of QoS in a cluster and a wide area network.
16 14 Anticipated Features in Future Releases Asynchronous I/O In large compute clusters, the potential exists for significant I/O optimization. When a client writes large amounts of data, a truly asynchronous I/O mechanism would allow the client to register the memory pages that need to be written for RDMA and allow the server to transfer the data to storage without causing interrupts on the client. This makes the client CPU fully available to the application again, which is a significant benefit in some situations. Source node Network Sink node LNET LND LND LNET Source node Network Sink node LNET LND LND LNET Put message description Get DMA address Register sink buffer Put message description Register source buffer Send description and source RDMA address Register source buffer Register sink buffer RDMA data Event RDMA data Event Sending message with DMA handshake Sending message without DMA handshake Figure 10. Network-level DMA with handshake interrupts and without handshake interrupts LNET supports RDMA; however, currently a handshake at the operating system level is required to initiate the RDMA, as shown in Figure 10 (on the left). The handshake exchanges the network-level DMA addresses to be used. The proposed change to LNET would eliminate the handshake and include the network-level DMA addresses in the initial request to transfer data as shown in Figure 10 (on the right).
17 15 Conclusion Chapter 5 Conclusion LNET provides an exceptionally flexible and innovative infrastructure. Among the many features and benefits that have been discussed, the most significant are: Native support for all commonly used HPC networks Extremely fast data rates through RDMA and unparalleled TCP throughput Support for site-wide file systems through routing, eliminating staging, and copying of data between clusters Load-balancing router support to eliminate low-speed network bottlenecks Lustre networking will continue to evolve with planned features to handle link aggregation, server-driven QoS, a rich control interface to large routed networks, and asynchronous I/O without interrupts.
18 Lustre Networking On the Web sun.com 4150 Network Circle, Santa Clara, CA USA Phone or SUN (9786) Web sun.com 2008 All rights reserved. Sun, Sun Microsystems, the Sun logo, Lustre, and Solaris are trademarks or registered trademarks of in the United States and other countries. Intel Xeon is a trademark or registered trademark of Intel Corporation or its subsidiaries in the United States and other countries. Information subject to change without notice. SunWIN # Lit. #SYWP /08
LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions
LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions Roger Goff Senior Product Manager DataDirect Networks, Inc. What is Lustre? Parallel/shared file system for
More informationArchitecting Storage for Semiconductor Design: Manufacturing Preparation
White Paper Architecting Storage for Semiconductor Design: Manufacturing Preparation March 2012 WP-7157 EXECUTIVE SUMMARY The manufacturing preparation phase of semiconductor design especially mask data
More informationTo Infiniband or Not Infiniband, One Site s s Perspective. Steve Woods MCNC
To Infiniband or Not Infiniband, One Site s s Perspective Steve Woods MCNC 1 Agenda Infiniband background Current configuration Base Performance Application performance experience Future Conclusions 2
More informationWITH THE LUSTRE FILE SYSTEM PETA-SCALE I/O. An Oak Ridge National Laboratory/ Lustre Center of Excellence Paper February 2008.
PETA-SCALE I/O WITH THE LUSTRE FILE SYSTEM An Oak Ridge National Laboratory/ Lustre Center of Excellence Paper February 2008 Abstract This paper describes low-level infrastructure in the Lustre file system
More informationParallel File Systems for HPC
Introduction to Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 The Need for 2 The File System 3 Cluster & A typical
More informationLustre TM. Scalability
Lustre TM Scalability An Oak Ridge National Laboratory/ Lustre Center of Excellence White Paper February 2009 2 Sun Microsystems, Inc Table of Contents Executive Summary...3 HPC Trends...3 Lustre File
More informationChelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING
Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity
More informationThe modules covered in this course are:
CORE Course description CORE is the first course in the Intel Solutions for Lustre* training curriculum. You ll learn about the various Intel Solutions for Lustre* software, Linux and Lustre* fundamentals
More informationScaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc
Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC
More informationIntel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances
Technology Brief Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances The world
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton
More informationGUIDE. Optimal Network Designs with Cohesity
Optimal Network Designs with Cohesity TABLE OF CONTENTS Introduction...3 Key Concepts...4 Five Common Configurations...5 3.1 Simple Topology...5 3.2 Standard Topology...6 3.3 Layered Topology...7 3.4 Cisco
More informationNew Storage Architectures
New Storage Architectures OpenFabrics Software User Group Workshop Replacing LNET routers with IB routers #OFSUserGroup Lustre Basics Lustre is a clustered file-system for supercomputing Architecture consists
More informationOlaf Weber Senior Software Engineer SGI Storage Software. Amir Shehata Lustre Network Engineer Intel High Performance Data Division
Olaf Weber Senior Software Engineer SGI Storage Software Amir Shehata Lustre Network Engineer Intel High Performance Data Division Intel and the Intel logo are trademarks or registered trademarks of Intel
More informationXyratex ClusterStor6000 & OneStor
Xyratex ClusterStor6000 & OneStor Proseminar Ein-/Ausgabe Stand der Wissenschaft von Tim Reimer Structure OneStor OneStorSP OneStorAP ''Green'' Advancements ClusterStor6000 About Scale-Out Storage Architecture
More informationSun Lustre Storage System Simplifying and Accelerating Lustre Deployments
Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems
More informationINTEGRATING HPFS IN A CLOUD COMPUTING ENVIRONMENT
INTEGRATING HPFS IN A CLOUD COMPUTING ENVIRONMENT Abhisek Pan 2, J.P. Walters 1, Vijay S. Pai 1,2, David Kang 1, Stephen P. Crago 1 1 University of Southern California/Information Sciences Institute 2
More informationHPC NETWORKING IN THE REAL WORLD
15 th ANNUAL WORKSHOP 2019 HPC NETWORKING IN THE REAL WORLD Jesse Martinez Los Alamos National Laboratory March 19 th, 2019 [ LOGO HERE ] LA-UR-19-22146 ABSTRACT Introduction to LANL High Speed Networking
More informationNTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.
Messaging App IB Verbs NTRDMA dmaengine.h ntb.h DMA DMA DMA NTRDMA v0.1 An Open Source Driver for PCIe and DMA Allen Hubbe at Linux Piter 2015 1 INTRODUCTION Allen Hubbe Senior Software Engineer EMC Corporation
More informationGPFS on a Cray XT. Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009
GPFS on a Cray XT Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009 Outline NERSC Global File System GPFS Overview Comparison of Lustre and GPFS
More informationUSING ISCSI AND VERITAS BACKUP EXEC 9.0 FOR WINDOWS SERVERS BENEFITS AND TEST CONFIGURATION
WHITE PAPER Maximize Storage Networks with iscsi USING ISCSI AND VERITAS BACKUP EXEC 9.0 FOR WINDOWS SERVERS BENEFITS AND TEST CONFIGURATION For use with Windows 2000 VERITAS Software Corporation 03/05/2003
More informationExperiences with HP SFS / Lustre in HPC Production
Experiences with HP SFS / Lustre in HPC Production Computing Centre (SSCK) University of Karlsruhe Laifer@rz.uni-karlsruhe.de page 1 Outline» What is HP StorageWorks Scalable File Share (HP SFS)? A Lustre
More informationOlaf Weber Senior Software Engineer SGI Storage Software
Olaf Weber Senior Software Engineer SGI Storage Software Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
More informationEqualLogic Storage and Non-Stacking Switches. Sizing and Configuration
EqualLogic Storage and Non-Stacking Switches Sizing and Configuration THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS
More informationIntroduction The Project Lustre Architecture Performance Conclusion References. Lustre. Paul Bienkowski
Lustre Paul Bienkowski 2bienkow@informatik.uni-hamburg.de Proseminar Ein-/Ausgabe - Stand der Wissenschaft 2013-06-03 1 / 34 Outline 1 Introduction 2 The Project Goals and Priorities History Who is involved?
More informationMulti-Rail LNet for Lustre
Multi-Rail LNet for Lustre Rob Mollard September 2016 The SGI logos and SGI product names used or referenced herein are either registered trademarks or trademarks of Silicon Graphics International Corp.
More informationCisco I/O Accelerator Deployment Guide
Cisco I/O Accelerator Deployment Guide Introduction This document provides design and configuration guidance for deploying the Cisco MDS 9000 Family I/O Accelerator (IOA) feature, which significantly improves
More informationTHE ZADARA CLOUD. An overview of the Zadara Storage Cloud and VPSA Storage Array technology WHITE PAPER
WHITE PAPER THE ZADARA CLOUD An overview of the Zadara Storage Cloud and VPSA Storage Array technology Zadara 6 Venture, Suite 140, Irvine, CA 92618, USA www.zadarastorage.com EXECUTIVE SUMMARY The IT
More informationRoCE vs. iwarp Competitive Analysis
WHITE PAPER February 217 RoCE vs. iwarp Competitive Analysis Executive Summary...1 RoCE s Advantages over iwarp...1 Performance and Benchmark Examples...3 Best Performance for Virtualization...5 Summary...6
More informationA GPFS Primer October 2005
A Primer October 2005 Overview This paper describes (General Parallel File System) Version 2, Release 3 for AIX 5L and Linux. It provides an overview of key concepts which should be understood by those
More informationLLNL Lustre Centre of Excellence
LLNL Lustre Centre of Excellence Mark Gary 4/23/07 This work was performed under the auspices of the U.S. Department of Energy by University of California, Lawrence Livermore National Laboratory under
More informationInfiniBand Networked Flash Storage
InfiniBand Networked Flash Storage Superior Performance, Efficiency and Scalability Motti Beck Director Enterprise Market Development, Mellanox Technologies Flash Memory Summit 2016 Santa Clara, CA 1 17PB
More informationBest Practices for Deployments using DCB and RoCE
Best Practices for Deployments using DCB and RoCE Contents Introduction... Converged Networks... RoCE... RoCE and iwarp Comparison... RoCE Benefits for the Data Center... RoCE Evaluation Design... RoCE
More informationiscsi Technology: A Convergence of Networking and Storage
HP Industry Standard Servers April 2003 iscsi Technology: A Convergence of Networking and Storage technology brief TC030402TB Table of Contents Abstract... 2 Introduction... 2 The Changing Storage Environment...
More informationWhy Performance Matters When Building Your New SD-WAN
Why Performance Matters When Building Your New SD-WAN Not all SD-WANs are created equal. Brought to you by Silver Peak The New Generation of High Performance SD-WANs As enterprise IT considers ways to
More informationFuture Routing Schemes in Petascale clusters
Future Routing Schemes in Petascale clusters Gilad Shainer, Mellanox, USA Ola Torudbakken, Sun Microsystems, Norway Richard Graham, Oak Ridge National Laboratory, USA Birds of a Feather Presentation Abstract
More informationIBM Europe Announcement ZP , dated November 6, 2007
IBM Europe Announcement ZP07-0484, dated November 6, 2007 IBM WebSphere Front Office for Financial Markets V2.0 and IBM WebSphere MQ Low Latency Messaging V2.0 deliver high speed and high throughput market
More informationWHITE PAPER: BEST PRACTICES. Sizing and Scalability Recommendations for Symantec Endpoint Protection. Symantec Enterprise Security Solutions Group
WHITE PAPER: BEST PRACTICES Sizing and Scalability Recommendations for Symantec Rev 2.2 Symantec Enterprise Security Solutions Group White Paper: Symantec Best Practices Contents Introduction... 4 The
More informationEvaluating the Impact of RDMA on Storage I/O over InfiniBand
Evaluating the Impact of RDMA on Storage I/O over InfiniBand J Liu, DK Panda and M Banikazemi Computer and Information Science IBM T J Watson Research Center The Ohio State University Presentation Outline
More informationOptimize and Accelerate Your Mission- Critical Applications across the WAN
BIG IP WAN Optimization Module DATASHEET What s Inside: 1 Key Benefits 2 BIG-IP WAN Optimization Infrastructure 3 Data Optimization Across the WAN 4 TCP Optimization 4 Application Protocol Optimization
More informationZEST Snapshot Service. A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1
ZEST Snapshot Service A Highly Parallel Production File System by the PSC Advanced Systems Group Pittsburgh Supercomputing Center 1 Design Motivation To optimize science utilization of the machine Maximize
More informationThe NE010 iwarp Adapter
The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter
More informationby Brian Hausauer, Chief Architect, NetEffect, Inc
iwarp Ethernet: Eliminating Overhead In Data Center Designs Latest extensions to Ethernet virtually eliminate the overhead associated with transport processing, intermediate buffer copies, and application
More informationAnalytics of Wide-Area Lustre Throughput Using LNet Routers
Analytics of Wide-Area Throughput Using LNet Routers Nagi Rao, Neena Imam, Jesse Hanley, Sarp Oral Oak Ridge National Laboratory User Group Conference LUG 2018 April 24-26, 2018 Argonne National Laboratory
More informationCRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart
CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart Xiangyong Ouyang, Raghunath Rajachandrasekar, Xavier Besseron, Hao Wang, Jian Huang, Dhabaleswar K. Panda Department of Computer
More informationParallel File Systems Compared
Parallel File Systems Compared Computing Centre (SSCK) University of Karlsruhe, Germany Laifer@rz.uni-karlsruhe.de page 1 Outline» Parallel file systems (PFS) Design and typical usage Important features
More informationIntel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage
Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Evaluation of Lustre File System software enhancements for improved Metadata performance Wojciech Turek, Paul Calleja,John
More informationIntroduction to Ethernet Latency
Introduction to Ethernet Latency An Explanation of Latency and Latency Measurement The primary difference in the various methods of latency measurement is the point in the software stack at which the latency
More informationImplementing Storage in Intel Omni-Path Architecture Fabrics
white paper Implementing in Intel Omni-Path Architecture Fabrics Rev 2 A rich ecosystem of storage solutions supports Intel Omni- Path Executive Overview The Intel Omni-Path Architecture (Intel OPA) is
More informationAddressing Data Management and IT Infrastructure Challenges in a SharePoint Environment. By Michael Noel
Addressing Data Management and IT Infrastructure Challenges in a SharePoint Environment By Michael Noel Contents Data Management with SharePoint and Its Challenges...2 Addressing Infrastructure Sprawl
More informationUSING ISCSI MULTIPATHING IN THE SOLARIS 10 OPERATING SYSTEM
USING ISCSI MULTIPATHING IN THE SOLARIS 10 OPERATING SYSTEM Aaron Dailey, Storage Network Engineering Scott Tracy, Storage Network Engineering Sun BluePrints OnLine December 2005 Part No 819-3730-10 Revision
More informationCreating an agile infrastructure with Virtualized I/O
etrading & Market Data Agile infrastructure Telecoms Data Center Grid Creating an agile infrastructure with Virtualized I/O Richard Croucher May 2009 Smart Infrastructure Solutions London New York Singapore
More informationEditShare XStream EFS Shared Storage
WHITE PAPER EditShare XStream EFS Shared Storage Advantages of EFS Native Client network protocol in media intensive storage workflows 2018 EditShare LLC. All rights reserved. EditShare is a registered
More informationAn Introduction to GPFS
IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4
More informationSR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience
SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience Jithin Jose, Mingzhe Li, Xiaoyi Lu, Krishna Kandalla, Mark Arnold and Dhabaleswar K. (DK) Panda Network-Based Computing Laboratory
More informationBest Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays
Dell EqualLogic Best Practices Series Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays A Dell Technical Whitepaper Jerry Daugherty Storage Infrastructure
More informationARISTA: Improving Application Performance While Reducing Complexity
ARISTA: Improving Application Performance While Reducing Complexity October 2008 1.0 Problem Statement #1... 1 1.1 Problem Statement #2... 1 1.2 Previous Options: More Servers and I/O Adapters... 1 1.3
More informationServer Networking e Virtual Data Center
Server Networking e Virtual Data Center Roma, 8 Febbraio 2006 Luciano Pomelli Consulting Systems Engineer lpomelli@cisco.com 1 Typical Compute Profile at a Fortune 500 Enterprise Compute Infrastructure
More informationIntroduction to TCP/IP Offload Engine (TOE)
Introduction to TCP/IP Offload Engine (TOE) Version 1.0, April 2002 Authored By: Eric Yeh, Hewlett Packard Herman Chao, QLogic Corp. Venu Mannem, Adaptec, Inc. Joe Gervais, Alacritech Bradley Booth, Intel
More informationVirtuLocity VLNCloud Software Acceleration Service Virtualized acceleration wherever and whenever you need it
VirtuLocity VLNCloud Software Acceleration Service Virtualized acceleration wherever and whenever you need it Bandwidth Optimization with Adaptive Congestion Avoidance for Cloud Connections Virtulocity
More informationMellanox Virtual Modular Switch
WHITE PAPER July 2015 Mellanox Virtual Modular Switch Introduction...1 Considerations for Data Center Aggregation Switching...1 Virtual Modular Switch Architecture - Dual-Tier 40/56/100GbE Aggregation...2
More informationAndreas Dilger. Principal Lustre Engineer. High Performance Data Division
Andreas Dilger Principal Lustre Engineer High Performance Data Division Focus on Performance and Ease of Use Beyond just looking at individual features... Incremental but continuous improvements Performance
More informationAdvanced RDMA-based Admission Control for Modern Data-Centers
Advanced RDMA-based Admission Control for Modern Data-Centers Ping Lai Sundeep Narravula Karthikeyan Vaidyanathan Dhabaleswar. K. Panda Computer Science & Engineering Department Ohio State University Outline
More informationHLD For SMP node affinity
HLD For SMP node affinity Introduction Current versions of Lustre rely on a single active metadata server. Metadata throughput may be a bottleneck for large sites with many thousands of nodes. System architects
More informationLustre File System. Proseminar 2013 Ein-/Ausgabe - Stand der Wissenschaft Universität Hamburg. Paul Bienkowski Author. Michael Kuhn Supervisor
Proseminar 2013 Ein-/Ausgabe - Stand der Wissenschaft Universität Hamburg September 30, 2013 Paul Bienkowski Author 2bienkow@informatik.uni-hamburg.de Michael Kuhn Supervisor michael.kuhn@informatik.uni-hamburg.de
More informationDesign a Remote-Office or Branch-Office Data Center with Cisco UCS Mini
White Paper Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini February 2015 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 9 Contents
More informationEMC Celerra CNS with CLARiiON Storage
DATA SHEET EMC Celerra CNS with CLARiiON Storage Reach new heights of availability and scalability with EMC Celerra Clustered Network Server (CNS) and CLARiiON storage Consolidating and sharing information
More informationNetworking for Data Acquisition Systems. Fabrice Le Goff - 14/02/ ISOTDAQ
Networking for Data Acquisition Systems Fabrice Le Goff - 14/02/2018 - ISOTDAQ Outline Generalities The OSI Model Ethernet and Local Area Networks IP and Routing TCP, UDP and Transport Efficiency Networking
More informationNetApp High-Performance Storage Solution for Lustre
Technical Report NetApp High-Performance Storage Solution for Lustre Solution Design Narjit Chadha, NetApp October 2014 TR-4345-DESIGN Abstract The NetApp High-Performance Storage Solution (HPSS) for Lustre,
More informationLustre overview and roadmap to Exascale computing
HPC Advisory Council China Workshop Jinan China, October 26th 2011 Lustre overview and roadmap to Exascale computing Liang Zhen Whamcloud, Inc liang@whamcloud.com Agenda Lustre technology overview Lustre
More informationAn Overview of Fujitsu s Lustre Based File System
An Overview of Fujitsu s Lustre Based File System Shinji Sumimoto Fujitsu Limited Apr.12 2011 For Maximizing CPU Utilization by Minimizing File IO Overhead Outline Target System Overview Goals of Fujitsu
More informationREAL SPEED. Neterion : The Leader in 10 Gigabit Ethernet adapters
experience10 GbE REAL SPEED Neterion : The Leader in 10 Gigabit Ethernet adapters With data volumes growing at explosive rates, network bandwidth has become a critical factor in the IT infrastructure of
More informationExtending PCI-Express in MicroTCA Platforms. Whitepaper. Joey Maitra of Magma & Tony Romero of Performance Technologies
Extending PCI-Express in MicroTCA Platforms Whitepaper Joey Maitra of Magma & Tony Romero of Performance Technologies Introduction: The introduction of MicroTCA platforms has opened the door for AdvancedMC
More informationA Guide to Architecting the Active/Active Data Center
White Paper A Guide to Architecting the Active/Active Data Center 2015 ScaleArc. All Rights Reserved. White Paper The New Imperative: Architecting the Active/Active Data Center Introduction With the average
More informationSmall Enterprise Design Profile(SEDP) WAN Design
CHAPTER 3 Small Enterprise Design Profile(SEDP) WAN Design This chapter discusses how to design and deploy WAN architecture for Small Enterprise Design Profile. The primary components of the WAN architecture
More information1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.
1 Copyright 2011, Oracle and/or its affiliates. All rights ORACLE PRODUCT LOGO Solaris 11 Networking Overview Sebastien Roy, Senior Principal Engineer Solaris Core OS, Oracle 2 Copyright 2011, Oracle and/or
More informationExtending InfiniBand Globally
Extending InfiniBand Globally Eric Dube (eric@baymicrosystems.com) com) Senior Product Manager of Systems November 2010 Bay Microsystems Overview About Bay Founded in 2000 to provide high performance networking
More informationDriveScale-DellEMC Reference Architecture
DriveScale-DellEMC Reference Architecture DellEMC/DRIVESCALE Introduction DriveScale has pioneered the concept of Software Composable Infrastructure that is designed to radically change the way data center
More informationSeven Criteria for a Sound Investment in WAN Optimization
Seven Criteria for a Sound Investment in WAN Optimization Introduction WAN optimization technology brings three important business benefits to IT organizations: Reduces branch office infrastructure costs
More informationDesign a Remote-Office or Branch-Office Data Center with Cisco UCS Mini
White Paper Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini June 2016 2016 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 9 Contents
More informationInfiniband and RDMA Technology. Doug Ledford
Infiniband and RDMA Technology Doug Ledford Top 500 Supercomputers Nov 2005 #5 Sandia National Labs, 4500 machines, 9000 CPUs, 38TFlops, 1 big headache Performance great...but... Adding new machines problematic
More informationOptimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications
Optimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications K. Vaidyanathan, P. Lai, S. Narravula and D. K. Panda Network Based Computing Laboratory
More informationCisco Nexus 4000 Series Switches for IBM BladeCenter
Cisco Nexus 4000 Series Switches for IBM BladeCenter What You Will Learn This document is targeted at server, storage, and network administrators planning to deploy IBM BladeCenter servers with the unified
More informationArchitecting the High Performance Storage Network
Architecting the High Performance Storage Network Jim Metzler Ashton, Metzler & Associates Table of Contents 1.0 Executive Summary...3 3.0 SAN Architectural Principals...5 4.0 The Current Best Practices
More informationMellanox Technologies Delivers The First InfiniBand Server Blade Reference Design
Mellanox Technologies Delivers The First InfiniBand Server Blade Reference Design Complete InfiniBand Reference Design Provides a Data Center in a Box SANTA CLARA, CALIFORNIA and YOKNEAM, ISRAEL, Jan 21,
More informationSNIA Discussion on iscsi, FCIP, and IFCP Page 1 of 7. IP storage: A review of iscsi, FCIP, ifcp
SNIA Discussion on iscsi, FCIP, and IFCP Page 1 of 7 IP storage: A review of iscsi, FCIP, ifcp SNIA IP Storage Forum With the advent of new IP storage products and transport protocol standards iscsi, FCIP,
More information2 to 4 Intel Xeon Processor E v3 Family CPUs. Up to 12 SFF Disk Drives for Appliance Model. Up to 6 TB of Main Memory (with GB LRDIMMs)
Based on Cisco UCS C460 M4 Rack Servers Solution Brief May 2015 With Intelligent Intel Xeon Processors Highlights Integrate with Your Existing Data Center Our SAP HANA appliances help you get up and running
More informationDB2 purescale: High Performance with High-Speed Fabrics. Author: Steve Rees Date: April 5, 2011
DB2 purescale: High Performance with High-Speed Fabrics Author: Steve Rees Date: April 5, 2011 www.openfabrics.org IBM 2011 Copyright 1 Agenda Quick DB2 purescale recap DB2 purescale comes to Linux DB2
More informationHorizontal Scaling Solution using Linux Environment
Systems Software for the Next Generation of Storage Horizontal Scaling Solution using Linux Environment December 14, 2001 Carter George Vice President, Corporate Development PolyServe, Inc. PolyServe Goal:
More informationEdgeConnectSP The Premier SD-WAN Solution
SERVICE PROVIDER EdgeConnectSP The Premier SD-WAN Solution Build High-Performance Managed SD-WAN Services Challenges with Legacy WANs Significant shifts in application and traffic patterns, including the
More informationActive-Active LNET Bonding Using Multiple LNETs and Infiniband partitions
April 15th - 19th, 2013 LUG13 LUG13 Active-Active LNET Bonding Using Multiple LNETs and Infiniband partitions Shuichi Ihara DataDirect Networks, Japan Today s H/W Trends for Lustre Powerful server platforms
More informationPLANEAMENTO E GESTÃO DE REDES INFORMÁTICAS COMPUTER NETWORKS PLANNING AND MANAGEMENT
Mestrado em Engenharia Informática e de Computadores PLANEAMENTO E GESTÃO DE REDES INFORMÁTICAS COMPUTER NETWORKS PLANNING AND MANAGEMENT 2010-2011 Metodologia de Projecto 4 - Project Methodology 4 1 Hierarchical
More informationSAN Virtuosity Fibre Channel over Ethernet
SAN VIRTUOSITY Series WHITE PAPER SAN Virtuosity Fibre Channel over Ethernet Subscribe to the SAN Virtuosity Series at www.sanvirtuosity.com Table of Contents Introduction...1 VMware and the Next Generation
More informationQLE10000 Series Adapter Provides Application Benefits Through I/O Caching
QLE10000 Series Adapter Provides Application Benefits Through I/O Caching QLogic Caching Technology Delivers Scalable Performance to Enterprise Applications Key Findings The QLogic 10000 Series 8Gb Fibre
More informationModule 2 Storage Network Architecture
Module 2 Storage Network Architecture 1. SCSI 2. FC Protocol Stack 3. SAN:FC SAN 4. IP Storage 5. Infiniband and Virtual Interfaces FIBRE CHANNEL SAN 1. First consider the three FC topologies pointto-point,
More informationFlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC
white paper FlashGrid Software Intel SSD DC P3700/P3600/P3500 Topic: Hyper-converged Database/Storage FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC Abstract FlashGrid
More informationEnabling Efficient and Scalable Zero-Trust Security
WHITE PAPER Enabling Efficient and Scalable Zero-Trust Security FOR CLOUD DATA CENTERS WITH AGILIO SMARTNICS THE NEED FOR ZERO-TRUST SECURITY The rapid evolution of cloud-based data centers to support
More informationCisco Unified Computing System Delivering on Cisco's Unified Computing Vision
Cisco Unified Computing System Delivering on Cisco's Unified Computing Vision At-A-Glance Unified Computing Realized Today, IT organizations assemble their data center environments from individual components.
More informationVoltaire. Fast I/O for XEN using RDMA Technologies. The Grid Interconnect Company. April 2005 Yaron Haviv, Voltaire, CTO
Voltaire The Grid Interconnect Company Fast I/O for XEN using RDMA Technologies April 2005 Yaron Haviv, Voltaire, CTO yaronh@voltaire.com The Enterprise Grid Model and ization VMs need to interact efficiently
More informationBirds of a Feather Presentation
Mellanox InfiniBand QDR 4Gb/s The Fabric of Choice for High Performance Computing Gilad Shainer, shainer@mellanox.com June 28 Birds of a Feather Presentation InfiniBand Technology Leadership Industry Standard
More information