Extending InfiniBand Globally

Similar documents
Network Layer Flow Control via Credit Buffering

Storage System. David Southwell, Ph.D President & CEO Obsidian Strategics Inc. BB:(+1)

High Performance Ethernet for Grid & Cluster Applications. Adam Filby Systems Engineer, EMEA

Introduction to iscsi

Introduction to High-Speed InfiniBand Interconnect

IBM Europe Announcement ZG , dated February 13, 2007

Transport is now key for extended SAN applications. Main factors required in SAN interconnect transport solutions are:

InfiniBand Networked Flash Storage

New Storage Architectures

iscsi Technology: A Convergence of Networking and Storage

Optical Transport Platform

Cisco SFS 7000D InfiniBand Server Switch

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Extend your DB2 purescale cluster to another city- Geographically Dispersed purescale Cluster

40 Gigabit and 100 Gigabit Ethernet Are Here!

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

InfiniBand and Next Generation Enterprise Networks

WAN Technology & Design. Dr. Nawaporn Wisitpongphan

SCTE Event. Metro Network Alternatives 5/22/2013

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Signal Delay Network Emulators

OceanStor 9000 InfiniBand Technical White Paper. Issue V1.01 Date HUAWEI TECHNOLOGIES CO., LTD.

Information Data Reliability With The XGbE WAN PHY

Comparing Server I/O Consolidation Solutions: iscsi, InfiniBand and FCoE. Gilles Chekroun Errol Roberts

AVAYA FABRIC CONNECT SOLUTION WITH SENETAS ETHERNET ENCRYPTORS

Optical Networking. Applications

SAN extension and bridging

The NE010 iwarp Adapter

THE MPLS JOURNEY FROM CONNECTIVITY TO FULL SERVICE NETWORKS. Sangeeta Anand Vice President Product Management Cisco Systems.

CN-100 Network Analyzer Product Overview

Module 2 Storage Network Architecture

Overview Brochure Direct attach and active optical cables

BROCADE ICX 6610 SWITCHES FREQUENTLY ASKED QUESTIONS

Infiniband Fast Interconnect

InfiniBand SDR, DDR, and QDR Technology Guide

iscsi or iser? Asgeir Eiriksson CTO Chelsio Communications Inc

OPTICAL TRANSCEIVERS. Harnessing more network power

Optical Business Services

SONET Links Extend Fibre Channel SANs

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

- 128 x Gigabit Ethernet connections - 32 x 4G Fibre Channel connections - 8 x 8G Fibre Channel connections - down to 2 Mbps

Dragon Slayer Consulting

Solutions for Scalable HPC

Data center applications standards reference guide

Industry Standards for the Exponential Growth of Data Center Bandwidth and Management. Craig W. Carlson

Line Cards and Physical Layer Interface Modules Overview, page 1

Arista 7060X, 7060X2, 7260X and 7260X3 series: Q&A

SIMPLE, FLEXIBLE CONNECTIONS FOR TODAY S BUSINESS. Ethernet Services from Verizon

More companies are turning to technology to help boost their bottom line

Ethernet based Broadband Access Networks

SwitchX Virtual Protocol Interconnect (VPI) Switch Architecture

Concurrent Support of NVMe over RDMA Fabrics and Established Networked Block and File Storage

1.264 Lecture 23. Telecom Enterprise networks MANs, WANs

OptiDriver 100 Gbps Application Suite

Cisco Nexus 4000 Series Switches for IBM BladeCenter

METRO/ENTERPRISE WDM PLATFORM

Future Routing Schemes in Petascale clusters

The Promise of Unified I/O Fabrics

SONET/SDH VCAT SONET VCAT

Hands-On Wide Area Storage & Network Design WAN: Design - Deployment - Performance - Troubleshooting

MX Ring. WDM - MUX/DeMUX. MUX/DeMUX. Features Full native mode performance Optical connectors Passive model requires no power.

OPTera Metro 8000 Services Switch

Introduction to Infiniband

STORAGE PROTOCOLS. Storage is a major consideration for cloud initiatives; what type of disk, which

SFP GBIC XFP. Application Note. Cost Savings. Density. Flexibility. The Pluggables Advantage

The Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011

GPON Gigabit Passive Optical Network

5U CWDM Managed Platform SML-5000

Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment

Universal Network Demarcation. Enabling Ethernet and wave services with the Nokia 1830 PSD. Application note. 1 Application note

Service-centric transport infrastructure

LUSTRE NETWORKING High-Performance Features and Flexible Support for a Wide Array of Networks White Paper November Abstract

Encryption in high-speed optical networks

Snia S Storage Networking Management/Administration.

DATASHEET Scalable & Flexible FMT 9600E Hyperscale DWDM Connect

BROCADE 8000 SWITCH FREQUENTLY ASKED QUESTIONS

QLogic BS21, , and InfiniBand Switches IBM Power at-a-glance guide

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

BROCADE ICX 6610 SWITCHES FREQUENTLY ASKED QUESTIONS

Introducing the Brocade MLXe: Brocade s High-Density 100 Gigabit

OCP Engineering Workshop - Telco

Overview. Cisco UCS Manager User Documentation

IP Video Network Gateway Solutions

Introduction: PURPOSE BUILT HARDWARE. ARISTA WHITE PAPER HPC Deployment Scenarios

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

Product Overview. Send documentation comments to CHAPTER

Cisco 4000 Series Integrated Services Routers: Architecture for Branch-Office Agility

Next Generation Storage Networking for Next Generation Data Centers. PRESENTATION TITLE GOES HERE Dennis Martin President, Demartek

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek

To Infiniband or Not Infiniband, One Site s s Perspective. Steve Woods MCNC

OTN Technology, Standards and Applica7ons

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE

NEXT GENERATION BACKHAUL NETWORKS

Ericsson ip transport nms

Netherlands = Finland?

Birds of a Feather Presentation

HPC Customer Requirements for OpenFabrics Software

Cisco Etherswitch Service Modules

Optical Network Tester (ONT)

Transcription:

Extending InfiniBand Globally Eric Dube (eric@baymicrosystems.com) com) Senior Product Manager of Systems November 2010

Bay Microsystems Overview About Bay Founded in 2000 to provide high performance networking solutions Silicon Engineering & Headquarters: o San Jose, CA Systems Engineering g & Business Development: o Germantown, MD; MA Corporate Focus Development of complex integrated t circuits it for high h performance packet processing and optical transport applications Systems that t deliver high h performance protocol agnostic inter-working and Data Center Transport Extension (DCTE) (via InfiniBand) 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 2

Extending InfiniBand Globally Campus, Metro, or Wide Area Network (from 1 to 1000 s of miles) 0100110 0 With InfiniBand s adoption rate steadily growing for high performance computing, financial services, and high-speed storage applications, the need to extend InfiniBand between data centers is essential for providing disaster recovery, multi-site backups, and real-time data access solutions. While InfiniBand s crediting mechanism is an excellent and reliable way to provide flow control, existing InfiniBand hardware doesn t provide enough port buffering for deployment beyond a single site. A reduction in sustained bandwidth starts occurring at just a few hundred meters due to inadequate port buffering if the number of virtual lanes aren t reduced. Even with the minimum amount of virtual lanes configured, not enough packets can be kept in-flight due to the port buffer credit starvation that starts occurring at around 500-600 meters or less (depending on the data rate.) 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 3

ABEx 2020 Global InfiniBand Extension Platform InfiniBand (4X SDR) Native IB Port IB Pseudo-Port 2-Port IB Switch Metro / Wide Area InfiniBand (4X SDR) Gigabit Ethernet 1 Gigabit Ethernet 1 Network Gigabit Ethernet 2 Gigabit Ethernet 2 ABEx 2020 ABEx 2020 SONET/SDH/ATM OC-48 to 192, 10G Ethernet (LAN/WAN), IP, dark fiber Specialized devices are needed that provide enhanced buffering and end-to-end flow control to extend native InfiniBand over campus, metro, and wide area networks to any point on the globe. IB Device Type: 2-Port Switch IB Data Rate: 4X InfiniBand SDR (Full 8 Gbps line-rate performance) IB Virtual Lanes: 4 Operational (VL0-3) + 1 Management (VL15) Ethernet Encapsulation: Up to two 1G Ethernet links can be encapsulated over the same wide area network connection Pluggable Optics: SFP (1/2.5G) & XFP (10G) slots provide flexible connectivity options Range: Production fabrics running at 15,000km today; simulated up to 35,000km 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 4

InfiniBand Applications and Multi-site Deployments Global File Systems & Storage Architectures High performance/high volume data sharing and storage virtualization between sites High Performance Computing (HPC) Clustered applications and cloud computing Post-processing and visualization Financial Services Disaster recovery solution for low latency trading and market data feed applications Clustered Databases and Warehouses Multi-site failover and data mirroring Real-time local access and information sharing Distributed Healthcare Applications High resolution imaging deployments 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 5

Large Data JCTD Petascale Global Storage Architecture Global InfiniBand Network Site 7 Security Element Storage Server Nodes Storage Array (Petabytes) Site 6 Site1 Sustains >250MB/sec. across an OC-48 link Site 2 Large distributed network of compute and storage resources on a single IB infrastructure 10 s of terabytes of traffic transmitted daily Petascale storage and transport Very large data set gathering, archiving, processing, and rendering for the US Government In production since late 2006 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 6 Site 5 Site 4 Site 3

Large Data JCTD Lustre File Transfer Over Distance Distance / Delay Connection Line Rate # of Files Transferred % of Theoretical Max / Bandwidth ~150 fiber miles (2.5 ms latency) SONET OC-192 Single Multiple 62.8% (590.6 MBps) 99.6% (935.7 MBps) ~2000 fiber miles (34.5 ms latency) SONET OC-192 Single Multiple 59.1% (555.2 MBps) 91.1% (856.4 MBps) ~13,000 fiber miles (206 ms latency) SONET OC-48 (Partial) Single Multiple 86.0% (182.3 MBps) 94.6% (200.7 MBps) Maximum Theoretical Data Transfer Rate for SONET OC-192 = 939.7 MBps; for Partial SONET OC-48 = 212.0 MBps 1 MB = 1,048,576 bits (2^20) * Slide content and performance data obtained from Large Data JCTD Public Presentation 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 7

Large Data JCTD IB Performance Comparison Typical RDMA/IB Performance Typical RDMA/IB Performance Typical TCP/IP/ETHERNET Performance TESTS ON 1 Gbps CIRCUIT (~8000 miles) [ ~13,000 fiber miles] TESTS ON 8 Gbps CIRCUIT (~1200 miles) [~2000 fiber miles] RDMA over IB provides very efficient use of available bandwidth with near linear scaling RDMA/IB performance 80% TCP/IP performance 40% RDMA/IB CPU usage estimated 4x less InfiniBand connection is lossless with nearly perfect fair sharing of bandwidth across multiple, concurrent data flows * Slide content and performance data obtained from Large Data JCTD Public Presentation 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 8

InfiniBand Pseudowire Services for Converged Network Cores InfiniBand Ethernet Ethernet t ATM Service Provider IP/MPLS Core Network InfiniBand ATM With more and more Service Provider s moving towards converged network cores (such as IP/MPLS), supporting those core network services will be important in order to provide cost-effective global InfiniBand extension InfiniBand over MPLS Networks - Draft specification has been submitted and presented to IETF for review and consideration InfiniBand Tunneling using Layer 2 Tunneling Protocol version 3 (L2TPv3) - L2TPv3 is an IETF standard that provides a robust control plane for dynamically establishing and maintaining sessions - Draft specification has implemented for InfiniBand tunneling over IP 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 9

InfiniBand Extension over Packet Switched Networks via L2TPv3 Physical View Wide Area Network InfiniBand Subnet ABEx 2020 IP Network Service ABEx 2020 InfiniBand Subnet InfiniBand Pseudowire via L2TPv3 InfiniBand View InfiniBand Subnet Core network IP service is transparent to InfiniBand network and applications ABEx 2020 Appear as two 2-port ABEx 2020 InfiniBand Switches Bridges the two InfiniBand Subnets into a single seamless network InfiniBand Subnet 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 10

SC09: Univ. of Utah / AMES Lab InfiniBand Pseudowire Demo Global high performance storage demonstration using PVFS2 file system to transfer large files over InfiniBand between sites ~1000 Fiber Miles! ~5000 Fiber Miles! Native InfiniBand is extended via L2TPv3 between sites over a shared packet switched network running IP 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 11

InfiniBand to Wide Area Network Data Rates Mapping Connection Type Actual Data Rate WAN Speed WAN Transport 1X InfiniBand SDR 2Gbps 2.5G OC-48 / OTU1 4X InfiniBand SDR 8Gbps 10G OC-192 / OTU2 / 10GE 12X InfiniBand SDR 24Gbps? - 1X InfiniBand DDR 4Gbps? - 4X InfiniBand DDR 16Gbps? - 12X InfiniBand ib DDR 32Gbps 40G OC-768 / OTU3 / 40GE 1X InfiniBand QDR 8Gbps 10G OC-192 / OTU2 / 10GE 4X InfiniBand QDR 32Gbps 40G OC-768 / OTU3 / 40GE 8X InfiniBand QDR 64Gbps? - 12X InfiniBand QDR 96Gbps 100G OTU4 / 100GE 1X InfiniBand EDR 25Gbps? - 4X InfiniBand EDR 100Gbps 100G OTU4 / 100GE 8X InfiniBand EDR 200Gbps? - 12X InfiniBand EDR 300Gbps? - 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 12

Introducing the InfiniBand Extension (IBEx) Products Metro (/Campus) IB Extension 1-350 km 1/10/40G Ethernet, IPv4/IPv6, Dark Fiber Global IB Extension 1-15,000 km 1/10/40G Ethernet, IPv4/IPv6, SONET/SDH OC- 48/192/786, ITU-T G.709 OTU3, Dark Fiber, WDM 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 13

IBEx G40 4X/8X InfiniBand QDR Extension Platform Serial Management Availability Q1 2011 IBEx G40 Main Features 10G Ethernet 40G WAN IB Port 1 LAN IB Port 2 / 40G Ethernet 40G InfiniBand extension platform providing connectivity for: - 4X/8X InfiniBand QDR (up to 40Gbps) [2 x QSFP] and 10G Ethernet [1 x XFP] Utilizes Bay s own Newport 40G framer ASIC to supports extension over: - 40G Ethernet, IPv4/IPv6, SONET/SDH OC-768, ITU-T G.709 OTU3, dark fiber, WDM Enhanced internal port buffering and flow control capabilities enabling global InfiniBand extension at full line rate from 1-15,000km Strong embedded Advanced Encryption Standard (AES-256) - Complete WAN-side encryption for all LAN-side interfaces including InfiniBand and Ethernet Full support for all InfiniBand Virtual Lanes: 8 Operational (VL0-7) + 1 Management (VL15) Compact, low power, 1U 19-inch rack mountable chassis with redundant, hot-swappable power supplies and fans 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 14

Contact and Additional Information Contacts Chuck Gershman, President & CEO - chuck@baymicrosystems.com - (408) 437-0400 x105 Eric Dube, Senior Product Manager of Systems - eric@baymicrosystems.com - (301) 944-8149 Please stop by and see our 4X InfiniBand QDR over 40G wide area network demonstration in the InfiniBand Trade Association (IBTA) Booth #1161 at SC10! 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 15

11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 16

Performance Data: Bandwidth vs. Message Size 1000 Unreliable Connection (UC) Sustained bandwidth over any distance! 7.5 Gbps 800 B/sec) Ba andwidth (M 600 400 HCAs Only HCAs+SW HCAs+ABEx HCAs+SW+ABEx @ 0km HCAs+SW+ABEx @ 10km HCAs+SW+ABEx @ 100km HCAs+SW+ABEx @ 1000km HCAs+SW+ABEx ABEx @ 10,000km 200 0 1 10 100 1000 10000 100000 1000000 10000000 Message Size (bytes) 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 17

Performance Data: Bandwidth vs. Distance 1000 Reliable Connection (RC) vs. Unreliable Connection (UC) 800 UC Connections Bandwidth (M MB/sec) 600 400 RC Connections RC 8k byte RC 64k byte RC 4M byte UC 512 byte UC 64K byte RC 512 byte 200 0 10 100 Distance (kilometers) 1000 10000 11/17/201 Bay Microsystems, Inc. Proprietary Do No Re-distribute 18