ETHERNET OVER INFINIBAND

Similar documents
USER SPACE IPOIB PACKET PROCESSING

Ethernet Services over IPoIB

Introduction to Infiniband

USING OPEN FABRIC INTERFACE IN INTEL MPI LIBRARY

Mellanox DPDK. Release Notes. Rev 16.11_4.0

SUSE Linux Enterprise Server (SLES) 12 SP4 Inbox Driver Release Notes SLES 12 SP4

Network Working Group. Category: Standards Track IBM April 2006

Containing RDMA and High Performance Computing

NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications

InfiniBand OFED Driver for. VMware Infrastructure 3. Installation Guide

Mellanox WinOF VPI Release Notes

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms

打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生.

RoCE Update. Liran Liss, Mellanox Technologies March,

Mellanox DPDK. Release Notes. Rev 16.11_2.3

SUSE Linux Enterprise Server (SLES) 15 Inbox Driver Release Notes SLES 15

Windows OpenFabrics (WinOF)

Advancing RDMA. A proposal for RDMA on Enhanced Ethernet. Paul Grun SystemFabricWorks

Mellanox Infiniband Foundations

InfiniBand OFED Driver for. VMware Virtual Infrastructure (VI) 3.5. Installation Guide

Mellanox OFED for FreeBSD for ConnectX-4/ConnectX-4 Lx/ ConnectX-5 Release Note. Rev 3.5.0

PERFORMANCE ACCELERATED Mellanox InfiniBand Adapters Provide Advanced Levels of Data Center IT Performance, Productivity and Efficiency

QuickSpecs. HP Z 10GbE Dual Port Module. Models

MLNX_EN for FreeBSD Release Notes

Mellanox DPDK Release Notes

Network Adapter Flow Steering

Mellanox OFED for FreeBSD for ConnectX-4/ConnectX-5 Release Note. Rev 3.4.1

Study. Dhabaleswar. K. Panda. The Ohio State University HPIDC '09

Cloud Networking (VITMMA02) Network Virtualization: Overlay Networks OpenStack Neutron Networking

Mellanox ConnectX-3 ESXi 6.5 Inbox Driver Release Notes. Rev 1.0

Benchmarking message queue libraries and network technologies to transport large data volume in

Red Hat Enterprise Linux (RHEL) 7.5-ALT Driver Release Notes

12th ANNUAL WORKSHOP Experiences in Writing OFED Software for a New InfiniBand HCA. Knut Omang ORACLE. [ April 6th, 2016 ]

InfiniBand Linux Operating System Software Access Layer

Application Acceleration Beyond Flash Storage

RoCE vs. iwarp Competitive Analysis

VXLAN Overview: Cisco Nexus 9000 Series Switches

Introduction to High-Speed InfiniBand Interconnect

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience

Mellanox OFED for FreeBSD User Manual

RDMA Container Support. Liran Liss Mellanox Technologies

Higher scalability to address more Layer 2 segments: up to 16 million VXLAN segments.

Mellanox ConnectX-4 NATIVE ESX Driver for VMware vsphere 5.5/6.0 Release Notes

Mellanox NATIVE ESX Driver for VMware vsphere 6.0 Release Notes

Red Hat Enterprise Linux (RHEL) 7.3 Driver Release Notes

Networking at the Speed of Light

WinOF-2 Release Notes

SUSE Linux Enterprise Server (SLES) 12 SP3 Driver SLES 12 SP3

Optimizing your virtual switch for VXLAN. Ron Fuller, VCP-NV, CCIE#5851 (R&S/Storage) Staff Systems Engineer NSBU

Request for Comments: 4392 Category: Informational April 2006

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Implementing Storage in Intel Omni-Path Architecture Fabrics

DPDK Tunneling Offload RONY EFRAIM & YONGSEOK KOH MELLANOX

Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin,

Mellanox NATIVE ESX Driver for VMware vsphere 6.0 Release Notes

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate

Mellanox NATIVE ESX Driver for VMware vsphere 6.0 Release Notes

WINDOWS RDMA (ALMOST) EVERYWHERE

Mellanox NATIVE ESX Driver for VMware vsphere 6.0 Release Notes

Request for Comments: 4755 Category: Standards Track December 2006

OpenFabrics Interface WG A brief introduction. Paul Grun co chair OFI WG Cray, Inc.

Advanced Computer Networks. End Host Optimization

A-GEAR 10Gigabit Ethernet Server Adapter X520 2xSFP+

CERN openlab Summer 2006: Networking Overview

MOVING FORWARD WITH FABRIC INTERFACES

Enclosed are the results from OFA Logo testing performed on the following devices under test (DUTs):

Informatix Solutions INFINIBAND OVERVIEW. - Informatix Solutions, Page 1 Version 1.0

MULTICAST USE IN THE FINANCIAL INDUSTRY

Advanced RDMA-based Admission Control for Modern Data-Centers

Configuring Mellanox Hardware for VPI Operation Application Note

Mellanox WinOF VPI User Manual

OpenFabrics Alliance Interoperability Logo Group (OFILG) May 2014 Logo Event Report

Mellanox ConnectX-4 NATIVE ESXi Driver for VMware vsphere 5.5/6.0 Release Notes. Rev /

IBM Flex System IB port QDR InfiniBand Adapter. User s Guide

Simulation using MIC co-processor on Helios

Mellanox InfiniBand Training IB Professional, Expert and Engineer Certifications

Future Routing Schemes in Petascale clusters

Chapter 5 Network Layer

Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat. ACM SIGCOMM 2013, August, Hong Kong, China

OpenFabrics Alliance

RDMA on vsphere: Update and Future Directions

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Computer Networks Security: intro. CS Computer Systems Security

InfiniBand Networked Flash Storage

vswitch Acceleration with Hardware Offloading CHEN ZHIHUI JUNE 2018

Meltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies

Workshop on High Performance Computing (HPC) Architecture and Applications in the ICTP October High Speed Network for HPC

Technische Universität München. Comparison of Network Interface Controllers for Software Packet Processing

High Performance Interconnects: Landscape, Assessments & Rankings

Network Adapters. FS Network adapter are designed for data center, and provides flexible and scalable I/O solutions. 10G/25G/40G Ethernet Adapters

Mellanox WinOF VPI Registry Keywords Overview

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Choosing the Best Network Interface Card for Cloud Mellanox ConnectX -3 Pro EN vs. Intel XL710

HPE FlexFabric 7900 Switch Series

RDMA programming concepts

TCP/IP and the OSI Model

CS519: Computer Networks. Lecture 1 (part 2): Jan 28, 2004 Intro to Computer Networking

WinOF-2 for Windows 2016 Release Notes

Internetworking Part 1

MANAGING NODE CONFIGURATION WITH 1000S OF NODES

Transcription:

14th ANNUAL WORKSHOP 2018 ETHERNET OVER INFINIBAND Evgenii Smirnov and Mikhail Sennikovsky ProfitBricks GmbH April 10, 2018

ETHERNET OVER INFINIBAND: CURRENT SOLUTIONS mlx4_vnic Currently deprecated Requires specialized HW (BridgeX gateway) VXLAN over IPoIB Some stability issues in IPoIB on our workload patterns IPoIB in CM mode doesn t scale well with multi-threaded transfers IPoIB in UD mode has lower performance for single-threaded transfers Extra complexity due to many layers IB/IPoIB/IPv6/UDP/VXLAN/Ethernet 2 OpenFabrics Alliance Workshop 2018

OUR EOIB SOLUTION Is a high-speed and scalable Ethernet over InfiniBand linux driver Allows up to 5*10 8 virtual networks separated on the InfiniBand layer Presented as a standard Ethernet network interface with all benefits like ip tool, ethtool, bridging, vlans etc. Supports checksum and segmentation offloading on mlx4 Does not require specific IB hardware (e.g. BridgeX) Similar to EoIB concept presented by Ali Ayoub at OFA-2013 Is an equivalent of Omni-Path VNIC for InfiniBand 3 OpenFabrics Alliance Workshop 2018

OVERVIEW Example with three hosts and three separated virtual networks Network 0xF000:0xC100: Host A, Host B, Host C Network 0xF050:0xC100: Host A, Host B Network 0xF000:0xC200: Host B, Host C 4 OpenFabrics Alliance Workshop 2018

MAIN CONCEPTS: VES & VPORT Ethernet Overlay Network on top of InfiniBand UD Transport Broadcast domain is identified by PKEY + MLID pair VES - Virtual Ethernet Switch Can have one or more VPORTs Works as a self-learning switch with its Forwarding Database (FDB) VPORT (Virtual Port) Performs actual data transmission Identified by VES and QPN Virtual Ethernet interface uses VPORT API to talk to the EoIB network 5 OpenFabrics Alliance Workshop 2018

VES & VPORT MULTICAST SETUP Example Despite using the same MLID, VPORT 1 or VPORT2 cannot communicate to VPORT 3, as it uses different PKEY and subscribed to a different MGID. 6 OpenFabrics Alliance Workshop 2018

FRAME FORMAT LRH GRH BTH DETH Payload Inv CRC Var CRC EoIB Header Ethernet header Payload 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 0-16 Signature, 2b11 Version, 2b00 Reserved, MBZ TSS Reserved, MBZ EoIB uses Mellanox mlx4_vnic encapsulation header format Signature and Version set to values used by mlx4_vnic Header enables: HW RX path offloads (e.g. checksum validation) Software TSS on HCAs older than ConnectX-4 (see above TSS field) 7 OpenFabrics Alliance Workshop 2018

ADDRESS RESOLUTION & FDB VES works as a self-learning switch with a Forwarding Database (FDB) FDB maps MAC + VLAN to LID + QPN Both 802.1q and 802.1ad (QinQ) are supported FDB is updated based on incoming traffic If FDB mapping for the destination MAC+VLAN does not exist, the outgoing frame is sent via IB multicast 8 OpenFabrics Alliance Workshop 2018

EXAMPLE: PING DIAGRAM 9 OpenFabrics Alliance Workshop 2018

HW OFFLOADING SUPPORT The following HW offloads are supported (currently only for mlx4): IP / TCP / UDP checksum calculation on TX IP / TCP / UDP checksum validation on RX Large send offload Transmit side scaling (TSS) Receive side scaling (RSS) 10 OpenFabrics Alliance Workshop 2018

CONFIGURATION: IP, ETHTOOL Basic configuration example: Other settings can be specified with ip link add: Generic settings like ethernet address, number of rx and tx queues, etc. EoIB-specific settings like IB device & port, FDB size, IB rate, Queue to MSI-X interrupt mapping, Q_Key Ethtool configuration support tx/rx/tso offloads, statistics etc. 11 OpenFabrics Alliance Workshop 2018

BENCHMARKING: BANDWIDTH Uperf multithread tcp test results summary Measured on Intel Xeon E5-2680 and ConnectX-3 VPI in FDR10 mode Linux kernel 4.4, Mellanox OFED 3.4 12 OpenFabrics Alliance Workshop 2018

BENCHMARKING: CPU USAGE Uperf multithread tcp test results summary Measured on Intel Xeon E5-2680 and ConnectX-3 VPI in FDR10 mode Linux kernel 4.4, Mellanox OFED 3.4 13 OpenFabrics Alliance Workshop 2018

FUTURE PLANS TODOs we plan to work on Support for mlx5 Open-source it and offer to the upstream kernel Performance improvements and tuning TODOs we do NOT plan to work on ( so far ;) Path speed discovery Support of multiple InfiniBand subnets If you are working on a similar project, we would be happy to cooperate. 14 OpenFabrics Alliance Workshop 2018

14th ANNUAL WORKSHOP 2018 THANK YOU Development team: Eugene Crosser <evgenii.cherkashin@profitbricks.com> Evgenii Smirnov <evgenii.smirnov@profitbricks.com> Mikhail Sennikovsky <mikhail.sennikovskii@profitbricks.com> Sergii Riabchun <sergii.riabchun@profitbricks.com> ProfitBricks GmbH, the IaaS-Company: http://www.profitbricks.com/