IBM WebSphere MQ Low Latency Messaging Software Tested With Arista 10 Gigabit Ethernet Switch and Mellanox ConnectX

Similar documents
A Low Latency Solution Stack for High Frequency Trading. High-Frequency Trading. Solution. White Paper

ARISTA: Improving Application Performance While Reducing Complexity

reinventing data center switching

Choosing the Best Network Interface Card for Cloud Mellanox ConnectX -3 Pro EN vs. Intel XL710

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

WebSphere MQ Low Latency Messaging V2.1. High Throughput and Low Latency to Maximize Business Responsiveness IBM Corporation

PERFORMANCE ACCELERATED Mellanox InfiniBand Adapters Provide Advanced Levels of Data Center IT Performance, Productivity and Efficiency

Mellanox Virtual Modular Switch

Enterprise Networks for Low Latency, High Frequency Financial Trading

RoCE vs. iwarp Competitive Analysis

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

TIBCO, HP and Mellanox High Performance Extreme Low Latency Messaging

Mellanox Technologies Delivers The First InfiniBand Server Blade Reference Design

Flex System IB port FDR InfiniBand Adapter Lenovo Press Product Guide

10Gb Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Latency-Sensitive Applications

Solutions for Scalable HPC

Flex System EN port 10Gb Ethernet Adapter Product Guide

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

IBM Europe Announcement ZP , dated November 6, 2007

VM Migration Acceleration over 40GigE Meet SLA & Maximize ROI

Corporate Update. Enabling The Use of Data January Mellanox Technologies

Ethernet. High-Performance Ethernet Adapter Cards

ARISTA WHITE PAPER Arista 7500E Series Interface Flexibility

Introduction to Ethernet Latency

Chelsio 10G Ethernet Open MPI OFED iwarp with Arista Switch

Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

Interconnect Your Future

Building the Most Efficient Machine Learning System

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience

Diffusion TM 5.0 Performance Benchmarks

DB2 purescale: High Performance with High-Speed Fabrics. Author: Steve Rees Date: April 5, 2011

By John Kim, Chair SNIA Ethernet Storage Forum. Several technology changes are collectively driving the need for faster networking speeds.

Arista 7060X, 7060X2, 7260X and 7260X3 series: Q&A

Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware

Сетевые технологии для систем хранения данных

V.I.B.E. Virtual. Integrated. Blade. Environment. Harveenpal Singh. System-x PLM

Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters

Comparing Ethernet and Soft RoCE for MPI Communication

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Optimize New Intel Xeon E based Ser vers with Emulex OneConnect and OneCommand Manager

Improving Altibase Performance with Solarflare 10GbE Server Adapters and OpenOnload

The Future of High Performance Interconnects

Network Design Considerations for Grid Computing

IBM Flex System EN port 10Gb Ethernet Adapter IBM Redbooks Product Guide

NVMe over Universal RDMA Fabrics

Future Routing Schemes in Petascale clusters

Emulex OCe11102-N and Mellanox ConnectX-3 EN on Windows Server 2008 R2

Comparing Ethernet & Soft RoCE over 1 Gigabit Ethernet

NOTE: A minimum of 1 gigabyte (1 GB) of server memory is required per each NC510F adapter. HP NC510F PCIe 10 Gigabit Server Adapter

Deploying Data Center Switching Solutions

Cavium FastLinQ 25GbE Intelligent Ethernet Adapters vs. Mellanox Adapters

(Detailed Terms, Technical Specifications, and Bidding Format) This section describes the main objective to be achieved through this procurement.

2008 International ANSYS Conference

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

RoCE Accelerates Data Center Performance, Cost Efficiency, and Scalability

IBM Virtual Fabric Architecture

Introduction: PURPOSE BUILT HARDWARE. ARISTA WHITE PAPER HPC Deployment Scenarios

The Future of Interconnect Technology

Informatix Solutions INFINIBAND OVERVIEW. - Informatix Solutions, Page 1 Version 1.0

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms

Density Optimized System Enabling Next-Gen Performance

Benefits of 25, 40, and 50GbE Networks for Ceph and Hyper- Converged Infrastructure John F. Kim Mellanox Technologies

IBM Data Center Networking in Support of Dynamic Infrastructure

Building the Most Efficient Machine Learning System

Chilean Stock Exchange Streamlines Securities Transaction

Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances

Switching Architectures for Cloud Network Designs

InfiniBand Switch System Family. Highest Levels of Scalability, Simplified Network Manageability, Maximum System Productivity

FC-NVMe. NVMe over Fabrics. Fibre Channel the most trusted fabric can transport NVMe natively. White Paper

Fast Lanes and Speed Bumps on the Road To Higher-Performance, Higher-Speed Networking


FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE

Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet

Arista AgilePorts INTRODUCTION

SCRAMNet GT. A New Technology for Shared-Memor y Communication in High-Throughput Networks. Technology White Paper

Accelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage

Optimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

Accelerating Ceph with Flash and High Speed Networks

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

The QLogic 8200 Series is the Adapter of Choice for Converged Data Centers

Use of the Internet SCSI (iscsi) protocol

Presentation_ID. 2002, Cisco Systems, Inc. All rights reserved.

The Myricom ARC Series of Network Adapters with DBL

Building resiliency with IBM System x and IBM System Storage solutions

WHITE PAPER THE CASE FOR 25 AND 50 GIGABIT ETHERNET WHITE PAPER

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

WebSphere MQ Low Latency Messaging (LLM)

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark

CLN4C44. ConnectX -4 Quad Port 25Gb/s Adapter. User Manual. Rev. 1.0

HP Update. Bill Mannel VP/GM HPC & Big Data Business Unit Apollo Servers

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

QLE10000 Series Adapter Provides Application Benefits Through I/O Caching

IBM _` p5 570 servers

HP BladeSystem c-class Ethernet network adaptors

OCP3. 0. ConnectX Ethernet Adapter Cards for OCP Spec 3.0

NVMe over Fabrics. High Performance SSDs networked over Ethernet. Rob Davis Vice President Storage Technology, Mellanox

Lenovo - Excelero NVMesh Reference Architecture

QuickSpecs. HP Z 10GbE Dual Port Module. Models

Birds of a Feather Presentation

Transcription:

IBM WebSphere MQ Low Latency Messaging Software Tested With Arista 10 Gigabit Ethernet Switch and Mellanox ConnectX -2 EN with RoCE Adapter Delivers Reliable Multicast Messaging With Ultra Low Latency And High Throughput Introduction Microseconds matter in the ultra-competitive world of High Frequency Trading (HFT). Trading architectures that deliver the lowest latency solution are the difference between profit and loss for these financial firms. IBM, Mellanox and Arista have combined to demonstrate a low latency messaging transport solution that facilitates the high-speed delivery of market, trade, reference and event data, in or between front-, middle and backoffice operations. 10 Gigabit Ethernet prevails as the interconnect technology of choice for these applications. With its full non-blocking throughput, record density, low latency, robust function, ease of operation and leading TCO, the Arista 7100 Series 10 Gigabit Ethernet switch is ideal for HFT applications. Arista s high-throughput, low latency 10Gb Ethernet switch, coupled with IBM's WebSphere MQ Low Latency Messaging (WMQLLM) software using the Mellanox ConnectX-2 10 Gigabit Ethernet adapter with RDMA over Converged Ethernet (RoCE) and OFED (OpenFabrics Enterprise Distribution) drivers, is a powerful combination of networking and messaging software that delivers the lowest latency solution for environments like High Frequency Trading. Key Results Summary WebSphere MQ Low Latency Messaging (WMQLLM) 2.3.0 Reliable Multicast Messaging (RMM) using 10GbE maintained a latency of 4μs at rates up to 100K messages/second and 6μs at one million messages/second. Application throughput for the tested solution was on the order of 8.4 Gbps. Latency remained consistently low as the number of multicast groups increased from one to one thousand. Across all message sizes and message rates tested the average latency of LLM using 10GbE was 5X to 12X faster than LLM over 1GbE. August 2010 Arista, IBM and Mellanox 1

Test Goals The main objective of this testing is to measure the single hop latency for WebSphere MQ Low Latency Messaging (WMQLLM) 2.3.0 using the 10GbE RoCE industry standard protocol with Mellanox's ConnectX-2 adapter (MT25448), as well as to determine the maximal end-to-end throughput rate. Maximal throughput and single hop latency were measured on a 10GbE network fabric. System under test WMQLLM Software Version: 2.3.0 Machines Vendor Model: Processors: Cache: Front Side Bus Speed: Memory: 10GbE Adapter: Driver: Operating System Version: Network Elements Switch: Network Accessories Connectors: Cables: IBM x3650 M2 (quad core) 2 x Intel Xeon Quad Core X5570 2.93 GHz 8MB Level 2 cache 1333 MHz 16GB Mellanox ConnectX-2 MT25448 [ConnectX- 2 EN with RoCE, PCIe 2.0 2.5 GT/s] (rev a0) OFED-1.5.1 Linux SuSE 11 (2.6.18-164.el5) x86_64 Arista 7124S SFP+ LC to LC Fiber Optic cables August 2010 Arista, IBM and Mellanox 2

Single hop latency test Test description The test setup consists of two machines A and B connected through an Arista 7124S switch. The test is a simple reflector tests. On machine A an LLM transmitter sends messages to an LLM receiver on machine B over reliable multicast. The messages are immediately sent back to machine A using an LLM transmitter and are received by an LLM receiver. A time stamp is written to a subset of the messages before each message is submitted to the LLM transmitter on machine A and the time stamp is extracted by the LLM receiver on machine A after completing the round trip. The single hop latency is calculated as half the round trip time. All latency tests ran for 5 minutes. Approximately 300,000 latency samples were recorded for each 5 minute test. From these 300,000 samples latency statistics were calculated. The test was repeated with variable message rates. Results are shown for message of 45 bytes. A diagram of the test configuration is shown below in Figure 1. Figure 1: Test configuration The results of the latency tests are presented in Table 1. August 2010 Arista, IBM and Mellanox 3

LLM over RDMA (RoCE) Mellanox ConnectX-2 (MTU=2K) Rate [msgs/sec] median average max std 10,000 4.00 4.50 15.50 0.90 50,000 4.00 4.50 26.00 1.10 100,000 4.00 4.50 41.00 2.10 250,000 4.50 5.00 88.00 7.30 500,000 4.50 6.00 180.00 23.30 1,000,000 6.00 7.50 140.00 18.10 Table 1: Single hop latency results for LLM over 10GbE using RoCE Key point: As we increase the message rate to 1,000,000 msgs/sec, the median latency remains consistently low for the tested solution, ranging from 4 microseconds to 6 microseconds. Throughput test Test description The test setup consists of two machines A and B connected through an Arista 7124S switch. On machine A an LLM transmitter sends messages to an LLM receiver on machine B over reliable multicast. The reported throughput is the maximal rate at which messages can be delivered from the sender to the receiver without any loss. All throughput tests ran for 5 minutes. The throughput test has been repeated with different message sizes. The results of the latency tests are presented in Table 2. It important to note that the reported throughput is that of the application and does not include the transport and network overhead. August 2010 Arista, IBM and Mellanox 4

LLM over UDP Mellanox ConnectX- 2 (Ethernet MTU=1500 bytes) Msg size [bytes] [Gbit/sec] Table 2: Max throughput for LLM over 10GbE 45 8.40 100 8.20 1,000 8.46 Key point: The tested solution can deliver close to a full 10GbE of network throughput across different message sizes. Multicast Scalability Test The main objective of this test is to determine multicast scalability, in terms of the number of different multicast groups, of the Arista switch, when used in conjunction with IBM's WMQLLM software running on an IBM Blade HS-21 (quad core) system with the Mellanox ConnectX-2 Ethernet adapter. Test Description The test setup consists of four machines A, B, C and D connected through an Arista 7124S switch. A multiple multicast group reflector tests is running between machines C and D to load the switch with multicast data. While the load test is running the basic latency test (described in the 'single hop latency test' subsection) is running, using a single multicast group, between machines A and B and produces latency results. The test is repeated with the load test between C and D configured to use different number of groups. The results of the multicast scalability tests are presented in Table 3. August 2010 Arista, IBM and Mellanox 5

Rate [msgs/sec] LLM over UDP Multicast Groups Median Latency 1 13.50 16 12.50 64 13.00 100,000 256 13.00 512 12.50 1,000 12.50 Table 3: Single hop latency results of LLM over UDP in the multicast scalability test Key point: As the number of multicast groups grows, the tested solution is able to scale to efficiently deliver multiple flows of data to multiple groups of subscribers while maintaining consistently low latency. Summary The results reported here clearly show the significant performance benefits of a 10GbE based solution combining Arista s switching technology, Mellanox ConnectX-2 EN with RoCE adapters, and IBM's WebSphere MQ Low Latency Messaging. This unique combination of networking technology and messaging software provides the lowest latency solution for today's demanding HFT environments with a solution that scales to very high message rates. As well, for applications that require high bandwidth, for example, exchanges or trading applications that need to store information to a transaction logger, or for system architectures anticipating increased future traffic volumes, the tested solution delivers close to 100% utilization of available network bandwidth. Finally, the ability to forward traffic with consistently low latency as the number of multicast groups increases is important in financial market environments where multiple flows of data need to be delivered effectively to multiple subscribers to maximize application performance. The tested solution clearly demonstrated this capability. August 2010 Arista, IBM and Mellanox 6

About Arista Arista Networks was founded to deliver cloud networking solutions for large datacenter and computing environments. Arista Networks ignited the low-latency 10GbE Ethernet revolution with the Arista 7100 and reinvented the modular data center Ethernet switch with the Arista 7500. Arista leads the data center Ethernet switching industry with innovation in switching hardware, silicon based performance, and the EOS platform, a pioneering new software architecture with self-healing and live in-service software upgrade capabilities. Arista markets its products worldwide through distribution partners, systems integrators and resellers with a strong dedication to partner and customer success. For more information, visit http://www.aristanetworks.com. About IBM For more information, visit http://www.ibm.com/financialmarkets/fasterdata. About Mellanox Mellanox Technologies is a leading supplier of end-to-end connectivity solutions for servers and storage that optimize data center performance. Mellanox products deliver market-leading bandwidth, performance, scalability, power conservation and costeffectiveness while converging multiple legacy network technologies into one future-proof solution. For the best in performance and scalability, Mellanox connectivity solutions are a preferred choice for Fortune 500 data centers and the world s most powerful supercomputers. Founded in 1999, Mellanox Technologies is headquartered in Sunnyvale, California and Yokneam, Israel. For more information, visit Mellanox at www.mellanox.com. August 2010 Arista, IBM and Mellanox 7