Gateware Defined Networking (GDN) for Ultra Low Latency Trading and Compliance

Size: px
Start display at page:

Download "Gateware Defined Networking (GDN) for Ultra Low Latency Trading and Compliance"

Transcription

1 Gateware Defined Networking (GDN) for Ultra Low Latency Trading and Compliance STAC Summit: Panel: FPGA for trading today: December 2015 John W. Lockwood, PhD, CEO Algo-Logic Systems, Inc. (408) Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel

2 GDN Powers Algo-Logic IP Cores, Pre-built FPGA Applications, and Systems GDN Gateware Defined Networking Accelerated Server IP CORE FPGA Gateware HDD SSD NIC+FPGA CPU Cores CPU 10G 40G 100 GE Low Latency MAC,TCP, Protocol Parsers Order Book cores Pre-Programmed apps in multiple FPGA vendor devices Pre-Programmed apps in multiple FPGA Cards Integrated Switch Solutions Integrated Server Systems Data Center Deployments to Co-Location Facility 2015 Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 2

3 Algo-Logic s Family of Accelerated Finance Applications Tick-to-Trade System Low Latency Library Full Order Book Low Latency TCP 76 ns MAC to Application 10GE PHY/MAC 89 ns Round-trip latency Market Data Filter Protocol Parsers All major exchanges Accelerated Server FPGA HDD SSD CPU Cores CPU 10G 40G 100 GE Algorithms in Logic: All apps run in FPGA Not STAC Benchmarks 2015 Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 3

4 Algo-Logic s Tick-to-Trade System: Full Offload to FPGA System Software Your control/gui interface(s) running on your server Your Unique Design Requirements In-House integration & API customizations GDN* Design Customization Areas Top of Order- Book Info Orders, Symbols, Trigger criteria etc. Heartbeats, echo info, stats & status Execution Reports & logs * Gateware Defined Networking (GDN) FPGA Card FPGA interfaces + API customizations 10GE multicast data Algo- Logic ULL PHY+ MAC IP Customizations Algo-Logic Market Data Filter Module UDP Parser IP Customizations Algo-Logic Protocol Parsing Libraries IP Customizations Algo-Logic Full Order- Book Processing Filtered Trade events & Top-of- Book data Your Trading Logic, Algorithms, Order Criteria & triggers Risk Checks module Inject market Orders IP Customizations Algo-Logic 76-nanosecond TCP/UDP Endpoint Algo- Logic ULL PHY+ MAC Orders to Exchange(s) Execution Reports from Exchange(s) (at your co-location site) Algo-Logic Systems GDN ULL Trading Solutions are sub-microsecond 2015 Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 4 Traditional Legacy Software Trading Systems at 20 to 50 microseconds

5 Algo-Logic s Key Value Store (KVS) Examples: Directory Key Value Company Phone # Algo-Logic (408) Key/Value Store (KVS) Simplifies implementation of large-scale distributed computation algorithms Data Center Servers exchanges data over standard Ethernet Forwarding Tables Data Deduplication IP Address Interface : MAC Address Eth6 : 02:33:29:F2:AB:CC Content Hash Storage Block ID XYZ Order ID Symbol, Side, Price Stock Trading ATY AAPL, B, Virtex Edge List Graph Search v140 v201, v206, v225 Challenges Operating System delays packets and limits throughput Per-core processing inefficient at high-speed packet processing Solutions Bypass kernel bypass with DPDK Offload of packet processing with FPGA 2015 Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 5

6 Implementation of KVS with Socket I/O, Kernel Bypass, and GDN in FPGA Benchmark same application Key/Value Store (KVS) Running on the same PC Intel i7-4770k CPU, NIC, and Altera Stratix V A7 FPGA With three different implementations Socket I/O, Kernel bypass with DPDK, FPGA OCSM LEGEND Data Transfer = 10g Ethernet Traditional Socket I/O Intel 10G NIC Kernel Driver Message Process Kernel Bypass with DPDK Algo-Logic software on Intel GE NIC and Core i7-4770k CPU Receive Queue Dequeue GDN in FPGA Enqueue OCSM LEGEND Control Handoff = 10g Ethernet Intel DPDK Supported NIC Store Dequeue Message Buffer Message Process Note: Message read once into CPU Cache Response Generation OCSM 10g Ethernet Parser Modifier REQUEST GENERATOR OCSM Header Identifier RESPONSE GENERATOR OCSM Header Reconstruct Key/Value Extractor Key/Value Search Response Decoder Exact Match Search Engine (EMSE) Data Transfer = Transmit Queue Enqueue Algo-Logic gateware Algo-Logic software on Nallatech P385 with on Intel GE NIC Altera Stratix V A7 FPGA and Core i7-4770k CPU 2015 Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 6

7 Measured Latency, Throughput, and Power Results FPGA PHY MAC GDN-Traffic Classifier Associative Rule-Match CAM Key Extractor Parser Flow or ACL Target Queues MACs PHYs All Datapaths Summary Latency (µseconds) Tested Throughput (CSMs/sec) Power (µjoules/csm) KVS in Software Sockets KVS in DPDK KVS in FPGA Rack of Search Servers Additional KVS Servers Provision Controller UPS Power 10G 40G DPDK RTL All Datapaths Summary Latency (µseconds) Maximum Throughput (CSMs/sec) Power (µjoules/csm) GDN vs. Sockets 88x less 13x 21x less GDN vs. DPDK 14x less 3.2x 13x less Advance Results: Not STAC Benchmarks 2015 Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 7

8 Tighter Spread = Less Jitter Peercentage of Observed s Percentage of s Observed [%] KVS Latency in FPGA, DPDK, and Sockets 50.00% 45.00% 40.00% 35.00% Latency Comparison 100k packets, 1 OCSM per packet, 1k pps Altera Stratix V RTL Average: 0.467µs KVS in FPGA: Best Latency, No Jitter 0.70% 0.60% KVS in Software Worst Latency Worst Jitter Socket Implementation Latency Distribution with One OCSM/ Intel i7 Average: 41.54µs 30.00% 25.00% KVS in DPDK: Lowers Latency, Some Jitter 0.50% 0.40% 0.30% 0.20% Sockets RTL Sockets DPDK 20.00% 0.10% 15.00% 10.00% 5.00% DPDK Average: 6.29µs 0.00% Latency Distribution [µs] Sockets Average: 41.40µs Advance Results: % Latency Distribution [µs] Lowest Lower Latency = Faster Response Not STAC Benchmarks 2015 Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 8

9 Key Results: Gateware Defined Networking Gateware Defined Networking (GDN) Lowers Latency 7x to 45x over optimized DPDK and traditional Linux networking software Increases Throughput 3x to 13x improvement in Throughput / Server Reduces Power 13x to 21x less Power / Server Advance Results: Not STAC Benchmarks 2015 Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 9

10 Algo-Logic is Partnered to Provide Solutions on All Major Platforms 2015 Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 10

11 Thank You! Algo-Logic Systems, Inc. Phone: (408) Web: Corporate Headquarters 2255-D Martin Ave Santa Clara, CA Algo-Logic Systems Inc., All rights reserved. STAC FPGA Panel 11

Implementing Ultra Low Latency Data Center Services with Programmable Logic

Implementing Ultra Low Latency Data Center Services with Programmable Logic Implementing Ultra Low Latency Data Center Services with Programmable Logic John W. Lockwood, CEO: Algo-Logic Systems, Inc. http://algo-logic.com Solutions@Algo-Logic.com (408) 707-3740 2255-D Martin Ave.,

More information

Implementing Ultra Low Latency Data Center Services with Programmable Logic

Implementing Ultra Low Latency Data Center Services with Programmable Logic Implementing Ultra Low Latency Data Center Services with Programmable Logic John W. Lockwood, Madhu Monga Algo-Logic Systems, Inc., Santa Clara, CA 95050 JWLockwd@Algo-Logic.com, Madhu@Algo-Logic.com Abstract

More information

FPGA Augmented ASICs: The Time Has Come

FPGA Augmented ASICs: The Time Has Come FPGA Augmented ASICs: The Time Has Come David Riddoch Steve Pope Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. Hardware acceleration is Niche (With the obvious exception of graphics

More information

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS CS6410 Moontae Lee (Nov 20, 2014) Part 1 Overview 00 Background User-level Networking (U-Net) Remote Direct Memory Access

More information

The Myricom ARC Series with DBL

The Myricom ARC Series with DBL The Myricom ARC Series with DBL Drive down Tick-To-Trade latency with CSPi s Myricom ARC Series of 10 gigabit network adapter integrated with DBL software. They surpass all other full-featured adapters,

More information

High Performance Packet Processing with FlexNIC

High Performance Packet Processing with FlexNIC High Performance Packet Processing with FlexNIC Antoine Kaufmann, Naveen Kr. Sharma Thomas Anderson, Arvind Krishnamurthy University of Washington Simon Peter The University of Texas at Austin Ethernet

More information

PVPP: A Programmable Vector Packet Processor. Sean Choi, Xiang Long, Muhammad Shahbaz, Skip Booth, Andy Keep, John Marshall, Changhoon Kim

PVPP: A Programmable Vector Packet Processor. Sean Choi, Xiang Long, Muhammad Shahbaz, Skip Booth, Andy Keep, John Marshall, Changhoon Kim PVPP: A Programmable Vector Packet Processor Sean Choi, Xiang Long, Muhammad Shahbaz, Skip Booth, Andy Keep, John Marshall, Changhoon Kim Fixed Set of Protocols Fixed-Function Switch Chip TCP IPv4 IPv6

More information

Virtual Switch Acceleration with OVS-TC

Virtual Switch Acceleration with OVS-TC WHITE PAPER Virtual Switch Acceleration with OVS-TC HARDWARE ACCELERATED OVS-TC PROVIDES BETTER CPU EFFICIENCY, LOWER COMPLEXITY, ENHANCED SCALABILITY AND INCREASED NETWORK PERFORMANCE COMPARED TO KERNEL-

More information

ANIC Host CPU Offload Features Overview An Overview of Features and Functions Available with ANIC Adapters

ANIC Host CPU Offload Features Overview An Overview of Features and Functions Available with ANIC Adapters ANIC Host CPU Offload Features Overview An Overview of Features and Functions Available with ANIC Adapters ANIC Adapters Accolade s ANIC line of FPGA-based adapters/nics help accelerate security and networking

More information

RoCE vs. iwarp Competitive Analysis

RoCE vs. iwarp Competitive Analysis WHITE PAPER February 217 RoCE vs. iwarp Competitive Analysis Executive Summary...1 RoCE s Advantages over iwarp...1 Performance and Benchmark Examples...3 Best Performance for Virtualization...5 Summary...6

More information

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration

More information

Speeding up Linux TCP/IP with a Fast Packet I/O Framework

Speeding up Linux TCP/IP with a Fast Packet I/O Framework Speeding up Linux TCP/IP with a Fast Packet I/O Framework Michio Honda Advanced Technology Group, NetApp michio@netapp.com With acknowledge to Kenichi Yasukata, Douglas Santry and Lars Eggert 1 Motivation

More information

Experience with the NetFPGA Program

Experience with the NetFPGA Program Experience with the NetFPGA Program John W. Lockwood Algo-Logic Systems Algo-Logic.com With input from the Stanford University NetFPGA Group & Xilinx XUP Program Sunday, February 21, 2010 FPGA-2010 Pre-Conference

More information

INT-1010 TCP Offload Engine

INT-1010 TCP Offload Engine INT-1010 TCP Offload Engine Product brief, features and benefits summary Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx or Altera FPGAs INT-1010 is highly flexible that is

More information

A Low Latency Solution Stack for High Frequency Trading. High-Frequency Trading. Solution. White Paper

A Low Latency Solution Stack for High Frequency Trading. High-Frequency Trading. Solution. White Paper A Low Latency Solution Stack for High Frequency Trading White Paper High-Frequency Trading High-frequency trading has gained a strong foothold in financial markets, driven by several factors including

More information

INT 1011 TCP Offload Engine (Full Offload)

INT 1011 TCP Offload Engine (Full Offload) INT 1011 TCP Offload Engine (Full Offload) Product brief, features and benefits summary Provides lowest Latency and highest bandwidth. Highly customizable hardware IP block. Easily portable to ASIC flow,

More information

Agilio CX 2x40GbE with OVS-TC

Agilio CX 2x40GbE with OVS-TC PERFORMANCE REPORT Agilio CX 2x4GbE with OVS-TC OVS-TC WITH AN AGILIO CX SMARTNIC CAN IMPROVE A SIMPLE L2 FORWARDING USE CASE AT LEAST 2X. WHEN SCALED TO REAL LIFE USE CASES WITH COMPLEX RULES TUNNELING

More information

Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware

Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware What You Will Learn The goal of zero latency in financial services has caused the creation of

More information

打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生.

打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生. 打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生 shiys@solutionware.com.cn BY DEFAULT, LINUX NETWORKING NOT TUNED FOR MAX PERFORMANCE, MORE FOR RELIABILITY Trade-off :Low Latency, throughput, determinism Performance

More information

Much Faster Networking

Much Faster Networking Much Faster Networking David Riddoch driddoch@solarflare.com Copyright 2016 Solarflare Communications, Inc. All rights reserved. What is kernel bypass? The standard receive path The standard receive path

More information

The Myricom ARC Series of Network Adapters with DBL

The Myricom ARC Series of Network Adapters with DBL The Myricom ARC Series of Network Adapters with DBL Financial Trading s lowest latency, most full-featured market feed connections Drive down Tick-To-Trade latency with CSPi s Myricom ARC Series of 10

More information

10G bit UDP Offload Engine (UOE) MAC+ PCIe SOC IP

10G bit UDP Offload Engine (UOE) MAC+ PCIe SOC IP Intilop Corporation 4800 Great America Pkwy Ste-231 Santa Clara, CA 95054 Ph: 408-496-0333 Fax:408-496-0444 www.intilop.com 10G bit UDP Offload Engine (UOE) MAC+ PCIe INT 15012 (Ultra-Low Latency SXUOE+MAC+PCIe+Host_I/F)

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

Accelerated Programmable Services. FPGA and GPU augmented infrastructure.

Accelerated Programmable Services. FPGA and GPU augmented infrastructure. Accelerated Programmable Services FPGA and GPU augmented infrastructure graeme.burnett@hatstand.com Here and Now Market data 10GbE feeds common moving to 40GbE then 100GbE Software feed handlers can barely

More information

Netronome 25GbE SmartNICs with Open vswitch Hardware Offload Drive Unmatched Cloud and Data Center Infrastructure Performance

Netronome 25GbE SmartNICs with Open vswitch Hardware Offload Drive Unmatched Cloud and Data Center Infrastructure Performance WHITE PAPER Netronome 25GbE SmartNICs with Open vswitch Hardware Offload Drive Unmatched Cloud and NETRONOME AGILIO CX 25GBE SMARTNICS SIGNIFICANTLY OUTPERFORM MELLANOX CONNECTX-5 25GBE NICS UNDER HIGH-STRESS

More information

FlexNIC: Rethinking Network DMA

FlexNIC: Rethinking Network DMA FlexNIC: Rethinking Network DMA Antoine Kaufmann Simon Peter Tom Anderson Arvind Krishnamurthy University of Washington HotOS 2015 Networks: Fast and Growing Faster 1 T 400 GbE Ethernet Bandwidth [bits/s]

More information

Hardware NVMe implementation on cache and storage systems

Hardware NVMe implementation on cache and storage systems Hardware NVMe implementation on cache and storage systems Jerome Gaysse, IP-Maker Santa Clara, CA 1 Agenda Hardware architecture NVMe for storage NVMe for cache/application accelerator NVMe for new NVM

More information

10Gb Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Latency-Sensitive Applications

10Gb Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Latency-Sensitive Applications 10Gb Ethernet: The Foundation for Low-Latency, Real-Time Financial Services Applications and Other, Latency-Sensitive Applications Testing conducted by Solarflare and Arista Networks reveals single-digit

More information

Agilio OVS Software Architecture

Agilio OVS Software Architecture WHITE PAPER Agilio OVS Software Architecture FOR SERVER-BASED NETWORKING THERE IS CONSTANT PRESSURE TO IMPROVE SERVER- BASED NETWORKING PERFORMANCE DUE TO THE INCREASED USE OF SERVER AND NETWORK VIRTUALIZATION

More information

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems

More information

Important new NVMe features for optimizing the data pipeline

Important new NVMe features for optimizing the data pipeline Important new NVMe features for optimizing the data pipeline Dr. Stephen Bates, CTO Eideticom Santa Clara, CA 1 Outline Intro to NVMe Controller Memory Buffers (CMBs) Use cases for CMBs Submission Queue

More information

Netronome NFP: Theory of Operation

Netronome NFP: Theory of Operation WHITE PAPER Netronome NFP: Theory of Operation TO ACHIEVE PERFORMANCE GOALS, A MULTI-CORE PROCESSOR NEEDS AN EFFICIENT DATA MOVEMENT ARCHITECTURE. CONTENTS 1. INTRODUCTION...1 2. ARCHITECTURE OVERVIEW...2

More information

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li Goal: fast and cost-efficient key-value store Store, retrieve, manage key-value objects Get(key)/Put(key,value)/Delete(key) Target: cluster-level

More information

INT G bit TCP Offload Engine SOC

INT G bit TCP Offload Engine SOC INT 10011 10 G bit TCP Offload Engine SOC Product brief, features and benefits summary: Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx/Altera FPGAs or Structured ASIC flow.

More information

Advanced Computer Networks. End Host Optimization

Advanced Computer Networks. End Host Optimization Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct

More information

TLDK Overview. Transport Layer Development Kit Keith Wiles April Contributions from Ray Kinsella & Konstantin Ananyev

TLDK Overview. Transport Layer Development Kit Keith Wiles April Contributions from Ray Kinsella & Konstantin Ananyev TLDK Overview Transport Layer Development Kit Keith Wiles April 2017 Contributions from Ray Kinsella & Konstantin Ananyev Notices and Disclaimers Intel technologies features and benefits depend on system

More information

OpenStack Networking: Where to Next?

OpenStack Networking: Where to Next? WHITE PAPER OpenStack Networking: Where to Next? WHAT IS STRIKING IS THE PERVASIVE USE OF OPEN VSWITCH (OVS), AND AMONG NEUTRON FEATURES, THE STRONG INTEREST IN SOFTWARE- BASED NETWORKING ON THE SERVER,

More information

Network Virtualization in Multi-tenant Datacenters

Network Virtualization in Multi-tenant Datacenters Network Virtualization in Multi-tenant Datacenters Teemu Koponen, Keith Amidon, Peter Balland, Martín Casado, Anupam Chanda, Bryan Fulton, Igor Ganichev, Jesse Gross, Natasha Gude, Paul Ingram, Ethan Jackson,

More information

Introduction to the OpenCAPI Interface

Introduction to the OpenCAPI Interface Introduction to the OpenCAPI Interface Brian Allison, STSM OpenCAPI Technology and Enablement Speaker name, Title Company/Organization Name Join the Conversation #OpenPOWERSummit Industry Collaboration

More information

OpenOnload. Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect

OpenOnload. Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect OpenOnload Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. OpenOnload Acceleration Software Accelerated

More information

SmartNIC Programming Models

SmartNIC Programming Models SmartNIC Programming Models Johann Tönsing 206--09 206 Open-NFP Agenda SmartNIC hardware Pre-programmed vs. custom (C and/or P4) firmware Programming models / offload models Switching on NIC, with SR-IOV

More information

Flash vs. Disk Storage: Testing Workloads is Key

Flash vs. Disk Storage: Testing Workloads is Key Flash vs. Disk Storage: Testing Workloads is Key Len Rosenthal VP of Marketing Flash Memory Summit 2013 Santa Clara, CA 1 Overview The leader in Storage Performance Validation. Our Mission: To provide

More information

NVMe : Redefining the Hardware/Software Architecture

NVMe : Redefining the Hardware/Software Architecture NVMe : Redefining the Hardware/Software Architecture Jérôme Gaysse, IP-Maker Santa Clara, CA 1 NVMe Protocol How to implement the NVMe protocol? SW, HW/SW or HW? 2- NVMe command ready CPU 1-Host driver

More information

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers The Missing Piece of Virtualization I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers Agenda 10 GbE Adapters Built for Virtualization I/O Throughput: Virtual & Non-Virtual Servers Case

More information

An NVMe-based FPGA Storage Workload Accelerator

An NVMe-based FPGA Storage Workload Accelerator An NVMe-based FPGA Storage Workload Accelerator Dr. Sean Gibb, VP Software Eideticom Santa Clara, CA 1 PCIe Bus NVMe SSD NVMe SSD Acceleration Host CPU HDD RDMA NIC NoLoad Accel. Card TM Storage I/O Bandwidth

More information

GUARANTEED END-TO-END LATENCY THROUGH ETHERNET

GUARANTEED END-TO-END LATENCY THROUGH ETHERNET GUARANTEED END-TO-END LATENCY THROUGH ETHERNET Øyvind Holmeide, OnTime Networks AS, Oslo, Norway oeyvind@ontimenet.com Markus Schmitz, OnTime Networks LLC, Texas, USA markus@ontimenet.com Abstract: Latency

More information

WebSphere MQ Low Latency Messaging V2.1. High Throughput and Low Latency to Maximize Business Responsiveness IBM Corporation

WebSphere MQ Low Latency Messaging V2.1. High Throughput and Low Latency to Maximize Business Responsiveness IBM Corporation WebSphere MQ Low Latency Messaging V2.1 High Throughput and Low Latency to Maximize Business Responsiveness 2008 IBM Corporation WebSphere MQ Low Latency Messaging Extends the WebSphere MQ messaging family

More information

10 G Bit TCP+UDP Offload Engine (TOE+UOE) Hardware IP Core

10 G Bit TCP+UDP Offload Engine (TOE+UOE) Hardware IP Core Intilop Corporation 4800 Great America Pkwy Ste-231 Santa Clara, CA 95054 Ph: 408-496-0333 Fax:408-496-0444 www.intilop.com 10G bit TCP+UDP Offload Engine MAC + PCIe + Host_IF (Same PHY Port) INT 25012

More information

1G Bit TCP+UDP Offload Engine (TOE+UOE) Hardware IP Core

1G Bit TCP+UDP Offload Engine (TOE+UOE) Hardware IP Core Intilop Corporation 4800 Great America Pkwy Ste-231 Santa Clara, CA 95054 Ph: 408-496-0333 Fax:408-496-0444 www.intilop.com 1G bit TCP+UDP Offload Engine MAC + Host_IF (Same PHY Port) INT 2511 (Ultra-Low

More information

InfiniBand Networked Flash Storage

InfiniBand Networked Flash Storage InfiniBand Networked Flash Storage Superior Performance, Efficiency and Scalability Motti Beck Director Enterprise Market Development, Mellanox Technologies Flash Memory Summit 2016 Santa Clara, CA 1 17PB

More information

An Implementation of the Homa Transport Protocol in RAMCloud. Yilong Li, Behnam Montazeri, John Ousterhout

An Implementation of the Homa Transport Protocol in RAMCloud. Yilong Li, Behnam Montazeri, John Ousterhout An Implementation of the Homa Transport Protocol in RAMCloud Yilong Li, Behnam Montazeri, John Ousterhout Introduction Homa: receiver-driven low-latency transport protocol using network priorities HomaTransport

More information

Enyx soft-hardware design services and development framework for FPGA & SoC

Enyx soft-hardware design services and development framework for FPGA & SoC soft-hardware design services and development framework for FPGA & SoC Smart NIC Smart Switch Your custom hardware hardware acceleration experts 3rd party IP Cores AXI ARM DMA CPU Your own soft-hardware

More information

URDMA: RDMA VERBS OVER DPDK

URDMA: RDMA VERBS OVER DPDK 13 th ANNUAL WORKSHOP 2017 URDMA: RDMA VERBS OVER DPDK Patrick MacArthur, Ph.D. Candidate University of New Hampshire March 28, 2017 ACKNOWLEDGEMENTS urdma was initially developed during an internship

More information

Solarflare and OpenOnload Solarflare Communications, Inc.

Solarflare and OpenOnload Solarflare Communications, Inc. Solarflare and OpenOnload 2011 Solarflare Communications, Inc. Solarflare Server Adapter Family Dual Port SFP+ SFN5122F & SFN5162F Single Port SFP+ SFN5152F Single Port 10GBASE-T SFN5151T Dual Port 10GBASE-T

More information

Five ways to optimise exchange connectivity latency

Five ways to optimise exchange connectivity latency Five ways to optimise exchange connectivity latency Every electronic trading algorithm has its own unique attributes impacting its operation. The general model is that the electronic trading algorithm

More information

The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication

The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication John Markus Bjørndalen, Otto J. Anshus, Brian Vinter, Tore Larsen Department of Computer Science University

More information

The Best Ethernet Storage Fabric

The Best Ethernet Storage Fabric The Best Ethernet Storage Fabric John F. Kim & Amit Katz Santa Clara, CA August 2017 1 Storage Networking Background: From Fibre Channel to Ethernet 1997 2017 Feature Fibre Channel Ethernet Bandwidth 1

More information

Using FPGAs to accelerate NVMe-oF based Storage Networks

Using FPGAs to accelerate NVMe-oF based Storage Networks Using FPGAs to accelerate NVMe-oF based Storage Networks Deboleena Sakalley IP & Solutions Architect, Xilinx Santa Clara, CA 1 Agenda NVMe-oF Offload in FPGA NVMe-oF Integrated Solution Solution Architecture

More information

Maximum Performance. How to get it and how to avoid pitfalls. Christoph Lameter, PhD

Maximum Performance. How to get it and how to avoid pitfalls. Christoph Lameter, PhD Maximum Performance How to get it and how to avoid pitfalls Christoph Lameter, PhD cl@linux.com Performance Just push a button? Systems are optimized by default for good general performance in all areas.

More information

Jakub Cabal et al. CESNET

Jakub Cabal et al. CESNET CONFIGURABLE FPGA PACKET PARSER FOR TERABIT NETWORKS WITH GUARANTEED WIRE- SPEED THROUGHPUT Jakub Cabal et al. CESNET 2018/02/27 FPGA, Monterey, USA Packet parsing INTRODUCTION It is among basic operations

More information

Programmable Software Switches. Lecture 11, Computer Networks (198:552)

Programmable Software Switches. Lecture 11, Computer Networks (198:552) Programmable Software Switches Lecture 11, Computer Networks (198:552) Software-Defined Network (SDN) Centralized control plane Data plane Data plane Data plane Data plane Why software switching? Early

More information

SmartNIC Programming Models

SmartNIC Programming Models SmartNIC Programming Models Johann Tönsing 207-06-07 207 Open-NFP Agenda SmartNIC hardware Pre-programmed vs. custom (C and/or P4) firmware Programming models / offload models Switching on NIC, with SR-IOV

More information

Martin Dubois, ing. Contents

Martin Dubois, ing. Contents Martin Dubois, ing Contents Without OpenNet vs With OpenNet Technical information Possible applications Artificial Intelligence Deep Packet Inspection Image and Video processing Network equipment development

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme NET1343BU NSX Performance Samuel Kommu #VMworld #NET1343BU Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no

More information

INSIGHTS. FPGA - Beyond Market Data. Financial Markets

INSIGHTS. FPGA - Beyond Market Data. Financial Markets FPGA - Beyond Market In this article, Mike O Hara, publisher of The Trading Mesh - talks to Mike Schonberg of Quincy, Laurent de Barry and Nicolas Karonis of Enyx and Henry Young of TS-Associates, about

More information

Software Datapath Acceleration for Stateless Packet Processing

Software Datapath Acceleration for Stateless Packet Processing June 22, 2010 Software Datapath Acceleration for Stateless Packet Processing FTF-NET-F0817 Ravi Malhotra Software Architect Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions

More information

Persistent Memory. High Speed and Low Latency. White Paper M-WP006

Persistent Memory. High Speed and Low Latency. White Paper M-WP006 Persistent Memory High Speed and Low Latency White Paper M-WP6 Corporate Headquarters: 3987 Eureka Dr., Newark, CA 9456, USA Tel: (51) 623-1231 Fax: (51) 623-1434 E-mail: info@smartm.com Customer Service:

More information

Virtio/vhost status update

Virtio/vhost status update Virtio/vhost status update Yuanhan Liu Aug 2016 outline Performance Multiple Queue Vhost TSO Functionality/Stability Live migration Reconnect Vhost PMD Todo Vhost-pci Vhost Tx

More information

Accelerating Contrail vrouter

Accelerating Contrail vrouter WHITE PAPER Accelerating Contrail vrouter WHEN DEPLOYED WITH THE JUNIPER NETWORKS CONTRAIL CLOUD NETWORKING PLATFORM, THE NETRONOME AGILIO VROUTER SOLUTION DELIVERS ACCELERATED PERFORMANCE THAT ENABLES

More information

Ruler: High-Speed Packet Matching and Rewriting on Network Processors

Ruler: High-Speed Packet Matching and Rewriting on Network Processors Ruler: High-Speed Packet Matching and Rewriting on Network Processors Tomáš Hrubý Kees van Reeuwijk Herbert Bos Vrije Universiteit, Amsterdam World45 Ltd. ANCS 2007 Tomáš Hrubý (VU Amsterdam, World45)

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this

More information

An Introduction to the QorIQ Data Path Acceleration Architecture (DPAA) AN129

An Introduction to the QorIQ Data Path Acceleration Architecture (DPAA) AN129 July 14, 2009 An Introduction to the QorIQ Data Path Acceleration Architecture (DPAA) AN129 David Lapp Senior System Architect What is the Datapath Acceleration Architecture (DPAA)? The QorIQ DPAA is a

More information

Low-Latency Datacenters. John Ousterhout Platform Lab Retreat May 29, 2015

Low-Latency Datacenters. John Ousterhout Platform Lab Retreat May 29, 2015 Low-Latency Datacenters John Ousterhout Platform Lab Retreat May 29, 2015 Datacenters: Scale and Latency Scale: 1M+ cores 1-10 PB memory 200 PB disk storage Latency: < 0.5 µs speed-of-light delay Most

More information

1G bit TCP Offload Engine SOC IP

1G bit TCP Offload Engine SOC IP Enterprise Class, Network Hardened TCP/UDP Acceleration Technology, Globally proven interoperability and rock solid reliability since 2009 All Stages of Full TCP Stack in hardware plus more advanced functionality

More information

Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies

Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay 1 Apache Spark - Intro Spark within the Big Data ecosystem Data Sources Data Acquisition / ETL Data Storage Data Analysis / ML Serving 3 Apache

More information

On the cost of tunnel endpoint processing in overlay virtual networks

On the cost of tunnel endpoint processing in overlay virtual networks J. Weerasinghe; NVSDN2014, London; 8 th December 2014 On the cost of tunnel endpoint processing in overlay virtual networks J. Weerasinghe & F. Abel IBM Research Zurich Laboratory Outline Motivation Overlay

More information

Kernel Bypass. Sujay Jayakar (dsj36) 11/17/2016

Kernel Bypass. Sujay Jayakar (dsj36) 11/17/2016 Kernel Bypass Sujay Jayakar (dsj36) 11/17/2016 Kernel Bypass Background Why networking? Status quo: Linux Papers Arrakis: The Operating System is the Control Plane. Simon Peter, Jialin Li, Irene Zhang,

More information

Using (Suricata over) PF_RING for NIC-Independent Acceleration

Using (Suricata over) PF_RING for NIC-Independent Acceleration Using (Suricata over) PF_RING for NIC-Independent Acceleration Luca Deri Alfredo Cardigliano Outlook About ntop. Introduction to PF_RING. Integrating PF_RING with

More information

LOW LATENCY DATA DISTRIBUTION IN CAPITAL MARKETS: GETTING IT RIGHT

LOW LATENCY DATA DISTRIBUTION IN CAPITAL MARKETS: GETTING IT RIGHT LOW LATENCY DATA DISTRIBUTION IN CAPITAL MARKETS: GETTING IT RIGHT PATRICK KUSTER Head of Business Development, Enterprise Capabilities, Thomson Reuters +358 (40) 840 7788; patrick.kuster@thomsonreuters.com

More information

Netchannel 2: Optimizing Network Performance

Netchannel 2: Optimizing Network Performance Netchannel 2: Optimizing Network Performance J. Renato Santos +, G. (John) Janakiraman + Yoshio Turner +, Ian Pratt * + HP Labs - * XenSource/Citrix Xen Summit Nov 14-16, 2007 2003 Hewlett-Packard Development

More information

Rapid Platform Deployment: Allows clients to concentrate their efforts on application software.

Rapid Platform Deployment: Allows clients to concentrate their efforts on application software. Overview Aliathon Ltd. in partnership with Nallatech brings to market a demo design based on the Universal Network Probe technology described in Aliathon Application Note 06. This design demonstrate the

More information

10 G bit TCP Offload Engine + PCIe/DMA SOC IP

10 G bit TCP Offload Engine + PCIe/DMA SOC IP Intilop Corporation 4800 Great America Pkwy Ste-231 Santa Clara, CA 95054 Ph: 408-496-0333 Fax:408-496-0444 www.intilop.com 10 G bit TCP Offload Engine + PCIe/DMA SOC IP INT 10012 (Very-Low Latency XTOE+PCIe+DMA+Host_I/F)

More information

Product Overview. Programmable Network Cards Network Appliances FPGA IP Cores

Product Overview. Programmable Network Cards Network Appliances FPGA IP Cores 2018 Product Overview Programmable Network Cards Network Appliances FPGA IP Cores PCI Express Cards PMC/XMC Cards The V1151/V1152 The V5051/V5052 High Density XMC Network Solutions Powerful PCIe Network

More information

5051 & 5052 PCIe Card Overview

5051 & 5052 PCIe Card Overview 5051 & 5052 PCIe Card Overview About New Wave New Wave DV provides high performance network interface cards, system level products, FPGA IP cores, and custom engineering for: High-bandwidth low-latency

More information

100% PACKET CAPTURE. Intelligent FPGA-based Host CPU Offload NIC s & Scalable Platforms. Up to 200Gbps

100% PACKET CAPTURE. Intelligent FPGA-based Host CPU Offload NIC s & Scalable Platforms. Up to 200Gbps 100% PACKET CAPTURE Intelligent FPGA-based Host CPU Offload NIC s & Scalable Platforms Up to 200Gbps Dual Port 100 GigE ANIC-200KFlex (QSFP28) The ANIC-200KFlex FPGA-based PCIe adapter/nic features dual

More information

Towards a Software Defined Data Plane for Datacenters

Towards a Software Defined Data Plane for Datacenters Towards a Software Defined Data Plane for Datacenters Arvind Krishnamurthy Joint work with: Antoine Kaufmann, Ming Liu, Naveen Sharma Tom Anderson, Kishore Atreya, Changhoon Kim, Jacob Nelson, Simon Peter

More information

Total Cost of Ownership Analysis for a Wireless Access Gateway

Total Cost of Ownership Analysis for a Wireless Access Gateway white paper Communications Service Providers TCO Analysis Total Cost of Ownership Analysis for a Wireless Access Gateway An analysis of the total cost of ownership of a wireless access gateway running

More information

Baseband Device Drivers. Release rc1

Baseband Device Drivers. Release rc1 Baseband Device Drivers Release 19.02.0-rc1 December 23, 2018 CONTENTS 1 BBDEV null Poll Mode Driver 1 1.1 Limitations....................................... 1 1.2 Installation.......................................

More information

Network Design Considerations for Grid Computing

Network Design Considerations for Grid Computing Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom

More information

Designing Next Generation FS for NVMe and NVMe-oF

Designing Next Generation FS for NVMe and NVMe-oF Designing Next Generation FS for NVMe and NVMe-oF Liran Zvibel CTO, Co-founder Weka.IO @liranzvibel Santa Clara, CA 1 Designing Next Generation FS for NVMe and NVMe-oF Liran Zvibel CTO, Co-founder Weka.IO

More information

Improving DPDK Performance

Improving DPDK Performance Improving DPDK Performance Data Plane Development Kit (DPDK) was pioneered by Intel as a way to boost the speed of packet API with standard hardware. DPDK-enabled applications typically show four or more

More information

Building a Platform Optimized for the Network Edge

Building a Platform Optimized for the Network Edge Building a Platform Optimized for the Network Edge MPLS + SDN + NFV WORLD 2018 Nicolas Bouthors, Enea Innovation Agenda Software Virtualization - Key Requirements Leveraging DPDK Multi-Function VNFs at

More information

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup Chapter 4 Routers with Tiny Buffers: Experiments This chapter describes two sets of experiments with tiny buffers in networks: one in a testbed and the other in a real network over the Internet2 1 backbone.

More information

CSE398: Network Systems Design

CSE398: Network Systems Design CSE398: Network Systems Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University March 14, 2005 Outline Classification

More information

Programmable NICs. Lecture 14, Computer Networks (198:552)

Programmable NICs. Lecture 14, Computer Networks (198:552) Programmable NICs Lecture 14, Computer Networks (198:552) Network Interface Cards (NICs) The physical interface between a machine and the wire Life of a transmitted packet Userspace application NIC Transport

More information

Meltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies

Meltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies Meltdown and Spectre Interconnect Evaluation Jan 2018 1 Meltdown and Spectre - Background Most modern processors perform speculative execution This speculation can be measured, disclosing information about

More information

Fusion Engine Next generation storage engine for Flash- SSD and 3D XPoint storage system

Fusion Engine Next generation storage engine for Flash- SSD and 3D XPoint storage system Fusion Engine Next generation storage engine for Flash- SSD and 3D XPoint storage system Fei Liu, Sheng Qiu, Jianjian Huo, Shu Li Alibaba Group Santa Clara, CA 1 Software overhead become critical Legacy

More information

with Sniffer10G of Network Adapters The Myricom ARC Series DATASHEET

with Sniffer10G of Network Adapters The Myricom ARC Series DATASHEET The Myricom ARC Series of Network Adapters with Sniffer10G Lossless packet processing, minimal CPU overhead, and open source application support all in a costeffective package that works for you Building

More information

An Intelligent NIC Design Xin Song

An Intelligent NIC Design Xin Song 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) An Intelligent NIC Design Xin Song School of Electronic and Information Engineering Tianjin Vocational

More information

A-GEAR 10Gigabit Ethernet Server Adapter X520 2xSFP+

A-GEAR 10Gigabit Ethernet Server Adapter X520 2xSFP+ Product Specification NIC-10G-2BF A-GEAR 10Gigabit Ethernet Server Adapter X520 2xSFP+ Apply Dual-port 10 Gigabit Fiber SFP+ server connections, These Server Adapters Provide Ultimate Flexibility and Scalability

More information