Agilio CX 2x40GbE with OVS-TC

Similar documents
Virtual Switch Acceleration with OVS-TC

Netronome 25GbE SmartNICs with Open vswitch Hardware Offload Drive Unmatched Cloud and Data Center Infrastructure Performance

Accelerating Contrail vrouter

Accelerating vrouter Contrail

Agilio OVS Software Architecture

SmartNIC Programming Models

SmartNIC Programming Models

Accelerating 4G Network Performance

Measuring a 25 Gb/s and 40 Gb/s data plane

Host Dataplane Acceleration: SmartNIC Deployment Models

Enabling Efficient and Scalable Zero-Trust Security

New Approach to OVS Datapath Performance. Founder of CloudNetEngine Jun Xiao

DPDK Performance Report Release Test Date: Nov 16 th 2016

Bringing Software Innovation Velocity to Networking Hardware

DPDK Intel NIC Performance Report Release 18.02

OpenStack Networking: Where to Next?

A Brief Guide to Virtual Switching Franck Baudin (Red Hat) Billy O Mahony (Intel)

QorIQ Intelligent Network Interface Card (inic) Solution SDK v1.0 Update

DPDK Intel NIC Performance Report Release 18.05

Netronome NFP: Theory of Operation

DPDK Vhost/Virtio Performance Report Release 18.05

Achieve Low Latency NFV with Openstack*

DPDK Intel NIC Performance Report Release 17.08

DPDK Vhost/Virtio Performance Report Release 18.11

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Data Path acceleration techniques in a NFV world

Configuring and Benchmarking Open vswitch, DPDK and vhost-user. Pei Zhang ( 张培 ) October 26, 2017

Comparing Open vswitch (OpenFlow) and P4 Dataplanes for Agilio SmartNICs

Accelerating NVMe-oF* for VMs with the Storage Performance Development Kit

Corporate Update. OpenVswitch hardware offload over DPDK. DPDK summit 2017

Programming NFP with P4 and C

Nova Scheduler: Optimizing, Configuring and Deploying NFV VNF's on OpenStack

Using SR-IOV offloads with Open-vSwitch and similar applications

Demonstrating Data Plane Performance Improvements using Enhanced Platform Awareness

DPDK Vhost/Virtio Performance Report Release 17.08

Improve Performance of Kube-proxy and GTP-U using VPP

SmartNIC Data Plane Acceleration & Reconfiguration. Nic Viljoen, Senior Software Engineer, Netronome

Programming Netronome Agilio SmartNICs

AMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV)

Mellanox NIC s Performance Report with DPDK Rev 1.0

MWC 2015 End to End NFV Architecture demo_

Next Gen Virtual Switch. CloudNetEngine Founder & CTO Jun Xiao

Zhang Tianfei. Rosen Xu

vswitch Acceleration with Hardware Offloading CHEN ZHIHUI JUNE 2018

KVM as The NFV Hypervisor

INSTALLATION RUNBOOK FOR Netronome Agilio OvS. MOS Version: 8.0 OpenStack Version:

Design and Implementation of Virtual TAP for Software-Defined Networks

ONOS-based Data Plane Acceleration Support for 5G. Dec 4, SKTelecom

Agenda How DPDK can be used for your Application DPDK Ecosystem boosting your Development Meet the Community Challenges

CLOUD ARCHITECTURE & PERFORMANCE WORKLOADS. Field Activities

Intel Open Network Platform Server Release 1.5 Performance Test Report. SDN/NFV Solutions with Intel Open Network Platform Server

Performance Considerations of Network Functions Virtualization using Containers

Accelerating Telco NFV Deployments with DPDK and SmartNICs

ARISTA: Improving Application Performance While Reducing Complexity

Programmable NICs. Lecture 14, Computer Networks (198:552)

Networking at the Speed of Light

All product specifications are subject to change without notice.

Red Hat OpenStack Platform 10

QuickSpecs. Overview. HPE Ethernet 10Gb 2-port 535 Adapter. HPE Ethernet 10Gb 2-port 535 Adapter. 1. Product description. 2.

Changpeng Liu. Senior Storage Software Engineer. Intel Data Center Group

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

Survey of ETSI NFV standardization documents BY ABHISHEK GUPTA FRIDAY GROUP MEETING FEBRUARY 26, 2016

Implementing a TCP Broadband Speed Test in the Cloud for Use in an NFV Infrastructure

Understanding The Performance of DPDK as a Computer Architect

Red Hat OpenStack Platform 13

Supporting Fine-Grained Network Functions through Intel DPDK

Introduction to the Cisco ASAv

FAQ. Release rc2

Enabling DPDK Accelerated OVS in ODL and Accelerating SFC

Datacenter Network Solutions Group

Getting Real Performance from a Virtualized CCAP

Accelerate Service Function Chaining Vertical Solution with DPDK

DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture. Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr.

Accelerating NVMe I/Os in Virtual Machine via SPDK vhost* Solution Ziye Yang, Changpeng Liu Senior software Engineer Intel

Windows Server 2012: Server Virtualization

Xilinx Answer QDMA Performance Report

Ethernet. High-Performance Ethernet Adapter Cards

The.pdf version of this slide deck will have missing info, due to use of animations. The original.pptx deck is available here:

Virtio/vhost status update

VDPA: VHOST-MDEV AS NEW VHOST PROTOCOL TRANSPORT

Two Network Functions Virtualization Infrastructure Platforms Demonstrate Predictable Performance and Continuous Improvement

Intel Speed Select Technology Base Frequency - Enhancing Performance

Bringing the Power of ebpf to Open vswitch. Linux Plumber 2018 William Tu, Joe Stringer, Yifeng Sun, Yi-Hung Wei VMware Inc. and Cilium.

An Intelligent NIC Design Xin Song

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Open vswitch - architecture

Configuring SR-IOV. Table of contents. with HP Virtual Connect and Microsoft Hyper-V. Technical white paper

Efficient Data Movement in Modern SoC Designs Why It Matters

Are You Insured Against Your Noisy Neighbor Sunku Ranganath, Intel Corporation Sridhar Rao, Spirent Communications

DPDK Summit China 2017

Programmable Server Adapters: Key Ingredients for Success

Virtual switching technologies and Linux bridge

Learning with Purpose

DPDK Intel Cryptodev Performance Report Release 18.08

CPU Pinning and Isolation in Kubernetes*

Network Services Benchmarking: Accelerating the Virtualization of the Network

Storage Protocol Offload for Virtualized Environments Session 301-F

PCI Express x8 Quad Port 10Gigabit Server Adapter (Intel XL710 Based)

SPDK China Summit Ziye Yang. Senior Software Engineer. Network Platforms Group, Intel Corporation

Service Edge Virtualization - Hardware Considerations for Optimum Performance

Transcription:

PERFORMANCE REPORT Agilio CX 2x4GbE with OVS-TC OVS-TC WITH AN AGILIO CX SMARTNIC CAN IMPROVE A SIMPLE L2 FORWARDING USE CASE AT LEAST 2X. WHEN SCALED TO REAL LIFE USE CASES WITH COMPLEX RULES TUNNELING AND USING MORE FLOWS, THE NETRONOME SOLUTION PROVIDES AN AVERAGE IMPROVEMENT OF 16X OVER THE COMPETITION. CONTENTS 1. LIST OF ACRONYMS...2 2. OVERVIEW...2 3. AGILIO OVS-TC ARCHITECTURE OVERVIEW...4 4. TESTING METHODOLOGY...4 4.1. PHY-OVS-PHY...5 4.2. PHY-VM-PHY...5 5. FLOW SCALABILITY...5 5.1. PHY-VM-PHY GUEST.TESTPMD.VF...6 5.1.1. FRAME RATE...6 5.1.2. THROUGHPUT...7 5.2. PHY-VM-PHY VXLAN - GUEST.TESTPMD.VF...7 5.2.1. FRAME RATE...7 5.2.2. THROUGHPUT...8 5.2.3. LATENCY...8 6. RULE COMPLEXITY...9 6.1 PHY-OVS-PHY...9 6.1.1. FRAME RATE @ 86B...9 6.2 PHY-VM-PHY GUEST.TESTPMD.VF... 1 6.2.1. FRAME RATE @ 86B... 1 6.3 PHY-VM-PHY VXLAN - GUEST.TESTPMD.VF... 1 6.3.1. FRAME RATE @ 13B... 1 7. SUMMARY...11 8. APPENDIX...12 8.1. INFRASTRUCTURE SPECIFICATIONS...12 8.1.1. DPDK BUILD SPECIFICATIONS...13 8.1.2. BOOT SETTINGS...13 8.1.3. OPERATING SYSTEM SETTINGS...13 Page 1 of 13

1. LIST OF ACRONYMS Acronym DPDK Gb/s Mb/s Mpps NIC OVS PF PMD SR-IOV TC VF VM XVIO VNF Description Intel s Data Plane Developer Kit Throughput unit: Gigabit per second Throughput unit: Million bits per second Frame rate unit: Million packets per second Network Interface Card Open vswitch PCIe Physical Function DPDK Poll-Mode Driver PCIe Single Root Input/Output Virtualization Traffic Control PCIe Virtual Function Guest Virtual Machine Netronome s Express Virtio Virtual Network Function 2. OVERVIEW OVS is the most commonly used virtual switch datapath. Kernel-OVS is implemented as a match/action forwarding engine based on flows that are inserted, modified or removed by user space. OVS was further enhanced with another user space datapath based on DPDK. The addition of OVS-DPDK improved performance but also created some new challenges. OVS-DPDK bypasses the Linux kernel networking stack, requires third party modules and implements its own security model for user space access to networking hardware. DPDK applications are more difficult to configure optimally; while OVS-DPDK management solutions exist, debugging can become a challenge without access to all the tools generally available for the Linux kernel networking stack. As more and more networking engineers realize that DPDK is complex, there is a need for continuously improved solutions. OVS using TC is the newest kernel-based approach, and improves upon Kernel-OVS and OVS-DPDK by providing a standard upstream interface for hardware acceleration. This report will discuss how an offloaded OVS-TC solution performs against software-based OVS-DPDK. OVS-TC with an Agilio CX SmartNIC can improve a relatively simple Layer 2 (L2) forwarding use case at least 2X. When scaled to real life use cases with complex rules (see Chapter 6), tunneling and using more flows, the Netronome solution provides an average improvement of 16X over the competition. Page 2 of 13

13 11 OVS-TC Flows/Rules OVS-DPDK Flows/Rules 1 CPU Core 1K 64K 128K 1K 64K 128K Complex use case with more flows and rules Million Packets per Second 9 7 24X 6 4 13X 11X 2 8 CPU Cores for OVS-DPDK 64 128 256 512 768 124 1518 Packet Size Figure 2.1: 1-port Mpps performance comparison PHY-VM-PHY VXLAN Agilio CX - Intel XL71 2x4GbE Simple Use Cases Complexity: Flows (F) and Flows+Rules (F+R) Real Life Use Cases Performance Improvement Factor (Mpps) 2X Netronome Competition 15X 1X 5X 1X 1F 1K F 64K F 64K F+R 256K F+R 1M F+R Figure 2.2: Netronome performance advantage for average packet sizes between 64B and 512B Page 3 of 13

3. AGILIO OVS-TC ARCHITECTURE OVERVIEW The figure below depicts the OVS-TC data plane being offloaded to the Netronome SmartNIC. Socket OVSDB Server OVS vswitchd VM User Space Kernel TC Datapath Match Tables Actions Future Features Tunnels OVS Kernel Datapath Kernel Offload TC Datapath Match Tables Actions Future Features Tunnels Deliver to Host Agilio CX Figure 3.1: Agilio OVS-TC architectural overview THE KEY GOAL OF THE BENCHMARKING IS TO QUANTIFY OVS DATAPATH PERFORMANCE AND THE ABILITY TO DELIVER DATA TO AND FROM HOST OR GUEST APPLICATIONS. By running the OVS-TC data functions on the Agilio SmartNIC, the TC datapath is dramatically accelerated while leaving higher-level functionality and features under software control. Thus, retaining the leverage and benefits derived from a sizeable open source development community. OVS contains a user space-based agent and a kernel-based datapath. The user space agent is responsible for switch configuration and flow table population. As observed in Figure 3.1, it is broken up into two fundamental components: ovs-vswitchd and ovsdb-server. The user space agent can accept configuration via a local command line (e.g. ovs-vsctl) as well as from a remote controller. When using a remote controller it is common to use OVSDB to interact with OVSDB-server for OVS configuration. The kernel TC datapath is where the Agilio offload hooks are inserted. With this solution, the OVS software still runs on the server, but the OVS- TC datapath match/action modules are synchronized down to the Agilio SmartNIC via hooks provided in the Linux kernel. 4. TESTING METHODOLOGY Repeatability is essential when testing complex systems and this is often difficult to achieve. Running performance tests and reviewing results are not trivial tasks as expert knowledge is required in multiple technology fields and host configurations, such as BIOS optimization, Linux host optimization, Kernel-based virtual machine (KVM)/Quick Eumulator (QEMU) optimization, CPU pinning and process affinity, OVS configuration, etc. The key goal of the benchmarking is to quantify OVS datapath performance and the ability to deliver data to and from host or guest applications. The tests are designed to make use of the DPDK to ensure the offloaded OVS datapath performance capabilities are evaluated and not the Linux IP networking stack more details are provided in the Infrastructure Specifications section. Ixia was used as an external hardware traffic generator (IxNetwork 8.41) which can transmit and receive traffic at line rate. An example DPDK application, testpmd, was used as a VNF to forward traffic. Page 4 of 13

VM x86 Host Port DPDK TESTPMD Port1 VM x86 Host Port DPDK TESTPMD Port1 Hypervisor Hypervisor x86 Host x86 Host x86 Kernel Kernel-OVS Datapath x86 Kernel Kernel-OVS Datapath PCle x8 PCle x8 Virtual Functions Virtual Functions SmartNIC OVS-TC Accelerated Datapath SmartNIC OVS-TC Accelerated Datapath Phy Phy1 Phy Phy1 Phy Phy1 Phy Phy1 Figure 4.1.1: PHY-OVS-PHY topology Figure 4.2.1: PHY-VM-PHY topology 4.1. PHY-OVS-PHY The PHY-OVS-PHY benchmarking topology measures the capabilities of the SmartNIC to forward traffic between physical ports. An Ixia is used to generate and sync traffic for the PHY-OVS-PHY topology. OVS OpenFlow rules were setup to forward traffic between the two physical ports. Figure 4.2.1. PHY-OVS-PHY topology shows the data flow in a purely accelerated fashion. 4.2. PHY-VM-PHY The PHY-VM-PHY benchmarking topology simulates the scenario of a VNF - incoming traffic is received by a guest application, which is then transmitted back over a different interface. In a production environment the VNF could perform routing, deep packet inspection, NAT, firewall, caching, etc. In the PHY-VM-PHY testing topology traffic is transmitted without any modification to dedicate resources to packet forwarding. In this test case, the traffic crosses over the PCI bus and is forwarded in a guest using testpmd as the VNF - testpmd is an example application that is part of DPDK. The OVS OpenFlow rules are accelerated and offloaded to the Agilio SmartNIC. Figure 4.1.1. PHY-VM-PHY topology shows the data flow tested for the PHY-VM-PHY topology. 5. FLOW SCALABILITY The flow scalability test implements simple in/out OVS OpenFlow rules and executes the same topology with different number of flow counts. For this report, 1K and 1M flow traffic profiles have been highlighted and used to demonstrate the difference in performance between a small number of flows and a larger number of flows. Page 5 of 13

Ixia Flows Device Under Test OVS OF rules match: ip src + ip dest action: out_port VM 64 512 in/out in/out 5K 1M in/out in/out Figure 5: Flow scalability test topology 5.1. PHY-VM-PHY GUEST.TESTPMD.VF 5.1.1. FRAME RATE Ixia - Aggregate Rx Frame Rate (Mpps) Flow Scalability Frame Rate- PHY-VM-PHY - Agilio OVS-TC vs. OVS-DPDK 2 OVS-TC with Agilio CX 2x4G - 1k Flows DPDK with Intel XL71 2x4G - 1K Flows 14.7Mpps @ 1K flows OVS-TC with Agilio CX 2x4G - 1M Flows DPDK with Intel XL71 2x4G - 1M Flows 15 1 5 7.9Mpps @ 1K flows 25 5 75 1 125 15 Frame Size (Bytes) Graph 5.1.1: Frame rate PHY-VM-PHY Agilio OVS-TC vs. OVS-DPDK (Flow scalability) Agilio OVS-TC achieves 85% higher frame rate than OVS-DPDK at 66B for 1, flows, providing a clear advantage for the Agilio CX 2x4GbE SmartNIC. OVS-TC utilizes one CPU core and delivers 14.7Mpps. Intel XL71 with OVS-DPDK uses eight CPU cores to deliver 7.9Mpps. Page 6 of 13

5.1.2. THROUGHPUT Flow Scalability Throughput - PHY-VM-PHY - Agilio OVS-TC vs. OVS-DPDK Ixia - Aggregate Rx Throughput (Gb/s) 4 3 2 1 39.5Gb/s 34.4Gb/s @ 512B 38.8Gb/s 27.1Gb/s @ 512B OVS-TC with Agilio CX 2x4G - 1K Flows DPDK with Intel XL71 2x4G - 1K Flows OVS-TC with Agilio CX 2x4G - 1M Flows DPDK with Intel XL71 2x4G - 1M Flows 25 5 75 1 125 15 Frame Size (Bytes) Graph 5.1.2: Throughput PHY-VM-PHY Agilio OVS-TC vs. OVS-DPDK (Flow scalability) Agilio OVS-TC outperforms OVS-DPDK for all frame sizes. The most substantial performance delta appears at 256B and 512B frame sizes. At 512B Agilio CX 2x4GbE provides 34.4Gb/s of throughput compared to 27.1Gb/s for the XL71 2x4GbE OVS-DPDK solution. 5.2. PHY-VM-PHY VXLAN - GUEST.TESTPMD.VF 5.2.1. FRAME RATE Flow Scalability Framerate- PHY-VM-PHY with XVLAN- Agilio OVS-TC vs. OVS-DPDK Ixia - Aggregate Rx Framerate (Mpps) 12 1 8 6 4 2 OVS-TC with Agilio CX 2x4G - 1K Flows DPDK with Intel XL71 2x4G - 1K Flows OVS-TC with Agilio CX 2x4G - 1M Flows DPDK with Intel XL71 2x4G - 1M Flows 11.3Mpps @ 1K flows Agilio-OVS @ 1 vcpu 8.Mpps @ 1K flows 25 5 75 1 125 15 Frame Size (Bytes) Graph 5.2.1: Frame rate PHY-VM-PHY VXLAN Agilio OVS-TC vs. OVS-DPDK (Flow scalability) On this PHY-VM-PHY with VXLAN test, Agilio OVS-TC achieves 41.3% higher frame rate than OVS-DPDK for VXLAN traffic at 13B, and it maintains similar performance delta up to 512B of frame size. At 1, flows, the measurement was 11.3Mpps performance for Agilio CX 2x4GbE and 8Mpps for XL71 at 64B. Page 7 of 13

5.2.2. THROUGHPUT Flow Scalability Throughput - PHY-VM-PHY with VXLAN - Agilio OVS-TC vs. OVS-DPDK Ixia - Aggregate Rx Throughput (Gb/s) 4 3 2 1 37.8Gb/s @ 512b 38.8Gb/s 27.1Gb/s @ 512b OVS-TC with Agilio CX 2x4G - 1K Flows DPDK with Intel XL71 2x4G - 1K Flows OVS-TC with Agilio CX 2x4G - 1M Flows DPDK with Intel XL71 2x4G - 1M Flows 25 5 75 1 125 15 Frame Size (Bytes) Graph 5.2.2: Throughput PHY-VM-PHY VXLAN Agilio OVS-TC vs. OVS-DPDK (Flow scalability) With this test, one observes a 37.8Gb/s of throughput with 1, flows on Agilio CX 2x4GbE which translates to 95% of line rate for VXLAN traffic at 512B. At the same test instance, OVS-DPDK performance is below line rate at 27.1Gb/s. 5.2.3. LATENCY Flow Scalability Latency- PHY-VM-PHY with VXLAN - Agilio OVS-TC vs. OVS-DPDK 35, Ixia - Aggregate Rx Throughput (Gb/s) 3, 25, 2, 15, 1, 5, ~23,ns @ 64B ~17,ns @ 64B OVS-TC with Agilio CX 2x4G - 1K Flows DPDK with Intel XL71 2x4G - 1K Flows 25 5 75 1 125 15 Frame Size (Bytes) Graph 5.2.3: Latency PHY-VM-PHY VXLAN Agilio OVS-TC vs. OVS-DPDK (Flow scalability) To complete this test case the latency performance was measured for both solutions and Agilio OVS-TC achieves an average of 25% lower latency for VXLAN tunneling than OVS-DP- DK using only 1/8 th of the CPU. Page 8 of 13

6. RULE COMPLEXITY The rule complexity case is designed to scale the number of OVS OpenFlow rules (matching on source and destination IP addresses) and execute the same topology using the identical number of flows (1:1 rules/flows). This test allow us to observe granular performance characteristics of both solutions on a large number of flows and rules. Each installed rule gets a uniform distribution of traffic. For this report 64, 1K, 8K, 64K, 128K and 1M traffic flow profiles have been highlighted and is used to highlight the difference in performance between a small number of flows and a larger number of flows. Simple Use Cases Ixia More Flows and Rules Device Under Test Real Life Use Cases Flows OVS OF rules match: ip src + ip dest action: out_port VM # Rules 64 512 5K 1M Figure 6: Rule complexity test topology 6.1 PHY-OVS-PHY 6.1.1. FRAME RATE @ 86B Rule Complexity Frame Rate - PHY-OVS-PHY - Agilio OVS-TC vs. OVS-DPDK Ixia - Aggregate Rx Frame Rate (Mpps) 35 3 25 2 15 1 5 OVS-TC with Agilio CX 2x4G DPDK with Intel XL71 2x4G 32.5 32.6 32.6 32.3 32. 17.6 1.9 9.68 7.56 8.23 3.59.73 64 1K 8K 64K 128K 1M Number of OVS-TC Wildcard Rules Graph 6.1.1: Frame rate @ 86B PHY-OVS-PHY Agilio OVS-TC vs. OVS-DPDK (1:1 Flows/Rules) Page 9 of 13

The results of the standard OVS-DPDK test highlights that an increased number of flows/rules has a negative impact on the throughput. As the flows/rules increase, the frame rate drops. At 64B packet size, OVS-DPDK delivers 17.6Mpps on 64 flows/rules, but when scaled to 1M flows/ rules the performance drops by 24X to approximately.73mpps. As the number of flows and rules increase, the OVS-DPDK performance degrades dramatically. 6.2 PHY-VM-PHY GUEST.TESTPMD.VF 6.2.1. FRAME RATE @ 86B Rule Complexity Frame Rate - PHY-OVS-PHY - Agilio OVS-TC vs. OVS-DPDK Ixia - Aggregate Rx Frame Rate (Mpps) 14 12 1 8 6 4 2 OVS-TC with Agilio CX 2x4G DPDK with Intel XL71 2x4G 13.2 13.6 13.6 13.3 12.9 7.33 6.6 3.73 3.3 2.94 2.25.26 64 1K 8K 64K 128K 1M Number of OVS-TC Wildcard Rules Graph 6.2.1: Frame rate @ 86B PHY-VM-PHY Agilio OVS-TC vs. OVS-DPDK (1:1 Flows/Rules) At a 86B packet size, OVS-TC delivers 13Mpps for 64K flows/rules while OVS-DPDK achieves 3.3Mpps. When compared to OVS-DPDK performance for this test, OVS-TC running on the Agilio SmartNIC performs an average of 5X better than the software only solution. 6.3 PHY-VM-PHY VXLAN - GUEST.TESTPMD.VF 6.3.1. FRAME RATE @ 13B Rule Complexity Frame Rate - PHY-VM-PHY - Agilio OVS-TC vs. OVS-DPDK Ixia - Aggregate Rx Frame Rate (Mpps) 14 12 1 8 6 4 2 OVS-TC with Agilio CX 2x4G DPDK with Intel XL71 2x4G 11.6 11.6 11.4 1.9 1.7 1.73 1.33 1.18 1.8.43.19 2.25 64 1K 8K 64K 128K 1M Number of OVS-TC Wildcard Rules Graph 6.3.1: Frame rate @13B PHY-VM-PHY VXLAN AOVS-TC vs. OVS-DPDK (1:1 Flows/Rules) Page 1 of 13

On this PHY-VM-PHY with VXLAN test the average packet-per-second throughput of hardware-accelerated OVS-TC with one CPU core allocated is 1X higher than OVS-DPDK with eight CPU cores allocated. 7. SUMMARY Agilio CX SmartNICs with OVS-TC significantly outperform OVS-DPDK with Intel NICs. Even when four server CPU cores with hyperthread partners are allocated to OVS-DPDK, the packets-per-second throughput of Agilio with OVS-TC delivers an up to 24X performance increase over OVS-DPDK. OVS-TC leads the way in virtual switching at scale by accelerating the Linux networking stack. This benchmarking confirms that Agilio SmartNICs with OVS-TC can deliver more data to more applications running on a given server. This translates directly to improved server efficiency and a dramatic reduction in TCO, as fewer servers and less data center infrastructure (such as switches, racks and cabling) are needed to perform a given application workload. Page 11 of 13

8. APPENDIX 8.1. INFRASTRUCTURE SPECIFICATIONS The benchmarking setup was performed using the following servers: NIC Hardware Software Agilio CX 2x4GbE Motherboard Supermicro X1DRi OS Ubuntu 16.4.4 LTS (Xenial Xerus) Intel XL71 2x4GbE CPU Intel(R) Xeon(R) CPU E5-263 v4 4 CPUs @ 2.2GHz 2 NUMA nodes Memory 4x 16GB DIMM DDR4 @ 24 MHz Motherboard Supermicro X1DRi CPU Intel(R) Xeon(R) CPU E5-263 v4 4 CPUs @ 2.2GHz 2 NUMA nodes Memory 4x 16GB DIMM DDR4 @ 24 MHz Kernel 4.15.-415-generic Virtualization (Qemu) Compiled against library: libvirt 3.6. Using library: libvirt 3.6. Using API: QEMU 2.1.1 Running hypervisor: QEMU 2.1.1 OVS 2.8.1 OS Ubuntu 16.4.4 LTS (Xenial Xerus) Kernel 4.15.-415-generic Virtualization (Qemu) Compiled against library: libvirt 3.6. Using library: libvirt 3.6. Using API: QEMU 2.1.1 Running hypervisor: QEMU 2.1.1 OVS OVS 2.8.1 DPDK enabled (v17.5.2) 8X Hyperthreaded CPUs pairs (4x physical CPUs) 4X RX queues per physical port 4X TX queues per physical port 2X RX queues per vhostuser port Table 1: Host machine specifications Page 12 of 13

8.1.1. DPDK BUILD SPECIFICATIONS DPDK app DPDK testpmd-dpdk Specifications Tag: v17.5.2 Build variables: DEFAULT_PKT_BURST = 64 RTE_TEST_RX_DESC_DEFAULT = 124 RTE_TEST_TX_DESC_DEFAULT = 124 Table 2: DPDK build parameters VNF variables: 8X Hyperthreaded CPUs pairs (4x physical CPUs) 2x RX queues per virtual port 2x TX queues per virtual port 8.1.2. BOOT SETTINGS BIOS settings: NUMA enabled Enhanced Intel SpeedStep technology disabled C-state, P-state - disabled Hyper-threading enabled Intel Virtualization Technology for Directed I/O enabled CPU Power and Performance Policy: Performance Memory Power Optimization: Performance Optimized Host Boot settings: Hugepage size = 2MB; No. of Hugepages = 16,384 (32GB total; 16GB per NUMA node). Isolated CPUs: -9, 2-29 8.1.3. OPERATING SYSTEM SETTINGS Linux OS Services Settings Disable: NetworkManager, firewalld, iptables, irqbalance 293 Bunker Hill Lane, Suite 15 Santa Clara, CA 9554 Tel: 48.496.22 Fax: 48.586.2 www.netronome.com 218 Netronome. All rights reserved. Netronome is a registered trademark and the Netronome Logo is a trademark of Netronome. All other trademarks are the property of their respective owners. PR-AGILIO-OVS-7/18 Page 13 of 13