Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Similar documents
Optimizing your virtual switch for VXLAN. Ron Fuller, VCP-NV, CCIE#5851 (R&S/Storage) Staff Systems Engineer NSBU

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Dell EMC. VxBlock Systems for VMware NSX 6.3 Architecture Overview

Agenda Introduce NSX-T: Architecture Switching Routing Firewall Disclaimer This presentation may contain product features that are currently under dev

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

vnetwork Future Direction Howie Xu, VMware R&D November 4, 2008

Cloud Networking (VITMMA02) Network Virtualization: Overlay Networks OpenStack Neutron Networking

Dell EMC. VxBlock Systems for VMware NSX 6.2 Architecture Overview

PCI Express x8 Quad Port 10Gigabit Server Adapter (Intel XL710 Based)

Parallel to NSX Edge Using VXLAN Overlays with Avi Vantage for both North-South and East-West Load Balancing Using Transit-Net

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Ordering and deleting Single-node Trial for VMware vcenter Server on IBM Cloud instances

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

QuickSpecs. HP Z 10GbE Dual Port Module. Models

Parallel to NSX Edge Using Avi Vantage for North-South and East-West Load Balancing

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

VMware vsphere 5.5 VXLAN Networking and Emulex OneConnect OCe14000 Ethernet Adapters

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

LANCOM Techpaper Routing Performance

Cisco UCS Network Performance Optimisation and Best Practices for VMware

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

Integrating Juniper Networks QFX5100 Switches and Junos Space into VMware NSX Environments

DPDK Summit China 2017

Design Guide: Deploying NSX for vsphere with Cisco ACI as Underlay

ARISTA: Improving Application Performance While Reducing Complexity

Weiterentwicklung von OpenStack Netzen 25G/50G/100G, FW-Integration, umfassende Einbindung. Alexei Agueev, Systems Engineer

PLUSOPTIC NIC-PCIE-2SFP+-V2-PLU

MidoNet Scalability Report

2V0-642 vmware. Number: 2V0-642 Passing Score: 800 Time Limit: 120 min.

Comparing TCP performance of tunneled and non-tunneled traffic using OpenVPN. Berry Hoekstra Damir Musulin OS3 Supervisor: Jan Just Keijser Nikhef

ALLNET ALL0141-4SFP+-10G / PCIe 10GB Quad SFP+ Fiber Card Server

Single Root I/O Virtualization (SR-IOV) and iscsi Uncompromised Performance for Virtual Server Environments Leonid Grossman Exar Corporation

W H I T E P A P E R. What s New in VMware vsphere 4: Performance Enhancements

Extreme Networks How to Build Scalable and Resilient Fabric Networks

On the cost of tunnel endpoint processing in overlay virtual networks

vswitch Acceleration with Hardware Offloading CHEN ZHIHUI JUNE 2018

Disclaimer CONFIDENTIAL 2

PCIe 10G SFP+ Network Card

Data Center Configuration. 1. Configuring VXLAN

Nexus 1000V in Context of SDN. Martin Divis, CSE,

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND FIBRE CHANNEL INFRASTRUCTURE

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

QuickSpecs. Overview. HPE Ethernet 10Gb 2-port 535 Adapter. HPE Ethernet 10Gb 2-port 535 Adapter. 1. Product description. 2.

Cross-vCenter NSX Installation Guide. Update 3 Modified on 20 NOV 2017 VMware NSX for vsphere 6.2

VXLAN Functionality Cubro EXA48600 & EXA32100

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

QLogic FastLinQ QL41232HMKR

Next Gen Virtual Switch. CloudNetEngine Founder & CTO Jun Xiao

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

VM Migration Acceleration over 40GigE Meet SLA & Maximize ROI

VMworld disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no

IBM Cloud for VMware Solutions NSX Edge Services Gateway Solution Architecture

New Approach to OVS Datapath Performance. Founder of CloudNetEngine Jun Xiao

Agilio CX 2x40GbE with OVS-TC

VMWare Horizon View 6 VDI Scalability Testing on Cisco 240c M4 HyperFlex Cluster System

Recommended Configuration Maximums. NSX for vsphere Updated on August 08, 2018

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

Network Configuration Example

Vmware VCXN610. VMware Certified Implementation Expert (R) Network Virtualization.

Cisco ACI and Cisco AVS

Network Design Considerations for Grid Computing

Session based high bandwidth throughput testing

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND ISCSI INFRASTRUCTURE

CS-580K/480K Advanced Topics in Cloud Computing. Network Virtualization

Architecture and Design of VMware NSX-T for Workload Domains. Modified on 20 NOV 2018 VMware Validated Design 4.3 VMware NSX-T 2.3

Deploy the ExtraHop Trace Appliance with VMware

Cross-vCenter NSX Installation Guide. Update 6 Modified on 16 NOV 2017 VMware NSX for vsphere 6.3

Session S : The Final Frontier?

HOW TO BUILD A NESTED NSX-T 2.3 LAB

IBM POWER8 100 GigE Adapter Best Practices

IPv6 Best Operational Practices of Network Functions Virtualization (NFV) With Vmware NSX. Jeremy Duncan Tachyon Dynamics

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Data Sheet FUJITSU PLAN EP Intel X710-DA2 2x10GbE SFP+

Accelerating Contrail vrouter

Network Adapters. FS Network adapter are designed for data center, and provides flexible and scalable I/O solutions. 10G/25G/40G Ethernet Adapters

NETWORK OVERLAYS: AN INTRODUCTION

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware

VXLAN Overview: Cisco Nexus 9000 Series Switches

Implementing SQL Server 2016 with Microsoft Storage Spaces Direct on Dell EMC PowerEdge R730xd

Storage Protocol Offload for Virtualized Environments Session 301-F

Solarflare and OpenOnload Solarflare Communications, Inc.

Microsoft SharePoint Server 2010 on Dell Systems

Networking at the Speed of Light

Design and Implementation of Virtual TAP for Software-Defined Networks

IOV hardware and software plumbing for VMware ESX - journey to uncompromised I/O Virtualization and Convergence.

Netronome 25GbE SmartNICs with Open vswitch Hardware Offload Drive Unmatched Cloud and Data Center Infrastructure Performance

Intended status: Informational Expires: January J. Rapp. VMware. July 8, 2016

1V0-642.exam.30q.

VMware Validated Design for NetApp HCI

Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat. ACM SIGCOMM 2013, August, Hong Kong, China

TLDK Overview. Transport Layer Development Kit Keith Wiles April Contributions from Ray Kinsella & Konstantin Ananyev

vstart 50 VMware vsphere Solution Specification

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Transcription:

NET1343BU NSX Performance Samuel Kommu #VMworld #NET1343BU

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitment from VMware to deliver these features in any generally available product. Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery. Pricing and packaging for any new technologies or features discussed or presented have not been determined. 2

Objectives 1 NSX Performance Characteristics 2 Recommendations for Optimal Performance 3 Benchmarking Considerations 3

Agenda 1 NSX Overview of the Overlays 2 Performance Tuning 3 VMware NIC Compatibility Guide 4 East-West Performance 5 North-South Performance 6 IETF 7 Summary & QA 4

NSX Overview Quick Overview of Overlays

VXLAN Frame Format 14 bytes Outer Ethernet Header IP Header Data VXLAN Encapsulated Frame IP Protocol 20 bytes Outer IP Header Header Check Sum Outer Source IP 8 bytes Outer UDP Header VTEPs* Outer Dest IP 8 bytes VXLAN Header Source Port Inner Dest MAC Dest Port (8472) Original Ethernet Frame Inner Source MAC UDP Length Optional Ether Type UDP Check Sum Optional Inner 802.1Q Original Ethernet Payload FCS IP over UDP encapsulation Outer Dest MAC Outer Source MAC Optional VXLAN Type Optional Outer 802.1Q Ether Type VXLAN Flags RSVD VXLAN NI (VNI) RSVD 6

Geneve Frame Format 14 bytes Outer Ethernet Header IP Header Data Geneve Encapsulated Frame IP Protocol 20 bytes Outer IP Header Header Check Sum Outer Source IP 8 bytes Outer UDP Header TEPs* Outer Dest IP 8+ bytes Geneve Header Source Port Inner Dest MAC Dest Port (8472) Original Ethernet Frame Inner Source MAC UDP Length Optional Ether Type UDP Check Sum Optional Inner 802.1Q Original Ethernet Payload FCS IP over UDP encapsulation Outer Dest MAC Outer Source MAC Optional VXLAN Type Optional Outer 802.1Q Ether Type Version Length Flags Protocol Type VNI Options 7

Performance Tuning Parameters that matter - VMworld 2017 Content: Not for publication

MTU / MSS Maximum Transmission Unit / Maximum Segment Size Virtual Machine vnic MTU: 1600/9000 MTU: 1600/9000 ESX 1 ESX 2 40G pnic Virtual Machine MTU: 1500/8900 vnic MTU: 1500/8900 40G pnic 40G Switch Maximum Packet Size Allowed 9

TSO for Overlay Traffic VM Switching NIC Based TSO MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload 65K CPU Based TSO MAC IP TCP Payload MTU MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload Physical Fabric For best performance use Hardware TSO for Overlay Traffic 10

LRO for Overlay Traffic VM MAC NIC Based LRO Software LRO available on ESX 6.5 IP MAC IP TCP Payload 32K VXLAN UDP / MAC IP TCP Payload Geneve 32K All components in this path including Logical Switch, Firewall etc., work with 32K segments QLogic 45000 Series Support LRO in Hardware MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLANMAC IP TCP Payload MAC IP UDP / Geneve MAC IP TCP Payload Physical Fabric 1500/ 9000 11

Receive Side Scaling (RSS) Core 1 100% Usage Thread 1 Thread 2 Thread 3 Thread n Queue 1 Core 2 0% Usage Core 3 0% Usage ESXi Kernel Space Network Adapter Queues Core n 0% Usage - Without Receive side scaling - All traffic is handled by a single core - Multiple cores are not used even if available - Traffic throughput slows down based on a single core s capacity 12

Receive Side Scaling (RSS) Core 1 20% Usage Core 2 20% Usage Core 3 20% Usage ESXi Kernel Space Core n 20% Usage Thread 1 Thread 2 Thread 3 Thread n Network Adapter Queues Queue 1 Queue 2 Queue 3 Queue n - With Receive Side Scaling Enabled - Network adapter has multiple queues to handle receive traffic - 5 tuple based hash (Src/Dest IP, Src/Dest MAC and Src Port) for optimal distribution to queues - Kernel thread per receive queue helps leverage multiple CPU cores 13

Rx/Tx Filters Core 1 20% Usage Core 2 20% Usage Core 3 20% Usage ESXi Kernel Space Core n 20% Usage Thread 1 Thread 2 Thread 3 Thread n Network Adapter Queues Queue 1 Queue 2 Queue 3 Queue n - Uses inner packet headers to queue traffic 14

Native Driver VMKernel Data Structures Translation VMKLinux Native Driver Less CPU cycles spent on translating With ESX 6.5 Native Driver MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload MAC IP UDP VXLAN MAC IP TCP Payload Physical Fabric 1500/ 9000 15

# vnics vs # of Queues VM vnic 1 vnic vnic 2 Scale up throughput by using multiple vnics ESX 1 pnic 16

# of Queues on a single vnic VM vnic VMworld 2017 Scale up throughput on a single vnic by leveraging multiple Queues Content: Not for publication ESX 1 pnic 17

VMware NIC Compatibility Guide What features does my NIC card support?

How To VMware Compatibility Guide Where is it? https://www.vmware.com/resources/compatibility/search.php?devicecategory=io 19

How To VMware Compatibility Guide Intel 710: Search for a driver supporting VXLAN-Offload 2 Select the Version of ESX Select brand name 1 VMworld 2017 Card model name if available 3 4 I/O Device Type Features of interest Content: Not for publication 5 Click on Update and View Results 6 20

How To VMware Compatibility Guide Intel 710: Search results for driver supporting VXLAN-Offload 1 From the results Select the specific Card 10GbE SFP+ 2 Click on the ESX version 21

How To VMware Compatibility Guide Intel 710: Search results for driver supporting VXLAN-Offload - Continued Click on [+] to expand and check the features supported 1 22

How To VMware Compatibility Guide Intel 710: Search results for driver supporting VXLAN-Offload Found! Looks like this one supports a bunch of features Yay!! 1 23

Traffic Profiles In a Datacenter

End-to-End Traffic Flows East West & North - South VM1 ESX 1 ESX 2 pnic Mix of bandwidth hungry and other Flows VM2 pnic LS1- VM2 LS1- VM8 LS1- LS2- LS2- VM1.......... LS2- VM8 Rest of DC/Cloud NSX EDGE Larger Mix of Packets Higher PPS Requirements pnic ToR 25

Typical Applications and Traffic Profiles In Datacenter Long Flows Designed to maximize on bandwidth Logs Backups FTP Web Servers Short Flows Lower bandwidth requirements Databases Specially in memory ones or cache layers even in those cases bulk requests are made for efficiency refer FB on Memcache Small Packets (<200 Bytes) Generally no SLAs Some rare cases where latency is important DNS DHCP TCP ACKs Keep Alive Messages or Majority of the packets at this packet size Mix of different size packets Larger throughput heavy flows Smaller Latency Sensitive Flows distribution 26

Typical Applications and Traffic Profiles Physical vs Virtual * Physical Fabric (Any) Virtual Fabric (E/W) Long Flows (>1500 Bytes) Throughput Focused Short Flows (<=1500 Bytes) Not Bandwidth Hungry VMworld 2017 ~ 1500 Bytes 32K 64K (TSO, LRO) By Default Packet Size May be tuned to even higher values (NICs do the heavy lifting) Content: Not for publication Packet Size (Function of CPU Type/Speed) 27

Application Layer Performance Benchmarking Tools Which ones and why * Relevance for Virtual Infra (E/W) iperf 2 iperf 3 NetPerf PktGen Application Level Apache Benchmark Memcache Benchmarks: TSO/LRO Implementation NIC Cards Multiple Cores Similar to iperf2 Single core Benchmarking only Similar to iperf with a few more bells and whistles For some of the N/S Benchmarking Specially with BM Edge Application level benchmarks for throughput and latency 28

East West Performance Characteristics VMworld 2017 Content: Not for publication

Setup for East-West Throughput Performance Servers OS ESXi 6.5 U1 Processor Hyper-Threading RAM Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz Enabled 256 GB MTU 9000 Virtual Machines OS RedHat 6 vcpu 2 NIC Card Details Intel XL710 (40 GbE) Driver I40e RAM 2 Version 1.4.3 Network VMXNET3 MTU 1500 / 8900 30

End-to-End Traffic Flows East West Traffic Flow VM1 ESX 1 ESX 2 pnic VM2 pnic LS1- VM2 LS1- VM8 LS1- LS2- LS2- VM1.......... LS2- VM8 Rest of DC/Cloud NSX EDGE Larger Mix of Packets Higher PPS Requirements pnic Switch 31

Topology Logical Switch VM1 VM2..... VM4 VM5 VM6..... Logical Switch 1 Host 1 Host 2 40G pnic 40G pnic VM8 Switch 32

Topology Tier 1 Host 1 Tier-1 Router VM1 VM2..... VM4 VM1 VM2..... 40G pnic Logical Switch 1 Logical Switch 2 Host 2 40G pnic VM4 Switch 33

Topology Tier 0 Host 1 Tier-0 Router VM1 VM2..... VM4 VM1 VM2..... 40G pnic Logical Switch 1 Logical Switch 2 Host 2 40G pnic VM4 Switch 34

TCP Throughput ESX (East - West) Throughput (Gbps) 40 30 20 10 0 TCP Throughput With 1500 & 8900 MTU VMworld 2017 Content: Not for Logical Switch Tier0 Router Tier1 Router 1500 MTU 8900 MTU Benchmarking Methodology: iperf 2.0.5 Options -P 4 t 30 Across 4 VM Pairs 4 Threads per VM Pair publication Observations ~25Gbps with 1500 MTU Close to line rate with 9K MTU To achieve ~80Gbps Use 2 x 40GbE NIC cards 35

North - South VM Edge

DPDK VMworld 2017 Edge is DPDK enabled High forwarding performance Pull mode driver, queue manager, buffer manager, etc.) Linear performance increase by addition of cores More info can be found at http://www.intel.com/go/dpdk Content: Not for publication 37

Fast Path New Flow Yes No Action 1 Without a Hash Table Action 2 Action 3 For cluster of packets that arrive together Fast Path 75% Less CPU Cycles Action n With a Hash Table For an entire Flow Out 38

ESG Performance Environment Simple end to end topology Single Edge Gateway Two Uplinks One for overlay and other for VLAN Performance Measurement iperf True application TCP performance Multi-thread, multi-vm Single Edge Gateway Characteristics Hyper Threading Disabled CPU power saving disabled 4 vcpu Form Factor VM Config MTU change based on benchmark VMXNET 3 Servers ESX NSX Server Make/Model 6.0. U2 / 6.5 U1 6.2 / NSX-T Dell PowerEdge R720 / Intel Processor 2 x Intel Xeon CPU E5-2680 v2 @ 2.80GHz / 2 x E5 2699 v4 @ 2.20Ghz Hyper-Threading RAM Enabled (Disabled on ESG Node) 128 GB MTU 1600 or 9000 NIC Card Details Intel Driver VMworld 2017 Content: Not for publication XL710 i40e Version 1.4.26-1OEM.550.0.0.1331820 Virtual Machines OS vcpu 2 RAM Network RHEL 6 (64-bit) 2 GB MTU 1500 VMXNET3 ESG Details Form Factor Quad-Large / Mediuim (4 vcpu) iperf Version 2.0.5 39

Topology NSX-V Edge VXLAN VM3 (172.16.214.90) VLAN VM1 (20.20.20.3) ESX 1 ESX 2 40G pnic VM4 (172.16.214.91) VM2 (20.20.20.6) DLR VDS 40G pnic VPN ESG 40G Switch 40

Traffic Flow - VM3 (172.16.214.90) VLAN VXLAN VM1 (20.20.20.3) VM4 (172.16.214.91) VM2 (20.20.20.6) ESX 1 ESX 2 40G pnic DLR 40G pnic VPN ESG 40G Switch 41

Topology NSX-T VM Edge LS1- VM4 LS1- LS1- VM2 VM1..... VM1 VM2..... ESX 1 ESX 2 pnic VMworld 2017 Content: Not for pnic VM4 NSX EDGE publication Overlay LS pnic VLAN ToR MTU 1500 42

Edge Throughput Throughput in Gbps 12 10 8 6 4 2 0 NSX-V & NSX-T VM Edge - Throughput Using iperf NSX-V ESG NSX-T VM Edge ESG is not limited to 10G Can take advantage of 25/40G Ports 9K MTU for close to line rate throughput End to End tuning for best results Hard Drive Speed Application tuning Setup Details: 4 x vcpu (Quad Large / Medium ) Hyper-threading Disabled on host iperf 2 or distribution 2 x Intel E5-2680 v2 @ 2.80GHz (NSX-v) 2 x Intel E5-2699 v4 @ 2.20GHz (NSX-t) Sender VMs / Receiver VMs 2 x vcpu per VM Intel XL710 40G Card 43

North South Bare Metal Edge

Setup for North-South Throughput Performance Servers OS Bare Metal Edge Processor Hyper-Threading RAM Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz Enabled 256 GB MTU 1500 Virtual Machines OS RedHat 6 vcpu 2 NIC Card Details Intel x520 Driver ixgbe RAM 2 Version In-Box Network VMXNET3 MTU 1500 / 8900 45

Bare Metal Edge Topology LS1- VM1 ESX 1 VM1 pnic LS1- VM2 ESX 2 VM2 pnic LS1- VM3 ESX 3 VM3 pnic LS1- VM8 ESX 8 VM8 pnic Bandwidth Usage Measured Here pnic pnic BM Edge MTU 1500 ToR 46

Throughput 256 Bytes Line rate @ 10 Gbps Port Not the true limit - Bottleneck at sender/receiver side - 47

Setup for North-South Throughput Performance Servers Processor # of Sockets 2 # of Logical Cores 16 Hyper-Threading RAM NIC Intel Xeon CPU E5-2667 v2 @ 3.30GHz Disabled 64 GB Intel X540-AT2 48

IETF Benchmarking Considerations for Virtual Infrastructures

Internet Engineering Task Force Active Internet-Draft Please share your comments 51

In Summary

In Summary Inter Host Bottlenecks Core 1 Core n-1 Core 1 Core n-1 CPU 1 CPU 2 Core 2 Core n Core 2 Core n Per Core ~5 20 Gbps Based on MTU Per lane ~8 Gbps Max Throughput of ~64Gbps on 8x (8 lane) PCIe 3.0 NIC NIC PCIe 3.0 X8 (8 Lanes) NIC PCIe 3.0 X8 (8 Lanes) 40 Gbps 40 Gbps 40 Gbps 40 Gbps Single Core Limits 5 to 20Gbps per core based on MTU Multi Core 4 X times of single core limits Can go slightly beyond 80G PCIe 3.0 Limitations ~8Gbps per lane Most NICs are x8 lanes ~64 Gbps limit Use two NICs for > 40G Throughput For high throughput 2 x 8 lane NICs or 1 x 16 lane NIC Higher MTU TSO, LRO & RSS enabled cards 55

Summary Closer to the Application Layer Large packets processed at source with very little overhead Its not all software Offloads RSS VXLAN TSO LRO SSL Hardware and Application Limitations Use 2 PCIe Slots instead of 1 if looking for greater then 40G bandwidth Figure out what s really important for your Application Throughput? Latency? 56