UDP Encapsulation in Linux netdev0.1 Conference February 16, Tom Herbert

Similar documents
DPDK Tunneling Offload RONY EFRAIM & YONGSEOK KOH MELLANOX

Securing Network Traffic Tunneled Over Kernel managed TCP/UDP sockets

Cloud Networking (VITMMA02) Network Virtualization: Overlay Networks OpenStack Neutron Networking

Hardware Checksumming

Stacked Vlan - Performance Improvement and Challenges

Stacked Vlan: Performance Improvement and Challenges

What s happened to the world of networking hardware offloads? Jesse Brandeburg Anjali Singhai Jain

MPLS Segment Routing in IP Networks

Evaluation of virtualization and traffic filtering methods for container networks

On the cost of tunnel endpoint processing in overlay virtual networks

Springpath Qiang. Zu Ericsson S. Davari yahoo X. Liu Jabil January 3, This document defines a YANG data model for VxLAN protocol.

NETWORK OVERLAYS: AN INTRODUCTION

How OAM Identified in Overlay Protocols

Accelerating Load Balancing programs using HW- Based Hints in XDP

Cloud e Datacenter Networking

Transport Area Working Group

Cloud e Datacenter Networking

RFC 4301 Populate From Packet (PFP) in Linux

USER SPACE IPOIB PACKET PROCESSING

Optimizing your virtual switch for VXLAN. Ron Fuller, VCP-NV, CCIE#5851 (R&S/Storage) Staff Systems Engineer NSBU

Implementing IP in IP Tunnel

Packet Header Formats

Intel 10Gbe status and other thoughts. Linux IPsec Workshop Shannon Nelson Oracle Corp March 2018

IPSec. Overview. Overview. Levente Buttyán

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

rx hardening & udp gso willem de bruijn

Chapter 5 Network Layer

Module 28 Mobile IP: Discovery, Registration and Tunneling

vswitch Acceleration with Hardware Offloading CHEN ZHIHUI JUNE 2018

Measuring MPLS overhead

E : Internet Routing

VXLAN Overview: Cisco Nexus 9000 Series Switches

QuickSpecs. HP Z 10GbE Dual Port Module. Models

Intended Status: Standard Track. Dan Wing Calvin Qian. VMware. Expires: January 1, 2018 June 30, 2017

In-situ OAM. IPPM November 6 th, 2018

CSCI-GA Operating Systems. Networking. Hubertus Franke

Position of IP and other network-layer protocols in TCP/IP protocol suite

Datagram. Source IP address. Destination IP address. Options. Data

ERSPAN in Linux. A short history and review. Presenters: William Tu and Greg Rose

IPv6 over IPv4 GRE Tunnels

Identifier Locator Addressing IETF95

QUIC. Internet-Scale Deployment on Linux. Ian Swett Google. TSVArea, IETF 102, Montreal

CSE 4215/5431: Mobile Communications Winter Suprakash Datta

Accelerate Service Function Chaining Vertical Solution with DPDK

Configuring IP Tunnels

OSI Network Layer. Network Fundamentals Chapter 5. Version Cisco Systems, Inc. All rights reserved. Cisco Public 1

I Commands. iping, page 2 iping6, page 4 itraceroute, page 5 itraceroute6 vrf, page 6. itraceroute vrf encap vxlan, page 12

Data Center Configuration. 1. Configuring VXLAN

CS 356: Computer Network Architectures. Lecture 10: IP Fragmentation, ARP, and ICMP. Xiaowei Yang

Protocol-Independent FIB Architecture for Network Overlays

CIS-331 Final Exam Fall 2015 Total of 120 Points. Version 1

cs144 Midterm Review Fall 2010

Cloud Networking with OpenStack. 马力 海云捷迅 (AWcloud)

CS-580K/480K Advanced Topics in Cloud Computing. Network Virtualization

QuickSpecs. Overview. HPE Ethernet 10Gb 2-port 535 Adapter. HPE Ethernet 10Gb 2-port 535 Adapter. 1. Product description. 2.

Configuring Traffic Mirroring

Cisco IP Fragmentation and PMTUD

XDP: 1.5 years in production. Evolution and lessons learned. Nikita V. Shirokov

Open vswitch in Neutron

Objectives. Chapter 10. Upon completion you will be able to:

Alcatel-Lucent 4A Alcatel-Lucent Scalable IP Networks. Download Full Version :

BIG-IP TMOS : Tunneling and IPsec. Version 13.0

DetNet. Flow Definition and Identification, Features and Mapping to/from TSN. DetNet TSN joint workshop IETF / IEEE 802, Bangkok

Configuring Traffic Mirroring

IP Mobility Design Considerations

Agilio OVS Software Architecture

EE 610 Part 2: Encapsulation and network utilities

Introduction to Internetworking

Communication Systems DHCP

White Paper. Huawei Campus Switches VXLAN Technology. White Paper

ICN IDENTIFIER / LOCATOR. Marc Mosko Palo Alto Research Center ICNRG Interim Meeting (Berlin, 2016)

Networks. an overview. dr. C. P. J. Koymans. Informatics Institute University of Amsterdam. February 4, 2008

SFC in the DOCSIS Network James Kim Cable Television Laboratories, Inc.

OVS Acceleration using Network Flow Processors

Internet Engineering Task Force (IETF) April Applicability Statement for the Use of IPv6 UDP Datagrams with Zero Checksums

Higher scalability to address more Layer 2 segments: up to 16 million VXLAN segments.

Packetization Layer Path Maximum Transmission Unit Discovery (PLPMTU) For IPsec Tunnels

Support for Smart NICs. Ian Pratt

SR for SD-WAN over hybrid networks

Lecture 13 Page 1. Lecture 13 Page 3

Advanced Computer Networks. End Host Optimization

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Lecture 8 Advanced Networking Virtual LAN. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

IPv6 over IPv4 GRE Tunnels

IPv6 over IPv4 GRE Tunnels

The IPsec protocols. Overview

INTERNET PROTOCOL SECURITY (IPSEC) GUIDE.

IP over IPv6 Tunnels. Information About IP over IPv6 Tunnels. GRE IPv4 Tunnel Support for IPv6 Traffic

Mobile Communications Chapter 9: Network Protocols/Mobile IP

Contents. Tunneling commands 1

Hillstone IPSec VPN Solution

Internet Engineering Task Force (IETF) Category: Standards Track. S. Krishnan Ericsson October 2015

Tunnel within a network

Cubro Packetmaster EX32/32(+)

IPv6 Protocol. Does it solve all the security problems of IPv4? Franjo Majstor EMEA Consulting Engineer Cisco Systems, Inc.

Chapter 20 Network Layer: Internet Protocol 20.1

Introduction to TCP/IP networking

Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat. ACM SIGCOMM 2013, August, Hong Kong, China

Chapter 5 OSI Network Layer

Category: Informational July Generic Routing Encapsulation over CLNS Networks

Transcription:

UDP Encapsulation in Linux netdev0.1 Conference February 16, 2015 Tom Herbert <therbert@google.com>

Topics UDP encapsulation Common offloads Foo over UDP (FOU) Generic UDP Encapsulation (GUE)

Basic idea of UDP encap Put network packets into UDP payload Two general methods No encapsulation header: protocol of packet is inferred from port number Encapsulation header: extra header between UDP header and packet. Protocol and other data can be there. For example: Data TCP Data UDP GUE TCP Data ETH UDP GUE TCP Data

VM encap example Application 1 1 Application Guest kernel 2 2 Guest kernel Host kernel Encapsulation Decapsulation Host kernel Encapsulator NIC driver 3 4 3 4 Decapsulator NIC driver 4

UDP encap popularity UDP works with existing HW infrastructure RSS in NICs, ECMP in switches Checksum offload Used in nearly all encap, NV data protocols VXLAN, LISP, MPLS, GUE, Geneve, NSH, L2TP Likelihood UDP based encapsulation becomes ubiquitous In time most packets in DC could be UDP!

Offloads Load balancing Checksum offload Segmentation offload

Load balancing For ECMP, RSS, LAG port selection Probably all switches can 5-tuple over UDP/ packets Solution: use source port to represent hash of inner flow ~14 bits of entropy udp_src_flow_port function

TX Checksum offload NETIF_HW_CSUM Initialize checksum to pseudo header csum Input to device start and offset HW checksums from start to end of packet and writes result at offset NETIF CSUM HW can only checksum with certain protocol hdrs Typically UDP/ and TCP/ HW handle pseudo hdr csum also

RX Checksum offload CHECKSUM_COMPLETE HW returns checksum calculation across whole packet Host uses returned value to validate checksum(s) in the packet CHECKSUM_UNNECESSARY HW verfies and returns checksum okay Protocol specific, HW needs to parse packet csum_level allows HW to checksum within encapsulation, multiple checksums

Checksum offload for encapsulation Need to offload inner checksum like TCP UDP also has it s own checksum, this makes things interesting!

The MIGHTHY UDP Checksum for Encaps Want set to zero for performance (particularly switch vendors), but... UDP checksum is required for v6, and UDP checksum covers more of packet than inner checksum, but... RFC6935, RFC6936, and a lot more requirements in encapsulation protocol drafts to allow it, but UDP checksum is actually a good idea for both v4 and v6 when you re using Linux hosts to do encapsulation, let me explain...

Leveraging UDP checksum offload Probably every deployed NIC supports simple UDP checksum for TX and RX Only new NICs support offload of encapsulated checksums Solution: Enable UDP checksum for encap and use it to offload inner checksums Receive: checksum-unnecessary conversion Transmit: remote checksum offload

Checksum unnecessary conversion Device returns checksum unnecessary for non-zero outer UDP checksum Complete checksum of packet starting from the UDP header is ~pseudo_hdr_csum So convert checksum unnecessary to checksum complete Inner checksum(s) verified using checksum complete No checksum computation on host!

Remote checksum offload Defer TX checksum offload to remote Encapsulation header with start and offset data referring to inner checksum Offload outer UDP checksum and send At receive Do what device does: determine checksum from start to end of packet and write to offset Aleady have complete checksum so we can easily find this Write checksum into packet, validate like normal No checksum calculation in host

Segmentation offload Stack operates on bigger than MTU sized packets Offloads in receive and transmit

Transmit segmentation offload Split big TCP packet into small ones GSO (stack), TSO (HW) For each created packet Copy headers from big one Adjust lengths, checksums, sequence number that must be set per packet

GSO for UDP encapsulation UDP GSO function calls skb_udp_tunnel_segment Call GSO segment for next layer: gso_inner_segment Adjust UDP length and checksum per packet For encapsulation header, just copy those bytes* *Assuming encapsulation header does not have fields that must be set per packet

Receive segmentation offload Build large TCP packet from small ones GRO operation is to match packets to same flow for coalesing GRO (stack), LRO (HW)

GRO for UDP encapsulation UDP GRO receive path (udp_gro_receive) Encapsulation specific GRO functions Call GRO function per port Facility to register offloads per port Call GRO receive for next protocol

FOU and GUE FOU and GUE encapsulating

Foo over UDP Packets of protocol over UDP Destination port maps to protocol e.g. (), v6, (sit), GRE, ESP, etc Example: on port 5555

FOU support Logically, a header inserted to facilitate transport fou.c implements RX. encap_rcv in socket Remove UDP and reinject packet as protocol associated with port Ip tunnel implements FOU for, SIT, GRE Insert UDP header between and payload Source port from flow_hash

FOU example Set up receive ip fou add port 5555 ipproto 4 Set up transmit ip link add name tun1 type ipip \ remote 192.168.1.1 \ local 192.168.1.2 \ ttl 225 \ encap fou \ encap-sport auto \ encap-dport 5555

in FOU transmit TCP Data Start with a plain TCP/ packet sent on tun1

in FOU transmit TCP Data Logically prepend header

in FOU transmit protocol is 4 for TCP Data This is encapsulation

in FOU transmit UDP destination port set to 5555 for /UDP UDP UDP port set to hassh value for inner /TCP headers TCP Data Insert UDP header

in FOU transmit UDP TCP Data packet with encapsulation

in FOU transmit ETH UDP TCP Data Add Ethernet header and send

in FOU receive UDP TCP Data Receiver processes UDP packet based on destination port

in FOU receive Adjust transport header offset in sk_buff TCP Data Remove UDP header UDP

in FOU receive TCP Data Now have original packet. Reinject this into kernel, next protocol to prcess is 4

Generic UDP encapsulation (GUE) Extensible and generic encapsulation proto Encapsulation header for carrying packets of protocol Type field, header length, 8 bit protocol 16 bit flags and optional fields indicated by them. More can be defined in extension Private/extension flag

GUE headers UDP and GUE headers

GRE/GUE example Set up receiver ip fou add port 7777 gue Set up transmit ip link add name tun1 type ipip \ remote 192.168.1.1 \ local 192.168.1.2 \ ttl 225 \ encap gue \ encap-sport auto \ encap-dport 7777 \ encap-udp-csum \ encap-remcsum

GRE in GUE transmit v4 packet Application sends packet on tun1

GRE in GUE transmit GRE v4 packet Logically prepend header for GRE/ tunneling

GRE in GUE transmit Next protocol is 47 for GRE UDP destination port set to 7777 for GUE UDP GUE GRE v4 packet Insert UDP/GUE headers

GRE in GUE transmit UDP GUE GRE v4 packet Insert UDP/GUE headers

GRE in GUE transmit ETH UDP GUE GRE v4 packet Add Ethernet and headers and send

GRE in GUE receive UDP GUE GRE v4 packet Process packet based on UDP port (GUE port)

GRE in GUE receive Adjust transport header offset in sk_buff GRE v4 packet UDP GUE Remove UDP/GUE headers

GRE in GUE receive GRE v4 packet Now have original GRE/ packet. Reinject this into kernel, next protocol to prcess is 47 (GRE)

Thanks, and looking forward Good support for UDP encapsulation is the result of a broad community effort Still a lot of intersting work to do in security, control, and performance