DPDK Roadmap Tim O Driscoll & Chris Wright Open Networking Summit 2017
Agenda Overview: What is DPDK? What problems does it solve? Open source community and transition to Linux Foundation: Why is this transition important? Who are the project members? Future roadmap: What s coming next? Further info: How to get started and get involved in the project.
Packets per Second The Problem Statement 160,000,000 140,000,000 120,000,000 100,000,000 80,000,000 60,000,000 10GbE Packets/sec 40GbE Packets/sec 100GbE Packets/sec 40,000,000 20,000,000 0 64 128 192 256 320 384 448 512 576 640 704 768 832 896 960 1024 1088 1152 1216 1280 1344 1408 1472 Packet Size Packet Size 64 Bytes Packet Size 1024 Bytes 40G packets/second 59.5 million each way 40G packets/second 4.8 million each way Packet arrival interval 16.8 ns Packet arrival interval 208.8 ns 2 GHz clock cycles/packet 33 cycles Typical Network Infrastructure Packet Size 2 GHz clock cycles/packet 417 cycles Typical Server Packet Size
Packets per Second The Problem Statement 160,000,000 140,000,000 120,000,000 100,000,000 80,000,000 60,000,000 10GbE Packets/sec 40GbE Packets/sec 100GbE Packets/sec 40,000,000 20,000,000 0 64 128 192 256 320 384 448 512 576 640 704 768 832 896 960 1024 1088 1152 1216 1280 1344 1408 1472 Packet Size Packet Size 64 Bytes Problem Statement: Packet Size 1024 Bytes 40G packets/second 59.5 million each way 40G packets/second 4.8 million each way Packet In typical arrival interval networked 16.8 applications most traffic Packet arrival is in interval the data 208.8 plane ns 2 Network GHz clock operators 33 require cycles high data plane 2 GHz performance clock for 417 cost cycles efficiency cycles/packet cycles/packet Small packet sizes make it difficult to achieve line rate Typical Network Infrastructure Packet Size Typical Server Packet Size
DPDK The Data Plane Development Kit (DPDK) is a set of software libraries and drivers for accelerating packet processing workloads on COTS hardware platforms.
High Performance Challenges The Linux scheduler causes too much overhead for task switches: Bind a single software thread to a logical core. Memory and PCIe access is really slow compared to CPU operations: Process a bunch of packets during each software iteration and amortize the access cost over multiple packets. Data doesn t seem to be near the CPU when it needs to be: For memory access, use HW or SW controlled prefetching. For PCIe access, use Data Direct IO to write data directly into cache. The system can t keep up with the number of interrupts for packet Rx: Switch from an interrupt-driven network device driver to a polled-mode driver. Access to shared data structures is a bottleneck: Use access schemes that reduce the amount of sharing (e.g. lockless queues for message passing). Page tables are constantly evicted (DTLB Thrashing): Allow Linux to use Huge Pages (2MB, 1GB)
DPDK Framework
DPDK History DPDK.org open source community established. Helps to facilitate an increase in the use of, and contributions to, DPDK. Rapid increase in multi-vendor CPU and NIC support. ABI versioning added. Increased OS distro packaging (RHEL, CentOS, Ubuntu etc.). DPDK Summits extended to PRC and Europe. Support for hardware and software accelerators added. DPDK transitions to The Linux Foundation. First DPDK Summit being held in Bangalore. 2010-12 2013 2014 2015 2016 2017 Initial DPDK releases provided under open source BSD license. First fully open source release (1.7). First multi-vendor CPU and NIC support. First OS distro packaging of DPDK (Fedora, FreeBSD etc.). First DPDK Summit community event held. Continued increase in multivendor CPU and NIC support. First LTS release (16.11). Technical Board created to aid technical decision making. Community decision to adopt a more formal governance structure.
The DPDK Community Diverse and welcoming developer community Supports multiple hardware architectures Kernel-style peer review, open community roadmap process Technical board provides oversight and stewardship
Linux Foundation Now Hosting DPDK Industry support: hardware, software vendors, multiple industry verticals 8 founding Gold member supporters, 5 Silver members
Linux Foundation Now Hosting DPDK Linux Foundation stamp of approval reassures new participants that the project is a technical meritocracy Industry co-ordination and investment to grow the DPDK project Outreach and promotion 3 annual DPDK Summit events Facilitating developer community meetings DPDK Userspace Enabling technical collaboration in the DPDK community
DPDK Consumption
Roadmap - Releases Since 16.04, releases use the Ubuntu numbering scheme of YY.MM. We ve transitioned from 3 major releases per year to 4 in 2017. Frequency and dates of releases will be fixed from 2017 onwards. 16.04 16.07 16.11 (LTS) 2016 2017 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 17.02 17.05 17.08 17.11
Roadmap - Themes Consumability/packaging Crypto acceleration, including asymmetric crypto Expansion to include new device types (e.g. compression, programmable devices) Event-driven API Generic Ethdev APIs Container networking optimization
Consumability/Packaging DPDK now supports LTS releases: Maintains a stable release with back-ported bug fixes over an extended period of time. This will provide downstream consumers with a stable target on which to base applications or packages. First LTS release is 16.11. LTS releases will be maintained for 2 years. Packaging: RPM DPDK and Deb DPDK projects exist in FD.io for DPDK packaging. OS distros that package DPDK include: RHEL (version 7.1 & higher) FreeBSD (10.1+) Ubuntu (15.10+) Wind River Linux (6+) CentOS (7.1+) Fedora (22+)
Crypto Acceleration CRYPTODEV API QAT AESNI MB SNOW 3G OpenSSL Scheduler NULL AESNI GCM KASUMI ARMv8 ZUC PMD for hardware acceleration PMDs for optimized software acceleration libraries PMDs for optimized software acceleration libraries for wireless algorithms PMD for nonoptimized software implementation PMD to distribute packets across multiple accelerators PMD for test purposes Future work includes: Support for additional hardware accelerators. Extending the API to support asymmetric crypto. More advanced Scheduler capabilities.
Cryptodev Flow
New Device Types
Event-Driven API
Generic Ethdev APIs Ethdev API is common, but many capabilities are device-specific. Initial focus was on filtering, to replace the existing device-specific capabilities with a new generic rte_flow API: This API provides a generic means to configure hardware to match specific ingress or egress traffic, alter its fate and query related counters according to any number of user-defined rules. Matching can be performed on packet data (protocol headers, payload) and properties (e.g. associated physical port, virtual device function ID). Possible operations include dropping traffic, diverting it to specific queues, to virtual/physical device functions or ports, performing tunnel offloads, adding marks and so on. A target release for deprecating the legacy APIs will be determined when most PMDs have migrated to rte_flow. Work is in progress on an rte_tm API for Traffic Management, including metering and marking, hierarchical scheduling etc.
Container Networking Optimization Virtio in Containers is a new approach to high-speed networking for containers. In a VM, QEMU helps with device emulations and interaction with the backend. In containers, we don t have QEMU: We could introduce a kernel module, but we re already trying to remove the existing out-of-tree kernel modules from DPDK. Instead, all of the work is done in the PMD driver. We present virtio as a virtual device, just like the way that Ring, PCAP, or other virtual devices are used. The control message are also handled through the driver. Container/App VIRTIO DPDK ETHDEV VIRTIO PMD VIRTIO-USER vhost-user adapter virtio vhost vswitch Socket /tmp/xx.socket
Further Info Open source website (dpdk.org): Download the code, access the documentation, join the mailing lists etc. DPDK Summit events: Includes videos and presentations from previous events. Subscribe to quarterly newsletter. Videos and training: Intel Network Builders University BrightTalk webinars Meet-ups Interested in contributing? Subscribe to the mailing lists. Review the Contributor s Guidelines and contribute patches!
Tim O Driscoll tim.odriscoll@intel.com Questions? Chris Wright chrisw@redhat.com