Light: A Scalable, High-performance and Fully-compatible User-level TCP Stack. Dan Li ( 李丹 ) Tsinghua University
|
|
- William Pearson
- 5 years ago
- Views:
Transcription
1 Light: A Scalable, High-performance and Fully-compatible User-level TCP Stack Dan Li ( 李丹 ) Tsinghua University
2 Data Center Network Performance
3 Hardware Capability of Modern Servers Multi-core CPU Kernel stack becomes the performance bottleneck! Linux PCIe 3.0, 4.0, G~400Gbps NIC
4 Limitation of Linux Kernel Interruption based I/O in high-speed traffic Coupling sockets with VFS Lack of connection locality Shared accept core CPU Usage Breakdown of Web Server (Web server (Lighttpd) Serving a 64 byte file) 83% of CPU usage spent inside kernel! Applicati on TCP/IP 34% Packet I/O 4% Kernel (without TCP/IP)
5 Prior Works Improvement to Linux kernel Latest Linux 4.14, Fastsocket, Mega-pipe, Affinityaccept, IsoStack, StackMap Problems of the kernel stack remain except the percore accept queue User-level I/O DPDK, PFRing, Netmap, PSIO User-level TCP stack mtcp, IX, mos, SeaStar, F-Stack Problem: need to modify the app. source code
6 Light Design Goal User-level TCP stack High performance High throughput Low (tail) latency Full compatibility Do not need to touch the application code at all
7 Challenge Caused by Full Compatibility Performance interference between application and stack Polling-mode I/O Taking over network-related API Distinguishing FD spaces Read(), write() User-level blocking API send(), recv(), epoll() Fault detection and resource recycle
8 Accept Ready Queue Close Ready Queue TX Ready Queue RX Ready Queue Accept Ready Queue Close Ready Queue TX Ready Queue RX Ready Queue Command Queue Command Queue Architecture Overview (1) Three Components of Light: FM (Fronted Module) Provides POSIX API for apps. BM (Backend Module) Polls the Command Queue and processes the commands sequentially. App process 0 core 2 core 3 App process 1 Program Logic Program Logic POSIX API POSIX API Frontend Module Frontend Module Shared Hugepage Memory Light Epoll Light Socket Backend Module Backend Module PPM (Protocol Process Module) Undertakes the major process logic of the TCP/IP/Ethernet protocols Protocol Process Module Light Process 0 core 0 DPDK Protocol Process Module core 1 Light Process 1 User Space RSS Kernel Space NIC
9 Architecture Overview (2) Light-App Separation: Run the Light stack and apps on separate cores; APP Core 0 APP Core 1 APP Core 2 Applications One-to-many and many-toone match between the stack and apps. Stack Core 0 RSS NIC Stack Core 1 Light Stack Eliminate the performance interference between application and stack.
10 Design for Full Compatibility (1) Taking over Network-related APIs: LD_PRELOAD dlsym Application Network-related APIs Dynamic Linker Other APIs Hijacked by LD_PRELOAD Light FM Lib dlsym GNC C Lib
11 Design for Full Compatibility (2) Distinguishing FD Spaces: ssize_t read(int fd, void *buf, size_t count) 0 Bottom-up Top-down Other FDs Maintained by Kernel Network-related FDs Maintained by Light glibc Light Implementation
12 Design for Full Compatibility (3) User-Level Blocking APIs: Epoll_wait(): Can monitor both network-related FDs and non-network FDs with blocking semantics. epoll_create() 1.1 Listened FDs Socket FD epoll_ctl() Nonnetwork FD Application 2.1 Light epoll epoll_wait() Event collection Nonnetwork event 3.1 Networkrelated event 6 Other Blocking APIs: Leverage epoll_wait() to realize the blocking semantics. kernel epoll_create() 1.2 Listened FDs FIFO FD kernel epoll_ctl() 2.2 Nonnetwork FD Kernel epoll 5 kernel epoll_wait() 3.2 Kernel Event collection Nonnetwork readable FIFO event event Kernel Light FIFO FD
13 Design for Full Compatibility (4) Fault Detection and Resource Recycle: Fault Detection Resource Recycle 3 Epoll Monitor IPC socket 1 IPC socket 2 1 App 1 App 2 2 IPC Socket 2 Event Kernel
14 Design for High Performance (1) (1) Benefits from DPDK: General Techniques PMD, Zero-copy, Hugepage, etc. Lockless Shared-Queue Based IPC (2) TCB Management Local Listen Table and Established Table Dedicated Accept Queues
15 Design for High Performance (2) (3) Full Connection Locality Core Locality for Passive Connections Core Locality for Active Connections: Use soft-rss to compute and record the stack core index in the socket object. In this way, the reply packets can be steered to the same core as the original packets.
16 Implementation System Configuration Ubuntu (kernel version generic) DPDK Code lines of C code (excluding DPDK Library and the protocol stack ported from the kernel) APIs Most TCP related APIs have been realized.
17 Evaluation (1) Network Throughput and Multi-core Scalability We use two powerful machines: 1) One runs wrk to generate a high workload of http requests; 2) Another runs Nginx on kernel stack or Light stack. Request Response wrk Nginx Server
18 Evaluation (2) Network Throughput and Multi-core Scalability Nginx on Light gets 56% higher throughput on 8 CPU cores and achieves a linear speedup ratio of 0.89 in terms of network throughput. The RPS of Nginx running on Light and Linux kernel stack against the number of CPU cores used. The message size is set as 64 Bytes.
19 Evaluation (3) Network Throughput and Multi-core Scalability Nginx on Light can consistently achieve more than 50% RPS compared with kernel stack. The RPS of Nginx running on Light and Linux kernel stack against the message size. The number of CPU cores used is 8.
20 Evaluation (4) Network Latency (1) Two machines: 1) One runs wrk to generate a high workload of http requests; 2) Another runs Nginx on kernel stack or Light stack. Request Response wrk Nginx Server
21 Evaluation (5) Network Latency (1) Light can reduce the tail latency by two orders of magnitude compared to kernel stack. CDF of round-trip latency for Nginx on Light and kernel stack.
22 Evaluation (6) Network Latency (2) We use two machines to run as NetPIPE server and NetPIPE client respectively both on Light stack or kernel stack. Request Response NetPIPE Server NetPIPE Server
23 Evaluation (7) Network Latency Compared with Linux kernel stack, Light can reduce the average latency by above 40%, with a maximum of 52%. One-way latency for NetPIPE on Light and kernel stack.
24 Light in DMM (1) Light should develop adapter-library (Light-adapter) for DMM to integrate for communication with Light. Light- nsocket API DMM adapter must implement the interfaces defined by DMM, including the socket APIs, epoll APIs, fork APIs and the resource recycle APIs. Kernel Adapt Light-adapter nstack adapter nrd Light should integrate the DMM Light stack adapter-library(nstack adapter), developed by DMM. The library utilizes HAL the plug-in interface to provide rich features, such as resource (shared NIC memory) and event management.
25 Light in DMM (2) Key Techniques Distributed and Centralized nrd deployment Web APP Video streaming Online gaming (LRD & CRD) provide end-to-end protocol orchestration Stack-transparent Protocol Routing (Stack orchestrator) POSIX compatible socket APIs Flexible socket API redirection and mapping Socket Layer L2~L4 POSIX Socket-compatible API (LD_PRELOAD) VPP Host Stack IPv4 input/output Socket Bridge(SBR) TLDK DPDK input Light Data-plane EAL IPv6 input/output Socket MUX Protocol Orchestrator User Space nrd Honeycomb REST REST (SBR) Flexible APIs for integration of third party stacks NIC Kernel stack Kernel Space DMM VPP 3 rd Party stack (EAL) Multiple stack instances support Multiple I/O engines support
26 Future Work Network operating system out of kernel Redesign PPM module New transport protocol New congestion control mechanism Virtualization / container environment Integrating Light into DMM framework
27 Thanks!
Empower Diverse Open Transport Layer Protocols in Cloud Networking GEORGE ZHAO DIRECTOR OSS & ECOSYSTEM, HUAWEI
Empower Diverse Open Transport Layer Protocols in Cloud Networking GEORGE ZHAO DIRECTOR OSS & ECOSYSTEM, HUAWEI Agenda FD.io Introduction Challenges in Container & Cloud Native Apps Proposed Solutions
More informationLight & NOS. Dan Li Tsinghua University
Light & NOS Dan Li Tsinghua University Performance gain The Power of DPDK As claimed: 80 CPU cycles per packet Significant gain compared with Kernel! What we care more How to leverage the performance gain
More informationSpeeding up Linux TCP/IP with a Fast Packet I/O Framework
Speeding up Linux TCP/IP with a Fast Packet I/O Framework Michio Honda Advanced Technology Group, NetApp michio@netapp.com With acknowledge to Kenichi Yasukata, Douglas Santry and Lars Eggert 1 Motivation
More informationAccelerate Network Protocol Stack Performance and Adoption in the Cloud Networking via DMM
Accelerate Network Protocol Stack Performance and Adoption in the Cloud Networking via DMM Waterman Cao Senior Researcher Cloud Networking Lab, Huawei AGENDA 01 02 03 Overview What we face DMM Overview
More informationPASTE: A Network Programming Interface for Non-Volatile Main Memory
PASTE: A Network Programming Interface for Non-Volatile Main Memory Michio Honda (NEC Laboratories Europe) Giuseppe Lettieri (Università di Pisa) Lars Eggert and Douglas Santry (NetApp) USENIX NSDI 2018
More informationSoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet
SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet Mao Miao, Fengyuan Ren, Xiaohui Luo, Jing Xie, Qingkai Meng, Wenxue Cheng Dept. of Computer Science and Technology, Tsinghua
More informationTLDK Overview. Transport Layer Development Kit Ray Kinsella February ray.kinsella [at] intel.com IRC: mortderire
TLDK Overview Transport Layer Development Kit Ray Kinsella February 2017 Email : ray.kinsella [at] intel.com IRC: mortderire Contributions from Keith Wiles & Konstantin Ananyev Legal Disclaimer General
More informationLearning with Purpose
Network Measurement for 100Gbps Links Using Multicore Processors Xiaoban Wu, Dr. Peilong Li, Dr. Yongyi Ran, Prof. Yan Luo Department of Electrical and Computer Engineering University of Massachusetts
More informationMegaPipe: A New Programming Interface for Scalable Network I/O
MegaPipe: A New Programming Interface for Scalable Network I/O Sangjin Han in collabora=on with Sco? Marshall Byung- Gon Chun Sylvia Ratnasamy University of California, Berkeley Yahoo! Research tl;dr?
More informationResearch on DPDK Based High-Speed Network Traffic Analysis. Zihao Wang Network & Information Center Shanghai Jiao Tong University
Research on DPDK Based High-Speed Network Traffic Analysis Zihao Wang Network & Information Center Shanghai Jiao Tong University Outline 1 Background 2 Overview 3 DPDK Based Traffic Analysis 4 Experiment
More informationStackMap: Low-Latency Networking with the OS Stack and Dedicated NICs
StackMap: Low-Latency Networking with the OS Stack and Dedicated NICs Kenichi Yasukata 1, Michio Honda 2, Douglas Santry 2, and Lars Eggert 2 1 Keio University 2 NetApp Abstract StackMap leverages the
More informationIX: A Protected Dataplane Operating System for High Throughput and Low Latency
IX: A Protected Dataplane Operating System for High Throughput and Low Latency Adam Belay et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Presented by Han Zhang & Zaina Hamid Challenges
More informationTLDK Overview. Transport Layer Development Kit Keith Wiles April Contributions from Ray Kinsella & Konstantin Ananyev
TLDK Overview Transport Layer Development Kit Keith Wiles April 2017 Contributions from Ray Kinsella & Konstantin Ananyev Notices and Disclaimers Intel technologies features and benefits depend on system
More informationIsoStack Highly Efficient Network Processing on Dedicated Cores
IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single
More informationOpenOnload. Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect
OpenOnload Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. OpenOnload Acceleration Software Accelerated
More informationAgilio CX 2x40GbE with OVS-TC
PERFORMANCE REPORT Agilio CX 2x4GbE with OVS-TC OVS-TC WITH AN AGILIO CX SMARTNIC CAN IMPROVE A SIMPLE L2 FORWARDING USE CASE AT LEAST 2X. WHEN SCALED TO REAL LIFE USE CASES WITH COMPLEX RULES TUNNELING
More informationVPP Host Stack. TCP and Session Layers. Florin Coras, Dave Barach, Keith Burns, Dave Wallace
Host Stack and Layers Florin Coras, Dave Barach, Keith Burns, Dave Wallace - A Universal Terabit Network Platform For Native Cloud Network Services Most Efficient on the Planet EFFICIENCY Superior Performance
More informationIX: A Protected Dataplane Operating System for High Throughput and Low Latency
IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this
More informationMuch Faster Networking
Much Faster Networking David Riddoch driddoch@solarflare.com Copyright 2016 Solarflare Communications, Inc. All rights reserved. What is kernel bypass? The standard receive path The standard receive path
More informationFast packet processing in the cloud. Dániel Géhberger Ericsson Research
Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration
More informationVPP Host Stack. Transport and Session Layers. Florin Coras, Dave Barach, Keith Burns, Dave Wallace
Host Stack Transport and Layers Florin Coras, Dave Barach, Keith Burns, Dave Wallace - A Universal Terabit Network Platform For Native Cloud Network Services Most Efficient on the Planet EFFICIENCY Superior
More informationZiye Yang. NPG, DCG, Intel
Ziye Yang NPG, DCG, Intel Agenda What is SPDK? Accelerated NVMe-oF via SPDK Conclusion 2 Agenda What is SPDK? Accelerated NVMe-oF via SPDK Conclusion 3 Storage Performance Development Kit Scalable and
More informationEd Warnicke, Cisco. Tomasz Zawadzki, Intel
Ed Warnicke, Cisco Tomasz Zawadzki, Intel Agenda SPDK iscsi target overview FD.io and VPP SPDK iscsi VPP integration Q&A 2 Notices & Disclaimers Intel technologies features and benefits depend on system
More informationLecture 8: Other IPC Mechanisms. CSC 469H1F Fall 2006 Angela Demke Brown
Lecture 8: Other IPC Mechanisms CSC 469H1F Fall 2006 Angela Demke Brown Topics Messages through sockets / pipes Receiving notification of activity Generalizing the event notification mechanism Kqueue Semaphores
More informationTopics. Lecture 8: Other IPC Mechanisms. Socket IPC. Unix Communication
Topics Lecture 8: Other IPC Mechanisms CSC 469H1F Fall 2006 Angela Demke Brown Messages through sockets / pipes Receiving notification of activity Generalizing the event notification mechanism Kqueue Semaphores
More informationBringing&the&Performance&to&the& Cloud &
Bringing&the&Performance&to&the& Cloud & Dongsu&Han& KAIST & Department&of&Electrical&Engineering& Graduate&School&of&InformaAon&Security& & The&Era&of&Cloud&CompuAng& Datacenters&at&Amazon,&Google,&Facebook&
More informationDemystifying Network Cards
Demystifying Network Cards Paul Emmerich December 27, 2017 Chair of Network Architectures and Services About me PhD student at Researching performance of software packet processing systems Mostly working
More informationData Path acceleration techniques in a NFV world
Data Path acceleration techniques in a NFV world Mohanraj Venkatachalam, Purnendu Ghosh Abstract NFV is a revolutionary approach offering greater flexibility and scalability in the deployment of virtual
More informationMaster s Thesis (Academic Year 2015) Improving TCP/IP stack performance by fast packet I/O framework
Master s Thesis (Academic Year 2015) Improving TCP/IP stack performance by fast packet I/O framework Keio University Graduate School of Media and Governance Kenichi Yasukata Master s Thesis Academic Year
More informationCustom UDP-Based Transport Protocol Implementation over DPDK
Custom UDPBased Transport Protocol Implementation over DPDK Dmytro Syzov, Dmitry Kachan, Kirill Karpov, Nikolai Mareev and Eduard Siemens Future Internet Lab Anhalt, Anhalt University of Applied Sciences,
More informationScaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX
Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX Inventing Internet TV Available in more than 190 countries 104+ million subscribers Lots of Streaming == Lots of Traffic
More informationHigh bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK
High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK [r.tasker@dl.ac.uk] DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459
More informationArrakis: The Operating System is the Control Plane
Arrakis: The Operating System is the Control Plane Simon Peter, Jialin Li, Irene Zhang, Dan Ports, Doug Woos, Arvind Krishnamurthy, Tom Anderson University of Washington Timothy Roscoe ETH Zurich Building
More informationAn Implementation of the Homa Transport Protocol in RAMCloud. Yilong Li, Behnam Montazeri, John Ousterhout
An Implementation of the Homa Transport Protocol in RAMCloud Yilong Li, Behnam Montazeri, John Ousterhout Introduction Homa: receiver-driven low-latency transport protocol using network priorities HomaTransport
More informationQuickSpecs. Overview. HPE Ethernet 10Gb 2-port 535 Adapter. HPE Ethernet 10Gb 2-port 535 Adapter. 1. Product description. 2.
Overview 1. Product description 2. Product features 1. Product description HPE Ethernet 10Gb 2-port 535FLR-T adapter 1 HPE Ethernet 10Gb 2-port 535T adapter The HPE Ethernet 10GBase-T 2-port 535 adapters
More informationPASTE: Fast End System Networking with netmap
PASTE: Fast End System Networking with netmap Michio Honda, Giuseppe Lettieri, Lars Eggert and Douglas Santry BSDCan 2018 Contact: @michioh, micchie@sfc.wide.ad.jp Code: https://github.com/micchie/netmap/tree/stack
More informationSolarflare and OpenOnload Solarflare Communications, Inc.
Solarflare and OpenOnload 2011 Solarflare Communications, Inc. Solarflare Server Adapter Family Dual Port SFP+ SFN5122F & SFN5162F Single Port SFP+ SFN5152F Single Port 10GBASE-T SFN5151T Dual Port 10GBASE-T
More informationIX: A Protected Dataplane Operating System for High Throughput and Low Latency
IX: A Protected Dataplane Operating System for High Throughput and Low Latency Adam Belay 1 George Prekas 2 Ana Klimovic 1 Samuel Grossman 1 Christos Kozyrakis 1 1 Stanford University Edouard Bugnion 2
More informationSLIPSTREAM: AUTOMATIC INTERPROCESS COMMUNICATION OPTIMIZATION. Will Dietz, Joshua Cranmer, Nathan Dautenhahn, Vikram Adve
SLIPSTREAM: AUTOMATIC INTERPROCESS COMMUNICATION OPTIMIZATION Will Dietz, Joshua Cranmer, Nathan Dautenhahn, Vikram Adve Introduction 2 Use of TCP is ubiquitous Widely Supported Location Transparency Programmer-friendly
More informationBe Fast, Cheap and in Control with SwitchKV. Xiaozhou Li
Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li Goal: fast and cost-efficient key-value store Store, retrieve, manage key-value objects Get(key)/Put(key,value)/Delete(key) Target: cluster-level
More informationlibvnf: building VNFs made easy
libvnf: building VNFs made easy Priyanka Naik, Akash Kanase, Trishal Patel, Mythili Vutukuru Dept. of Computer Science and Engineering Indian Institute of Technology, Bombay SoCC 18 11 th October, 2018
More informationDPDK Summit China 2017
Summit China 2017 Embedded Network Architecture Optimization Based on Lin Hao T1 Networks Agenda Our History What is an embedded network device Challenge to us Requirements for device today Our solution
More informationAdvanced Computer Networks. End Host Optimization
Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct
More informationContaining RDMA and High Performance Computing
Containing RDMA and High Performance Computing Liran Liss ContainerCon 2015 Agenda High Performance Computing (HPC) networking RDMA 101 Containing RDMA Challenges Solution approach RDMA network namespace
More informationSPDK China Summit Ziye Yang. Senior Software Engineer. Network Platforms Group, Intel Corporation
SPDK China Summit 2018 Ziye Yang Senior Software Engineer Network Platforms Group, Intel Corporation Agenda SPDK programming framework Accelerated NVMe-oF via SPDK Conclusion 2 Agenda SPDK programming
More informationMemory-Mapped Files. generic interface: vaddr mmap(file descriptor,fileoffset,length) munmap(vaddr,length)
File Systems 38 Memory-Mapped Files generic interface: vaddr mmap(file descriptor,fileoffset,length) munmap(vaddr,length) mmap call returns the virtual address to which the file is mapped munmap call unmaps
More informationContainers Do Not Need Network Stacks
s Do Not Need Network Stacks Ryo Nakamura iijlab seminar 2018/10/16 Based on Ryo Nakamura, Yuji Sekiya, and Hajime Tazaki. 2018. Grafting Sockets for Fast Networking. In ANCS 18: Symposium on Architectures
More informationDPDK Summit China 2017
DPDK Summit China 2017 2 DPDK in container Status Quo and Future Directions Jianfeng Tan, June 2017 3 LEGAL DISCLAIMER No license (express or implied, by estoppel or otherwise) to any intellectual property
More informationOutline. Overview. Linux-specific, since kernel 2.6.0
Outline 25 Alternative I/O Models 25-1 25.1 Overview 25-3 25.2 Signal-driven I/O 25-9 25.3 I/O multiplexing: poll() 25-12 25.4 Problems with poll() and select() 25-29 25.5 The epoll API 25-32 25.6 epoll
More informationHIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS
HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS CS6410 Moontae Lee (Nov 20, 2014) Part 1 Overview 00 Background User-level Networking (U-Net) Remote Direct Memory Access
More informationVPP Host Stack. Transport and Session Layers. Florin Coras, Dave Barach
Host Stack Transport and Layers Florin Coras, Dave Barach - A Universal Terabit Network Platform For Native Cloud Network Services Most Efficient on the Planet EFFICIENCY Superior Performance PERFORMANCE
More informationThe Power of Batching in the Click Modular Router
The Power of Batching in the Click Modular Router Joongi Kim, Seonggu Huh, Keon Jang, * KyoungSoo Park, Sue Moon Computer Science Dept., KAIST Microsoft Research Cambridge, UK * Electrical Engineering
More informationNetronome 25GbE SmartNICs with Open vswitch Hardware Offload Drive Unmatched Cloud and Data Center Infrastructure Performance
WHITE PAPER Netronome 25GbE SmartNICs with Open vswitch Hardware Offload Drive Unmatched Cloud and NETRONOME AGILIO CX 25GBE SMARTNICS SIGNIFICANTLY OUTPERFORM MELLANOX CONNECTX-5 25GBE NICS UNDER HIGH-STRESS
More informationODP Relationship to NFV. Bill Fischofer, LNG 31 October 2013
ODP Relationship to NFV Bill Fischofer, LNG 31 October 2013 Alphabet Soup NFV - Network Functions Virtualization, a carrier initiative organized under ETSI (European Telecommunications Standards Institute)
More informationDPDK Summit 2016 OpenContrail vrouter / DPDK Architecture. Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr.
DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr. Product Manager CONTRAIL (MULTI-VENDOR) ARCHITECTURE ORCHESTRATOR Interoperates
More informationXilinx Answer QDMA Performance Report
Xilinx Answer 71453 QDMA Performance Report Important Note: This downloadable PDF of an Answer Record is provided to enhance its usability and readability. It is important to note that Answer Records are
More informationDPDK Load Balancers RSS H/W LOAD BALANCER DPDK S/W LOAD BALANCER L4 LOAD BALANCERS L7 LOAD BALANCERS NOV 2018
x DPDK Load Balancers RSS H/W LOAD BALANCER DPDK S/W LOAD BALANCER L4 LOAD BALANCERS L7 LOAD BALANCERS NOV 2018 Contact Vincent, Jay L - Your Contact For Load Balancer Follow up jay.l.vincent@intel.com
More informationLike select() and poll(), epoll can monitor multiple FDs epoll returns readiness information in similar manner to poll() Two main advantages:
Outline 22 Alternative I/O Models 22-1 22.1 Overview 22-3 22.2 Nonblocking I/O 22-5 22.3 Signal-driven I/O 22-11 22.4 I/O multiplexing: poll() 22-14 22.5 Problems with poll() and select() 22-31 22.6 The
More informationEvolution of the netmap architecture
L < > T H local Evolution of the netmap architecture Evolution of the netmap architecture -- Page 1/21 Evolution of the netmap architecture Luigi Rizzo, Università di Pisa http://info.iet.unipi.it/~luigi/vale/
More informationPDP : A Flexible and Programmable Data Plane. Massimo Gallo et al.
PDP : A Flexible and Programmable Data Plane Massimo Gallo et al. Introduction Network Function evolution L7 Load Balancer TLS/SSL Server Proxy Server Firewall Introduction Network Function evolution Can
More informationDeveloping Stateful Middleboxes with the mos API KYOUNGSOO PARK & YOUNGGYOUN MOON
Developing Stateful Middleboxes with the mos API KYOUNGSOO PARK & YOUNGGYOUN MOON ASIM JAMSHED, DONGHWI KIM, & DONGSU HAN SCHOOL OF ELECTRICAL ENGINEERING, KAIST Network Middlebox Networking devices that
More informationInterprocess Communication Mechanisms
Interprocess Communication 1 Interprocess Communication Mechanisms shared storage These mechanisms have already been covered. examples: shared virtual memory shared files processes must agree on a name
More informationshared storage These mechanisms have already been covered. examples: shared virtual memory message based signals
Interprocess Communication 1 Interprocess Communication Mechanisms shared storage These mechanisms have already been covered. examples: shared virtual memory shared files processes must agree on a name
More informationNetworking at the Speed of Light
Networking at the Speed of Light Dror Goldenberg VP Software Architecture MaRS Workshop April 2017 Cloud The Software Defined Data Center Resource virtualization Efficient services VM, Containers uservices
More informationEXTENDING AN ASYNCHRONOUS MESSAGING LIBRARY USING AN RDMA-ENABLED INTERCONNECT. Konstantinos Alexopoulos ECE NTUA CSLab
EXTENDING AN ASYNCHRONOUS MESSAGING LIBRARY USING AN RDMA-ENABLED INTERCONNECT Konstantinos Alexopoulos ECE NTUA CSLab MOTIVATION HPC, Multi-node & Heterogeneous Systems Communication with low latency
More informationIBM POWER8 100 GigE Adapter Best Practices
Introduction IBM POWER8 100 GigE Adapter Best Practices With higher network speeds in new network adapters, achieving peak performance requires careful tuning of the adapters and workloads using them.
More informationL41 - Lecture 5: The Network Stack (1)
L41 - Lecture 5: The Network Stack (1) Dr Robert N. M. Watson 27 April 2015 Dr Robert N. M. Watson L41 - Lecture 5: The Network Stack (1) 27 April 2015 1 / 19 Introduction Reminder: where we left off in
More informationLINUX INTERNALS & NETWORKING Weekend Workshop
Here to take you beyond LINUX INTERNALS & NETWORKING Weekend Workshop Linux Internals & Networking Weekend workshop Objectives: To get you started with writing system programs in Linux Build deeper view
More informationOpen Source Traffic Analyzer
Open Source Traffic Analyzer Daniel Turull June 2010 Outline 1 Introduction 2 Background study 3 Design 4 Implementation 5 Evaluation 6 Conclusions 7 Demo Outline 1 Introduction 2 Background study 3 Design
More informationVALE: a switched ethernet for virtual machines
L < > T H local VALE VALE -- Page 1/23 VALE: a switched ethernet for virtual machines Luigi Rizzo, Giuseppe Lettieri Università di Pisa http://info.iet.unipi.it/~luigi/vale/ Motivation Make sw packet processing
More informationPerformance Objects and Counters for the System
APPENDIXA Performance Objects and for the System May 19, 2009 This appendix provides information on system-related objects and counters. Cisco Tomcat Connector, page 2 Cisco Tomcat JVM, page 4 Cisco Tomcat
More informationTo Grant or Not to Grant
To Grant or Not to Grant (for the case of Xen network drivers) João Martins Principal Software Engineer Virtualization Team July 11, 2017 Safe Harbor Statement The following is intended to outline our
More informationTales of the Tail Hardware, OS, and Application-level Sources of Tail Latency
Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Jialin Li, Naveen Kr. Sharma, Dan R. K. Ports and Steven D. Gribble February 2, 2015 1 Introduction What is Tail Latency? What
More informationKernel Bypass. Sujay Jayakar (dsj36) 11/17/2016
Kernel Bypass Sujay Jayakar (dsj36) 11/17/2016 Kernel Bypass Background Why networking? Status quo: Linux Papers Arrakis: The Operating System is the Control Plane. Simon Peter, Jialin Li, Irene Zhang,
More informationAn FPGA-Based Optical IOH Architecture for Embedded System
An FPGA-Based Optical IOH Architecture for Embedded System Saravana.S Assistant Professor, Bharath University, Chennai 600073, India Abstract Data traffic has tremendously increased and is still increasing
More informationDPDK on Arm64 Status Review & Plan
DPDK on Arm64 Status Review & Plan Song.zhu@arm.com Yi.He@arm.com Herbert.Guan@arm.com 19/03/2018 2018 Arm Limited DPDK Overview Data Plane Development Kit A set of libraries and drivers for fast packet
More informationIntel Ethernet Server Adapter XL710 for OCP
Product Brief Intel Ethernet Server Adapter XL710 for OCP Industry-leading, energy-efficient design for 10/40GbE performance and multi-core processors. Key Features OCP Spec. v2.0, Type 1 Supports 4x10GbE,
More informationSoftware Datapath Acceleration for Stateless Packet Processing
June 22, 2010 Software Datapath Acceleration for Stateless Packet Processing FTF-NET-F0817 Ravi Malhotra Software Architect Reg. U.S. Pat. & Tm. Off. BeeKit, BeeStack, CoreNet, the Energy Efficient Solutions
More informationIntroduction to OpenOnload Building Application Transparency and Protocol Conformance into Application Acceleration Middleware
White Paper Introduction to OpenOnload Building Application Transparency and Protocol Conformance into Application Acceleration Middleware Steve Pope, PhD Chief Technical Officer Solarflare Communications
More informationTRex Realistic Traffic Generator
DEVNET-1120 TRex Realistic Traffic Generator Hanoch Haim, Principal Engineer Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session in the Cisco
More informationFAQ. Release rc2
FAQ Release 19.02.0-rc2 January 15, 2019 CONTENTS 1 What does EAL: map_all_hugepages(): open failed: Permission denied Cannot init memory mean? 2 2 If I want to change the number of hugepages allocated,
More informationSupporting Fine-Grained Network Functions through Intel DPDK
Supporting Fine-Grained Network Functions through Intel DPDK Ivano Cerrato, Mauro Annarumma, Fulvio Risso - Politecnico di Torino, Italy EWSDN 2014, September 1st 2014 This project is co-funded by the
More informationMemory Management Strategies for Data Serving with RDMA
Memory Management Strategies for Data Serving with RDMA Dennis Dalessandro and Pete Wyckoff (presenting) Ohio Supercomputer Center {dennis,pw}@osc.edu HotI'07 23 August 2007 Motivation Increasing demands
More informationECE 650 Systems Programming & Engineering. Spring 2018
ECE 650 Systems Programming & Engineering Spring 2018 Programming with Network Sockets Tyler Bletsch Duke University Slides are adapted from Brian Rogers (Duke) Sockets We ve looked at shared memory vs.
More informationSelf-driving Datacenter: Analytics
Self-driving Datacenter: Analytics George Boulescu Consulting Systems Engineer 19/10/2016 Alvin Toffler is a former associate editor of Fortune magazine, known for his works discussing the digital revolution,
More informationNTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.
Messaging App IB Verbs NTRDMA dmaengine.h ntb.h DMA DMA DMA NTRDMA v0.1 An Open Source Driver for PCIe and DMA Allen Hubbe at Linux Piter 2015 1 INTRODUCTION Allen Hubbe Senior Software Engineer EMC Corporation
More informationDPDK Performance Report Release Test Date: Nov 16 th 2016
Test Date: Nov 16 th 2016 Revision History Date Revision Comment Nov 16 th, 2016 1.0 Initial document for release 2 Contents Audience and Purpose... 4 Test setup:... 4 Intel Xeon Processor E5-2699 v4 (55M
More informationPASTE: A Networking API for Non-Volatile Main Memory
PASTE: A Networking API for Non-Volatile Main Memory Michio Honda (NEC Laboratories Europe) Lars Eggert (NetApp) Douglas Santry (NetApp) TSVAREA@IETF 99, Prague May 22th 2017 More details at our HotNets
More informationAccelerate Cloud Native with FD.io
Accelerate Cloud Native with FDio Naoyuki Mori, Ping Yu, Kinsella Ray, Hongjun Ni Intel Agenda FDio*: Cloud native acceleration framework Acceleration of Envoy with FDio* TCP and QAT Acceleration of Load
More informationLegUp: Accelerating Memcached on Cloud FPGAs
0 LegUp: Accelerating Memcached on Cloud FPGAs Xilinx Developer Forum December 10, 2018 Andrew Canis & Ruolong Lian LegUp Computing Inc. 1 COMPUTE IS BECOMING SPECIALIZED 1 GPU Nvidia graphics cards are
More informationAgilio OVS Software Architecture
WHITE PAPER Agilio OVS Software Architecture FOR SERVER-BASED NETWORKING THERE IS CONSTANT PRESSURE TO IMPROVE SERVER- BASED NETWORKING PERFORMANCE DUE TO THE INCREASED USE OF SERVER AND NETWORK VIRTUALIZATION
More informationAn Intelligent NIC Design Xin Song
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) An Intelligent NIC Design Xin Song School of Electronic and Information Engineering Tianjin Vocational
More informationA Scalable Event Dispatching Library for Linux Network Servers
A Scalable Event Dispatching Library for Linux Network Servers Hao-Ran Liu and Tien-Fu Chen Dept. of CSIE National Chung Cheng University Traditional server: Multiple Process (MP) server A dedicated process
More informationNFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications
NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications Outline RDMA Motivating trends iwarp NFS over RDMA Overview Chelsio T5 support Performance results 2 Adoption Rate of 40GbE Source: Crehan
More information초고속네트워크보안시스템설계및구현. KyoungSoo Park School of Electrical Engineering, KAIST. (Collaboration with many students & faculty members at KAIST)
초고속네트워크보안시스템설계및구현 KyoungSoo Park School of Electrical Engineering, KAIST (Collaboration with many students & faculty members at KAIST) Agenda High-performance packet processing on x86 systems Kargus High-performance
More informationNo Tradeoff Low Latency + High Efficiency
No Tradeoff Low Latency + High Efficiency Christos Kozyrakis http://mast.stanford.edu Latency-critical Applications A growing class of online workloads Search, social networking, software-as-service (SaaS),
More informationApplication Acceleration Beyond Flash Storage
Application Acceleration Beyond Flash Storage Session 303C Mellanox Technologies Flash Memory Summit July 2014 Accelerating Applications, Step-by-Step First Steps Make compute fast Moore s Law Make storage
More informationMaximizing Network Throughput for Container Based Storage David Borman Quantum
Maximizing Network Throughput for Container Based Storage David Borman Quantum 1 Agenda Assumptions Background Information Methods for External Access Descriptions, Pros and Cons Summary 2 Assumptions
More informationECE 650 Systems Programming & Engineering. Spring 2018
ECE 650 Systems Programming & Engineering Spring 2018 Networking Transport Layer Tyler Bletsch Duke University Slides are adapted from Brian Rogers (Duke) TCP/IP Model 2 Transport Layer Problem solved:
More informationFundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin,
Fundamental Questions to Answer About Computer Networking, Jan 2009 Prof. Ying-Dar Lin, ydlin@cs.nctu.edu.tw Chapter 1: Introduction 1. How does Internet scale to billions of hosts? (Describe what structure
More information