Netchannel 2: Optimizing Network Performance

Size: px
Start display at page:

Download "Netchannel 2: Optimizing Network Performance"

Transcription

1 Netchannel 2: Optimizing Network Performance J. Renato Santos +, G. (John) Janakiraman + Yoshio Turner +, Ian Pratt * + HP Labs - * XenSource/Citrix Xen Summit Nov 14-16, Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

2 Motivation TCP performance for GigE today (PV dirvers) Throughput (Mb/s) CPU utilization (%) RX TX RX TX 60 linux xen linux xen linux xen linux xen Xen PV driver can sustain peak throughput on GigE But Xen uses significantly more CPU cycles than Linux Less available cycles for application. 10 Gig networks: CPU saturation prevents achieving line rate Need to reduce I/O virtualization overhead in Xen networking page 2

3 Netchannel 2 Netchannel2: New I/O channel protocol to enable Xen networking design changes for improving performance Work in progress Software optimizations Implementation optimizations Software design changes Devices with direct access (Direct I/O, PCI-IOV) (Device exposes multiple virtual interfaces accessed directly by ) Multi-queue devices Device has multiple RX queues Avoids data copy on receive page 3

4 Performance Analysis of Xen networking Identified main sources of overhead Results guided design choices in Netchannel 2 Emphasis on RX results higher overhead than TX page 4

5 Performance Analysis of Xen Networking (RX) RX, UDP traffic, large msg (48KB) (Xen 64bit on Intel core2 duo) 60 CPU (%) usercopy kern xen1 grantcopy kern0 xen0 0 xen linux usercopy: data copy from kernel to user buffer kern: kernel code in xen1: xen code executed in the context of the grantcopy: data copy using Xen grant (only memory copy cost) kern0: kernel code in domain 0 xen0: xen code executed in the context of domain 0 page 5

6 Implementation optimizations on RX CPU (%) usercopy kern xen1 grantcopy kern0 xen0 0 current xen no bridge netfilter align copy linear skb grant opt. linux Implementation optimizations (also possible in netchannel 1) Disabling netfilter on bridge Fix grant copy alignment problem Avoid fragments on single page packets A few optimizations in grant code page 6

7 Background: receive path on Xen today driver domain netback IO rings netfront skb GR grant copy RX TX post buffer rx skb device driver Xen NIC Hardware page 7

8 Netchannel 2: Moving grant copy to driver domain netback IO rings netfront skb RX TX rx GR skb grant copy device driver NIC Xen Hardware page 8

9 Netchannel 2: Moving grant copy to Improve performance by placing packet in the CPU cache Improve 2 nd data copy from kernel to user Avoid polluting dom0 CPU cache Improve resource management Assigns CPU cost of data copy to the Better scalability Eliminates dom0 as data copy bottleneck page 9

10 Netchannel 2: Multi queue device support Device has multiple RX queues Dedicate one RX queue to a particular Program device to demultiplex incoming packets to the dedicated queue using MAC address Post RX descriptors pointing to memory Device places received packet directly into memory avoiding data copy page 10

11 Netchannel 2: Multi-queue device support dedicated RX queue driver domain device driver netback skb GR IO rings Post RX RX TX netfront RX buffer RX NIC Xen Hardware page 11

12 Netchannel 2: Multi-queue device support Grant copy still used for inter- packet (also multicast, broadcast) other dedicated RX queue driver domain device driver netback skb GR IO rings Post RX RX TX netfront RX buffer GR skb Xen NIC Hardware page 12

13 Netchannel 2: Multi-queue device support Need to extend linux netdev interface (for native device driver) Buffer posted on dedicated queue must be allocated from the respective pool Need a new buffer allocation function that selects the memory pool based on the queue id In Xen this will be mapped to a function in netback page 13

14 Netchannel 2: Grant caching in driver domain Avoid unmapping grants, expecting that the same memory will be reused in the future Reduces grant mapping/unmapping overhead RX buffers should have high locality Windows and Linux tend to recycle RX buffers TX buffer recycling behavior is uncertain Need to evaluate experimentally In Linux we could promote buffer recycling if we can modify the skb allocator page 14

15 Netchannel 2: Grant caching in driver domain driver domain netback GR skb skb GR IO rings Post RX RX TX netfront RX buffer RX TX packet device driver GR GR GR grant cache Xen NIC Hardware page 15

16 Netchannel 2: Grant caching in driver domain driver domain device driver netback skb skb GR GR GR GR GR grant cache IO rings Post RX RX TX Control netfront RX buffer RX TX packet invalidate grant NIC Xen Hardware page 16

17 Netchannel 2: Grant extensions 1. Grant transitivity Guest authorizes grant to be transferred to other domain For to communication TX issues a grant for driver domain Driver domain transfers grant rights to RX RX uses grant to copy packet to local memory 2. New grant copy operation with range limit Grant does not give access to a full page access limited to a range specified by (offset, size) Prevents from snooping on other packets previously received on the same page page 17

18 Netchannel 2: RX performance improvements usercopy kern xen1 grantcopy kern0 xen0 0 current xen implem opt. copy Guest copy improvements linux usercopy: cached packet reduces copy overhead grant copy: lower overhead on copy (possibly because copy is cache aligned) kern: kernel also benefits from cached packet for packet header access xen: grant copy cost moved from dom0 (xen0) do (xen1); also grant optimizations are more effective on side (pin read-only page) page 18

19 Netchannel 2: RX performance improvements usercopy kern xen1 grantcopy kern0 xen0 0 current xen implem opt. copy mqueue linux multi queue multi-queue behavior emulated by modifying device driver of traditional NIC and dedicating device RX queue to page 19

20 Netchannel 2: RX performance improvements usercopy kern xen1 grantcopy kern0 xen0 0 current xen implem opt. copy mqueue linux multi queue improvements grant copy eliminated kern0: reduced network processing cost in dom0: no bridge overhead no socket buffer allocation/deallocation.overhead page 20

21 Netchannel 2: RX performance improvements CPU (%) usercopy kern xen1 grantcopy kern0 xen0 0 current xen implem opt. copy mqueue mqueue + grant reuse linux multi queue + grant cache grant caching emulated assuming 100% hit rate on cache Grant cache benefit: eliminates grant operation overhead in dom0 (xen0) page 21

22 Interrupt throttling Netback process RX packets in batches Number of packets in each batch determined by the number of packets received in each hardware interrupt Most remaining Xen overhead proportional to number of batches Interrupts, event notification/delivery, Xen scheduler runs Increasing batch size should reduce Xen overhead NIC can be configured to throttle interrupt rate (coalescing) RX interrupt delayed until N packets received (limit interrupt rate at high throughput) Or after a given timeout (limit latency at low throughput) Latency sensitive applications should not be significantly affected by larger batch size Latency effect is limited by latency coalescing parameter page 22

23 Performance impact of interrupt throttling Default configuration (6 pkt/intr) CPU (%) % copy 193% mqueue +grant cache 126% direct I/O 100% linux CPU (%) Interrupt throttling config (64 pkt/intr) % copy 123% 109% mqueue + grant cache direct I/O 100% linux usercopy kern xen1 grantcopy kern0 xen0 Interrupt throttling significantly improves Xen performance Default configuration is good for Linux (but not optimal for Xen) Xen users should use different device coalescing settings Large batches achieve almost native performance with multi-queue (23% overhead) page 23

24 Conclusion Netchannel 2 will provide: significant performance improvement for traditional NICs Overhead over linux reduced by 3.7 times on RX (370% to 100%) near native performance for RX on multi-queue devices 23% overhead over linux Multi-queue devices can be a good alternative to direct I/O devices (direct access) Slightly higher CPU cost But no hardware dependency on single device driver to maintain, test, debug, etc. Easier to migrate, easier to monitor/enforce traffic policies (firewall, rate control, etc) page 24

25 page 25

Xen Network I/O Performance Analysis and Opportunities for Improvement

Xen Network I/O Performance Analysis and Opportunities for Improvement Xen Network I/O Performance Analysis and Opportunities for Improvement J. Renato Santos G. (John) Janakiraman Yoshio Turner HP Labs Xen Summit April 17-18, 27 23 Hewlett-Packard Development Company, L.P.

More information

Network optimizations for PV guests

Network optimizations for PV guests Network optimizations for PV guests J. Renato Santos G. (John) Janakiraman Yoshio Turner HP Labs Summit September 7-8, 26 23 Hewlett-Packard Development Company, L.P. The information contained herein is

More information

Bridging the Gap between Software and Hardware Techniques for I/O Virtualization

Bridging the Gap between Software and Hardware Techniques for I/O Virtualization Bridging the Gap between Software and Hardware Techniques for I/O Virtualization Jose Renato Santos, Yoshio Turner, G.(John) Janakiraman, Ian Pratt HP Laboratories HPL-28-39 Keyword(s): Virtualization,

More information

Support for Smart NICs. Ian Pratt

Support for Smart NICs. Ian Pratt Support for Smart NICs Ian Pratt Outline Xen I/O Overview Why network I/O is harder than block Smart NIC taxonomy How Xen can exploit them Enhancing Network device channel NetChannel2 proposal I/O Architecture

More information

Xenoprof overview & Networking Performance Analysis

Xenoprof overview & Networking Performance Analysis Xenoprof overview & Networking Performance Analysis J. Renato Santos G. (John) Janakiraman Yoshio Turner Aravind Menon HP Labs Xen Summit January 17-18, 2006 2003 Hewlett-Packard Development Company, L.P.

More information

Keeping up with the hardware

Keeping up with the hardware Keeping up with the hardware Challenges in scaling I/O performance Jonathan Davies XenServer System Performance Lead XenServer Engineering, Citrix Cambridge, UK 18 Aug 2015 Jonathan Davies (Citrix) Keeping

More information

A Novel Approach to Gain High Throughput and Low Latency through SR- IOV

A Novel Approach to Gain High Throughput and Low Latency through SR- IOV A Novel Approach to Gain High Throughput and Low Latency through SR- IOV Usha Devi G #1, Kasthuri Theja Peduru #2, Mallikarjuna Reddy B #3 School of Information Technology, VIT University, Vellore 632014,

More information

Enabling Fast, Dynamic Network Processing with ClickOS

Enabling Fast, Dynamic Network Processing with ClickOS Enabling Fast, Dynamic Network Processing with ClickOS Joao Martins*, Mohamed Ahmed*, Costin Raiciu, Roberto Bifulco*, Vladimir Olteanu, Michio Honda*, Felipe Huici* * NEC Labs Europe, Heidelberg, Germany

More information

Speeding up Linux TCP/IP with a Fast Packet I/O Framework

Speeding up Linux TCP/IP with a Fast Packet I/O Framework Speeding up Linux TCP/IP with a Fast Packet I/O Framework Michio Honda Advanced Technology Group, NetApp michio@netapp.com With acknowledge to Kenichi Yasukata, Douglas Santry and Lars Eggert 1 Motivation

More information

Optimizing TCP Receive Performance

Optimizing TCP Receive Performance Optimizing TCP Receive Performance Aravind Menon and Willy Zwaenepoel School of Computer and Communication Sciences EPFL Abstract The performance of receive side TCP processing has traditionally been dominated

More information

Redesigning Xen's Memory Sharing Mechanism for Safe and Efficient I/O Virtualization Kaushik Kumar Ram, Jose Renato Santos, Yoshio Turner

Redesigning Xen's Memory Sharing Mechanism for Safe and Efficient I/O Virtualization Kaushik Kumar Ram, Jose Renato Santos, Yoshio Turner Redesigning Xen's Memory Sharing Mechanism for Safe and Efficient I/O Virtualization Kaushik Kumar Ram, Jose Renato Santos, Yoshio Turner HP Laboratories HPL-21-39 Keyword(s): No keywords available. Abstract:

More information

IBM POWER8 100 GigE Adapter Best Practices

IBM POWER8 100 GigE Adapter Best Practices Introduction IBM POWER8 100 GigE Adapter Best Practices With higher network speeds in new network adapters, achieving peak performance requires careful tuning of the adapters and workloads using them.

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Adam Belay et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Presented by Han Zhang & Zaina Hamid Challenges

More information

double split driver model

double split driver model software defining system devices with the BANANA double split driver model Dan WILLIAMS, Hani JAMJOOM IBM Watson Research Center Hakim WEATHERSPOON Cornell University Decoupling gives Flexibility Cloud

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this

More information

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems

More information

Software Routers: NetMap

Software Routers: NetMap Software Routers: NetMap Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking October 8, 2014 Slides from the NetMap: A Novel Framework for

More information

To Grant or Not to Grant

To Grant or Not to Grant To Grant or Not to Grant (for the case of Xen network drivers) João Martins Principal Software Engineer Virtualization Team July 11, 2017 Safe Harbor Statement The following is intended to outline our

More information

Introduction to Oracle VM (Xen) Networking

Introduction to Oracle VM (Xen) Networking Introduction to Oracle VM (Xen) Networking Dongli Zhang Oracle Asia Research and Development Centers (Beijing) dongli.zhang@oracle.com May 30, 2017 Dongli Zhang (Oracle) Introduction to Oracle VM (Xen)

More information

QuickSpecs. HP Z 10GbE Dual Port Module. Models

QuickSpecs. HP Z 10GbE Dual Port Module. Models Overview Models Part Number: 1Ql49AA Introduction The is a 10GBASE-T adapter utilizing the Intel X722 MAC and X557-AT2 PHY pairing to deliver full line-rate performance, utilizing CAT 6A UTP cabling (or

More information

Xen Community Update. Ian Pratt, Citrix Systems and Chairman of Xen.org

Xen Community Update. Ian Pratt, Citrix Systems and Chairman of Xen.org Xen Community Update Ian Pratt, Citrix Systems and Chairman of Xen.org 1 Outline Project Status Xen Client Initiative Xen Cloud Platform New Xen 4.0 Features 2 Announcement The Xen Advisory Board is excited

More information

Network device virtualization: issues and solutions

Network device virtualization: issues and solutions Network device virtualization: issues and solutions Ph.D. Seminar Report Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy by Debadatta Mishra Roll No: 114050005

More information

Optimizing Performance: Intel Network Adapters User Guide

Optimizing Performance: Intel Network Adapters User Guide Optimizing Performance: Intel Network Adapters User Guide Network Optimization Types When optimizing network adapter parameters (NIC), the user typically considers one of the following three conditions

More information

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate

PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate NIC-PCIE-1SFP+-PLU PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate Flexibility and Scalability in Virtual

More information

Implementation and Analysis of Large Receive Offload in a Virtualized System

Implementation and Analysis of Large Receive Offload in a Virtualized System Implementation and Analysis of Large Receive Offload in a Virtualized System Takayuki Hatori and Hitoshi Oi The University of Aizu, Aizu Wakamatsu, JAPAN {s1110173,hitoshi}@u-aizu.ac.jp Abstract System

More information

How to abstract hardware acceleration device in cloud environment. Maciej Grochowski Intel DCG Ireland

How to abstract hardware acceleration device in cloud environment. Maciej Grochowski Intel DCG Ireland How to abstract hardware acceleration device in cloud environment Maciej Grochowski Intel DCG Ireland Outline Introduction to Hardware Accelerators Intel QuickAssist Technology (Intel QAT) as example of

More information

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS CS6410 Moontae Lee (Nov 20, 2014) Part 1 Overview 00 Background User-level Networking (U-Net) Remote Direct Memory Access

More information

VALE: a switched ethernet for virtual machines

VALE: a switched ethernet for virtual machines L < > T H local VALE VALE -- Page 1/23 VALE: a switched ethernet for virtual machines Luigi Rizzo, Giuseppe Lettieri Università di Pisa http://info.iet.unipi.it/~luigi/vale/ Motivation Make sw packet processing

More information

IsoStack Highly Efficient Network Processing on Dedicated Cores

IsoStack Highly Efficient Network Processing on Dedicated Cores IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single

More information

Performance Considerations of Network Functions Virtualization using Containers

Performance Considerations of Network Functions Virtualization using Containers Performance Considerations of Network Functions Virtualization using Containers Jason Anderson, et al. (Clemson University) 2016 International Conference on Computing, Networking and Communications, Internet

More information

Virtualization, Xen and Denali

Virtualization, Xen and Denali Virtualization, Xen and Denali Susmit Shannigrahi November 9, 2011 Susmit Shannigrahi () Virtualization, Xen and Denali November 9, 2011 1 / 70 Introduction Virtualization is the technology to allow two

More information

An Intelligent NIC Design Xin Song

An Intelligent NIC Design Xin Song 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) An Intelligent NIC Design Xin Song School of Electronic and Information Engineering Tianjin Vocational

More information

A low-overhead networking mechanism for virtualized high-performance computing systems

A low-overhead networking mechanism for virtualized high-performance computing systems J Supercomput (2012) 59:443 468 DOI 10.1007/s11227-010-0444-9 A low-overhead networking mechanism for virtualized high-performance computing systems Jae-Wan Jang Euiseong Seo Heeseung Jo Jin-Soo Kim Published

More information

Nested Virtualization Update From Intel. Xiantao Zhang, Eddie Dong Intel Corporation

Nested Virtualization Update From Intel. Xiantao Zhang, Eddie Dong Intel Corporation Nested Virtualization Update From Intel Xiantao Zhang, Eddie Dong Intel Corporation Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED,

More information

RDMA-like VirtIO Network Device for Palacios Virtual Machines

RDMA-like VirtIO Network Device for Palacios Virtual Machines RDMA-like VirtIO Network Device for Palacios Virtual Machines Kevin Pedretti UNM ID: 101511969 CS-591 Special Topics in Virtualization May 10, 2012 Abstract This project developed an RDMA-like VirtIO network

More information

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration

More information

NIC-PCIE-4RJ45-PLU PCI Express x4 Quad Port Copper Gigabit Server Adapter (Intel I350 Based)

NIC-PCIE-4RJ45-PLU PCI Express x4 Quad Port Copper Gigabit Server Adapter (Intel I350 Based) NIC-PCIE-4RJ45-PLU PCI Express x4 Quad Port Copper Gigabit Server Adapter (Intel I350 Based) Quad-port Gigabit Ethernet server adapters designed with performance enhancing features and new power management

More information

The Convergence of Storage and Server Virtualization Solarflare Communications, Inc.

The Convergence of Storage and Server Virtualization Solarflare Communications, Inc. The Convergence of Storage and Server Virtualization 2007 Solarflare Communications, Inc. About Solarflare Communications Privately-held, fabless semiconductor company. Founded 2001 Top tier investors:

More information

Much Faster Networking

Much Faster Networking Much Faster Networking David Riddoch driddoch@solarflare.com Copyright 2016 Solarflare Communications, Inc. All rights reserved. What is kernel bypass? The standard receive path The standard receive path

More information

Advanced Computer Networks. End Host Optimization

Advanced Computer Networks. End Host Optimization Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct

More information

Kernel Bypass. Sujay Jayakar (dsj36) 11/17/2016

Kernel Bypass. Sujay Jayakar (dsj36) 11/17/2016 Kernel Bypass Sujay Jayakar (dsj36) 11/17/2016 Kernel Bypass Background Why networking? Status quo: Linux Papers Arrakis: The Operating System is the Control Plane. Simon Peter, Jialin Li, Irene Zhang,

More information

vnetwork Future Direction Howie Xu, VMware R&D November 4, 2008

vnetwork Future Direction Howie Xu, VMware R&D November 4, 2008 vnetwork Future Direction Howie Xu, VMware R&D November 4, 2008 Virtual Datacenter OS from VMware Infrastructure vservices and Cloud vservices Existing New - roadmap Virtual Datacenter OS from VMware Agenda

More information

I/O Scalability in Xen

I/O Scalability in Xen I/O Scalability in Xen Kevin Tian kevin.tian@intel.com Eddie Dong eddie.dong@intel.com Yang Zhang yang.zhang@intel.com Sponsored by: & & Agenda Overview of I/O Scalability Issues Excessive Interrupts Hurt

More information

Optimizing the GigE transfer What follows comes from company Pleora.

Optimizing the GigE transfer What follows comes from company Pleora. Optimizing the GigE transfer What follows comes from company Pleora. Selecting a NIC and Laptop Based on our testing, we recommend Intel NICs. In particular, we recommend the PRO 1000 line of Intel PCI

More information

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System Abstract In this paper, we address the latency issue in RT- XEN virtual machines that are available in Xen 4.5. Despite the advantages of applying virtualization to systems, the default credit scheduler

More information

Background. IBM sold expensive mainframes to large organizations. Monitor sits between one or more OSes and HW

Background. IBM sold expensive mainframes to large organizations. Monitor sits between one or more OSes and HW Virtual Machines Background IBM sold expensive mainframes to large organizations Some wanted to run different OSes at the same time (because applications were developed on old OSes) Solution: IBM developed

More information

Xen and the Art of Virtualization

Xen and the Art of Virtualization Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield Presented by Thomas DuBuisson Outline Motivation

More information

Arrakis: The Operating System is the Control Plane

Arrakis: The Operating System is the Control Plane Arrakis: The Operating System is the Control Plane Simon Peter, Jialin Li, Irene Zhang, Dan Ports, Doug Woos, Arvind Krishnamurthy, Tom Anderson University of Washington Timothy Roscoe ETH Zurich Building

More information

XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet

XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet Vijay Shankar Rajanna, Smit Shah, Anand Jahagirdar and Kartik Gopalan Computer Science, State University of New York at Binghamton

More information

DPDK Summit China 2017

DPDK Summit China 2017 Summit China 2017 Embedded Network Architecture Optimization Based on Lin Hao T1 Networks Agenda Our History What is an embedded network device Challenge to us Requirements for device today Our solution

More information

Live Migration: Even faster, now with a dedicated thread!

Live Migration: Even faster, now with a dedicated thread! Live Migration: Even faster, now with a dedicated thread! Juan Quintela Orit Wasserman Vinod Chegu KVM Forum 2012 Agenda Introduction Migration

More information

Part 1: Introduction to device drivers Part 2: Overview of research on device driver reliability Part 3: Device drivers research at ERTOS

Part 1: Introduction to device drivers Part 2: Overview of research on device driver reliability Part 3: Device drivers research at ERTOS Some statistics 70% of OS code is in device s 3,448,000 out of 4,997,000 loc in Linux 2.6.27 A typical Linux laptop runs ~240,000 lines of kernel code, including ~72,000 loc in 36 different device s s

More information

SR-IOV Networking in Xen: Architecture, Design and Implementation

SR-IOV Networking in Xen: Architecture, Design and Implementation SR-IOV Networking in Xen: Architecture, Design and Implementation Yaozu Dong, Zhao Yu and Greg Rose Abstract. SR-IOV capable network devices offer the benefits of direct I/O throughput and reduced CPU

More information

G Robert Grimm New York University

G Robert Grimm New York University G22.3250-001 Receiver Livelock Robert Grimm New York University Altogether Now: The Three Questions What is the problem? What is new or different? What are the contributions and limitations? Motivation

More information

Data Path acceleration techniques in a NFV world

Data Path acceleration techniques in a NFV world Data Path acceleration techniques in a NFV world Mohanraj Venkatachalam, Purnendu Ghosh Abstract NFV is a revolutionary approach offering greater flexibility and scalability in the deployment of virtual

More information

打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生.

打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生. 打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生 shiys@solutionware.com.cn BY DEFAULT, LINUX NETWORKING NOT TUNED FOR MAX PERFORMANCE, MORE FOR RELIABILITY Trade-off :Low Latency, throughput, determinism Performance

More information

Learning with Purpose

Learning with Purpose Network Measurement for 100Gbps Links Using Multicore Processors Xiaoban Wu, Dr. Peilong Li, Dr. Yongyi Ran, Prof. Yan Luo Department of Electrical and Computer Engineering University of Massachusetts

More information

Large Receive Offload implementation in Neterion 10GbE Ethernet driver

Large Receive Offload implementation in Neterion 10GbE Ethernet driver Large Receive Offload implementation in Neterion 10GbE Ethernet driver Leonid Grossman Neterion, Inc. leonid@neterion.com Abstract 1 Introduction The benefits of TSO (Transmit Side Offload) implementation

More information

Network Adapters. FS Network adapter are designed for data center, and provides flexible and scalable I/O solutions. 10G/25G/40G Ethernet Adapters

Network Adapters. FS Network adapter are designed for data center, and provides flexible and scalable I/O solutions. 10G/25G/40G Ethernet Adapters Network Adapters IDEAL FOR DATACENTER, ENTERPRISE & ISP NETWORK SOLUTIONS FS Network adapter are designed for data center, and provides flexible and scalable I/O solutions. 10G/25G/40G Ethernet Adapters

More information

XE1-P241. XE1-P241 PCI Express PCIe x4 Dual SFP Port Gigabit Server Adapter (Intel I350 Based) Product Highlight

XE1-P241. XE1-P241 PCI Express PCIe x4 Dual SFP Port Gigabit Server Adapter (Intel I350 Based) Product Highlight Product Highlight o Halogen-free dual-port Gigabit Ethernet adapters with fiber interface options o Innovative power management features including Energy Efficient Ethernet (EEE) and DMA Coalescing for

More information

ROB IN Performance Measurements

ROB IN Performance Measurements ROB IN Performance Measurements I. Mandjavidze CEA Saclay, 91191 Gif-sur-Yvette CEDEX, France ROB Complex Hardware Organisation Mode of Operation ROB Complex Software Organisation Performance Measurements

More information

Latest Developments with NVMe/TCP Sagi Grimberg Lightbits Labs

Latest Developments with NVMe/TCP Sagi Grimberg Lightbits Labs Latest Developments with NVMe/TCP Sagi Grimberg Lightbits Labs 2018 Storage Developer Conference. Insert Your Company Name. All Rights Reserved. 1 NVMe-oF - Short Recap Early 2014: Initial NVMe/RDMA pre-standard

More information

One Server Per City: Using TCP for Very Large SIP Servers. Kumiko Ono Henning Schulzrinne {kumiko,

One Server Per City: Using TCP for Very Large SIP Servers. Kumiko Ono Henning Schulzrinne {kumiko, One Server Per City: Using TCP for Very Large SIP Servers Kumiko Ono Henning Schulzrinne {kumiko, hgs}@cs.columbia.edu Goal Answer the following question: How does using TCP affect the scalability and

More information

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班 I/O virtualization Jiang, Yunhong Yang, Xiaowei 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup Chapter 4 Routers with Tiny Buffers: Experiments This chapter describes two sets of experiments with tiny buffers in networks: one in a testbed and the other in a real network over the Internet2 1 backbone.

More information

SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet

SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet Mao Miao, Fengyuan Ren, Xiaohui Luo, Jing Xie, Qingkai Meng, Wenxue Cheng Dept. of Computer Science and Technology, Tsinghua

More information

CSC 5930/9010 Cloud S & P: Virtualization

CSC 5930/9010 Cloud S & P: Virtualization CSC 5930/9010 Cloud S & P: Virtualization Professor Henry Carter Fall 2016 Recap Network traffic can be encrypted at different layers depending on application needs TLS: transport layer IPsec: network

More information

OpenFlow Software Switch & Intel DPDK. performance analysis

OpenFlow Software Switch & Intel DPDK. performance analysis OpenFlow Software Switch & Intel DPDK performance analysis Agenda Background Intel DPDK OpenFlow 1.3 implementation sketch Prototype design and setup Results Future work, optimization ideas OF 1.3 prototype

More information

The Price of Safety: Evaluating IOMMU Performance

The Price of Safety: Evaluating IOMMU Performance The Price of Safety: Evaluating IOMMU Performance Muli Ben-Yehuda 1 Jimi Xenidis 2 Michal Ostrowski 2 Karl Rister 3 Alexis Bruemmer 3 Leendert Van Doorn 4 1 muli@il.ibm.com 2 {jimix,mostrows}@watson.ibm.com

More information

Xen Project Status Ian Pratt 12/3/07 1

Xen Project Status Ian Pratt 12/3/07 1 Xen Project Status Ian Pratt 12/3/07 1 Project Status xen.org and the Xen Advisory Board Xen project mission Ubiquitous virtualization Realizing Xen s architectural advantages From servers to clients Interoperability

More information

The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication

The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication John Markus Bjørndalen, Otto J. Anshus, Brian Vinter, Tore Larsen Department of Computer Science University

More information

Rocker: switchdev prototyping vehicle

Rocker: switchdev prototyping vehicle Rocker: switchdev prototyping vehicle Scott Feldman Somewhere in Oregon, USA sfeldma@gmail.com Abstract Rocker is an emulated network switch platform created to accelerate development of an in kernel network

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

Architecture and Performance Implications

Architecture and Performance Implications VMWARE WHITE PAPER VMware ESX Server 2 Architecture and Performance Implications ESX Server 2 is scalable, high-performance virtualization software that allows consolidation of multiple applications in

More information

Message Passing Architecture in Intra-Cluster Communication

Message Passing Architecture in Intra-Cluster Communication CS213 Message Passing Architecture in Intra-Cluster Communication Xiao Zhang Lamxi Bhuyan @cs.ucr.edu February 8, 2004 UC Riverside Slide 1 CS213 Outline 1 Kernel-based Message Passing

More information

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK [r.tasker@dl.ac.uk] DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459

More information

ABSTRACT. New Architectures and Mechanisms for the Network Subsystem in Virtualized Servers. Kaushik Kumar Ram

ABSTRACT. New Architectures and Mechanisms for the Network Subsystem in Virtualized Servers. Kaushik Kumar Ram ABSTRACT New Architectures and Mechanisms for the Network Subsystem in Virtualized Servers by Kaushik Kumar Ram Machine virtualization has become a cornerstone of modern datacenters. It enables server

More information

Network device drivers in Linux

Network device drivers in Linux Network device drivers in Linux Aapo Kalliola Aalto University School of Science Otakaari 1 Espoo, Finland aapo.kalliola@aalto.fi ABSTRACT In this paper we analyze the interfaces, functionality and implementation

More information

Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat. ACM SIGCOMM 2013, August, Hong Kong, China

Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat. ACM SIGCOMM 2013, August, Hong Kong, China Got Loss? Get zovn! Daniel Crisan, Robert Birke, Gilles Cressier, Cyriel Minkenberg, and Mitch Gusat ACM SIGCOMM 2013, 12-16 August, Hong Kong, China Virtualized Server 1 Application Performance in Virtualized

More information

HKG net_mdev: Fast-path userspace I/O. Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog

HKG net_mdev: Fast-path userspace I/O. Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog HKG18-110 net_mdev: Fast-path userspace I/O Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog Why userland I/O Time sensitive networking Developed mostly for Industrial IOT, automotive and audio/video

More information

Xen and the Art of Virtualization. Nikola Gvozdiev Georgian Mihaila

Xen and the Art of Virtualization. Nikola Gvozdiev Georgian Mihaila Xen and the Art of Virtualization Nikola Gvozdiev Georgian Mihaila Outline Xen and the Art of Virtualization Ian Pratt et al. I. The Art of Virtualization II. Xen, goals and design III. Xen evaluation

More information

Memory mapped netlink

Memory mapped netlink Patrick McHardy Netfilter Workshop 2011 Freiburg im Breisgau, Germany Current state of affairs Netlink uses regular socket I/O Messages are constructed into a socket buffer's data area,

More information

ebpf Offload to Hardware cls_bpf and XDP

ebpf Offload to Hardware cls_bpf and XDP ebpf Offload to Hardware cls_bpf and Nic Viljoen, DXDD (Based on Netdev 1.2 talk) November 10th 2016 1 What is ebpf? A universal in-kernel virtual machine 10 64-bit registers 512 byte stack Infinite size

More information

An Energy-Efficient Asymmetric Multi-Processor for HPC Virtualization

An Energy-Efficient Asymmetric Multi-Processor for HPC Virtualization An Energy-Efficient Asymmetric Multi-Processor for HP Virtualization hung Lee and Peter Strazdins*, omputer Systems Group, Research School of omputer Science, The Australian National University (slides

More information

VM Migration, Containers (Lecture 12, cs262a)

VM Migration, Containers (Lecture 12, cs262a) VM Migration, Containers (Lecture 12, cs262a) Ali Ghodsi and Ion Stoica, UC Berkeley February 28, 2018 (Based in part on http://web.eecs.umich.edu/~mosharaf/slides/eecs582/w16/021516-junchenglivemigration.pptx)

More information

Fairness Issues in Software Virtual Routers

Fairness Issues in Software Virtual Routers Fairness Issues in Software Virtual Routers Norbert Egi, Adam Greenhalgh, h Mark Handley, Mickael Hoerdt, Felipe Huici, Laurent Mathy Lancaster University PRESTO 2008 Presenter: Munhwan Choi Virtual Router

More information

Xen and the Art of Virtualization

Xen and the Art of Virtualization Xen and the Art of Virtualization Paul Barham,, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer,, Ian Pratt, Andrew Warfield University of Cambridge Computer Laboratory Presented

More information

DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture. Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr.

DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture. Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr. DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr. Product Manager CONTRAIL (MULTI-VENDOR) ARCHITECTURE ORCHESTRATOR Interoperates

More information

Interrupt Coalescing in Xen

Interrupt Coalescing in Xen Interrupt Coalescing in Xen with Scheduler Awareness Michael Peirce & Kevin Boos Outline Background Hypothesis vic-style Interrupt Coalescing Adding Scheduler Awareness Evaluation 2 Background Xen split

More information

Memory Management Strategies for Data Serving with RDMA

Memory Management Strategies for Data Serving with RDMA Memory Management Strategies for Data Serving with RDMA Dennis Dalessandro and Pete Wyckoff (presenting) Ohio Supercomputer Center {dennis,pw}@osc.edu HotI'07 23 August 2007 Motivation Increasing demands

More information

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers The Missing Piece of Virtualization I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers Agenda 10 GbE Adapters Built for Virtualization I/O Throughput: Virtual & Non-Virtual Servers Case

More information

Container Adoption for NFV Challenges & Opportunities. Sriram Natarajan, T-Labs Silicon Valley Innovation Center

Container Adoption for NFV Challenges & Opportunities. Sriram Natarajan, T-Labs Silicon Valley Innovation Center Container Adoption for NFV Challenges & Opportunities Sriram Natarajan, T-Labs Silicon Valley Innovation Center Virtual Machine vs. Container Stack KVM Container-stack Libraries Guest-OS Hypervisor Libraries

More information

Modeling Architecture-OS Interactions using Layered Queuing Network Models.

Modeling Architecture-OS Interactions using Layered Queuing Network Models. Modeling Architecture-OS Interactions using Layered Queuing Network Models. J. Lakshmi, S.K. Nandy, Indian Institute of Science, Bangalore, India {jlakshmi, nandy }@serc.iisc.ernet.in Abstract The prevalent

More information

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand Device Programming Nima Honarmand read/write interrupt read/write Spring 2017 :: CSE 506 Device Interface (Logical View) Device Interface Components: Device registers Device Memory DMA buffers Interrupt

More information

Supporting Fine-Grained Network Functions through Intel DPDK

Supporting Fine-Grained Network Functions through Intel DPDK Supporting Fine-Grained Network Functions through Intel DPDK Ivano Cerrato, Mauro Annarumma, Fulvio Risso - Politecnico di Torino, Italy EWSDN 2014, September 1st 2014 This project is co-funded by the

More information

Knut Omang Ifi/Oracle 6 Nov, 2017

Knut Omang Ifi/Oracle 6 Nov, 2017 Software and hardware support for Network Virtualization part 1 Knut Omang Ifi/Oracle 6 Nov, 2017 1 Motivation Goal: Introduction to challenges in providing fast networking to virtual machines Prerequisites:

More information

Receive Livelock. Robert Grimm New York University

Receive Livelock. Robert Grimm New York University Receive Livelock Robert Grimm New York University The Three Questions What is the problem? What is new or different? What are the contributions and limitations? Motivation Interrupts work well when I/O

More information

PE2G4SFPI35L Quad Port SFP Gigabit Ethernet PCI Express Server Adapter Intel i350am4 Based

PE2G4SFPI35L Quad Port SFP Gigabit Ethernet PCI Express Server Adapter Intel i350am4 Based PE2G4SFPI35L Quad Port SFP Gigabit Ethernet PCI Express Server Adapter Intel i350am4 Based Product Description Silicom s Quad Port SFP Gigabit Ethernet PCI Express Server adapter is PCI-Express X4 SFP

More information

FAQ. Release rc2

FAQ. Release rc2 FAQ Release 19.02.0-rc2 January 15, 2019 CONTENTS 1 What does EAL: map_all_hugepages(): open failed: Permission denied Cannot init memory mean? 2 2 If I want to change the number of hugepages allocated,

More information

Xenrelay: An Efficient Data Transmitting Approach for Tracing Guest Domain

Xenrelay: An Efficient Data Transmitting Approach for Tracing Guest Domain Xenrelay: An Efficient Data Transmitting Approach for Tracing Guest Domain Hai Jin, Wenzhi Cao, Pingpeng Yuan, Xia Xie Cluster and Grid Computing Lab Services Computing Technique and System Lab Huazhong

More information