Linux Kernel forwarding. Masakazu Ginzado Co., Ltd.

Size: px
Start display at page:

Download "Linux Kernel forwarding. Masakazu Ginzado Co., Ltd."

Transcription

1 Linux Kernel forwarding Masakazu Ginzado Co., Ltd.

2 Agenda Cisco PC-Router Hardware Cisco PC-Router Software Cisco forwarding Linux Kernel forwarding Linux Kernel forwarding Linux PC-Router

3 Hardware Cisco PC-Router Memory Memory Memory Forwarding ASIC CPU

4 Software Cisco PC-Router Protection Loose Strict Multi-tasking Co-operative Pre-emptive Multi-Processor UP SMP

5 Cisco Process Switching ip_input(process) ip_input(process) Processor Routing Table / / ARP Table I/O memory cbb Interface Processor Interface Processor Network media Network media

6 Cisco Fast Switching ip_input(process) ip_input(process) Processor Route Cache Routing Table / / I/O memory ARP Table Interface Processor Interface Processor cbb Network media Network media

7 Cisco Express Forwarding CEF Table 1 octet octet octet octet octet octet octet Adjacency Table cbb076 FE0/ FE1/0

8 Hardware ASIC Forwarding Switch Fabric Interface FIA FIA RAWQ PSA memory Rx Buffer Manager PSA FIFO burst memory RAWQ Physical Layer Interface Module Tx Buffer Manager CPU CEF table FIA Fabric Interface ASIC PSA Packet Switching ASIC RawQ SRAM CEF table DRAM

9 Software vs Hardware PPS 400Mpps 300Mpps 200Mpps 100Mpps 0Mpps Cisco 7200 Cisco 7600

10 Linux Kernel Forwarding ip_rcv ip_rcv_finish ip_forward ip_forward_finish ip_send ip_route_input fib_lookup ip_output rt_hash_code IP Processing ip_finish_output Poll_Queue CPU1 Device 1 Device 2 Device 3 NAPI netif_receive_skb e1000_alloc_rx_buffers e1000_clean_rx_irq Rx_Ring device 3 eth_header eth_type_trans alloc_skb Kernel memory DMA engines dev_queue_xmit e1000_clean_tx_irq Completion queue kfree Root Qdisc device 2 Tx_Ring device 2 qdisc_restart hard_start_xmit e1000_xmit_frame TX-API net_rx_action Interrupt handler net_tx_action H/W Interrupt

11 Receive live-lock & NAPI Old API New API H/W IRQ CPU CPU H/W IRQ H/W IRQ disable IRQ polling packets enable IRQ NIC NIC Packet Packet

12 struct rt_hash_table struct rt_hash_table[] struct rtable struct rtable Src IP Dst IP ifindex struct neighbour

13 struct fib_table struct fib_table struct hlist_head struct fn_hash prefixlen 0 prefixlen 23 Prefix struct fn_zone struct fib_node prefixlen 24 struct hlist_head prefixlen 32 struct fn_zone *fn_zone_list

14 DEFAULT RT/FIXED DST route add default to byte UDP packet abount 850kpps DST DST DST FULL RT/FIXED DST route add /16 to route add /16 to abount 850kpps 64byte UDP packet DST DST DST DEFAULT RT/RANDOM DST route add default to abount 270kpps 64byte UDP packet DST DST DST FULL RT/RANDOM DST route add /16 to route add /16 to abount 270kpps 64byte UDP packet DST DST DST

15 DEFAULT RT/FIXED DST DEFAULT RT/RANDOM DST FULL RT/FIXED DST FULL RT/RANDOM DST 700, , ,000 PPS 400, , , ,000 0 VC6beta E6405 CentOS5 E6405 FreeBSD8 E6405 OpenBSD46 E6405 CentOS5 L5520 E6405 Core2 Duo 2.13GHz L5520 Xeon 2.27GHz

16 tg3_poll ip_route_input.text.acpi_processor_idle tg3_start_xmit_dma_bug kmem_cache_free.text.fn_hash_lookup ip_output netif_receive_skb qdisc_run kmalloc.text ip_rcv dst_destroy kfree acpi_os_read_port dst_alloc kmem_cache_alloc dev_queue_xmit.text.rt_may_expire OTHERS 100% 80% 60% 40% kmem_cache_alloc kfree ip_rcv kmalloc qdisc_run netif_receive_skb ip_output tg3_start_xmit_dma_bug dev_queue_xmit rt_may_expire dst_alloc dst_destroy fn_hash_lookup kmem_cache_free acpi_processor_idle 20% ip_route_input 0% tg3_poll DEF RT/FIXED DST FULL RT/FIXED DST DEF RT/RAND DST FULL RT/RAND DST Kernel Oprofile result by symbols

17 tg3.ko route.c slab.c processor_idle.c dev.c fib_hash.c dst.c sch_generic.c ip_output.c skbuff.c ip_input.c osl.c neighbour.c eth.c list_debug.c rcupdate.c fib_rules.c softirq.c ip_forward.c OTHERS 100% 80% 60% ip_forward.c ip_input.c skbuff.c ip_output.c sch_generic.c dev.c fib_rules.c neighbour.c dst.c fib_hash.c processor_idle.c 40% slab.c 20% route.c 0% tg3.ko DEF RT/FIXED DST FULL RT/FIXED DST DEF RT/RAND DST FULL RT/RAND DST Kernel Oprofile result by apps

18 Ethernet flow control Ehternet Switch PAUSE Frame NIC Buffer is full... Rx Ring

19 irqbalance CPU #1 CPU #2 from NIC #1 to CPU #1 from NIC #2 to CPU #2 APIC irqbalance Interupt NIC #1 NIC#2

20 UMA vs NUMA Core MA Nehalem CPU #1 CPU #2 CPU #1 CPU #2 Core0 Core1 Core0 Core1 Core0 Core1 Core0 Core1 Core2 Core3 Core2 Core3 Core2 Core3 Core2 Core3 L2 Cache L2 Cache L2 Cache L2 Cache L3 Cache L3 Cache FSB 10.66GB/s 10.66GB/s x 3 QPI 25.6GB/s Chipset Memory 32GB/s Chipset Memory 32GB/s 10.66GB/s x 2 Memory 21GB/s

21 ASIC

22 ... Today s high-end PCs are equipped with peripheral component interconnect (PCI) shared buses that fit into the multi-gigabit-persecond routing segment, for a price much lower than that of commercial routers. However, commercially available PC Network Interface Cards (NICs) lack programmability, and require not only packets to cross the PCI bus twice, but also to process them in software by the operating system, reducing routing performance. In this paper we describe and assess the performance of an FPGA-based NIC developed to overcome the main limitations of NICs commercially available today. --- Andrea Bianco, Robert Birke, Jorge M. Finochietto, Giulio Galante, Marco Mellia, Fabio Neri, Michele Petracca, Boosting the performance of PC-based software routers with FPGA-enhanced line cards

23 ASIC vs FPGA ASIC Application Specific Integrated Circuit FPGA Field Programmable Gate Array SRAM

24 KIT STARTER CYCLONE IV GX ALTERA $395.00

25 Memory CPU HDL(Hardware Description Language) library IEEE; use IEEE.std_logic_1164.all; PCI Exp bus Slow Path Fast Path PCI Exp PCI Exp entity RS_FF is port ( R, S in std_logic; Q, Q_B out std_logif ); end RS_FF; architecture RS_FF of RS_FF is signal R_S std_logic_vector( 1 downto 0) begin R_S <= R & S; process ( R_S ) begin case R_S is when "01" => Q <= '1'; Q_B <= '0';... FPGA FPGA Eth Phy Eth Phy

26 HDL

27 Benchmark by Intel 3.0Mpps 2.5Mpps 2.0Mpps 1.5Mpps 1.0Mpps 0.5Mpps Mpps Gbps 20.0Gbps 15.0Gbps 10.0Gbps 5.0Gbps 0Mpps Packet Size 0Gbps performed by Vyatta Community Edition software single Intel Xeon processor 5540 and Intel 82599EB 10 Gigabit Ethernet Controller Source Integrating Services at the Edge

28 Inside Cisco IOS Software Architecture (CCIE Professional Development) - http//my.safaribooksonline.com/ Cisco Express Forwarding - http//my.safaribooksonline.com/ Linux Kernel Source Code - http// Linux Software Router Data Plane Optimization and Performance Evaluation - http//citeseerx.ist.psu.edu/viewdoc/download?doi= &rep=rep1&type=pdf Eliminating Receive Livelock in an Interrupt-driven Kernel - http// Boosting the performance of PC-based software routers with FPGA-enhanced line cards - http// Integrating Services at the Edge - http//edc.intel.com/download.aspx?id=2977&returnurl=/default.aspx Vyatta Replacement Guide - Cisco - http//

Boosting the Performance of PC-based Software Routers with FPGA-enhanced Network Interface Cards

Boosting the Performance of PC-based Software Routers with FPGA-enhanced Network Interface Cards Boosting the Performance of PC-based Software Routers with FPGA-enhanced Network Interface Cards Andrea Bianco, Robert Birke, Gianluca Botto, Marcello Chiaberge, Jorge M. Finochietto, Giulio Galante, Marco

More information

Chapter 3 Internet Protocol Layer

Chapter 3 Internet Protocol Layer Chapter 3 Internet Protocol Layer Problem Statement The Internet protocol (IP) layer provides a host-to-host transmission service. It is the most critical layer of the Internet protocol stack. Compared

More information

Scalable Layer-2/Layer-3 Multistage Switching Architectures for Software Routers

Scalable Layer-2/Layer-3 Multistage Switching Architectures for Software Routers Scalable Layer-2/Layer-3 Multistage Switching Architectures for Software Routers Andrea Bianco, Jorge M. Finochietto, Giulio Galante, Marco Mellia, Davide Mazzucchi, Fabio Neri Dipartimento di Elettronica,

More information

February 10-11, Uppsala, Sweden

February 10-11, Uppsala, Sweden February 2003 FIRST Technical Colloquium February 10-11, 2003 @ Uppsala, Sweden bifrost a high performance router & firewall Robert Olsson Hans Wassen Bifrost concept Small size Linux distribution targeted

More information

Advanced Computer Networks. End Host Optimization

Advanced Computer Networks. End Host Optimization Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct

More information

Linux IP Networking. Antonio Salueña

Linux IP Networking. Antonio Salueña Linux IP Networking Antonio Salueña Preface We will study linux networking for the following case: Intel x86 architecture IP packets Recent stable linux kernel series 2.4.x 2 Overview

More information

OpenFlow Software Switch & Intel DPDK. performance analysis

OpenFlow Software Switch & Intel DPDK. performance analysis OpenFlow Software Switch & Intel DPDK performance analysis Agenda Background Intel DPDK OpenFlow 1.3 implementation sketch Prototype design and setup Results Future work, optimization ideas OF 1.3 prototype

More information

Click vs. Linux: Two Efficient Open-Source IP Network Stacks for Software Routers

Click vs. Linux: Two Efficient Open-Source IP Network Stacks for Software Routers Click vs. Linux: Two Efficient Open-Source IP Network Stacks for Software Routers Andrea Bianco, Robert Birke, Davide Bolognesi, Jorge M. Finochietto, Giulio Galante, Marco Mellia, Prashant M.L.N.P.P.,

More information

NetFPGA Hardware Architecture

NetFPGA Hardware Architecture NetFPGA Hardware Architecture Jeffrey Shafer Some slides adapted from Stanford NetFPGA tutorials NetFPGA http://netfpga.org 2 NetFPGA Components Virtex-II Pro 5 FPGA 53,136 logic cells 4,176 Kbit block

More information

Chapter 3. Internet Protocol Layer

Chapter 3. Internet Protocol Layer Chapter 3 Internet Protocol Layer Problem Statement The Internet protocol (IP) layer provides a host-to-host transmission service. In order to provide the host-to-host service, how to connect millions

More information

Tolerating Malicious Drivers in Linux. Silas Boyd-Wickizer and Nickolai Zeldovich

Tolerating Malicious Drivers in Linux. Silas Boyd-Wickizer and Nickolai Zeldovich XXX Tolerating Malicious Drivers in Linux Silas Boyd-Wickizer and Nickolai Zeldovich How could a device driver be malicious? Today's device drivers are highly privileged Write kernel memory, allocate memory,...

More information

An Effective Forwarding Architecture for SMP Linux Routers

An Effective Forwarding Architecture for SMP Linux Routers 1 An Effective Forwarding Architecture for SMP Linux Routers Raffaele Bolla and Roberto Bruschi Abstract Recent technological advances provide an excellent opportunity to achieve truly effective results

More information

Performance Enhancement for IPsec Processing on Multi-Core Systems

Performance Enhancement for IPsec Processing on Multi-Core Systems Performance Enhancement for IPsec Processing on Multi-Core Systems Sandeep Malik Freescale Semiconductor India Pvt. Ltd IDC Noida, India Ravi Malhotra Freescale Semiconductor India Pvt. Ltd IDC Noida,

More information

Open-Source PC-Based Software Routers: A Viable Approach to High-Performance Packet Switching

Open-Source PC-Based Software Routers: A Viable Approach to High-Performance Packet Switching Open-Source PC-Based Software Routers: A Viable Approach to High-Performance Packet Switching Andrea Bianco 1, Jorge M. Finochietto 1, Giulio Galante 2, Marco Mellia 1, and Fabio Neri 1 1 Dipartimento

More information

POLITECNICO DI TORINO Repository ISTITUZIONALE

POLITECNICO DI TORINO Repository ISTITUZIONALE POLITECNICO DI TORINO Repository ISTITUZIONALE HERO: High-speed enhanced routing operation in Ethernet NICs for software routers Original HERO: High-speed enhanced routing operation in Ethernet NICs for

More information

The Power of Batching in the Click Modular Router

The Power of Batching in the Click Modular Router The Power of Batching in the Click Modular Router Joongi Kim, Seonggu Huh, Keon Jang, * KyoungSoo Park, Sue Moon Computer Science Dept., KAIST Microsoft Research Cambridge, UK * Electrical Engineering

More information

10GE network tests with UDP. Janusz Szuba European XFEL

10GE network tests with UDP. Janusz Szuba European XFEL 10GE network tests with UDP Janusz Szuba European XFEL Outline 2 Overview of initial DAQ architecture Slice test hardware specification Initial networking test results DAQ software UDP tests Summary 10GE

More information

DPDK Summit China 2017

DPDK Summit China 2017 Summit China 2017 Embedded Network Architecture Optimization Based on Lin Hao T1 Networks Agenda Our History What is an embedded network device Challenge to us Requirements for device today Our solution

More information

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems

More information

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration

More information

Generic Model of I/O Module Interface to CPU and Memory Interface to one or more peripherals

Generic Model of I/O Module Interface to CPU and Memory Interface to one or more peripherals William Stallings Computer Organization and Architecture 7 th Edition Chapter 7 Input/Output Input/Output Problems Wide variety of peripherals Delivering different amounts of data At different speeds In

More information

Reliably Scalable Name Prefix Lookup! Haowei Yuan and Patrick Crowley! Washington University in St. Louis!! ANCS 2015! 5/8/2015!

Reliably Scalable Name Prefix Lookup! Haowei Yuan and Patrick Crowley! Washington University in St. Louis!! ANCS 2015! 5/8/2015! Reliably Scalable Name Prefix Lookup! Haowei Yuan and Patrick Crowley! Washington University in St. Louis!! ANCS 2015! 5/8/2015! ! My Topic for Today! Goal: a reliable longest name prefix lookup performance

More information

Implemen'ng IPv6 Segment Rou'ng in the Linux Kernel

Implemen'ng IPv6 Segment Rou'ng in the Linux Kernel Implemen'ng IPv6 Segment Rou'ng in the Linux Kernel David Lebrun, Olivier Bonaventure ICTEAM, UCLouvain Work supported by ARC grant 12/18-054 (ARC-SDN) and a Cisco grant Agenda IPv6 Segment Rou'ng Implementa'on

More information

Xen Network I/O Performance Analysis and Opportunities for Improvement

Xen Network I/O Performance Analysis and Opportunities for Improvement Xen Network I/O Performance Analysis and Opportunities for Improvement J. Renato Santos G. (John) Janakiraman Yoshio Turner HP Labs Xen Summit April 17-18, 27 23 Hewlett-Packard Development Company, L.P.

More information

Demystifying Network Cards

Demystifying Network Cards Demystifying Network Cards Paul Emmerich December 27, 2017 Chair of Network Architectures and Services About me PhD student at Researching performance of software packet processing systems Mostly working

More information

ntop Users Group Meeting

ntop Users Group Meeting ntop Users Group Meeting PF_RING Tutorial Alfredo Cardigliano Overview Introduction Installation Configuration Tuning Use cases PF_RING Open source packet processing framework for

More information

Arrakis: The Operating System is the Control Plane

Arrakis: The Operating System is the Control Plane Arrakis: The Operating System is the Control Plane Simon Peter, Jialin Li, Irene Zhang, Dan Ports, Doug Woos, Arvind Krishnamurthy, Tom Anderson University of Washington Timothy Roscoe ETH Zurich Building

More information

High-Speed Forwarding: A P4 Compiler with a Hardware Abstraction Library for Intel DPDK

High-Speed Forwarding: A P4 Compiler with a Hardware Abstraction Library for Intel DPDK High-Speed Forwarding: A P4 Compiler with a Hardware Abstraction Library for Intel DPDK Sándor Laki Eötvös Loránd University Budapest, Hungary lakis@elte.hu Motivation Programmability of network data plane

More information

Reliability of datagram transmission on Gigabit Ethernet at full link load

Reliability of datagram transmission on Gigabit Ethernet at full link load Reliability of datagram transmission on Gigabit Ethernet at full link load LHCb Technical Note Issue: 1 Revision: 0 Reference: LPHE 2005-06 / LHCB 2004-030 DAQ Created: 17 th Oct 2003 Last modified: 31

More information

An FPGA-Based Optical IOH Architecture for Embedded System

An FPGA-Based Optical IOH Architecture for Embedded System An FPGA-Based Optical IOH Architecture for Embedded System Saravana.S Assistant Professor, Bharath University, Chennai 600073, India Abstract Data traffic has tremendously increased and is still increasing

More information

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand Device Programming Nima Honarmand read/write interrupt read/write Spring 2017 :: CSE 506 Device Interface (Logical View) Device Interface Components: Device registers Device Memory DMA buffers Interrupt

More information

Enabling Fast, Dynamic Network Processing with ClickOS

Enabling Fast, Dynamic Network Processing with ClickOS Enabling Fast, Dynamic Network Processing with ClickOS Joao Martins*, Mohamed Ahmed*, Costin Raiciu, Roberto Bifulco*, Vladimir Olteanu, Michio Honda*, Felipe Huici* * NEC Labs Europe, Heidelberg, Germany

More information

Routing architecture and forwarding

Routing architecture and forwarding DD2490 p4 2011 Routing architecture and forwarding & Intro to Homework 4 Olof Hagsand KTH /CSC 1 Connecting devices Connecting devices Networking devices Internetworking devices Hub/ Hub/ Repeater Bridge/

More information

Much Faster Networking

Much Faster Networking Much Faster Networking David Riddoch driddoch@solarflare.com Copyright 2016 Solarflare Communications, Inc. All rights reserved. What is kernel bypass? The standard receive path The standard receive path

More information

HKG net_mdev: Fast-path userspace I/O. Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog

HKG net_mdev: Fast-path userspace I/O. Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog HKG18-110 net_mdev: Fast-path userspace I/O Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog Why userland I/O Time sensitive networking Developed mostly for Industrial IOT, automotive and audio/video

More information

Computer Organization ECE514. Chapter 5 Input/Output (9hrs)

Computer Organization ECE514. Chapter 5 Input/Output (9hrs) Computer Organization ECE514 Chapter 5 Input/Output (9hrs) Learning Outcomes Course Outcome (CO) - CO2 Describe the architecture and organization of computer systems Program Outcome (PO) PO1 Apply knowledge

More information

Agilio CX 2x40GbE with OVS-TC

Agilio CX 2x40GbE with OVS-TC PERFORMANCE REPORT Agilio CX 2x4GbE with OVS-TC OVS-TC WITH AN AGILIO CX SMARTNIC CAN IMPROVE A SIMPLE L2 FORWARDING USE CASE AT LEAST 2X. WHEN SCALED TO REAL LIFE USE CASES WITH COMPLEX RULES TUNNELING

More information

Programmable NICs. Lecture 14, Computer Networks (198:552)

Programmable NICs. Lecture 14, Computer Networks (198:552) Programmable NICs Lecture 14, Computer Networks (198:552) Network Interface Cards (NICs) The physical interface between a machine and the wire Life of a transmitted packet Userspace application NIC Transport

More information

INT-1010 TCP Offload Engine

INT-1010 TCP Offload Engine INT-1010 TCP Offload Engine Product brief, features and benefits summary Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx or Altera FPGAs INT-1010 is highly flexible that is

More information

Getting Real Performance from a Virtualized CCAP

Getting Real Performance from a Virtualized CCAP Getting Real Performance from a Virtualized CCAP A Technical Paper prepared for SCTE/ISBE by Mark Szczesniak Software Architect Casa Systems, Inc. 100 Old River Road Andover, MA, 01810 978-688-6706 mark.szczesniak@casa-systems.com

More information

High Speed Packet Filtering on Linux

High Speed Packet Filtering on Linux past, present & future of High Speed Packet Filtering on Linux Gilberto Bertin $ whoami System engineer at Cloudflare DDoS mitigation team Enjoy messing with networking and low level things Cloudflare

More information

Software Routers: NetMap

Software Routers: NetMap Software Routers: NetMap Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking October 8, 2014 Slides from the NetMap: A Novel Framework for

More information

Input/Output Problems. External Devices. Input/Output Module. I/O Steps. I/O Module Function Computer Architecture

Input/Output Problems. External Devices. Input/Output Module. I/O Steps. I/O Module Function Computer Architecture 168 420 Computer Architecture Chapter 6 Input/Output Input/Output Problems Wide variety of peripherals Delivering different amounts of data At different speeds In different formats All slower than CPU

More information

Open Source Traffic Analyzer

Open Source Traffic Analyzer Open Source Traffic Analyzer Daniel Turull June 2010 Outline 1 Introduction 2 Background study 3 Design 4 Implementation 5 Evaluation 6 Conclusions 7 Demo Outline 1 Introduction 2 Background study 3 Design

More information

Organisasi Sistem Komputer

Organisasi Sistem Komputer LOGO Organisasi Sistem Komputer OSK 5 Input Output 1 1 PT. Elektronika FT UNY Input/Output Problems Wide variety of peripherals Delivering different amounts of data At different speeds In different formats

More information

The IP Lookup Mechanism in a Linux Software Router: Performance Evaluation and Optimizations

The IP Lookup Mechanism in a Linux Software Router: Performance Evaluation and Optimizations The IP Lookup Mechanism in a Linux Software Router: Performance Evaluation and Optimizations Raffaele Bolla, Roberto Bruschi DIST - Department of Communications, Computer and Systems Science University

More information

INT 1011 TCP Offload Engine (Full Offload)

INT 1011 TCP Offload Engine (Full Offload) INT 1011 TCP Offload Engine (Full Offload) Product brief, features and benefits summary Provides lowest Latency and highest bandwidth. Highly customizable hardware IP block. Easily portable to ASIC flow,

More information

Learning with Purpose

Learning with Purpose Network Measurement for 100Gbps Links Using Multicore Processors Xiaoban Wu, Dr. Peilong Li, Dr. Yongyi Ran, Prof. Yan Luo Department of Electrical and Computer Engineering University of Massachusetts

More information

nforce 680i and 680a

nforce 680i and 680a nforce 680i and 680a NVIDIA's Next Generation Platform Processors Agenda Platform Overview System Block Diagrams C55 Details MCP55 Details Summary 2 Platform Overview nforce 680i For systems using the

More information

Introduction to Routers and LAN Switches

Introduction to Routers and LAN Switches Introduction to Routers and LAN Switches Session 3048_05_2001_c1 2001, Cisco Systems, Inc. All rights reserved. 3 Prerequisites OSI Model Networking Fundamentals 3048_05_2001_c1 2001, Cisco Systems, Inc.

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK [r.tasker@dl.ac.uk] DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459

More information

Improve Performance of Kube-proxy and GTP-U using VPP

Improve Performance of Kube-proxy and GTP-U using VPP Improve Performance of Kube-proxy and GTP-U using VPP Hongjun Ni (hongjun.ni@intel.com) Danny Zhou (danny.zhou@intel.com) Johnson Li (johnson.li@intel.com) Network Platform Group, DCG, Intel Acknowledgement:

More information

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC,

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC, Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC, QorIQ, StarCore and Symphony are trademarks of Freescale

More information

Large Receive Offload implementation in Neterion 10GbE Ethernet driver

Large Receive Offload implementation in Neterion 10GbE Ethernet driver Large Receive Offload implementation in Neterion 10GbE Ethernet driver Leonid Grossman Neterion, Inc. leonid@neterion.com Abstract 1 Introduction The benefits of TSO (Transmit Side Offload) implementation

More information

Networking in a Vertically Scaled World

Networking in a Vertically Scaled World Networking in a Vertically Scaled World David S. Miller Red Hat Inc. LinuxTAG, Berlin, 2008 OUTLINE NETWORK PRINCIPLES MICROPROCESSOR HISTORY IMPLICATIONS FOR NETWORKING LINUX KERNEL HORIZONTAL NETWORK

More information

Network Superhighway CSCD 330. Network Programming Winter Lecture 13 Network Layer. Reading: Chapter 4

Network Superhighway CSCD 330. Network Programming Winter Lecture 13 Network Layer. Reading: Chapter 4 CSCD 330 Network Superhighway Network Programming Winter 2015 Lecture 13 Network Layer Reading: Chapter 4 Some slides provided courtesy of J.F Kurose and K.W. Ross, All Rights Reserved, copyright 1996-2007

More information

VALE: a switched ethernet for virtual machines

VALE: a switched ethernet for virtual machines L < > T H local VALE VALE -- Page 1/23 VALE: a switched ethernet for virtual machines Luigi Rizzo, Giuseppe Lettieri Università di Pisa http://info.iet.unipi.it/~luigi/vale/ Motivation Make sw packet processing

More information

The Network Layer and Routers

The Network Layer and Routers The Network Layer and Routers Daniel Zappala CS 460 Computer Networking Brigham Young University 2/18 Network Layer deliver packets from sending host to receiving host must be on every host, router in

More information

Last class: Today: Course administration OS definition, some history. Background on Computer Architecture

Last class: Today: Course administration OS definition, some history. Background on Computer Architecture 1 Last class: Course administration OS definition, some history Today: Background on Computer Architecture 2 Canonical System Hardware CPU: Processor to perform computations Memory: Programs and data I/O

More information

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC,

Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC, Freescale, the Freescale logo, AltiVec, C-5, CodeTEST, CodeWarrior, ColdFire, C-Ware, t he Energy Efficient Solutions logo, mobilegt, PowerQUICC, QorIQ, StarCore and Symphony are trademarks of Freescale

More information

Operating Systems. 17. Sockets. Paul Krzyzanowski. Rutgers University. Spring /6/ Paul Krzyzanowski

Operating Systems. 17. Sockets. Paul Krzyzanowski. Rutgers University. Spring /6/ Paul Krzyzanowski Operating Systems 17. Sockets Paul Krzyzanowski Rutgers University Spring 2015 1 Sockets Dominant API for transport layer connectivity Created at UC Berkeley for 4.2BSD Unix (1983) Design goals Communication

More information

Cisco Series Internet Router Architecture: Packet Switching

Cisco Series Internet Router Architecture: Packet Switching Cisco 12000 Series Internet Router Architecture: Packet Switching Document ID: 47320 Contents Introduction Prerequisites Requirements Components Used Conventions Background Information Packet Switching:

More information

International Journal of Engineering Trends and Technology (IJETT) Volume 6 Number 3- Dec 2013

International Journal of Engineering Trends and Technology (IJETT) Volume 6 Number 3- Dec 2013 High Speed Design of Ethernet MAC * Bolleddu Alekya 1 P. Bala Nagu 2 1 PG Student (M. Tech), Dept. of ECE, Chirala Engineering College, Chirala, A.P, India. 2 Associate Professor, Dept. of ECE, Chirala

More information

Network device drivers in Linux

Network device drivers in Linux Network device drivers in Linux Aapo Kalliola Aalto University School of Science Otakaari 1 Espoo, Finland aapo.kalliola@aalto.fi ABSTRACT In this paper we analyze the interfaces, functionality and implementation

More information

An Experimental review on Intel DPDK L2 Forwarding

An Experimental review on Intel DPDK L2 Forwarding An Experimental review on Intel DPDK L2 Forwarding Dharmanshu Johar R.V. College of Engineering, Mysore Road,Bengaluru-560059, Karnataka, India. Orcid Id: 0000-0001- 5733-7219 Dr. Minal Moharir R.V. College

More information

PacketShader as a Future Internet Platform

PacketShader as a Future Internet Platform PacketShader as a Future Internet Platform AsiaFI Summer School 2011.8.11. Sue Moon in collaboration with: Joongi Kim, Seonggu Huh, Sangjin Han, Keon Jang, KyoungSoo Park Advanced Networking Lab, CS, KAIST

More information

PVPP: A Programmable Vector Packet Processor. Sean Choi, Xiang Long, Muhammad Shahbaz, Skip Booth, Andy Keep, John Marshall, Changhoon Kim

PVPP: A Programmable Vector Packet Processor. Sean Choi, Xiang Long, Muhammad Shahbaz, Skip Booth, Andy Keep, John Marshall, Changhoon Kim PVPP: A Programmable Vector Packet Processor Sean Choi, Xiang Long, Muhammad Shahbaz, Skip Booth, Andy Keep, John Marshall, Changhoon Kim Fixed Set of Protocols Fixed-Function Switch Chip TCP IPv4 IPv6

More information

Consolidating Communications and Networking Workloads onto one Architecture

Consolidating Communications and Networking Workloads onto one Architecture Consolidating Communications and ing Workloads onto one Architecture White Paper Intel Xeon Processor Equipment Platform Architecture Communications Industry An effective approach for reducing CapEx and

More information

FYS Data acquisition & control. Introduction. Spring 2018 Lecture #1. Reading: RWI (Real World Instrumentation) Chapter 1.

FYS Data acquisition & control. Introduction. Spring 2018 Lecture #1. Reading: RWI (Real World Instrumentation) Chapter 1. FYS3240-4240 Data acquisition & control Introduction Spring 2018 Lecture #1 Reading: RWI (Real World Instrumentation) Chapter 1. Bekkeng 14.01.2018 Topics Instrumentation: Data acquisition and control

More information

Implementing the Wireless Token Ring Protocol As a Linux Kernel Module

Implementing the Wireless Token Ring Protocol As a Linux Kernel Module Implementing the Wireless Token Ring Protocol As a Linux Kernel Module Ruchira Datta Web Over Wireless Group University of California Berkeley, California September 28, 2001 1 Preliminary Groundwork: Fake

More information

PacketShader: A GPU-Accelerated Software Router

PacketShader: A GPU-Accelerated Software Router PacketShader: A GPU-Accelerated Software Router Sangjin Han In collaboration with: Keon Jang, KyoungSoo Park, Sue Moon Advanced Networking Lab, CS, KAIST Networked and Distributed Computing Systems Lab,

More information

Re-architecting Virtualization in Heterogeneous Multicore Systems

Re-architecting Virtualization in Heterogeneous Multicore Systems Re-architecting Virtualization in Heterogeneous Multicore Systems Himanshu Raj, Sanjay Kumar, Vishakha Gupta, Gregory Diamos, Nawaf Alamoosa, Ada Gavrilovska, Karsten Schwan, Sudhakar Yalamanchili College

More information

NVMe Over Fabrics: Scaling Up With The Storage Performance Development Kit

NVMe Over Fabrics: Scaling Up With The Storage Performance Development Kit NVMe Over Fabrics: Scaling Up With The Storage Performance Development Kit Ben Walker Data Center Group Intel Corporation 2018 Storage Developer Conference. Intel Corporation. All Rights Reserved. 1 Notices

More information

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb. Messaging App IB Verbs NTRDMA dmaengine.h ntb.h DMA DMA DMA NTRDMA v0.1 An Open Source Driver for PCIe and DMA Allen Hubbe at Linux Piter 2015 1 INTRODUCTION Allen Hubbe Senior Software Engineer EMC Corporation

More information

Routers Technologies & Evolution for High-Speed Networks

Routers Technologies & Evolution for High-Speed Networks Routers Technologies & Evolution for High-Speed Networks C. Pham Université de Pau et des Pays de l Adour http://www.univ-pau.fr/~cpham Congduc.Pham@univ-pau.fr Router Evolution slides from Nick McKeown,

More information

Lecture 20: Link Layer

Lecture 20: Link Layer Lecture 20: Link Layer COMP 332, Spring 2018 Victoria Manfredi Acknowledgements: materials adapted from Computer Networking: A Top Down Approach 7 th edition: 1996-2016, J.F Kurose and K.W. Ross, All Rights

More information

I/O Handling. ECE 650 Systems Programming & Engineering Duke University, Spring Based on Operating Systems Concepts, Silberschatz Chapter 13

I/O Handling. ECE 650 Systems Programming & Engineering Duke University, Spring Based on Operating Systems Concepts, Silberschatz Chapter 13 I/O Handling ECE 650 Systems Programming & Engineering Duke University, Spring 2018 Based on Operating Systems Concepts, Silberschatz Chapter 13 Input/Output (I/O) Typical application flow consists of

More information

Non-uniform memory access (NUMA)

Non-uniform memory access (NUMA) Non-uniform memory access (NUMA) Memory access between processor core to main memory is not uniform. Memory resides in separate regions called NUMA domains. For highest performance, cores should only access

More information

100 Gbps Open-Source Software Router? It's Here. Jim Thompson, CTO, Netgate

100 Gbps Open-Source Software Router? It's Here. Jim Thompson, CTO, Netgate 100 Gbps Open-Source Software Router? It's Here. Jim Thompson, CTO, Netgate @gonzopancho Agenda Edge Router Use Cases Need for Speed Cost, Flexibility, Control, Evolution The Engineering Challenge Solution

More information

How to Build a 100 Gbps DDoS Traffic Generator

How to Build a 100 Gbps DDoS Traffic Generator How to Build a 100 Gbps DDoS Traffic Generator DIY with a Single Commodity-off-the-shelf Server (COTS) Surasak Sanguanpong Surasak.S@ku.ac.th DISCLAIMER THE FOLLOWING CONTENTS HAS BEEN APPROVED FOR APPROPIATE

More information

Router Architectures

Router Architectures Router Architectures Venkat Padmanabhan Microsoft Research 13 April 2001 Venkat Padmanabhan 1 Outline Router architecture overview 50 Gbps multi-gigabit router (Partridge et al.) Technology trends Venkat

More information

INT G bit TCP Offload Engine SOC

INT G bit TCP Offload Engine SOC INT 10011 10 G bit TCP Offload Engine SOC Product brief, features and benefits summary: Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx/Altera FPGAs or Structured ASIC flow.

More information

Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses

Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses 1 Most of the integrated I/O subsystems are connected to the

More information

I/O Channels. RAM size. Chipsets. Cluster Computing Paul A. Farrell 9/8/2011. Memory (RAM) Dept of Computer Science Kent State University 1

I/O Channels. RAM size. Chipsets. Cluster Computing Paul A. Farrell 9/8/2011. Memory (RAM) Dept of Computer Science Kent State University 1 Memory (RAM) Standard Industry Memory Module (SIMM) RDRAM and SDRAM Access to RAM is extremely slow compared to the speed of the processor Memory busses (front side busses FSB) run at 100MHz to 800MHz

More information

Using (Suricata over) PF_RING for NIC-Independent Acceleration

Using (Suricata over) PF_RING for NIC-Independent Acceleration Using (Suricata over) PF_RING for NIC-Independent Acceleration Luca Deri Alfredo Cardigliano Outlook About ntop. Introduction to PF_RING. Integrating PF_RING with

More information

LHCb on-line / off-line computing. 4

LHCb on-line / off-line computing. 4 Off-line computing LHCb on-line / off-line computing Domenico Galli, Bologna INFN CSN Lecce, 24923 We plan LHCb-Italy off-line computing resources to be as much centralized as possible Put as much computing

More information

Topic & Scope. Content: The course gives

Topic & Scope. Content: The course gives Topic & Scope Content: The course gives an overview of network processor cards (architectures and use) an introduction of how to program Intel IXP network processors some ideas of how to use network processors

More information

Motivation to Teach Network Hardware

Motivation to Teach Network Hardware NetFPGA: An Open Platform for Gigabit-rate Network Switching and Routing John W. Lockwood, Nick McKeown Greg Watson, Glen Gibb, Paul Hartke, Jad Naous, Ramanan Raghuraman, and Jianying Luo JWLockwd@stanford.edu

More information

I/O Management Intro. Chapter 5

I/O Management Intro. Chapter 5 I/O Management Intro Chapter 5 1 Learning Outcomes A high-level understanding of the properties of a variety of I/O devices. An understanding of methods of interacting with I/O devices. An appreciation

More information

PCI Performance on the RC32334/RC32332

PCI Performance on the RC32334/RC32332 PCI Performance on the RC32334/RC32332 Application Note AN-367 By Rakesh Bhatia and Pallathu Sadik Revision History April 19, 2002: Initial publication. September 4, 2002: Updated for revision YC silicon.

More information

ARISTA: Improving Application Performance While Reducing Complexity

ARISTA: Improving Application Performance While Reducing Complexity ARISTA: Improving Application Performance While Reducing Complexity October 2008 1.0 Problem Statement #1... 1 1.1 Problem Statement #2... 1 1.2 Previous Options: More Servers and I/O Adapters... 1 1.3

More information

William Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved.

William Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved. + William Stallings Computer Organization and Architecture 10 th Edition 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. 2 + Chapter 3 A Top-Level View of Computer Function and Interconnection

More information

Support for Smart NICs. Ian Pratt

Support for Smart NICs. Ian Pratt Support for Smart NICs Ian Pratt Outline Xen I/O Overview Why network I/O is harder than block Smart NIC taxonomy How Xen can exploit them Enhancing Network device channel NetChannel2 proposal I/O Architecture

More information

Implementing Software Virtual Routers on Multi-core PCs using Click

Implementing Software Virtual Routers on Multi-core PCs using Click Implementing Software Virtual Routers on Multi-core PCs using Click Mickaël Hoerdt, Dept. of computer engineering Université catholique de Louvain la neuve mickael.hoerdt@uclouvain.be LANCASTER UNIVERSITY

More information

The Convergence of Storage and Server Virtualization Solarflare Communications, Inc.

The Convergence of Storage and Server Virtualization Solarflare Communications, Inc. The Convergence of Storage and Server Virtualization 2007 Solarflare Communications, Inc. About Solarflare Communications Privately-held, fabless semiconductor company. Founded 2001 Top tier investors:

More information

FPGAs and Networking

FPGAs and Networking FPGAs and Networking Marc Kelly & Richard Hughes-Jones University of Manchester 12th July 27 1 Overview of Work Looking into the usage of FPGA's to directly connect to Ethernet for DAQ readout purposes.

More information

Improving Packet Processing Performance of a Memory- Bounded Application

Improving Packet Processing Performance of a Memory- Bounded Application Improving Packet Processing Performance of a Memory- Bounded Application Jörn Schumacher CERN / University of Paderborn, Germany jorn.schumacher@cern.ch On behalf of the ATLAS FELIX Developer Team LHCb

More information

CSCD 330 Network Programming

CSCD 330 Network Programming CSCD 330 Network Programming Network Superhighway Spring 2018 Lecture 13 Network Layer Reading: Chapter 4 Some slides provided courtesy of J.F Kurose and K.W. Ross, All Rights Reserved, copyright 1996-2007

More information

- Knowledge of basic computer architecture and organization, ECE 445

- Knowledge of basic computer architecture and organization, ECE 445 ECE 446: Device Driver Development Fall 2014 Wednesdays 7:20-10 PM Office hours: Wednesdays 6:15-7:15 PM or by appointment, Adjunct office Engineering Building room 3707/3708 Last updated: 8/24/14 Instructor:

More information