DTN End Host performance and tuning

Size: px
Start display at page:

Download "DTN End Host performance and tuning"

Transcription

1 DTN End Host performance and tuning 1 Gigabit Ethernet & NVME Disks Richard- Hughes Jones Senior Network Advisor, Office of the CTO GEANT AssociaOon - Cambridge Workshop: Moving My Data at High Speeds over the Network, Prague 12 Jun 216

2 The GÉANT DTN A story of how we explored the hardware and what we found Almost a technical report of progress Not really a teach- in, but we do give some hints for tuning Welcome input 2

3 The GÉANT DTN Hardware A lot of help from Boston Labs (London UK) Mellanox (UK & Israel ) Supermicro X1DRT- i+ motherboard Two 6 core 3.4GHz Xeon E v3 processors, Mellanox ConnectX- 4 1 GE NIC 16 lane PCI- e As many interrupts as cores Driver MLNX_OFED NVME SSD Set 8 lane PCI- e Fedora 23 with the fc23.x86_64 kernel NIC NVME 3

4 Explore the Hardware How are the peripherals connected? NUMA: Which PCIe I/f & bus is connected to which CPU socket or node? To which core do the IRQs go? Tools dmsg lspci tv lspci vv (for the PCIe slot) numactl H Look in /sys & /proc cat /proc/irq/<irq>/smp_affinity [root@geant_dtn1 mlnx_tuning_scripts]#./ show_irq_affinity.sh enp131sf1 183: 4 184: 8 185: 1 186: 2 187: 4 188: 8 189: 4 19: 8 191: 1 192: 2 193: Networks 4 Services People 194: 8 [root@dhcp richard]# lspci tv -+-[:ff]-+-8. Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link \-1f.2 Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 VCU +-[:8]-+-.-[81] [82]----. Mellanox Technologies MT275 Family [ConnectX-3] +-2.-[83]--+-. Mellanox Technologies MT277 Family [ConnectX-4] \-.1 Mellanox Technologies MT277 Family [ConnectX-4] +-3.-[84] [:7f]-+-8. Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link +-9. Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link 1... \-[:]-+-. Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 DMI [1]--+-. Intel Corporation Ethernet Controller 1-Gigabit X54-AT2 \-.1 Intel Corporation Ethernet Controller 1-Gigabit X54-AT [2]----. Intel Corporation PCIe Data Center SSD [3]----. Intel Corporation PCIe Data Center SSD [4] [5]----. Intel Corporation PCIe Data Center SSD [6]----. Intel Corporation PCIe Data Center SSD [7]----. Intel Corporation PCIe Data Center SSD [8]----. Intel Corporation PCIe Data Center SSD 4

5 udpmon: UDP Achievable Throughput Ideal shape Flat poroons Limited by capacity of link Available BW on a loaded link Recv Wire rate Mbit/s Mumbai-Singapore 5 bytes 1 bytes 2 bytes 4 bytes 6 bytes 8 bytes 1 bytes 12 bytes 14 bytes Shape follows 1/t Packet spacing most important Spacing betw een frames us 1472 bytes Cannot send packets back- 2- back End host: NIC setup Ome on PCI / context switches

6 udpmon on Boston Lab hosts: Achievable Throughput & Packet loss Move IRQs from core 11, set affinity to lock udpmon to core 11 node 1. Interrupt coalescence on (3us) Recv Wire rate Mbit/s haswell1-2_x_nobackground_5nov15 Recv Wire rate Mbit/s haswell1-2_x_nobackground_5nov Gbit/s Spacing between frames us 8972 bytes Spacing between frames us 8972 bytes %cpu kernel snd % cpu kernel rec 1 haswell1-2_x_nobackground_5nov Spacing between frames us 1 haswell1-2_x_nobackground_5nov % kernel 4 2 Networks Services People Spacing between frames us 8972 bytes 8972 bytes %cpu kernel snd haswell1-2_x_nobackground_5nov15 96% kernel sending Spacing between frames us 8972 bytes Swapping between user and kernel mode 6

7 udpmon on GÉANT DTN: Achievable Throughput & Packet loss Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. Interrupt coalescence on (16us) Recv Wire rate Gbit/s %cpu kernel snd % cpu1 kernel rec 6 DTN1-2_noFW_a4_4Jun Gbit/s Spacing between frames us 1 DTN1-2_noFW_a4_4Jun % kernel sending Spacing between frames us 1 DTN1-2_noFW_a4_4Jun % kernel 4 2 Networks Services 2 People Spacing between frames us 4 bytes 6 bytes 7813 bytes 8972 bytes 4 bytes 6 bytes 7813 bytes 8972 bytes 4 bytes 6 bytes 7813 bytes Jumbo size packet should be highest! Swapping between user and kernel mode Also lost packets in the receiving host 7

8 udpmon_send: How fast can I transmit? Sending Rate as a FuncOon of Packet Size Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. 6 pkt_size_geant_dtn1_4may16 3 pkt_size_geant_dtn1_4may Send user data rate Mbit/s Average send time/packet us Size of user data in packet bytes Size of user data in packet bytes Drop 14.5 Gbit/s from 43 Gbit/s Step.75 µs Step occurs bytes user data Drop 3.6 Gbit/s Collaborate with Mellanox Step.29 µs 8

9 Some aspects of the Impact of Reality on Throughput

10 udpmon_send: How fast can I transmit? Turn off Checksum offload Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. Only ~ 2 Gbit/s drop but can also turn off TCP offload! 6 pkt_size_txoffrxoff_geant_dtn1_ax4_4jun16 5 Send user data rate Mbit/s Xsum OFF Xsum ON Size of user data in packet bytes

11 udpmon_send: How fast can I transmit? Turn on the Firewalls Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. Max drop 11 Gbit/s at 7872 bytes 6 pkt_size_dtn1_ax4_5jun16 5 Send user data rate Mbit/s Xsum On FW Off Xsum On FW On Size of user data in packet bytes

12 udpmon_send: How fast can I transmit? Which CPU core and node? Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. Run udpmon on core 2 no IRQs Firewalls ON Max drop 7.2 Gbit/s at 8972 bytes Send user data rate Mbit/s pkt_size_xsumonfwon_geant_dtn1_ax4_5jun16 Core 6 Node 1 Core 2 Node Size of user data in packet bytes Summary for 7772 byte user data Case Gbit/s No FW 43.3 FW ON 32.4 Wrong core 25.5

13 udpmon_send on servers at Boston Labs : 5 UDP flows Set affinity to lock udpmon_send to run on CPU cores on node 1. Send 8972 byte packets with wait Ome 3.6 μs (~2 Gbit/s) record the Ome- series every 5s. 3 flows had no packet loss ~2 Gbit/s, did not process irq, ~45 % kernel mode. Other 2 flows had 25-28% packet loss, process 3-4% so{irq, 96% kernel mode Throughput Gbit/s udpmon_tseries_sum_24nov % packet loss throughput % loss a4 % loss a8 % loss a1 % loss a2 % loss a4 Time during transfer hr 13

14 TCP Achievable Throughput

15 iperf3: TCP throughput Throughput vs TCP buffer size Distribute IRQs over all cores on node 1 Run iperf3 on core 6 Node 1, Firewalls OFF, TCP offload on, TCP cubic stack RTT.4 ms. Delay Bandwidth Product.5 MBytes. As expected rises smoothly to a plateau at.7 Mbytes reaching 75 Gbit/s Throughput constant a{er slow start. No TCP re- transmi ed segments observed (iperf3 and /proc/net/snmp ) BW Gbit/s DTN1-2_A6_TCPbuf_31May Buffer size Mbyte

16 iperf3: TCP throughput Which CPU core and Node? Distribute IRQs over all cores on node 1 Run iperf3 on core 6 node 1, and repeat on core 1 node Firewalls OFF, TCP offload on, TCP cubic stack RTT.4 ms. Delay Bandwidth Product.5 MBytes. Rises smoothly to a plateau at.5 Mbytes Throughput falls by 4 Gbit/s from 75 to 35 Gbit/s No TCP re- transmi ed segments observed (iperf3 and /proc/net/snmp ) BW Gbit/s Core 6 Node 1 Core 1 Node Buffer size Mbyte

17 iperf3: TCP throughput Use cores 6 & 1 on node 1 and node Firewalls OFF, TCP offload on, TCP cubic stack Rises smoothly to the plateau Throughput: 75 Gbit/s both Send & Receive on node 1 6 Gbit/s Send on node Reveive on node 1 35 Gbit/s both Send & Receive on node Very few TCP re- transmi ed segments observed BW Gbit/s DTN1-2_TCPbuf core 6 - core 6 core 1 - core 1 core 1 - core Buffer size Mbyte

18 iperf3: TCP throughput With Firewall ON Run iperf3 on core 6 node 1, TCP offload on, TCP cubic stack RTT.4 ms. Delay Bandwidth Product.5 MBytes. Rises smoothly to a plateau at.5 Mbytes Achievable throughput falls by 7.3 Gbit/s No TCP re- transmi ed segments observed (iperf3 and /proc/net/snmp ) BW Gbit/s No Firewall With Firewall Buffer size Mbyte

19 iperf: TCP throughput mulople flows Distribute IRQs over all cores on node 1 Run iperf on cores 6-11 for both receive and sending Firewalls ON, TCP offload on, TCP cubic Total throughput: increase 6 à 86 2 à 3 flows 98 Gbit/s for 4 & 5 flows then starts to fall % re- tx 2, 3 flows 1-4 % re- tx 4, 5 flows 1-3 % re- tx 8, 1 flows Individual flows can vary by ± 5 Gbit/s BW Gbit/s BW Gbit/s BW Gbit/s DTN1-2_A6-11_P2_TCPbuf_5Jun Buffer size Mbyte DTN1-2_A6-11_P4_TCPbuf_5Jun Buffer size Mbyte DTN1-2_A6-11_P8_TCPbuf_5Jun Buffer size Mbyte BW Gbit/s BW Gbit/s BW Gbit/s DTN1-2_A6-11_P3_TCPbuf_5Jun Buffer size Mbyte DTN1-2_A6-11_P5_TCPbuf_5Jun16 Buffer size Mbyte DTN1-2_A6-11_P8_TCPbuf_5Jun Buffer size Mbyte

20 Network Tuning for 1 Gigabit Ethernet Hyper threading Turn off in the BIOS Wait states Disable / minimise use of c- states. Use the BIOS or at boot Ome ( but I could not find out how in my BIOS!) Power saving Core Frequency Set governor performance $numactl H Set cpufreq to maximum $lspci tv Depends on scaling_driver: acpi- cpufreq allows se ng cpuinfo_cur_freq to max intel_pstate does not but seems fast anyway NUMA Check and select CPU cores in the node with Ethernet interfaces a ached $numactl H $lspci tv

21 Network Tuning for 1 Gigabit Ethernet IRQs #systemctl stop irqbalance.service Turn off the irqbalance service prevents balancer from changing the affinity scheme. Set affinity of the NIC IRQs to use CPU cores on the node with PCIe 1 per CPU. For UDP seems best NOT to use the CPU cores used by the apps. Interface parameters Ensure interrupt coalescence is ON 3 μs, 16 μs, more? Ensure Rx & Tx checksum offload is ON Ensure tcp- segmentaoon- offload is ON MTU Set IP MTU 9 Bytes #cat /proc/irq/<irq>/smp_affinity #echo 4 > /proc/irq/183/smp_affinity #ethtool C <i/f> rx-usecs 8 #ethtool K <i/f> rx on tx on Best in files eg ifcfg_ethx mtu=9

22 Network Tuning for 1 Gigabit Ethernet Queues Set txqueuelen transmit Q (I used 1 but 1, reccomended Set netdev_max_backlog say 25 Q between interface and IP stack Kernel parameters net.core.rmem_max net.core.wmem_max net.ipv4.tcp_rmem net.ipv4.tcp_wmem (min / default / max) net.ipv4.tcp_mtu_probing (jumbo frames) net.ipv4.tcp_congesoon_control BeJer to choose fewer high speed cores Best in file /etc/sysctl.conf See also h p:// docs/prod_so{ware/ Performance_Tuning_Guide_for_Mellanox_Network_Adapters.pdf Esnet FasterData h ps://fasterdata.es.net/network- tuning/

23 A Look at the Disk Sub- system

24 NVMe Disks Non VolaOle Memory express a scalable host controller interface. Designed for SSD a ached to PICe PCIe cards or 2.5 drives. Block IO based lockless block layer Shorter data path bypasses costly AHCI / SCSI layers Latency & CPU cycles reduced > 5% SCSI 6 μs 19,5 cycles NVMe 2.8 μs 9,1 cycles Parallelism - per CPU HW queues:

25 NVMe Disk Performance RAID with 2 disks IRQs distributed over all cores on both nodes Run disk_test on core 2 Node Measure sequenoal read and write disk- memory rates as funcoon file size 2 disks in RAID xfs file system Drop at file size ~3 Gbytes to 27 Gbit/s read and 15 Gbit/s write 7 7 DTN2_a2_R-2d_filescan_7Jun16 DTN2_a2_filescan_6Jun16 14 DTN2_a2_R-2d_filescan_7Jun16 Throughput Gbit/s 6 5 Throughput Gbit/s Gbit/s r Gbit/s w Gbit/s r Gbit/s w Throughput GBytes/s GBytes r GBytes w File size File size GBytes GBytes File size GBytes

26 NVMe Disk Performance 1 NVMe disk IRQs distributed over all cores on both nodes Run disk_test on core 2 Node Measure sequenoal read and write disk- memory rates as funcoon file size xfs file system Read Gbit/s Write Gbit/s 1 Disk ~ yes read < write! RAID 2disk DTN2_a8_D1-1d_filescan_7Jun16 7 DTN2_a2_R-2d_filescan_7Jun16 Throughput Gbit/s Gbit/s r Gbit/s w Throughput Gbit/s Gbit/s r Gbit/s w File size GBytes File size GBytes

27 Disk Tuning Steps for Discussion Really a to- do next list RAID with more disks note the CPU loads makefs.xfs parameters Stripe size 256k? mkfs.xfs - f - l version=2 - i size=124 - n size= d su=256k,sw=22 - L myname Read ahead blockdev - - setra Requests echo 512 > /sys/block/sda/queue/nr_requests Scheduler echo deadline > /sys/block/sda/queue/scheduler Other so{ware RAID products Measure data transfers

28 Summary for Discussion UDP flows harder than expected working with Mellanox Use of CPU cores seems criocal for both UDP and TCP Shown that TCP performance is good Explored some of the aspects that impact on Throughput Started to understand NVMe disk behaviour help from Boston Labs UK Just starong to run globus and grid{p Lets open the discussion

29 Thank you Richard- Hughes Jones Richard.Hughes- GEANT Limited on behalf of the GN4 Phase 1 project (GN4-1). The research leading to these results has received funding from the European Union s Horizon 22 research and innovaoon programme under Grant Agreement No (GN4-1). 29

30 Setup at Boston Labs 1 Gbit Ethernet NIC A lot of help from Boston Labs (London UK) Supermicro X1DRT- P motherboard Two 1 core 2.3 GHz Intel Xeon E5-265 v3 Haswell processors Mellanox ConnectX more GE cmd_throughput_lite.pl NIC 16 lane PCI- e As many interrupts as cores Centos 6.7 with the el6 kernel. IniOally Hyper Threading On 4 CPUs! NIC 3

31 What is udpmon? So{ware package for invesogaong end host and network performance, using UDP/IP frames. Programs work in client- server pairs to: Transmit streams of sequenced UDP packets at regular, carefully controlled intervals. Can vairy frame size and frame transmit spacing. Receive and check the sequence & Oming of the packets. IdenOfy if packets lost in the end host or network. Allows measurement of: Request- response latency. Achievable UDP bandwidth, packet loss, packet ordering, ji er. Packet dynamics & packet loss pa erns. Quality of the connecoon path and its stability.

32 The client- server pairs udpmon_bw_mon à udpmon_resp Achievable UDP bandwidth, packet loss, packet ordering, ji er Packet dynamics & packet loss pa erns udpmon_req à udpmon_resp Request- response latency udpmon_send à udpmon_recv Quality of the connecoon path and its stability Time series of achievable UDP bandwidth, packet loss

33 Achievable UDP Throughput Measurements Send a controlled stream of UDP frames spaced at regular intervals with 64 bit sequence numbers & send Ome stamp. Record the packet receive Ome. Zero stats set concurrent lockout Sender Receiver OK done Send data frames at regular intervals Time to send Inter-packet time (Histogram) Time to receive Get remote statistics Signal end of test Time n bytes Wait time Number of packets Send statistics back: No. received No. lost + loss pattern OK done No. out-of-order No. lost in network CPU load No. interrupts & SNMP Tx, Rx times & 1-way delay time

Networking: Data Transfer Nodes and Tuning

Networking: Data Transfer Nodes and Tuning Networking: Data Transfer Nodes and Tuning Richard Hughes-Jones GÉANT 1 GEANT Limited on behalf of the GN4 Phase 2 project (GN4-2). The research leading to these results has received funding from the European

More information

Lighting the Blue Touchpaper for UK e-science - Closing Conference of ESLEA Project The George Hotel, Edinburgh, UK March, 2007

Lighting the Blue Touchpaper for UK e-science - Closing Conference of ESLEA Project The George Hotel, Edinburgh, UK March, 2007 Working with 1 Gigabit Ethernet 1, The School of Physics and Astronomy, The University of Manchester, Manchester, M13 9PL UK E-mail: R.Hughes-Jones@manchester.ac.uk Stephen Kershaw The School of Physics

More information

FPGAs and Networking

FPGAs and Networking FPGAs and Networking Marc Kelly & Richard Hughes-Jones University of Manchester 12th July 27 1 Overview of Work Looking into the usage of FPGA's to directly connect to Ethernet for DAQ readout purposes.

More information

Performance Optimisations for HPC workloads. August 2008 Imed Chihi

Performance Optimisations for HPC workloads. August 2008 Imed Chihi Performance Optimisations for HPC workloads August 2008 Imed Chihi Agenda The computing model The assignment problem CPU sets Priorities Disk IO optimisations gettimeofday() Disabling services Memory management

More information

Improving Performance of 100G Data Transfer Nodes

Improving Performance of 100G Data Transfer Nodes Improving Performance of 100G Data Transfer Nodes Brian Tierney, Consultant Nate Hanford, ESnet bltierney@gmail.com http://fasterdata.es.net APAN, Singapore March 28, 2018 Recent TCP changes 2 Observation

More information

10GE network tests with UDP. Janusz Szuba European XFEL

10GE network tests with UDP. Janusz Szuba European XFEL 10GE network tests with UDP Janusz Szuba European XFEL Outline 2 Overview of initial DAQ architecture Slice test hardware specification Initial networking test results DAQ software UDP tests Summary 10GE

More information

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK [r.tasker@dl.ac.uk] DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459

More information

Linux Network Tuning Guide for AMD EPYC Processor Based Servers

Linux Network Tuning Guide for AMD EPYC Processor Based Servers Linux Network Tuning Guide for AMD EPYC Processor Application Note Publication # 56224 Revision: 1.00 Issue Date: November 2017 Advanced Micro Devices 2017 Advanced Micro Devices, Inc. All rights reserved.

More information

Exploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer

Exploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer TSM Performance Tuning Exploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer peinkofer@lrz.de Agenda Network Performance Disk-Cache Performance Tape Performance

More information

46PaQ. Dimitris Miras, Saleem Bhatti, Peter Kirstein Networks Research Group Computer Science UCL. 46PaQ AHM 2005 UKLIGHT Workshop, 19 Sep

46PaQ. Dimitris Miras, Saleem Bhatti, Peter Kirstein Networks Research Group Computer Science UCL. 46PaQ AHM 2005 UKLIGHT Workshop, 19 Sep 46PaQ Dimitris Miras, Saleem Bhatti, Peter Kirstein Networks Research Group Computer Science UCL 46PaQ AHM 2005 UKLIGHT Workshop, 19 Sep 2005 1 Today s talk Overview Current Status and Results Future Work

More information

Achieving 98Gbps of Crosscountry TCP traffic using 2.5 hosts, 10 x 10G NICs, and 10 TCP streams

Achieving 98Gbps of Crosscountry TCP traffic using 2.5 hosts, 10 x 10G NICs, and 10 TCP streams Achieving 98Gbps of Crosscountry TCP traffic using 2.5 hosts, 10 x 10G NICs, and 10 TCP streams Eric Pouyoul, Brian Tierney ESnet January 25, 2012 ANI 100G Testbed ANI Middleware Testbed NERSC To ESnet

More information

A System-Level Optimization Framework For High-Performance Networking. Thomas M. Benson Georgia Tech Research Institute

A System-Level Optimization Framework For High-Performance Networking. Thomas M. Benson Georgia Tech Research Institute A System-Level Optimization Framework For High-Performance Networking Thomas M. Benson Georgia Tech Research Institute thomas.benson@gtri.gatech.edu 1 Why do we need high-performance networking? Data flow

More information

Linux Network Tuning Guide for AMD EPYC Processor Based Servers

Linux Network Tuning Guide for AMD EPYC Processor Based Servers Linux Network Tuning Guide for AMD EPYC Processor Application Note Publication # 56224 Revision: 1.10 Issue Date: May 2018 Advanced Micro Devices 2018 Advanced Micro Devices, Inc. All rights reserved.

More information

Optimizing the GigE transfer What follows comes from company Pleora.

Optimizing the GigE transfer What follows comes from company Pleora. Optimizing the GigE transfer What follows comes from company Pleora. Selecting a NIC and Laptop Based on our testing, we recommend Intel NICs. In particular, we recommend the PRO 1000 line of Intel PCI

More information

TCP Tuning Domenico Vicinanza DANTE, Cambridge, UK

TCP Tuning Domenico Vicinanza DANTE, Cambridge, UK TCP Tuning Domenico Vicinanza DANTE, Cambridge, UK domenico.vicinanza@dante.net EGI Technical Forum 2013, Madrid, Spain TCP! Transmission Control Protocol (TCP)! One of the original core protocols of the

More information

AMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV)

AMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV) White Paper December, 2018 AMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV) Executive Summary Data centers and cloud service providers are creating a technology shift

More information

打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生.

打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生. 打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生 shiys@solutionware.com.cn BY DEFAULT, LINUX NETWORKING NOT TUNED FOR MAX PERFORMANCE, MORE FOR RELIABILITY Trade-off :Low Latency, throughput, determinism Performance

More information

Optimizing Performance: Intel Network Adapters User Guide

Optimizing Performance: Intel Network Adapters User Guide Optimizing Performance: Intel Network Adapters User Guide Network Optimization Types When optimizing network adapter parameters (NIC), the user typically considers one of the following three conditions

More information

ntop Users Group Meeting

ntop Users Group Meeting ntop Users Group Meeting PF_RING Tutorial Alfredo Cardigliano Overview Introduction Installation Configuration Tuning Use cases PF_RING Open source packet processing framework for

More information

Advanced Computer Networks. End Host Optimization

Advanced Computer Networks. End Host Optimization Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct

More information

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER Aspera FASP Data Transfer at 80 Gbps Elimina8ng tradi8onal bo

More information

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER 80 GBIT/S OVER IP USING DPDK Performance, Code, and Architecture Charles Shiflett Developer of next-generation

More information

Networking at the Speed of Light

Networking at the Speed of Light Networking at the Speed of Light Dror Goldenberg VP Software Architecture MaRS Workshop April 2017 Cloud The Software Defined Data Center Resource virtualization Efficient services VM, Containers uservices

More information

IBM POWER8 100 GigE Adapter Best Practices

IBM POWER8 100 GigE Adapter Best Practices Introduction IBM POWER8 100 GigE Adapter Best Practices With higher network speeds in new network adapters, achieving peak performance requires careful tuning of the adapters and workloads using them.

More information

QuickSpecs. HP Z 10GbE Dual Port Module. Models

QuickSpecs. HP Z 10GbE Dual Port Module. Models Overview Models Part Number: 1Ql49AA Introduction The is a 10GBASE-T adapter utilizing the Intel X722 MAC and X557-AT2 PHY pairing to deliver full line-rate performance, utilizing CAT 6A UTP cabling (or

More information

Internet data transfer record between CERN and California. Sylvain Ravot (Caltech) Paolo Moroni (CERN)

Internet data transfer record between CERN and California. Sylvain Ravot (Caltech) Paolo Moroni (CERN) Internet data transfer record between CERN and California Sylvain Ravot (Caltech) Paolo Moroni (CERN) Summary Internet2 Land Speed Record Contest New LSR DataTAG project and network configuration Establishing

More information

CERN openlab Summer 2006: Networking Overview

CERN openlab Summer 2006: Networking Overview CERN openlab Summer 2006: Networking Overview Martin Swany, Ph.D. Assistant Professor, Computer and Information Sciences, U. Delaware, USA Visiting Helsinki Institute of Physics (HIP) at CERN swany@cis.udel.edu,

More information

D1.1 Server Scalibility

D1.1 Server Scalibility D1.1 Server Scalibility Ronald van der Pol and Freek Dijkstra SARA Computing & Networking Services, Science Park 121, 1098 XG Amsterdam, The Netherlands March 2010 ronald.vanderpol@sara.nl,freek.dijkstra@sara.nl

More information

NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications

NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications Outline RDMA Motivating trends iwarp NFS over RDMA Overview Chelsio T5 support Performance results 2 Adoption Rate of 40GbE Source: Crehan

More information

NFS on the Fast track - fine tuning and futures

NFS on the Fast track - fine tuning and futures NFS on the Fast track - fine tuning and futures Bikash Roy Choudhury Solutions Architect, NetApp Agenda Overview of NFS layers Linux Client Why is the NFS performance Slow? Understanding application behavior

More information

Benchmarking message queue libraries and network technologies to transport large data volume in

Benchmarking message queue libraries and network technologies to transport large data volume in Benchmarking message queue libraries and network technologies to transport large data volume in the ALICE O 2 system V. Chibante Barroso, U. Fuchs, A. Wegrzynek for the ALICE Collaboration Abstract ALICE

More information

ARISTA: Improving Application Performance While Reducing Complexity

ARISTA: Improving Application Performance While Reducing Complexity ARISTA: Improving Application Performance While Reducing Complexity October 2008 1.0 Problem Statement #1... 1 1.1 Problem Statement #2... 1 1.2 Previous Options: More Servers and I/O Adapters... 1 1.3

More information

Presentation_ID. 2002, Cisco Systems, Inc. All rights reserved.

Presentation_ID. 2002, Cisco Systems, Inc. All rights reserved. 1 Gigabit to the Desktop Session Number 2 Gigabit to the Desktop What we are seeing: Today s driver for Gigabit Ethernet to the Desktop is not a single application but the simultaneous use of multiple

More information

OpenOnload. Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect

OpenOnload. Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect OpenOnload Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. OpenOnload Acceleration Software Accelerated

More information

Agilio CX 2x40GbE with OVS-TC

Agilio CX 2x40GbE with OVS-TC PERFORMANCE REPORT Agilio CX 2x4GbE with OVS-TC OVS-TC WITH AN AGILIO CX SMARTNIC CAN IMPROVE A SIMPLE L2 FORWARDING USE CASE AT LEAST 2X. WHEN SCALED TO REAL LIFE USE CASES WITH COMPLEX RULES TUNNELING

More information

EVPath Performance Tests on the GTRI Parallel Software Testing and Evaluation Center (PASTEC) Cluster

EVPath Performance Tests on the GTRI Parallel Software Testing and Evaluation Center (PASTEC) Cluster EVPath Performance Tests on the GTRI Parallel Software Testing and Evaluation Center (PASTEC) Cluster Magdalena Slawinska, Greg Eisenhauer, Thomas M. Benson, Alan Nussbaum College of Computing, Georgia

More information

The NE010 iwarp Adapter

The NE010 iwarp Adapter The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter

More information

Measuring a 25 Gb/s and 40 Gb/s data plane

Measuring a 25 Gb/s and 40 Gb/s data plane Measuring a 25 Gb/s and 40 Gb/s data plane Christo Kleu Pervaze Akhtar 1 Contents Preliminaries Equipment Traffic generators Test topologies Host and VM configuration NUMA Architecture CPU allocation BIOS

More information

Commissioning and Using the 4 Gigabit Lightpath from Onsala to Jodrell Bank

Commissioning and Using the 4 Gigabit Lightpath from Onsala to Jodrell Bank Commissioning and Using the 4 Gigabit Lightpath from Onsala to Jodrell Bank 1 DANTE City House,126-13 Hills Road, Cambridge CB2 1PQ, UK And Visiting Fellow, School of Physics and Astronomy, The University

More information

Application Acceleration Beyond Flash Storage

Application Acceleration Beyond Flash Storage Application Acceleration Beyond Flash Storage Session 303C Mellanox Technologies Flash Memory Summit July 2014 Accelerating Applications, Step-by-Step First Steps Make compute fast Moore s Law Make storage

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme NET1343BU NSX Performance Samuel Kommu #VMworld #NET1343BU Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no

More information

NVMe Takes It All, SCSI Has To Fall. Brave New Storage World. Lugano April Alexander Ruebensaal

NVMe Takes It All, SCSI Has To Fall. Brave New Storage World. Lugano April Alexander Ruebensaal Lugano April 2018 NVMe Takes It All, SCSI Has To Fall freely adapted from ABBA Brave New Storage World Alexander Ruebensaal 1 Design, Implementation, Support & Operating of optimized IT Infrastructures

More information

SPDK China Summit Ziye Yang. Senior Software Engineer. Network Platforms Group, Intel Corporation

SPDK China Summit Ziye Yang. Senior Software Engineer. Network Platforms Group, Intel Corporation SPDK China Summit 2018 Ziye Yang Senior Software Engineer Network Platforms Group, Intel Corporation Agenda SPDK programming framework Accelerated NVMe-oF via SPDK Conclusion 2 Agenda SPDK programming

More information

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration

More information

End-user systems: NICs, MotherBoards, Disks, TCP Stacks & Applications

End-user systems: NICs, MotherBoards, Disks, TCP Stacks & Applications 2th-21st June 25 NeSC Edinburgh http://gridmon.dl.ac.uk/nfnn/ End-user systems: NICs, MotherBoards, Disks, TCP Stacks & Applications Richard Hughes-Jones Work reported is from many Networking Collaborations

More information

Much Faster Networking

Much Faster Networking Much Faster Networking David Riddoch driddoch@solarflare.com Copyright 2016 Solarflare Communications, Inc. All rights reserved. What is kernel bypass? The standard receive path The standard receive path

More information

Ziye Yang. NPG, DCG, Intel

Ziye Yang. NPG, DCG, Intel Ziye Yang NPG, DCG, Intel Agenda What is SPDK? Accelerated NVMe-oF via SPDK Conclusion 2 Agenda What is SPDK? Accelerated NVMe-oF via SPDK Conclusion 3 Storage Performance Development Kit Scalable and

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances

Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances Technology Brief Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances The world

More information

Data Path acceleration techniques in a NFV world

Data Path acceleration techniques in a NFV world Data Path acceleration techniques in a NFV world Mohanraj Venkatachalam, Purnendu Ghosh Abstract NFV is a revolutionary approach offering greater flexibility and scalability in the deployment of virtual

More information

Data transfer over the wide area network with a large round trip time

Data transfer over the wide area network with a large round trip time Journal of Physics: Conference Series Data transfer over the wide area network with a large round trip time To cite this article: H Matsunaga et al 1 J. Phys.: Conf. Ser. 219 656 Recent citations - A two

More information

Avid Configuration Guidelines Lenovo P720 workstation Dual 8 to 28 Core CPU System

Avid Configuration Guidelines Lenovo P720 workstation Dual 8 to 28 Core CPU System Avid Configuration Guidelines Lenovo P720 workstation Dual 8 to 28 Core CPU System Page 1 of 14 Dave Pimm Avid Technology April 25, 2018 1.) Lenovo P720 AVID Qualified System Specification: P720 Hardware

More information

Demystifying Network Cards

Demystifying Network Cards Demystifying Network Cards Paul Emmerich December 27, 2017 Chair of Network Architectures and Services About me PhD student at Researching performance of software packet processing systems Mostly working

More information

perfsonar Host Hardware

perfsonar Host Hardware perfsonar Host Hardware This document is a result of work by the perfsonar Project (http://www.perfsonar.net) and is licensed under CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0/). Event

More information

PCIe 10G 5-Speed. Multi-Gigabit Network Card

PCIe 10G 5-Speed. Multi-Gigabit Network Card PCIe 10G 5-Speed Multi-Gigabit Network Card User Manual Ver. 2.00 All brand names and trademarks are properties of their respective owners. Contents: Chapter 1: Introduction... 3 1.1 Product Introduction...

More information

10Gbps TCP/IP streams from the FPGA

10Gbps TCP/IP streams from the FPGA TWEPP 2013 10Gbps TCP/IP streams from the FPGA for the CMS DAQ Eventbuilder Network Petr Žejdl, Dominique Gigi on behalf of the CMS DAQ Group 26 September 2013 Outline CMS DAQ Readout System TCP/IP Introduction,

More information

inettest a 10 Gigabit Ethernet Test Unit

inettest a 10 Gigabit Ethernet Test Unit 1 DANTE City House, 126-130 Hills Road, Cambridge CB2 1PQ, UK And Visiting Fellow, School of Physics and Astronomy, The University of Manchester, Oxford Rd, Manchester UK E-mail: Richard.Hughes-Jones@dante.net

More information

Microsoft Windows 2016 Mellanox 100GbE NIC Tuning Guide

Microsoft Windows 2016 Mellanox 100GbE NIC Tuning Guide Microsoft Windows 2016 Mellanox 100GbE NIC Tuning Guide Publication # 56288 Revision: 1.00 Issue Date: June 2018 2018 Advanced Micro Devices, Inc. All rights reserved. The information contained herein

More information

May 1, Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science (ICS) A Sleep-based Communication Mechanism to

May 1, Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science (ICS) A Sleep-based Communication Mechanism to A Sleep-based Our Akram Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science (ICS) May 1, 2011 Our 1 2 Our 3 4 5 6 Our Efficiency in Back-end Processing Efficiency in back-end

More information

Ronald van der Pol

Ronald van der Pol Ronald van der Pol Contributors! " Ronald van der Pol! " Freek Dijkstra! " Pieter de Boer! " Igor Idziejczak! " Mark Meijerink! " Hanno Pet! " Peter Tavenier Outline! " Network bandwidth

More information

Meltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies

Meltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies Meltdown and Spectre Interconnect Evaluation Jan 2018 1 Meltdown and Spectre - Background Most modern processors perform speculative execution This speculation can be measured, disclosing information about

More information

NVMe over Fabrics. High Performance SSDs networked over Ethernet. Rob Davis Vice President Storage Technology, Mellanox

NVMe over Fabrics. High Performance SSDs networked over Ethernet. Rob Davis Vice President Storage Technology, Mellanox NVMe over Fabrics High Performance SSDs networked over Ethernet Rob Davis Vice President Storage Technology, Mellanox Ilker Cebeli Senior Director of Product Planning, Samsung May 3, 2017 Storage Performance

More information

IsoStack Highly Efficient Network Processing on Dedicated Cores

IsoStack Highly Efficient Network Processing on Dedicated Cores IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single

More information

N V M e o v e r F a b r i c s -

N V M e o v e r F a b r i c s - N V M e o v e r F a b r i c s - H i g h p e r f o r m a n c e S S D s n e t w o r k e d f o r c o m p o s a b l e i n f r a s t r u c t u r e Rob Davis, VP Storage Technology, Mellanox OCP Evolution Server

More information

Implementation and Analysis of Large Receive Offload in a Virtualized System

Implementation and Analysis of Large Receive Offload in a Virtualized System Implementation and Analysis of Large Receive Offload in a Virtualized System Takayuki Hatori and Hitoshi Oi The University of Aizu, Aizu Wakamatsu, JAPAN {s1110173,hitoshi}@u-aizu.ac.jp Abstract System

More information

LREC9030PF PCIe x1 100FX Desktop Fiber Ethernet Adapter (Intel Based)

LREC9030PF PCIe x1 100FX Desktop Fiber Ethernet Adapter (Intel Based) LREC9030PF PCIe x1 100FX Desktop Fiber Ethernet Adapter (Intel 82574 Based) Descriptions: LREC9030PF-SFP is a typical model of 100FX Ethernet adapter, based on Intel 82574 Ethernet Controller, researched

More information

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems

More information

The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook)

The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Workshop on New Visions for Large-Scale Networks: Research & Applications Vienna, VA, USA, March 12-14, 2001 The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Wu-chun Feng feng@lanl.gov

More information

Motivation CPUs can not keep pace with network

Motivation CPUs can not keep pace with network Deferred Segmentation For Wire-Speed Transmission of Large TCP Frames over Standard GbE Networks Bilic Hrvoye (Billy) Igor Chirashnya Yitzhak Birk Zorik Machulsky Technion - Israel Institute of technology

More information

Video capture using GigE Vision with MIL. What is GigE Vision

Video capture using GigE Vision with MIL. What is GigE Vision What is GigE Vision GigE Vision is fundamentally a standard for transmitting video from a camera (see Figure 1) or similar device over Ethernet and is primarily intended for industrial imaging applications.

More information

Performance Pack. Administration Guide Version R70. March 8, 2009

Performance Pack. Administration Guide Version R70. March 8, 2009 Performance Pack TM Administration Guide Version R70 March 8, 2009 2003-2009 Check Point Software Technologies Ltd. All rights reserved. This product and related documentation are protected by copyright

More information

An Extensible Message-Oriented Offload Model for High-Performance Applications

An Extensible Message-Oriented Offload Model for High-Performance Applications An Extensible Message-Oriented Offload Model for High-Performance Applications Patricia Gilfeather and Arthur B. Maccabe Scalable Systems Lab Department of Computer Science University of New Mexico pfeather@cs.unm.edu,

More information

Comparing TCP performance of tunneled and non-tunneled traffic using OpenVPN. Berry Hoekstra Damir Musulin OS3 Supervisor: Jan Just Keijser Nikhef

Comparing TCP performance of tunneled and non-tunneled traffic using OpenVPN. Berry Hoekstra Damir Musulin OS3 Supervisor: Jan Just Keijser Nikhef Comparing TCP performance of tunneled and non-tunneled traffic using OpenVPN Berry Hoekstra Damir Musulin OS3 Supervisor: Jan Just Keijser Nikhef Outline Introduction Approach Research Results Conclusion

More information

Avid Configuration Guidelines Lenovo P520/P520C workstation Single 6 to 18 Core CPU System P520 P520C

Avid Configuration Guidelines Lenovo P520/P520C workstation Single 6 to 18 Core CPU System P520 P520C Avid Configuration Guidelines Lenovo P520/P520C workstation Single 6 to 18 Core CPU System P520 P520C Page 1 of 14 Dave Pimm Avid Technology April 25, 2018 1.) Lenovo P520 & P520C AVID Qualified System

More information

Performance Characteristics on Gigabit networks

Performance Characteristics on Gigabit networks Version 4.7 Impairment Emulator Software for IP Networks (IPv4 & IPv6) Performance Characteristics on Gigabit networks ZTI Communications / 1 rue Ampère / 22300 LANNION / France Phone: +33 2 9613 4003

More information

Avid Configuration Guidelines HP Z8 G4 workstation Dual 8 to 28 Core CPU System

Avid Configuration Guidelines HP Z8 G4 workstation Dual 8 to 28 Core CPU System Avid Configuration Guidelines HP Z8 G4 workstation Dual 8 to 28 Core CPU System Page 1 of 13 Dave Pimm Avid Technology April 23, 2018 1.) HP Z8 G4 AVID Qualified System Specification: Z8 G4 Hardware Configuration

More information

FPGA Augmented ASICs: The Time Has Come

FPGA Augmented ASICs: The Time Has Come FPGA Augmented ASICs: The Time Has Come David Riddoch Steve Pope Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. Hardware acceleration is Niche (With the obvious exception of graphics

More information

TCP Tuning for the Web

TCP Tuning for the Web TCP Tuning for the Web Jason Cook - @macros - jason@fastly.com Me Co-founder and Operations at Fastly Former Operations Engineer at Wikia Lots of Sysadmin and Linux consulting The Goal Make the best use

More information

Analytics of Wide-Area Lustre Throughput Using LNet Routers

Analytics of Wide-Area Lustre Throughput Using LNet Routers Analytics of Wide-Area Throughput Using LNet Routers Nagi Rao, Neena Imam, Jesse Hanley, Sarp Oral Oak Ridge National Laboratory User Group Conference LUG 2018 April 24-26, 2018 Argonne National Laboratory

More information

INT G bit TCP Offload Engine SOC

INT G bit TCP Offload Engine SOC INT 10011 10 G bit TCP Offload Engine SOC Product brief, features and benefits summary: Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx/Altera FPGAs or Structured ASIC flow.

More information

PacketShader: A GPU-Accelerated Software Router

PacketShader: A GPU-Accelerated Software Router PacketShader: A GPU-Accelerated Software Router Sangjin Han In collaboration with: Keon Jang, KyoungSoo Park, Sue Moon Advanced Networking Lab, CS, KAIST Networked and Distributed Computing Systems Lab,

More information

Performance Characteristics on Gigabit networks

Performance Characteristics on Gigabit networks Version 4.6 Impairment Emulator Software for IP Networks (IPv4 & IPv6) Performance Characteristics on Gigabit networks ZTI / 1 boulevard d'armor / BP 20254 / 22302 Lannion Cedex / France Phone: +33 2 9648

More information

The Convergence of Storage and Server Virtualization Solarflare Communications, Inc.

The Convergence of Storage and Server Virtualization Solarflare Communications, Inc. The Convergence of Storage and Server Virtualization 2007 Solarflare Communications, Inc. About Solarflare Communications Privately-held, fabless semiconductor company. Founded 2001 Top tier investors:

More information

DXE-810S. Manual. 10 Gigabit PCI-EXPRESS-Express Ethernet Network Adapter V1.01

DXE-810S. Manual. 10 Gigabit PCI-EXPRESS-Express Ethernet Network Adapter V1.01 DXE-810S 10 Gigabit PCI-EXPRESS-Express Ethernet Network Adapter Manual V1.01 Table of Contents INTRODUCTION... 1 System Requirements... 1 Features... 1 INSTALLATION... 2 Unpack and Inspect... 2 Software

More information

ClearStream. Prototyping 40 Gbps Transparent End-to-End Connectivity. Cosmin Dumitru! Ralph Koning! Cees de Laat! and many others (see posters)!

ClearStream. Prototyping 40 Gbps Transparent End-to-End Connectivity. Cosmin Dumitru! Ralph Koning! Cees de Laat! and many others (see posters)! ClearStream Prototyping 40 Gbps Transparent End-to-End Connectivity Cosmin Dumitru! Ralph Koning! Cees de Laat! and many others (see posters)! University of Amsterdam! more data! Speed! Volume! Internet!

More information

Learning with Purpose

Learning with Purpose Network Measurement for 100Gbps Links Using Multicore Processors Xiaoban Wu, Dr. Peilong Li, Dr. Yongyi Ran, Prof. Yan Luo Department of Electrical and Computer Engineering University of Massachusetts

More information

HKG net_mdev: Fast-path userspace I/O. Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog

HKG net_mdev: Fast-path userspace I/O. Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog HKG18-110 net_mdev: Fast-path userspace I/O Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog Why userland I/O Time sensitive networking Developed mostly for Industrial IOT, automotive and audio/video

More information

An Intelligent NIC Design Xin Song

An Intelligent NIC Design Xin Song 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) An Intelligent NIC Design Xin Song School of Electronic and Information Engineering Tianjin Vocational

More information

CA AppLogic and Supermicro X9 Equipment Validation

CA AppLogic and Supermicro X9 Equipment Validation CA AppLogic and Supermicro X9 Equipment Validation Opening Statement Last Updated August 8, 2013 The Boston X9 servers described below are compatible and validated with both CA AppLogic 3.1.14 Xen and

More information

Ronald van der Pol

Ronald van der Pol Ronald van der Pol Outline! Goal of this project! 40GE demonstration setup! Application description! Results! Conclusions Goal of the project! Optimize single server disk to network I/O!

More information

FreeBSD Network Performance Tuning

FreeBSD Network Performance Tuning Sucon 2004 Zurich, Switzerland Hendrik Scholz hscholz@raisdorf.net http://www.wormulon.net/ Agenda Motivation Overview Optimization approaches sysctl() tuning Measurement NIC comparision Conclusion Motivation

More information

Linux Kernel Hacking Free Course

Linux Kernel Hacking Free Course Linux Kernel Hacking Free Course 3 rd edition G.Grilli, University of me Tor Vergata IRQ DISTRIBUTION IN MULTIPROCESSOR SYSTEMS April 05, 2006 IRQ distribution in multiprocessor systems 1 Contents: What

More information

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc.

Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc. Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc. 1 DISCLAIMER This presentation and/or accompanying oral statements by Samsung

More information

Xilinx Answer QDMA Performance Report

Xilinx Answer QDMA Performance Report Xilinx Answer 71453 QDMA Performance Report Important Note: This downloadable PDF of an Answer Record is provided to enhance its usability and readability. It is important to note that Answer Records are

More information

On the cost of tunnel endpoint processing in overlay virtual networks

On the cost of tunnel endpoint processing in overlay virtual networks J. Weerasinghe; NVSDN2014, London; 8 th December 2014 On the cost of tunnel endpoint processing in overlay virtual networks J. Weerasinghe & F. Abel IBM Research Zurich Laboratory Outline Motivation Overlay

More information

Analysis of CPU Pinning and Storage Configuration in 100 Gbps Network Data Transfer

Analysis of CPU Pinning and Storage Configuration in 100 Gbps Network Data Transfer Analysis of CPU Pinning and Storage Configuration in 100 Gbps Network Data Transfer International Center for Advanced Internet Research Northwestern University Se-young Yu Jim Chen, Joe Mambretti, Fei

More information

5-Speed NBASE-T Network. Controller Card

5-Speed NBASE-T Network. Controller Card 5-Speed NBASE-T Network Controller Card User Manual Ver. 1.00 All brand names and trademarks are properties of their respective owners. Contents: Chapter 1: Introduction... 3 1.1 Product Introduction...

More information

PCIe 10G SFP+ Network Card

PCIe 10G SFP+ Network Card PCIe 10G SFP+ Network Card User Manual Ver. 1.00 All brand names and trademarks are properties of their respective owners. Contents: Chapter 1: Introduction... 3 1.1 Product Introduction... 3 1.2 Features...

More information

Stacked Vlan: Performance Improvement and Challenges

Stacked Vlan: Performance Improvement and Challenges Stacked Vlan: Performance Improvement and Challenges Toshiaki Makita NTT Tokyo, Japan makita.toshiaki@lab.ntt.co.jp Abstract IEEE 802.1ad vlan protocol type was introduced in kernel 3.10, which has encouraged

More information

Performance Characteristics on Fast Ethernet and Gigabit networks

Performance Characteristics on Fast Ethernet and Gigabit networks Version 2.5 Traffic Generator and Measurement Tool for IP Networks (IPv4 & IPv6) FTTx, LAN, MAN, WAN, WLAN, WWAN, Mobile, Satellite, PLC, etc Performance Characteristics on Fast Ethernet and Gigabit networks

More information