DTN End Host performance and tuning
|
|
- Daisy Cunningham
- 5 years ago
- Views:
Transcription
1 DTN End Host performance and tuning 1 Gigabit Ethernet & NVME Disks Richard- Hughes Jones Senior Network Advisor, Office of the CTO GEANT AssociaOon - Cambridge Workshop: Moving My Data at High Speeds over the Network, Prague 12 Jun 216
2 The GÉANT DTN A story of how we explored the hardware and what we found Almost a technical report of progress Not really a teach- in, but we do give some hints for tuning Welcome input 2
3 The GÉANT DTN Hardware A lot of help from Boston Labs (London UK) Mellanox (UK & Israel ) Supermicro X1DRT- i+ motherboard Two 6 core 3.4GHz Xeon E v3 processors, Mellanox ConnectX- 4 1 GE NIC 16 lane PCI- e As many interrupts as cores Driver MLNX_OFED NVME SSD Set 8 lane PCI- e Fedora 23 with the fc23.x86_64 kernel NIC NVME 3
4 Explore the Hardware How are the peripherals connected? NUMA: Which PCIe I/f & bus is connected to which CPU socket or node? To which core do the IRQs go? Tools dmsg lspci tv lspci vv (for the PCIe slot) numactl H Look in /sys & /proc cat /proc/irq/<irq>/smp_affinity [root@geant_dtn1 mlnx_tuning_scripts]#./ show_irq_affinity.sh enp131sf1 183: 4 184: 8 185: 1 186: 2 187: 4 188: 8 189: 4 19: 8 191: 1 192: 2 193: Networks 4 Services People 194: 8 [root@dhcp richard]# lspci tv -+-[:ff]-+-8. Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link \-1f.2 Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 VCU +-[:8]-+-.-[81] [82]----. Mellanox Technologies MT275 Family [ConnectX-3] +-2.-[83]--+-. Mellanox Technologies MT277 Family [ConnectX-4] \-.1 Mellanox Technologies MT277 Family [ConnectX-4] +-3.-[84] [:7f]-+-8. Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link +-9. Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 QPI Link 1... \-[:]-+-. Intel Corporation Xeon E7 v3/xeon E5 v3/core i7 DMI [1]--+-. Intel Corporation Ethernet Controller 1-Gigabit X54-AT2 \-.1 Intel Corporation Ethernet Controller 1-Gigabit X54-AT [2]----. Intel Corporation PCIe Data Center SSD [3]----. Intel Corporation PCIe Data Center SSD [4] [5]----. Intel Corporation PCIe Data Center SSD [6]----. Intel Corporation PCIe Data Center SSD [7]----. Intel Corporation PCIe Data Center SSD [8]----. Intel Corporation PCIe Data Center SSD 4
5 udpmon: UDP Achievable Throughput Ideal shape Flat poroons Limited by capacity of link Available BW on a loaded link Recv Wire rate Mbit/s Mumbai-Singapore 5 bytes 1 bytes 2 bytes 4 bytes 6 bytes 8 bytes 1 bytes 12 bytes 14 bytes Shape follows 1/t Packet spacing most important Spacing betw een frames us 1472 bytes Cannot send packets back- 2- back End host: NIC setup Ome on PCI / context switches
6 udpmon on Boston Lab hosts: Achievable Throughput & Packet loss Move IRQs from core 11, set affinity to lock udpmon to core 11 node 1. Interrupt coalescence on (3us) Recv Wire rate Mbit/s haswell1-2_x_nobackground_5nov15 Recv Wire rate Mbit/s haswell1-2_x_nobackground_5nov Gbit/s Spacing between frames us 8972 bytes Spacing between frames us 8972 bytes %cpu kernel snd % cpu kernel rec 1 haswell1-2_x_nobackground_5nov Spacing between frames us 1 haswell1-2_x_nobackground_5nov % kernel 4 2 Networks Services People Spacing between frames us 8972 bytes 8972 bytes %cpu kernel snd haswell1-2_x_nobackground_5nov15 96% kernel sending Spacing between frames us 8972 bytes Swapping between user and kernel mode 6
7 udpmon on GÉANT DTN: Achievable Throughput & Packet loss Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. Interrupt coalescence on (16us) Recv Wire rate Gbit/s %cpu kernel snd % cpu1 kernel rec 6 DTN1-2_noFW_a4_4Jun Gbit/s Spacing between frames us 1 DTN1-2_noFW_a4_4Jun % kernel sending Spacing between frames us 1 DTN1-2_noFW_a4_4Jun % kernel 4 2 Networks Services 2 People Spacing between frames us 4 bytes 6 bytes 7813 bytes 8972 bytes 4 bytes 6 bytes 7813 bytes 8972 bytes 4 bytes 6 bytes 7813 bytes Jumbo size packet should be highest! Swapping between user and kernel mode Also lost packets in the receiving host 7
8 udpmon_send: How fast can I transmit? Sending Rate as a FuncOon of Packet Size Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. 6 pkt_size_geant_dtn1_4may16 3 pkt_size_geant_dtn1_4may Send user data rate Mbit/s Average send time/packet us Size of user data in packet bytes Size of user data in packet bytes Drop 14.5 Gbit/s from 43 Gbit/s Step.75 µs Step occurs bytes user data Drop 3.6 Gbit/s Collaborate with Mellanox Step.29 µs 8
9 Some aspects of the Impact of Reality on Throughput
10 udpmon_send: How fast can I transmit? Turn off Checksum offload Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. Only ~ 2 Gbit/s drop but can also turn off TCP offload! 6 pkt_size_txoffrxoff_geant_dtn1_ax4_4jun16 5 Send user data rate Mbit/s Xsum OFF Xsum ON Size of user data in packet bytes
11 udpmon_send: How fast can I transmit? Turn on the Firewalls Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. Max drop 11 Gbit/s at 7872 bytes 6 pkt_size_dtn1_ax4_5jun16 5 Send user data rate Mbit/s Xsum On FW Off Xsum On FW On Size of user data in packet bytes
12 udpmon_send: How fast can I transmit? Which CPU core and node? Move IRQs from core 6, set affinity to lock udpmon to core 6 node 1. Run udpmon on core 2 no IRQs Firewalls ON Max drop 7.2 Gbit/s at 8972 bytes Send user data rate Mbit/s pkt_size_xsumonfwon_geant_dtn1_ax4_5jun16 Core 6 Node 1 Core 2 Node Size of user data in packet bytes Summary for 7772 byte user data Case Gbit/s No FW 43.3 FW ON 32.4 Wrong core 25.5
13 udpmon_send on servers at Boston Labs : 5 UDP flows Set affinity to lock udpmon_send to run on CPU cores on node 1. Send 8972 byte packets with wait Ome 3.6 μs (~2 Gbit/s) record the Ome- series every 5s. 3 flows had no packet loss ~2 Gbit/s, did not process irq, ~45 % kernel mode. Other 2 flows had 25-28% packet loss, process 3-4% so{irq, 96% kernel mode Throughput Gbit/s udpmon_tseries_sum_24nov % packet loss throughput % loss a4 % loss a8 % loss a1 % loss a2 % loss a4 Time during transfer hr 13
14 TCP Achievable Throughput
15 iperf3: TCP throughput Throughput vs TCP buffer size Distribute IRQs over all cores on node 1 Run iperf3 on core 6 Node 1, Firewalls OFF, TCP offload on, TCP cubic stack RTT.4 ms. Delay Bandwidth Product.5 MBytes. As expected rises smoothly to a plateau at.7 Mbytes reaching 75 Gbit/s Throughput constant a{er slow start. No TCP re- transmi ed segments observed (iperf3 and /proc/net/snmp ) BW Gbit/s DTN1-2_A6_TCPbuf_31May Buffer size Mbyte
16 iperf3: TCP throughput Which CPU core and Node? Distribute IRQs over all cores on node 1 Run iperf3 on core 6 node 1, and repeat on core 1 node Firewalls OFF, TCP offload on, TCP cubic stack RTT.4 ms. Delay Bandwidth Product.5 MBytes. Rises smoothly to a plateau at.5 Mbytes Throughput falls by 4 Gbit/s from 75 to 35 Gbit/s No TCP re- transmi ed segments observed (iperf3 and /proc/net/snmp ) BW Gbit/s Core 6 Node 1 Core 1 Node Buffer size Mbyte
17 iperf3: TCP throughput Use cores 6 & 1 on node 1 and node Firewalls OFF, TCP offload on, TCP cubic stack Rises smoothly to the plateau Throughput: 75 Gbit/s both Send & Receive on node 1 6 Gbit/s Send on node Reveive on node 1 35 Gbit/s both Send & Receive on node Very few TCP re- transmi ed segments observed BW Gbit/s DTN1-2_TCPbuf core 6 - core 6 core 1 - core 1 core 1 - core Buffer size Mbyte
18 iperf3: TCP throughput With Firewall ON Run iperf3 on core 6 node 1, TCP offload on, TCP cubic stack RTT.4 ms. Delay Bandwidth Product.5 MBytes. Rises smoothly to a plateau at.5 Mbytes Achievable throughput falls by 7.3 Gbit/s No TCP re- transmi ed segments observed (iperf3 and /proc/net/snmp ) BW Gbit/s No Firewall With Firewall Buffer size Mbyte
19 iperf: TCP throughput mulople flows Distribute IRQs over all cores on node 1 Run iperf on cores 6-11 for both receive and sending Firewalls ON, TCP offload on, TCP cubic Total throughput: increase 6 à 86 2 à 3 flows 98 Gbit/s for 4 & 5 flows then starts to fall % re- tx 2, 3 flows 1-4 % re- tx 4, 5 flows 1-3 % re- tx 8, 1 flows Individual flows can vary by ± 5 Gbit/s BW Gbit/s BW Gbit/s BW Gbit/s DTN1-2_A6-11_P2_TCPbuf_5Jun Buffer size Mbyte DTN1-2_A6-11_P4_TCPbuf_5Jun Buffer size Mbyte DTN1-2_A6-11_P8_TCPbuf_5Jun Buffer size Mbyte BW Gbit/s BW Gbit/s BW Gbit/s DTN1-2_A6-11_P3_TCPbuf_5Jun Buffer size Mbyte DTN1-2_A6-11_P5_TCPbuf_5Jun16 Buffer size Mbyte DTN1-2_A6-11_P8_TCPbuf_5Jun Buffer size Mbyte
20 Network Tuning for 1 Gigabit Ethernet Hyper threading Turn off in the BIOS Wait states Disable / minimise use of c- states. Use the BIOS or at boot Ome ( but I could not find out how in my BIOS!) Power saving Core Frequency Set governor performance $numactl H Set cpufreq to maximum $lspci tv Depends on scaling_driver: acpi- cpufreq allows se ng cpuinfo_cur_freq to max intel_pstate does not but seems fast anyway NUMA Check and select CPU cores in the node with Ethernet interfaces a ached $numactl H $lspci tv
21 Network Tuning for 1 Gigabit Ethernet IRQs #systemctl stop irqbalance.service Turn off the irqbalance service prevents balancer from changing the affinity scheme. Set affinity of the NIC IRQs to use CPU cores on the node with PCIe 1 per CPU. For UDP seems best NOT to use the CPU cores used by the apps. Interface parameters Ensure interrupt coalescence is ON 3 μs, 16 μs, more? Ensure Rx & Tx checksum offload is ON Ensure tcp- segmentaoon- offload is ON MTU Set IP MTU 9 Bytes #cat /proc/irq/<irq>/smp_affinity #echo 4 > /proc/irq/183/smp_affinity #ethtool C <i/f> rx-usecs 8 #ethtool K <i/f> rx on tx on Best in files eg ifcfg_ethx mtu=9
22 Network Tuning for 1 Gigabit Ethernet Queues Set txqueuelen transmit Q (I used 1 but 1, reccomended Set netdev_max_backlog say 25 Q between interface and IP stack Kernel parameters net.core.rmem_max net.core.wmem_max net.ipv4.tcp_rmem net.ipv4.tcp_wmem (min / default / max) net.ipv4.tcp_mtu_probing (jumbo frames) net.ipv4.tcp_congesoon_control BeJer to choose fewer high speed cores Best in file /etc/sysctl.conf See also h p:// docs/prod_so{ware/ Performance_Tuning_Guide_for_Mellanox_Network_Adapters.pdf Esnet FasterData h ps://fasterdata.es.net/network- tuning/
23 A Look at the Disk Sub- system
24 NVMe Disks Non VolaOle Memory express a scalable host controller interface. Designed for SSD a ached to PICe PCIe cards or 2.5 drives. Block IO based lockless block layer Shorter data path bypasses costly AHCI / SCSI layers Latency & CPU cycles reduced > 5% SCSI 6 μs 19,5 cycles NVMe 2.8 μs 9,1 cycles Parallelism - per CPU HW queues:
25 NVMe Disk Performance RAID with 2 disks IRQs distributed over all cores on both nodes Run disk_test on core 2 Node Measure sequenoal read and write disk- memory rates as funcoon file size 2 disks in RAID xfs file system Drop at file size ~3 Gbytes to 27 Gbit/s read and 15 Gbit/s write 7 7 DTN2_a2_R-2d_filescan_7Jun16 DTN2_a2_filescan_6Jun16 14 DTN2_a2_R-2d_filescan_7Jun16 Throughput Gbit/s 6 5 Throughput Gbit/s Gbit/s r Gbit/s w Gbit/s r Gbit/s w Throughput GBytes/s GBytes r GBytes w File size File size GBytes GBytes File size GBytes
26 NVMe Disk Performance 1 NVMe disk IRQs distributed over all cores on both nodes Run disk_test on core 2 Node Measure sequenoal read and write disk- memory rates as funcoon file size xfs file system Read Gbit/s Write Gbit/s 1 Disk ~ yes read < write! RAID 2disk DTN2_a8_D1-1d_filescan_7Jun16 7 DTN2_a2_R-2d_filescan_7Jun16 Throughput Gbit/s Gbit/s r Gbit/s w Throughput Gbit/s Gbit/s r Gbit/s w File size GBytes File size GBytes
27 Disk Tuning Steps for Discussion Really a to- do next list RAID with more disks note the CPU loads makefs.xfs parameters Stripe size 256k? mkfs.xfs - f - l version=2 - i size=124 - n size= d su=256k,sw=22 - L myname Read ahead blockdev - - setra Requests echo 512 > /sys/block/sda/queue/nr_requests Scheduler echo deadline > /sys/block/sda/queue/scheduler Other so{ware RAID products Measure data transfers
28 Summary for Discussion UDP flows harder than expected working with Mellanox Use of CPU cores seems criocal for both UDP and TCP Shown that TCP performance is good Explored some of the aspects that impact on Throughput Started to understand NVMe disk behaviour help from Boston Labs UK Just starong to run globus and grid{p Lets open the discussion
29 Thank you Richard- Hughes Jones Richard.Hughes- GEANT Limited on behalf of the GN4 Phase 1 project (GN4-1). The research leading to these results has received funding from the European Union s Horizon 22 research and innovaoon programme under Grant Agreement No (GN4-1). 29
30 Setup at Boston Labs 1 Gbit Ethernet NIC A lot of help from Boston Labs (London UK) Supermicro X1DRT- P motherboard Two 1 core 2.3 GHz Intel Xeon E5-265 v3 Haswell processors Mellanox ConnectX more GE cmd_throughput_lite.pl NIC 16 lane PCI- e As many interrupts as cores Centos 6.7 with the el6 kernel. IniOally Hyper Threading On 4 CPUs! NIC 3
31 What is udpmon? So{ware package for invesogaong end host and network performance, using UDP/IP frames. Programs work in client- server pairs to: Transmit streams of sequenced UDP packets at regular, carefully controlled intervals. Can vairy frame size and frame transmit spacing. Receive and check the sequence & Oming of the packets. IdenOfy if packets lost in the end host or network. Allows measurement of: Request- response latency. Achievable UDP bandwidth, packet loss, packet ordering, ji er. Packet dynamics & packet loss pa erns. Quality of the connecoon path and its stability.
32 The client- server pairs udpmon_bw_mon à udpmon_resp Achievable UDP bandwidth, packet loss, packet ordering, ji er Packet dynamics & packet loss pa erns udpmon_req à udpmon_resp Request- response latency udpmon_send à udpmon_recv Quality of the connecoon path and its stability Time series of achievable UDP bandwidth, packet loss
33 Achievable UDP Throughput Measurements Send a controlled stream of UDP frames spaced at regular intervals with 64 bit sequence numbers & send Ome stamp. Record the packet receive Ome. Zero stats set concurrent lockout Sender Receiver OK done Send data frames at regular intervals Time to send Inter-packet time (Histogram) Time to receive Get remote statistics Signal end of test Time n bytes Wait time Number of packets Send statistics back: No. received No. lost + loss pattern OK done No. out-of-order No. lost in network CPU load No. interrupts & SNMP Tx, Rx times & 1-way delay time
Networking: Data Transfer Nodes and Tuning
Networking: Data Transfer Nodes and Tuning Richard Hughes-Jones GÉANT 1 GEANT Limited on behalf of the GN4 Phase 2 project (GN4-2). The research leading to these results has received funding from the European
More informationLighting the Blue Touchpaper for UK e-science - Closing Conference of ESLEA Project The George Hotel, Edinburgh, UK March, 2007
Working with 1 Gigabit Ethernet 1, The School of Physics and Astronomy, The University of Manchester, Manchester, M13 9PL UK E-mail: R.Hughes-Jones@manchester.ac.uk Stephen Kershaw The School of Physics
More informationFPGAs and Networking
FPGAs and Networking Marc Kelly & Richard Hughes-Jones University of Manchester 12th July 27 1 Overview of Work Looking into the usage of FPGA's to directly connect to Ethernet for DAQ readout purposes.
More informationPerformance Optimisations for HPC workloads. August 2008 Imed Chihi
Performance Optimisations for HPC workloads August 2008 Imed Chihi Agenda The computing model The assignment problem CPU sets Priorities Disk IO optimisations gettimeofday() Disabling services Memory management
More informationImproving Performance of 100G Data Transfer Nodes
Improving Performance of 100G Data Transfer Nodes Brian Tierney, Consultant Nate Hanford, ESnet bltierney@gmail.com http://fasterdata.es.net APAN, Singapore March 28, 2018 Recent TCP changes 2 Observation
More information10GE network tests with UDP. Janusz Szuba European XFEL
10GE network tests with UDP Janusz Szuba European XFEL Outline 2 Overview of initial DAQ architecture Slice test hardware specification Initial networking test results DAQ software UDP tests Summary 10GE
More informationHigh bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK
High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK [r.tasker@dl.ac.uk] DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459
More informationLinux Network Tuning Guide for AMD EPYC Processor Based Servers
Linux Network Tuning Guide for AMD EPYC Processor Application Note Publication # 56224 Revision: 1.00 Issue Date: November 2017 Advanced Micro Devices 2017 Advanced Micro Devices, Inc. All rights reserved.
More informationExploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer
TSM Performance Tuning Exploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer peinkofer@lrz.de Agenda Network Performance Disk-Cache Performance Tape Performance
More information46PaQ. Dimitris Miras, Saleem Bhatti, Peter Kirstein Networks Research Group Computer Science UCL. 46PaQ AHM 2005 UKLIGHT Workshop, 19 Sep
46PaQ Dimitris Miras, Saleem Bhatti, Peter Kirstein Networks Research Group Computer Science UCL 46PaQ AHM 2005 UKLIGHT Workshop, 19 Sep 2005 1 Today s talk Overview Current Status and Results Future Work
More informationAchieving 98Gbps of Crosscountry TCP traffic using 2.5 hosts, 10 x 10G NICs, and 10 TCP streams
Achieving 98Gbps of Crosscountry TCP traffic using 2.5 hosts, 10 x 10G NICs, and 10 TCP streams Eric Pouyoul, Brian Tierney ESnet January 25, 2012 ANI 100G Testbed ANI Middleware Testbed NERSC To ESnet
More informationA System-Level Optimization Framework For High-Performance Networking. Thomas M. Benson Georgia Tech Research Institute
A System-Level Optimization Framework For High-Performance Networking Thomas M. Benson Georgia Tech Research Institute thomas.benson@gtri.gatech.edu 1 Why do we need high-performance networking? Data flow
More informationLinux Network Tuning Guide for AMD EPYC Processor Based Servers
Linux Network Tuning Guide for AMD EPYC Processor Application Note Publication # 56224 Revision: 1.10 Issue Date: May 2018 Advanced Micro Devices 2018 Advanced Micro Devices, Inc. All rights reserved.
More informationOptimizing the GigE transfer What follows comes from company Pleora.
Optimizing the GigE transfer What follows comes from company Pleora. Selecting a NIC and Laptop Based on our testing, we recommend Intel NICs. In particular, we recommend the PRO 1000 line of Intel PCI
More informationTCP Tuning Domenico Vicinanza DANTE, Cambridge, UK
TCP Tuning Domenico Vicinanza DANTE, Cambridge, UK domenico.vicinanza@dante.net EGI Technical Forum 2013, Madrid, Spain TCP! Transmission Control Protocol (TCP)! One of the original core protocols of the
More informationAMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV)
White Paper December, 2018 AMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV) Executive Summary Data centers and cloud service providers are creating a technology shift
More information打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生.
打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生 shiys@solutionware.com.cn BY DEFAULT, LINUX NETWORKING NOT TUNED FOR MAX PERFORMANCE, MORE FOR RELIABILITY Trade-off :Low Latency, throughput, determinism Performance
More informationOptimizing Performance: Intel Network Adapters User Guide
Optimizing Performance: Intel Network Adapters User Guide Network Optimization Types When optimizing network adapter parameters (NIC), the user typically considers one of the following three conditions
More informationntop Users Group Meeting
ntop Users Group Meeting PF_RING Tutorial Alfredo Cardigliano Overview Introduction Installation Configuration Tuning Use cases PF_RING Open source packet processing framework for
More informationAdvanced Computer Networks. End Host Optimization
Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct
More informationASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed
ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER Aspera FASP Data Transfer at 80 Gbps Elimina8ng tradi8onal bo
More informationASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed
ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER 80 GBIT/S OVER IP USING DPDK Performance, Code, and Architecture Charles Shiflett Developer of next-generation
More informationNetworking at the Speed of Light
Networking at the Speed of Light Dror Goldenberg VP Software Architecture MaRS Workshop April 2017 Cloud The Software Defined Data Center Resource virtualization Efficient services VM, Containers uservices
More informationIBM POWER8 100 GigE Adapter Best Practices
Introduction IBM POWER8 100 GigE Adapter Best Practices With higher network speeds in new network adapters, achieving peak performance requires careful tuning of the adapters and workloads using them.
More informationQuickSpecs. HP Z 10GbE Dual Port Module. Models
Overview Models Part Number: 1Ql49AA Introduction The is a 10GBASE-T adapter utilizing the Intel X722 MAC and X557-AT2 PHY pairing to deliver full line-rate performance, utilizing CAT 6A UTP cabling (or
More informationInternet data transfer record between CERN and California. Sylvain Ravot (Caltech) Paolo Moroni (CERN)
Internet data transfer record between CERN and California Sylvain Ravot (Caltech) Paolo Moroni (CERN) Summary Internet2 Land Speed Record Contest New LSR DataTAG project and network configuration Establishing
More informationCERN openlab Summer 2006: Networking Overview
CERN openlab Summer 2006: Networking Overview Martin Swany, Ph.D. Assistant Professor, Computer and Information Sciences, U. Delaware, USA Visiting Helsinki Institute of Physics (HIP) at CERN swany@cis.udel.edu,
More informationD1.1 Server Scalibility
D1.1 Server Scalibility Ronald van der Pol and Freek Dijkstra SARA Computing & Networking Services, Science Park 121, 1098 XG Amsterdam, The Netherlands March 2010 ronald.vanderpol@sara.nl,freek.dijkstra@sara.nl
More informationNFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications
NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications Outline RDMA Motivating trends iwarp NFS over RDMA Overview Chelsio T5 support Performance results 2 Adoption Rate of 40GbE Source: Crehan
More informationNFS on the Fast track - fine tuning and futures
NFS on the Fast track - fine tuning and futures Bikash Roy Choudhury Solutions Architect, NetApp Agenda Overview of NFS layers Linux Client Why is the NFS performance Slow? Understanding application behavior
More informationBenchmarking message queue libraries and network technologies to transport large data volume in
Benchmarking message queue libraries and network technologies to transport large data volume in the ALICE O 2 system V. Chibante Barroso, U. Fuchs, A. Wegrzynek for the ALICE Collaboration Abstract ALICE
More informationARISTA: Improving Application Performance While Reducing Complexity
ARISTA: Improving Application Performance While Reducing Complexity October 2008 1.0 Problem Statement #1... 1 1.1 Problem Statement #2... 1 1.2 Previous Options: More Servers and I/O Adapters... 1 1.3
More informationPresentation_ID. 2002, Cisco Systems, Inc. All rights reserved.
1 Gigabit to the Desktop Session Number 2 Gigabit to the Desktop What we are seeing: Today s driver for Gigabit Ethernet to the Desktop is not a single application but the simultaneous use of multiple
More informationOpenOnload. Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect
OpenOnload Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. OpenOnload Acceleration Software Accelerated
More informationAgilio CX 2x40GbE with OVS-TC
PERFORMANCE REPORT Agilio CX 2x4GbE with OVS-TC OVS-TC WITH AN AGILIO CX SMARTNIC CAN IMPROVE A SIMPLE L2 FORWARDING USE CASE AT LEAST 2X. WHEN SCALED TO REAL LIFE USE CASES WITH COMPLEX RULES TUNNELING
More informationEVPath Performance Tests on the GTRI Parallel Software Testing and Evaluation Center (PASTEC) Cluster
EVPath Performance Tests on the GTRI Parallel Software Testing and Evaluation Center (PASTEC) Cluster Magdalena Slawinska, Greg Eisenhauer, Thomas M. Benson, Alan Nussbaum College of Computing, Georgia
More informationThe NE010 iwarp Adapter
The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter
More informationMeasuring a 25 Gb/s and 40 Gb/s data plane
Measuring a 25 Gb/s and 40 Gb/s data plane Christo Kleu Pervaze Akhtar 1 Contents Preliminaries Equipment Traffic generators Test topologies Host and VM configuration NUMA Architecture CPU allocation BIOS
More informationCommissioning and Using the 4 Gigabit Lightpath from Onsala to Jodrell Bank
Commissioning and Using the 4 Gigabit Lightpath from Onsala to Jodrell Bank 1 DANTE City House,126-13 Hills Road, Cambridge CB2 1PQ, UK And Visiting Fellow, School of Physics and Astronomy, The University
More informationApplication Acceleration Beyond Flash Storage
Application Acceleration Beyond Flash Storage Session 303C Mellanox Technologies Flash Memory Summit July 2014 Accelerating Applications, Step-by-Step First Steps Make compute fast Moore s Law Make storage
More informationDisclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme
NET1343BU NSX Performance Samuel Kommu #VMworld #NET1343BU Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no
More informationNVMe Takes It All, SCSI Has To Fall. Brave New Storage World. Lugano April Alexander Ruebensaal
Lugano April 2018 NVMe Takes It All, SCSI Has To Fall freely adapted from ABBA Brave New Storage World Alexander Ruebensaal 1 Design, Implementation, Support & Operating of optimized IT Infrastructures
More informationSPDK China Summit Ziye Yang. Senior Software Engineer. Network Platforms Group, Intel Corporation
SPDK China Summit 2018 Ziye Yang Senior Software Engineer Network Platforms Group, Intel Corporation Agenda SPDK programming framework Accelerated NVMe-oF via SPDK Conclusion 2 Agenda SPDK programming
More informationFast packet processing in the cloud. Dániel Géhberger Ericsson Research
Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration
More informationEnd-user systems: NICs, MotherBoards, Disks, TCP Stacks & Applications
2th-21st June 25 NeSC Edinburgh http://gridmon.dl.ac.uk/nfnn/ End-user systems: NICs, MotherBoards, Disks, TCP Stacks & Applications Richard Hughes-Jones Work reported is from many Networking Collaborations
More informationMuch Faster Networking
Much Faster Networking David Riddoch driddoch@solarflare.com Copyright 2016 Solarflare Communications, Inc. All rights reserved. What is kernel bypass? The standard receive path The standard receive path
More informationZiye Yang. NPG, DCG, Intel
Ziye Yang NPG, DCG, Intel Agenda What is SPDK? Accelerated NVMe-oF via SPDK Conclusion 2 Agenda What is SPDK? Accelerated NVMe-oF via SPDK Conclusion 3 Storage Performance Development Kit Scalable and
More information6.9. Communicating to the Outside World: Cluster Networking
6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and
More informationIntel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances
Technology Brief Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances The world
More informationData Path acceleration techniques in a NFV world
Data Path acceleration techniques in a NFV world Mohanraj Venkatachalam, Purnendu Ghosh Abstract NFV is a revolutionary approach offering greater flexibility and scalability in the deployment of virtual
More informationData transfer over the wide area network with a large round trip time
Journal of Physics: Conference Series Data transfer over the wide area network with a large round trip time To cite this article: H Matsunaga et al 1 J. Phys.: Conf. Ser. 219 656 Recent citations - A two
More informationAvid Configuration Guidelines Lenovo P720 workstation Dual 8 to 28 Core CPU System
Avid Configuration Guidelines Lenovo P720 workstation Dual 8 to 28 Core CPU System Page 1 of 14 Dave Pimm Avid Technology April 25, 2018 1.) Lenovo P720 AVID Qualified System Specification: P720 Hardware
More informationDemystifying Network Cards
Demystifying Network Cards Paul Emmerich December 27, 2017 Chair of Network Architectures and Services About me PhD student at Researching performance of software packet processing systems Mostly working
More informationperfsonar Host Hardware
perfsonar Host Hardware This document is a result of work by the perfsonar Project (http://www.perfsonar.net) and is licensed under CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0/). Event
More informationPCIe 10G 5-Speed. Multi-Gigabit Network Card
PCIe 10G 5-Speed Multi-Gigabit Network Card User Manual Ver. 2.00 All brand names and trademarks are properties of their respective owners. Contents: Chapter 1: Introduction... 3 1.1 Product Introduction...
More information10Gbps TCP/IP streams from the FPGA
TWEPP 2013 10Gbps TCP/IP streams from the FPGA for the CMS DAQ Eventbuilder Network Petr Žejdl, Dominique Gigi on behalf of the CMS DAQ Group 26 September 2013 Outline CMS DAQ Readout System TCP/IP Introduction,
More informationinettest a 10 Gigabit Ethernet Test Unit
1 DANTE City House, 126-130 Hills Road, Cambridge CB2 1PQ, UK And Visiting Fellow, School of Physics and Astronomy, The University of Manchester, Oxford Rd, Manchester UK E-mail: Richard.Hughes-Jones@dante.net
More informationMicrosoft Windows 2016 Mellanox 100GbE NIC Tuning Guide
Microsoft Windows 2016 Mellanox 100GbE NIC Tuning Guide Publication # 56288 Revision: 1.00 Issue Date: June 2018 2018 Advanced Micro Devices, Inc. All rights reserved. The information contained herein
More informationMay 1, Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science (ICS) A Sleep-based Communication Mechanism to
A Sleep-based Our Akram Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science (ICS) May 1, 2011 Our 1 2 Our 3 4 5 6 Our Efficiency in Back-end Processing Efficiency in back-end
More informationRonald van der Pol
Ronald van der Pol Contributors! " Ronald van der Pol! " Freek Dijkstra! " Pieter de Boer! " Igor Idziejczak! " Mark Meijerink! " Hanno Pet! " Peter Tavenier Outline! " Network bandwidth
More informationMeltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies
Meltdown and Spectre Interconnect Evaluation Jan 2018 1 Meltdown and Spectre - Background Most modern processors perform speculative execution This speculation can be measured, disclosing information about
More informationNVMe over Fabrics. High Performance SSDs networked over Ethernet. Rob Davis Vice President Storage Technology, Mellanox
NVMe over Fabrics High Performance SSDs networked over Ethernet Rob Davis Vice President Storage Technology, Mellanox Ilker Cebeli Senior Director of Product Planning, Samsung May 3, 2017 Storage Performance
More informationIsoStack Highly Efficient Network Processing on Dedicated Cores
IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single
More informationN V M e o v e r F a b r i c s -
N V M e o v e r F a b r i c s - H i g h p e r f o r m a n c e S S D s n e t w o r k e d f o r c o m p o s a b l e i n f r a s t r u c t u r e Rob Davis, VP Storage Technology, Mellanox OCP Evolution Server
More informationImplementation and Analysis of Large Receive Offload in a Virtualized System
Implementation and Analysis of Large Receive Offload in a Virtualized System Takayuki Hatori and Hitoshi Oi The University of Aizu, Aizu Wakamatsu, JAPAN {s1110173,hitoshi}@u-aizu.ac.jp Abstract System
More informationLREC9030PF PCIe x1 100FX Desktop Fiber Ethernet Adapter (Intel Based)
LREC9030PF PCIe x1 100FX Desktop Fiber Ethernet Adapter (Intel 82574 Based) Descriptions: LREC9030PF-SFP is a typical model of 100FX Ethernet adapter, based on Intel 82574 Ethernet Controller, researched
More informationReducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet
Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems
More informationThe Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook)
Workshop on New Visions for Large-Scale Networks: Research & Applications Vienna, VA, USA, March 12-14, 2001 The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Wu-chun Feng feng@lanl.gov
More informationMotivation CPUs can not keep pace with network
Deferred Segmentation For Wire-Speed Transmission of Large TCP Frames over Standard GbE Networks Bilic Hrvoye (Billy) Igor Chirashnya Yitzhak Birk Zorik Machulsky Technion - Israel Institute of technology
More informationVideo capture using GigE Vision with MIL. What is GigE Vision
What is GigE Vision GigE Vision is fundamentally a standard for transmitting video from a camera (see Figure 1) or similar device over Ethernet and is primarily intended for industrial imaging applications.
More informationPerformance Pack. Administration Guide Version R70. March 8, 2009
Performance Pack TM Administration Guide Version R70 March 8, 2009 2003-2009 Check Point Software Technologies Ltd. All rights reserved. This product and related documentation are protected by copyright
More informationAn Extensible Message-Oriented Offload Model for High-Performance Applications
An Extensible Message-Oriented Offload Model for High-Performance Applications Patricia Gilfeather and Arthur B. Maccabe Scalable Systems Lab Department of Computer Science University of New Mexico pfeather@cs.unm.edu,
More informationComparing TCP performance of tunneled and non-tunneled traffic using OpenVPN. Berry Hoekstra Damir Musulin OS3 Supervisor: Jan Just Keijser Nikhef
Comparing TCP performance of tunneled and non-tunneled traffic using OpenVPN Berry Hoekstra Damir Musulin OS3 Supervisor: Jan Just Keijser Nikhef Outline Introduction Approach Research Results Conclusion
More informationAvid Configuration Guidelines Lenovo P520/P520C workstation Single 6 to 18 Core CPU System P520 P520C
Avid Configuration Guidelines Lenovo P520/P520C workstation Single 6 to 18 Core CPU System P520 P520C Page 1 of 14 Dave Pimm Avid Technology April 25, 2018 1.) Lenovo P520 & P520C AVID Qualified System
More informationPerformance Characteristics on Gigabit networks
Version 4.7 Impairment Emulator Software for IP Networks (IPv4 & IPv6) Performance Characteristics on Gigabit networks ZTI Communications / 1 rue Ampère / 22300 LANNION / France Phone: +33 2 9613 4003
More informationAvid Configuration Guidelines HP Z8 G4 workstation Dual 8 to 28 Core CPU System
Avid Configuration Guidelines HP Z8 G4 workstation Dual 8 to 28 Core CPU System Page 1 of 13 Dave Pimm Avid Technology April 23, 2018 1.) HP Z8 G4 AVID Qualified System Specification: Z8 G4 Hardware Configuration
More informationFPGA Augmented ASICs: The Time Has Come
FPGA Augmented ASICs: The Time Has Come David Riddoch Steve Pope Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. Hardware acceleration is Niche (With the obvious exception of graphics
More informationTCP Tuning for the Web
TCP Tuning for the Web Jason Cook - @macros - jason@fastly.com Me Co-founder and Operations at Fastly Former Operations Engineer at Wikia Lots of Sysadmin and Linux consulting The Goal Make the best use
More informationAnalytics of Wide-Area Lustre Throughput Using LNet Routers
Analytics of Wide-Area Throughput Using LNet Routers Nagi Rao, Neena Imam, Jesse Hanley, Sarp Oral Oak Ridge National Laboratory User Group Conference LUG 2018 April 24-26, 2018 Argonne National Laboratory
More informationINT G bit TCP Offload Engine SOC
INT 10011 10 G bit TCP Offload Engine SOC Product brief, features and benefits summary: Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx/Altera FPGAs or Structured ASIC flow.
More informationPacketShader: A GPU-Accelerated Software Router
PacketShader: A GPU-Accelerated Software Router Sangjin Han In collaboration with: Keon Jang, KyoungSoo Park, Sue Moon Advanced Networking Lab, CS, KAIST Networked and Distributed Computing Systems Lab,
More informationPerformance Characteristics on Gigabit networks
Version 4.6 Impairment Emulator Software for IP Networks (IPv4 & IPv6) Performance Characteristics on Gigabit networks ZTI / 1 boulevard d'armor / BP 20254 / 22302 Lannion Cedex / France Phone: +33 2 9648
More informationThe Convergence of Storage and Server Virtualization Solarflare Communications, Inc.
The Convergence of Storage and Server Virtualization 2007 Solarflare Communications, Inc. About Solarflare Communications Privately-held, fabless semiconductor company. Founded 2001 Top tier investors:
More informationDXE-810S. Manual. 10 Gigabit PCI-EXPRESS-Express Ethernet Network Adapter V1.01
DXE-810S 10 Gigabit PCI-EXPRESS-Express Ethernet Network Adapter Manual V1.01 Table of Contents INTRODUCTION... 1 System Requirements... 1 Features... 1 INSTALLATION... 2 Unpack and Inspect... 2 Software
More informationClearStream. Prototyping 40 Gbps Transparent End-to-End Connectivity. Cosmin Dumitru! Ralph Koning! Cees de Laat! and many others (see posters)!
ClearStream Prototyping 40 Gbps Transparent End-to-End Connectivity Cosmin Dumitru! Ralph Koning! Cees de Laat! and many others (see posters)! University of Amsterdam! more data! Speed! Volume! Internet!
More informationLearning with Purpose
Network Measurement for 100Gbps Links Using Multicore Processors Xiaoban Wu, Dr. Peilong Li, Dr. Yongyi Ran, Prof. Yan Luo Department of Electrical and Computer Engineering University of Massachusetts
More informationHKG net_mdev: Fast-path userspace I/O. Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog
HKG18-110 net_mdev: Fast-path userspace I/O Ilias Apalodimas Mykyta Iziumtsev François-Frédéric Ozog Why userland I/O Time sensitive networking Developed mostly for Industrial IOT, automotive and audio/video
More informationAn Intelligent NIC Design Xin Song
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) An Intelligent NIC Design Xin Song School of Electronic and Information Engineering Tianjin Vocational
More informationCA AppLogic and Supermicro X9 Equipment Validation
CA AppLogic and Supermicro X9 Equipment Validation Opening Statement Last Updated August 8, 2013 The Boston X9 servers described below are compatible and validated with both CA AppLogic 3.1.14 Xen and
More informationRonald van der Pol
Ronald van der Pol Outline! Goal of this project! 40GE demonstration setup! Application description! Results! Conclusions Goal of the project! Optimize single server disk to network I/O!
More informationFreeBSD Network Performance Tuning
Sucon 2004 Zurich, Switzerland Hendrik Scholz hscholz@raisdorf.net http://www.wormulon.net/ Agenda Motivation Overview Optimization approaches sysctl() tuning Measurement NIC comparision Conclusion Motivation
More informationLinux Kernel Hacking Free Course
Linux Kernel Hacking Free Course 3 rd edition G.Grilli, University of me Tor Vergata IRQ DISTRIBUTION IN MULTIPROCESSOR SYSTEMS April 05, 2006 IRQ distribution in multiprocessor systems 1 Contents: What
More informationLow-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc.
Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc. 1 DISCLAIMER This presentation and/or accompanying oral statements by Samsung
More informationXilinx Answer QDMA Performance Report
Xilinx Answer 71453 QDMA Performance Report Important Note: This downloadable PDF of an Answer Record is provided to enhance its usability and readability. It is important to note that Answer Records are
More informationOn the cost of tunnel endpoint processing in overlay virtual networks
J. Weerasinghe; NVSDN2014, London; 8 th December 2014 On the cost of tunnel endpoint processing in overlay virtual networks J. Weerasinghe & F. Abel IBM Research Zurich Laboratory Outline Motivation Overlay
More informationAnalysis of CPU Pinning and Storage Configuration in 100 Gbps Network Data Transfer
Analysis of CPU Pinning and Storage Configuration in 100 Gbps Network Data Transfer International Center for Advanced Internet Research Northwestern University Se-young Yu Jim Chen, Joe Mambretti, Fei
More information5-Speed NBASE-T Network. Controller Card
5-Speed NBASE-T Network Controller Card User Manual Ver. 1.00 All brand names and trademarks are properties of their respective owners. Contents: Chapter 1: Introduction... 3 1.1 Product Introduction...
More informationPCIe 10G SFP+ Network Card
PCIe 10G SFP+ Network Card User Manual Ver. 1.00 All brand names and trademarks are properties of their respective owners. Contents: Chapter 1: Introduction... 3 1.1 Product Introduction... 3 1.2 Features...
More informationStacked Vlan: Performance Improvement and Challenges
Stacked Vlan: Performance Improvement and Challenges Toshiaki Makita NTT Tokyo, Japan makita.toshiaki@lab.ntt.co.jp Abstract IEEE 802.1ad vlan protocol type was introduced in kernel 3.10, which has encouraged
More informationPerformance Characteristics on Fast Ethernet and Gigabit networks
Version 2.5 Traffic Generator and Measurement Tool for IP Networks (IPv4 & IPv6) FTTx, LAN, MAN, WAN, WLAN, WWAN, Mobile, Satellite, PLC, etc Performance Characteristics on Fast Ethernet and Gigabit networks
More information