FPGAs and Networking Marc Kelly & Richard Hughes-Jones University of Manchester 12th July 27 1
Overview of Work Looking into the usage of FPGA's to directly connect to Ethernet for DAQ readout purposes. Testing both 1 and 1 Gig systems. Evaluating the new generation of PCIExpress 1Gig Ethernet cards. Bringing it all together to form a test system. 12th July 27 2
Network Virtex4 Test Board 12th July 27 3
1 Gig FPGA Work Implemented a MAC + PHY layer inside Xilinx Virtex4 FPGA. Demonstrated working Ethernet between FPGA and remote hosts, however the learning curve is steep, issues with the Xilinx CoreGen design. Adaptable testing design with Network and User FPGA logic separated Once working it has proved reliable, the Network code as survived many re-spins of the design without failing. 12th July 27 4
Overview of Design User block Network block 12th July 27 5
Receiver Frame Jitter and Packet Loss 12 us (line speed) 9 Frame Jitter 12 us fpga man2_7mar7 8 1 7 1 12 us fpga man2_7mar7 13 us fpga man2_7mar7 5 4 5 4 3 N(n) 1 6 1 1 2 1 1 2 3 Frame spacing us 4 3 2 1 1 1 5 1 2 3 4 Frame spacing us 5 1 2 3 4 5 6 7 Num Packets Lost 8 9 25 us frame spacing Peak separation 4 5 us no coalescence 9 8 7 6 5 25 us fpga man2_7mar7 25 us fpga man2_7mar7 1 1 1 4 3 2 1 1 1 1 1 5 1 Frame Spacing us 15 1.2 1.8.6.4.2 25 us fpga man2_7mar7 Packet loss 1 2 3 4 Frame spacing us 5 1 2 3 4 5 6 7 8 9 Num Packets lost Plots by Richard, stolen from one of his talks. 12th July 27 6
1 Gig FPGA Work FPGAs can drive Ethernet. It is easy once configuration hurdle is passed. More testing under way. Request response style operations to pull data out FPGA to simulate an event building scenario. Planned Upgrade to 1Gig Ethernet. Do all tests at 1Gig. Perform some initial testing to try to determine the stability of the RocketIO TX/RX latency. 12th July 27 7
1Gig Ethernet Work Richard has nice new 1Gig PCI Express cards made my Myricom. 8xLane design, in theory that has more than enough bandwidth to deal with 1Gig link. Using loaned high end servers as host machines. Performing standard network testing operations using his normal tools. Aim is to determine the suitability of 1Gig systems. 12th July 27 8
High-end Server PCs Boston/Supermicro X7DBE Two Dual Core Intel Xeon Woodcrest 513 2 GHz Independent 1.33GHz FSBuses 53 MHz FD Memory (serial) Parallel access to 4 banks Chipsets: Intel 5P MCH PCIe & Memory ESB2 PCI-X GE etc. PCI 3 8 lane PCIe buses 3* 133 MHz PCI-X 2 Gigabit Ethernet SATA 12th July 27 9
Notice rate for 8972 byte packet ~.2% packet loss in 1M packets in receiving host Recv Wire rate Mbit/s gig6 5_myri1GE 1 9 8 7 6 5 4 3 2 1 1 bytes 1472 bytes 2 bytes 3 bytes 4 bytes 5 bytes 6 bytes 7 bytes 8 bytes 8972 bytes Sending host, 3 CPUs idle For <8 µs packets, 1 CPU is >9% in kernel mode inc ~1% soft int 1 2 Spacing between frames us 1 %cpu1 kernel snd 1 GigE Back2Back: UDP Throughput Kernel 2.6.2-web1_pktd-plus Myricom 1G-PCIE-8A-R Fibre rx-usecs=25 Coalescence ON MTU 9 bytes Max throughput 9.4 Gbit/s Receiving host 3 CPUs idle For <8 µs packets, 1 CPU is 7-8% in kernel mode inc ~15% soft int 4 1 bytes gig6 5_myri1GE 1472 bytes 8 2 bytes 6 3 bytes 4 4 bytes C 2 5 bytes 6 bytes 7 bytes 5 1 15 2 25 Spacing between frames us 3 35 4 8 bytes 8972 bytes 1 bytes gig6 5_myri1GE 1 8 6 4 2 1472 bytes 2 bytes 3 bytes 4 bytes 5 bytes 6 bytes By Richard, stolen from one of his talks. 12th July 27 3 % cpu1 kernel rec 1 2 Spacing between frames us 3 7 bytes 4 8 bytes 8972 bytes 1
1 GigE Back2Back: UDP Latency 5 y =.28x + 21.937 4 Latency 22 µs & very well behaved Latency Slope.28 µs/byte B2B Expect:.268 µs/byte Mem.4 PCI-e.54 1GigE.8 PCI-e.54 Mem.4 12 3 2 1 1 2 3 4 5 6 7 8 1 Histogram FWHM ~1 2 us 12 64 bytes gig6 5 6 3 bytes gig6 5 1 1 5 8 8 4 6 6 4 2 2 2 1 2 4 6 8 Latency us 2 4 6 Latency us 8 89 bytes gig6 5 3 4 9 Message length bytes 6 gig6 5_Myri1GE_rxcoal= Latency us Motherboard: Supermicro X7DBE Chipset: Intel 5P MCH CPU: 2 Dual Intel Xeon 513 2 GHz with 496k L2 cache Mem bus: 2 independent 1.33 GHz PCI-e 8 lane Linux Kernel 2.6.2-web1_pktd-plus Myricom NIC 1G-PCIE-8A-R Fibre myri1ge v1.2. + firmware v1.4.1 rx-usecs= Coalescence OFF MSI=1 Checksums ON tx_boundary=496 MTU 9 bytes 2 4 6 Latency us By Richard, stolen from one of his talks. 12th July 27 11 8
B2B UDP with memory access Achievable UDP Throughput mean 9.39 Gb/s sigma 16 mean 9.21 Gb/s sigma 37 mean 9.2 sigma 3 gig6 5_udpmon_membw 96 Recv Wire rate Mbit/s Send UDP traffic B2B with 1GE On receiver run independent memory write task L2 Cache 496 k Byte Write 8k Byte blocks in loop 1% user mode 95 UDP UDP+cpu1 UDP+cpu3 94 93 92 91 9 Packet loss 1 2 3 Trial number 4 5 gig6 5_udpmon_membw 3.5 % Packet loss mean.4% mean 1.4 % mean 1.8 % CPU load: 6 7 UDP UDP+cpu1 UDP+cpu3 3 2.5 2 1.5 1.5 Cpu Cpu1 Cpu2 Cpu3 : 6.% us, 74.7% :.% us,.% :.% us,.% : 1.% us,.% sy, sy, sy, sy,.%.%.%.% ni,.3% id, ni, 1.% id, ni, 1.% id, ni,.% id, By Richard, stolen from one of his talks. 12th July 27.%.%.%.% 1 2 3 Trial number wa, wa, wa, wa, 1.3%.%.%.% 4 5 hi, 17.7% si, hi,.% si, hi,.% si, hi,.% si, 6.%.%.%.% 7 st st st st 12
1Gig Ethernet Work New generation of servers are capable of supporting 1Gig Ethernet, doing real work and NOT being overloaded. New generation of Cards are very capable of supporting 1Gig Ethernet. Things are however Chipset/Server design dependant. Have to make sure the architecture is sound. High bandwidth, low contention designs needed. Need modern host OS, latest drivers etc. 12th July 27 13