RiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner

Similar documents
RiceNIC. A Reconfigurable Network Interface for Experimental Research and Education. Jeffrey Shafer Scott Rixner

A software platform to support dynamically reconfigurable Systems-on-Chip under the GNU/Linux operating system

Compute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen

SECURE PARTIAL RECONFIGURATION OF FPGAs. Amir S. Zeineddini Kris Gaj

Avnet, Xilinx ATCA PICMG Design Kit Hardware Manual

NetFPGA Hardware Architecture

Hardware Implementation of TRaX Architecture

High Speed Data Transfer Using FPGA

Enabling success from the center of technology. Interfacing FPGAs to Memory

AN OCM BASED SHARED MEMORY CONTROLLER FOR VIRTEX 4. Bas Breijer, Filipa Duarte, and Stephan Wong

Introducing the Spartan-6 & Virtex-6 FPGA Embedded Kits

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures

6.9. Communicating to the Outside World: Cluster Networking

A ONE CHIP HARDENED SOLUTION FOR HIGH SPEED SPACEWIRE SYSTEM IMPLEMENTATIONS

ProtoFlex: FPGA Accelerated Full System MP Simulation

Spartan-6 and Virtex-6 FPGA Embedded Kit FAQ

INT G bit TCP Offload Engine SOC

L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA

Full Linux on FPGA. Sven Gregori

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

Implementation and Analysis of Large Receive Offload in a Virtualized System

FPGA. Agenda 11/05/2016. Scheduling tasks on Reconfigurable FPGA architectures. Definition. Overview. Characteristics of the CLB.

Designing a Multi-Processor based system with FPGAs

A Next Generation Home Access Point and Router

ML410 BSB DDR2 Design Creation Using 8.2i SP1 EDK Base System Builder (BSB) April

ATLAS: A Chip-Multiprocessor. with Transactional Memory Support

INTRODUCTION TO FPGA ARCHITECTURE

A Framework for Rule Processing in Reconfigurable Network Systems

FPGA Solutions: Modular Architecture for Peak Performance

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

FPGA: What? Why? Marco D. Santambrogio

Massively Parallel Processor Breadboarding (MPPB)

Spartan-6 LX9 MicroBoard Embedded Tutorial. Tutorial 1 Creating an AXI-based Embedded System

Benchmarking the Performance of the Virtex-4 10/100/1000 TEMAC System Author: Kris Chaplin

Alternative Ideas for the CALICE Back-End System

FPGA Implementation and Validation of the Asynchronous Array of simple Processors

System Software Design for Multimedia Networking. Jonathan C.L. Liu, Ph.D. CISE Department University of Florida

RUN-TIME PARTIAL RECONFIGURATION SPEED INVESTIGATION AND ARCHITECTURAL DESIGN SPACE EXPLORATION

Complexity and Performance Evaluation of Two Partial Reconfiguration Interfaces on FPGAs: a Case Study 1

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

Design of a Network Camera with an FPGA

AC : INTRODUCING LABORATORIES WITH SOFT PROCES- SOR CORES USING FPGAS INTO THE COMPUTER ENGINEERING CURRICULUM

Data Side OCM Bus v1.0 (v2.00b)

INT-1010 TCP Offload Engine

Connect Tech Inc. Александр Баковкин Инженер отдела сервисов SWD Software

Prototyping NGC. First Light. PICNIC Array Image of ESO Messenger Front Page

A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management

Supporting the Linux Operating System on the MOLEN Processor Prototype

Virtualization, Xen and Denali

Development of Monitoring Unit for Data Acquisition from Avionic Bus 1 Anjana, 2 Dr. N. Satyanarayan, 3 M.Vedachary

VXS-610 Dual FPGA and PowerPC VXS Multiprocessor

Implementation of Offloading the iscsi and TCP/IP Protocol onto Host Bus Adapter

DRPM architecture overview

GigaX API for Zynq SoC

Product Overview. Programmable Network Cards Network Appliances FPGA IP Cores

FPGA VHDL Design Flow AES128 Implementation

ESA Contract 18533/04/NL/JD

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

Introduction to Field Programmable Gate Arrays

FreeBSD support for Stanford NetFPGA. Wojciech A. Koszek

In-chip and Inter-chip Interconnections and data transportations for Future MPAR Digital Receiving System

Hardware Design. University of Pannonia Dept. Of Electrical Engineering and Information Systems. MicroBlaze v.8.10 / v.8.20

Building an Embedded Processor System on Xilinx NEXYS3 FPGA and Profiling an Application: A Tutorial

Virtex-4 PowerPC Example Design. UG434 (v1.2) January 17, 2008

LEON4: Fourth Generation of the LEON Processor

S2C K7 Prodigy Logic Module Series

VHX - Xilinx - FPGA Programming in VHDL

A Simulation: Improving Throughput and Reducing PCI Bus Traffic by. Caching Server Requests using a Network Processor with Memory

Dynamically Reconfigurable Coprocessors in FPGA-based Embedded Systems

System Debug. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved

Simplify System Complexity

An NVMe-based Offload Engine for Storage Acceleration Sean Gibb, Eideticom Stephen Bates, Raithlin

VXS-621 FPGA & PowerPC VXS Multiprocessor

Outline Introduction System development Video capture Image processing Results Application Conclusion Bibliography

Reconfigurable Hardware Implementation of Mesh Routing in the Number Field Sieve Factorization

CSP PROJECT VIRTUAL FPGA. Working with Microblaze on Alpha Data board

INT 1011 TCP Offload Engine (Full Offload)

ISE Design Suite Software Manuals and Help

PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor

Motivation to Teach Network Hardware

Virtex-II Architecture. Virtex II technical, Design Solutions. Active Interconnect Technology (continued)

Co-Design and Co-Verification using a Synchronous Language. Satnam Singh Xilinx Research Labs

Gate Estimate. Practical (60% util)* (1000's) Max (100% util)* (1000's)

FPGA architecture and design technology

XMC Products. High-Performance XMC FPGAs, XMC 10gB Ethernet, and XMC Carrier Cards. XMC FPGAs. FPGA Extension I/O Modules.

High Capacity and High Performance 20nm FPGAs. Steve Young, Dinesh Gaitonde August Copyright 2014 Xilinx

Design of a Gigabit Distributed Data Multiplexer and Recorder System

Digital Integrated Circuits

Basic FPGA Architecture Xilinx, Inc. All Rights Reserved

Designing with the Xilinx 7 Series PCIe Embedded Block. Tweet this event: #avtxfest

A Hardware Cache memcpy Accelerator

Field Programmable Gate Array (FPGA) Devices

Zynq Architecture, PS (ARM) and PL

ReconOS: An RTOS Supporting Hardware and Software Threads

PowerPC on NetFPGA CSE 237B. Erik Rubow

Introduction to Partial Reconfiguration Methodology

Optimizing TCP Receive Performance

Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study

Simplify System Complexity

ProtoFlex: FPGA-Accelerated Hybrid Simulator

Transcription:

RiceNIC Prototyping Network Interfaces Jeffrey Shafer Scott Rixner

RiceNIC Overview Gigabit Ethernet Network Interface Card RiceNIC - Prototyping Network Interfaces 2

RiceNIC Overview Reconfigurable and programmable Can alter both FPGA design and program embedded processors Built on commercial development board Reference design is freely available Targeted at research and education applications RiceNIC - Prototyping Network Interfaces 3

RiceNIC Target Applications Experimental research into network server architectures Explore hardware/software interface between OS and NIC Explore NIC architectures and desired functions Examples using RiceNIC TCP offload (at Rice) Run TCP stack on NIC Moved connection management from host OS to NIC Low Power Networking (at University of Florida) Adaptive MAC varies performance versus energy efficiency Virtual Machines (at Rice, EPFL and HP Labs) Each virtual machine can communicate independently with NIC RiceNIC - Prototyping Network Interfaces 4

Outline RiceNIC Introduction Competing Development Methods NIC Architecture and Implementation Applications Education and Research Lessons Learned Conclusions RiceNIC - Prototyping Network Interfaces 5

How to Build the RiceNIC Prototyping Tool? Buy a commercial software programmable NIC? Straightforward to rewrite firmware Used for previous research with mixed results Insufficient performance / flexibility for prototyping Build new board from scratch? Highly customizable Expensive / time consuming Use commercial prototyping board? Inexpensive / fast We Chose a Commercial Board. How Did it Turn Out? RiceNIC - Prototyping Network Interfaces 6

Avnet Development Board Serial Port DDR Virtex FPGA RJ-45 Port SRAM Spartan FPGA Ethernet PHY FPGA Development Board Virtex and Spartan FPGAs 2 PowerPC 405 processors 300 MHz Separate 16kB I and D-cache Serial Port PCI Interface 256 MB DDR Memory 2 MB SRAM Memory 56kB on-fpga memory Gigabit Ethernet PHY 64-bit / 66 MHz PCI bus RiceNIC - Prototyping Network Interfaces 7

System Architecture Avnet Hardware Avnet Memory Custom Design DDR-SDRAM (256 MB) DDR Controller Processor Local Bus (PLB) Avnet Development Board Serial Port Full Duplex Ethernet 2 GBits/sec UART RS-232 Ethernet PHY (10/100/1000) MAC Control DMA Unit BRAM Block BRAM Block Cache PowerPC 300MHz Cache PowerPC 300MHz SRAM (2MB) PCI Bridge (Spartan FPGA) PROM PCI Bus (64b / 66 MHz) RiceNIC - Prototyping Network Interfaces 8

FPGA Utilization Implemented with Xilinx tools (ISE and EDK) Used ChipScope on-chip logic analyzer Component Virtex FPGA Spartan FPGA Slice Flip Flops 9,089 / 27,392 33% 2,361 / 6,144 38% 4 input LUTs (Logic) 11,811 / 27,392 43% 2,504 / 6,144 40% BRAMs 51 / 136 37% 6 / 16 37% Occupied Slices 9,164 / 13,696 66% 3,070 / 3,072 99% Global Clocks 10 / 16 62% 2 / 4 81% Digital Clock Managers 5 / 8 62% N/A N/A RiceNIC - Prototyping Network Interfaces 9

Virtex FPGA Placement Ethernet MAC DDR Controller PowerPC Processors PLB & Control PCI DMA Engine RiceNIC - Prototyping Network Interfaces 10

Debugging Support Software Debugging Serial Console (RS-232 port) Command line interface to PowerPC processors Runtime debugging / configuration changes / bootstrapping Firmware Profiler Timer based statistical profiler (similar to Oprofile) Exports results via serial console Hardware Debugging Xilinx Chipscope on Virtex/Spartan FPGAs Essential Features Not Found in Other NICs RiceNIC - Prototyping Network Interfaces 11

Performance TCP streaming test (netperf) RiceNIC firmware using 1 processor High throughput for packets larger than 1000 bytes Spare capacity for researchers to prototype new systems RiceNIC - Prototyping Network Interfaces 12

RiceNIC Application - Education Added Network Address Translation module to RiceNIC firmware NIC still runs at full Ethernet speeds Project time 1 day Feasible for a class project Could have implemented similar module on any commercial software programmable NIC, but Debugging of NAT module by non-experts greatly assisted by RiceNIC serial console which printed packet traces RiceNIC - Prototyping Network Interfaces 13

RiceNIC Application Low Power MAC Energy Efficient Internet Project (at USF and UF) Replacing MAC core on RiceNIC FPGA with a custom low-power variant that supports adaptive link rates This research could not be done on any software-programmable NIC! The Energy Efficient Internet Project: http://www.csee.usf.edu/~christen/energy/main.html RiceNIC - Prototyping Network Interfaces 14

RiceNIC Application Networking in Virtual Machines Each guest OS can talk directly to single shared NIC Separate interfaces for concurrent guests Experimental research project with many modifications FPGA provides isolated contexts in memory + event notification Firmware provides packet multiplexing / demultiplexing Xen / OS modifications Used 1 PowerPC processor and 12MB of RAM Guest OS Guest OS Guest OS Driver Driver Driver Xen Virtual Machine Monitor Context 1 Context 2 Context 3 RiceNIC (with modifications) Ethernet P. Willmann, J. Shafer, et.al, Concurrent Direct Network Access for Virtual Machine Monitors, The International Symposium on High Performance Computer Architecture (HPCA 07), Phoenix, AZ, (Feb 2007) RiceNIC - Prototyping Network Interfaces 15

RiceNIC Application - Virtualization Prototyping (with RiceNIC) critical to project success Could not use software programmable NIC Needed to change hardware architecture Software emulation too slow RiceNIC prototype ran at full speed Used RiceNIC to experimentally determine key architectural features Minimum on-nic packet buffer size per virtual machine Could not obtain equivalent results via simulation in a timely fashion Guest OS Guest OS Guest OS Driver Driver Driver Xen Virtual Machine Monitor Context 1 Context 2 Context 3 RiceNIC (with modifications) Ethernet RiceNIC - Prototyping Network Interfaces 16

Lessons Learned Use a commercial development board Avnet card perfect for RiceNIC Contains large FPGA, PCI, GigE, and memory Board arrived fully tested / documented Implementation can begin immediately Reduced up-front project expenses For low volume RiceNIC, might never be cheaper to fabricate a custom board RiceNIC - Prototyping Network Interfaces 17

Lessons Learned Design Time Design time Hardware (FPGA) design 1 year / 1 graduate student Software (firmware / driver) design 1 month / 1 graduate student Significant prior experience in writing NIC firmware Development stages Learning Xilinx ISE / EDK tools 1 month Time spent would be comparable with competing tools Design / Implementation 9 months Testing 3 months RiceNIC - Prototyping Network Interfaces 18

Lessons Learned Testing PCI core required extensive iterative testing Must work correctly on a wide variety of host computer systems with non-deterministic bus transfer and error timing characteristics Simulator can ignore these low-level issues but working prototype must function perfectly at even the most detailed level Saving grace - Fully-functional prototype lends confidence to experimental research RiceNIC - Prototyping Network Interfaces 19

Lessons Learned FPGAs well matched with requirements Clock rates required for gigabit NIC are achievable with modern FPGAs 66 MHz PCI interface, 100 MHz general logic, 300 MHz embedded processors Saved $$ and time versus creating an ASIC RiceNIC - Prototyping Network Interfaces 20

Lessons Learned How would you build a 10 Gigabit NIC? Make same design choice to use a commercial prototyping board Newer Avnet boards have larger FPGAs, PCI Express links, and 10G interfaces Prototyping boards closely follow the commercial state-of-the-art RiceNIC - Prototyping Network Interfaces 21

RiceNIC Usage RiceNIC is free for research and education applications Platform includes the FPGA configuration (bitstream), VHDL source code, embedded PowerPC firmware, and Linux device driver Purchase Avnet Virtex-II Pro development board Additional requirements to modify FPGA designs Xilinx development software (ISE, EDK, Chipscope) New licenses for the Xilinx MAC and PCI cores Can still modify firmware without any FPGA changes or additional licenses RiceNIC - Prototyping Network Interfaces 22

Conclusions Research into network system demands experimental prototypes Complex, asynchronous interactions among system components Simulation insufficient for new architectures Prototyping tools such as RiceNIC are critical for the development and understanding of future computer systems Development method was successful RiceNIC already used in several research projects at several institutions Would we use a commercial prototyping board to build the RiceNIC again? Yes! RiceNIC - Prototyping Network Interfaces 23

Questions? RiceNIC website: http://www.cs.rice.edu/cs/architecture/ricenic/ RiceNIC - Prototyping Network Interfaces 24