RiceNIC. A Reconfigurable Network Interface for Experimental Research and Education. Jeffrey Shafer Scott Rixner

Similar documents
RiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner

NetFPGA Hardware Architecture

Compute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen

Motivation to Teach Network Hardware

6.9. Communicating to the Outside World: Cluster Networking

Implementation and Analysis of Large Receive Offload in a Virtualized System

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup

Impact of Cache Coherence Protocols on the Processing of Network Traffic

Hardware Implementation of TRaX Architecture

Introducing the Spartan-6 & Virtex-6 FPGA Embedded Kits

ProtoFlex: FPGA-Accelerated Hybrid Simulator

An NVMe-based Offload Engine for Storage Acceleration Sean Gibb, Eideticom Stephen Bates, Raithlin

FPGA Solutions: Modular Architecture for Peak Performance

The NE010 iwarp Adapter

SECURE PARTIAL RECONFIGURATION OF FPGAs. Amir S. Zeineddini Kris Gaj

Spartan-6 and Virtex-6 FPGA Embedded Kit FAQ

High Speed Data Transfer Using FPGA

FreeBSD support for Stanford NetFPGA. Wojciech A. Koszek

A Next Generation Home Access Point and Router

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures

ProtoFlex: FPGA Accelerated Full System MP Simulation

Alternative Ideas for the CALICE Back-End System

A Simulation: Improving Throughput and Reducing PCI Bus Traffic by. Caching Server Requests using a Network Processor with Memory

PAC094 Performance Tips for New Features in Workstation 5. Anne Holler Irfan Ahmad Aravind Pavuluri

Virtualization, Xen and Denali

Full Linux on FPGA. Sven Gregori

Enabling success from the center of technology. Interfacing FPGAs to Memory

ReconOS: An RTOS Supporting Hardware and Software Threads

Spartan-6 & Virtex-6 FPGA Connectivity Kit FAQ

PowerPC on NetFPGA CSE 237B. Erik Rubow

Multifunction Networking Adapters

IBM Network Processor, Development Environment and LHCb Software

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator

Advanced Computer Networks. End Host Optimization

BlueGene/L. Computer Science, University of Warwick. Source: IBM

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Memory Management Strategies for Data Serving with RDMA

The Use of Cloud Computing Resources in an HPC Environment

Avnet, Xilinx ATCA PICMG Design Kit Hardware Manual

Implementation of Ethernet, Aurora and their Integrated module for High Speed Serial Data Transmission using Xilinx EDK on Virtex-5 FPGA

Practical MU-MIMO User Selection on ac Commodity Networks

Creating hybrid FPGA/virtual platform prototypes

VXS-621 FPGA & PowerPC VXS Multiprocessor

VXS-610 Dual FPGA and PowerPC VXS Multiprocessor

Farewell to Servers: Hardware, Software, and Network Approaches towards Datacenter Resource Disaggregation

Getting started with Digilent NetFPGA SUME, a Xilinx Virtex 7 FPGA board for high performance computing and networking systems

NetFPGA Update at GEC4

Fairness Issues in Software Virtual Routers

Gate Estimate. Practical (60% util)* (1000's) Max (100% util)* (1000's)

Design of a Gigabit Distributed Data Multiplexer and Recorder System

Connect Tech Inc. Александр Баковкин Инженер отдела сервисов SWD Software

Netchannel 2: Optimizing Network Performance

Managing and Securing Computer Networks. Guy Leduc. Chapter 2: Software-Defined Networks (SDN) Chapter 2. Chapter goals:

Performance Enhancement for IPsec Processing on Multi-Core Systems

Lecture 21: Congestion Control" CSE 123: Computer Networks Alex C. Snoeren

Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews

Axon: A Low-latency Device Implementing Source-routed Ethernet

AC : INTRODUCING LABORATORIES WITH SOFT PROCES- SOR CORES USING FPGAS INTO THE COMPUTER ENGINEERING CURRICULUM

Product Overview. Programmable Network Cards Network Appliances FPGA IP Cores

On-Chip Design Verification with Xilinx FPGAs

Leveraging HyperTransport for a custom high-performance cluster network

Using Alluxio to Improve the Performance and Consistency of HDFS Clusters

An Extensible Message-Oriented Offload Model for High-Performance Applications

Networking for Data Acquisition Systems. Fabrice Le Goff - 14/02/ ISOTDAQ

ATLAS: A Chip-Multiprocessor. with Transactional Memory Support

Extending the Power of FPGAs

Designing High Performance DSM Systems using InfiniBand Features

INT-1010 TCP Offload Engine

AXI4 Interconnect Paves the Way to Plug-and-Play IP

SoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator

Exchange Server 2007 Performance Comparison of the Dell PowerEdge 2950 and HP Proliant DL385 G2 Servers

Boost FPGA Prototype Productivity by 10x

LANCOM Techpaper IEEE n Indoor Performance

Data Acquisition. Amedeo Perazzo. SLAC, June 9 th 2009 FAC Review. Photon Controls and Data Systems (PCDS) Group. Amedeo Perazzo

Profiling Grid Data Transfer Protocols and Servers. George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA

Operating System Approaches for Dynamically Reconfigurable Hardware

XMC Products. High-Performance XMC FPGAs, XMC 10gB Ethernet, and XMC Carrier Cards. XMC FPGAs. FPGA Extension I/O Modules.

The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication

Accelerating System Designs Requiring High-Bandwidth Connectivity with Targeted Reference Designs

ResQ: Enabling SLOs in Network Function Virtualization

MiAMI: Multi-Core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces

Application debugging USB Bus utilization graph

S2C K7 Prodigy Logic Module Series

STATEFUL TCP/UDP traffic generation and analysis

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

CISCO SYSTEM ADMINISTRATION (41)

The rcuda middleware and applications

SWAP and TCP performance

Connection Handoff Policies for TCP Offload Network Interfaces

INT G bit TCP Offload Engine SOC

Support for Smart NICs. Ian Pratt

Experience with the NetFPGA Program

Optimizing ARM SoC s with Carbon Performance Analysis Kits. ARM Technical Symposia, Fall 2014 Andy Ladd

5051 & 5052 PCIe Card Overview

Flexible Architecture Research Machine (FARM)

Industry Collaboration and Innovation

FPGA VHDL Design Flow AES128 Implementation

Building an Embedded Processor System on Xilinx NEXYS3 FPGA and Profiling an Application: A Tutorial

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

Transcription:

RiceNIC A Reconfigurable Network Interface for Experimental Research and Education Jeffrey Shafer Scott Rixner

Introduction Networking is critical to modern computer systems Role of the network interface is changing New communication mechanisms with host (TCP offloading, RDMA ) New responsibilities (Data caching, encryption, ) Simulation is not sufficient to explore new architectures Research into network systems demands experimental prototypes RiceNIC was built as a fully-capable prototyping platform In active use for experimental research at several institutions (Rice, EPFL, UF, USF, HP Labs) RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 2

Outline Experimental Research into Network Server Architectures Simulation with whole-system simulators Challenges when applied to network systems Prototyping with software programmable NICs Better approach, but limited by existing tools Introduce RiceNIC Fully capable prototyping platform + practical to build Experimental Research with RiceNIC Conclusions RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 3

Simulation Challenges Network server architecture performance depends on interactions between System components (processor, memory, I/O interconnects) User applications, OS and device drivers Multiple computing systems separated by a network Network (Internet) can be inherently chaotic All of these elements interact asynchronously Simulator must model all elements with high fidelity RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 4

Adaptive Algorithms in Simulators High fidelity simulation critical when evaluating adaptive algorithms Software control flow adjusts based on underlying system performance TCP is a widely used adaptive algorithm Small inaccuracies can progressively distort TCP performance as feedback loop develops Example: Buffer drains slightly too slow, eventually fills and drops packets, TCP throttles bandwidth to compensate RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 5

TCP Simulation TCP (and other adaptive algorithms) react poorly to standard simulation techniques Warm-up stage - functional simulator Data collection stage - complete simulator Subtle pitfall Accidentally measuring TCP slow-start performance during data collection Solution Continuously execute simulator until TCP reaches steady-state, then measure performance Minimum slow-start time is 150 million cycles (with no network delay) Realistic slow-start time with network delay order of magnitude higher RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 6

Simulation is Expensive One recent research project Shortest tests are 120 seconds long (in real-time) Minimum for accurate data collection at TCP steady-state Hundreds of tests are run at different settings How long would simulation take? 5 orders of magnitude slower (or more!) for cycle accurate simulation 138 days per test How many compute machines are needed? We built a prototype so we could run in real-time, not simulator-time RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 7

Prototyping Advantages Performance Prototype can run in real-time Expense Cheaper to build a few prototypes than buy a compute cluster for simulation Accuracy A simulator models the entire computer (and has inherent assumptions and approximations throughout the system) A prototype models the specific device being investigated (i.e. NIC) but uses a real system otherwise Experimental error is constrained to the device being studied RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 8

Prototyping Challenges Existing prototyping platforms insufficient Can purchase software-programmable NICs for Ethernet and specialized interconnects (Myrinet, Infiniband) Hardware architecture (including control systems) is fixed May constrain new software design Software emulation of a new hardware architecture might be very compute-intensive Intellectual Property issues may limit range of available customizations RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 9

Past Experience with Prototyping Projects implemented on software-programmable NIC TCP connection offloading Move TCP stack to NIC for key connections Implementation limited by processor and available memory (Throughput limited to 100Mbps on a Tigon2 gigabit NIC) NIC data caching Save frequent packet data on NIC Cache size severely limited by Tigon2 memory capacity These prior experiences with experimental prototyping motivated the creation of RiceNIC RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 10

RiceNIC Overview Gigabit Ethernet Network Interface Card RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 11

RiceNIC Overview Reconfigurable Programmable FPGAs implement hardware architecture Embedded processors provide high-level NIC control Performs at Ethernet line rate Reference design is freely available Targeted at research and education applications RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 12

RiceNIC Board Serial Port DDR Virtex FPGA RJ-45 Port Spartan FPGA Ethernet PHY 2 FPGAs 2 PowerPC processors (300 MHz) Serial Port PCI Interface On-NIC Memory (256MB DDR) Gigabit Ethernet PHY 64-bit / 66 MHz PCI bus RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 13

Debugging Tools Software Debugging Serial Console (RS-232 port) Command line interface to PowerPC processors Runtime debugging / configuration changes / bootstrapping Firmware Profiler Timer based statistical profiler (similar to Oprofile) Exports results via serial console Hardware Debugging Logic analyzer (Xilinx Chipscope) on FPGAs Debugging Tools are Essential for Experimentation RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 14

RiceNIC Design Timeline Development time Hardware (FPGA) design - 1 year / 1 graduate student Software (firmware / driver) design - 1 month / 1 graduate student Development stages Learning Xilinx FPGA tools 1 month Design / Implementation 9 months Testing 3 months Prototype can be re-used by other institutions just like a simulator Current users include Rice, EPFL, UF, USF, and HP labs Constructing an experimental prototype is practical! RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 15

RiceNIC Applications RiceNIC supports a wide range of experimental research projects Scope of Changes Component Level System Level Modify firmware (Network Address Translation on NIC) Modify FPGA (Low power networking with adaptive MAC) Modify firmware, FPGA, device driver, OS (Networking in virtual machines) Modify systems beyond the NIC (Reconfigurable networking lab) RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 16

Network Address Translation Scope focused on NIC firmware Added NAT module to RiceNIC RiceNIC still runs at full Ethernet speeds Implementation took 1 day Debugging of NAT module greatly assisted by RiceNIC serial console which printed packet traces RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 17

Low Power MAC Energy Efficient Internet Project (at USF and UF) Scope focused on single component MAC Modified FPGA hardware Replaced MAC core on RiceNIC FPGA with a custom low-power variant that supports adaptive link rates This research could not be done on any softwareprogrammable NIC! The Energy Efficient Internet Project: http://www.csee.usf.edu/~christen/energy/main.html RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 18

Networking in Virtual Machines Each guest OS can talk directly to single shared NIC Separate interfaces for concurrent guests Experimental research project with many modifications FPGA provides isolated contexts in memory + event notification Firmware provides packet multiplexing / demultiplexing Xen / OS modifications Used 1 PowerPC processor and 12MB of RAM Guest OS Guest OS Guest OS Driver Driver Driver Xen Virtual Machine Monitor Context 1 Context 2 Context 3 RiceNIC (with modifications) Ethernet P. Willmann, J. Shafer, et.al, Concurrent Direct Network Access for Virtual Machine Monitors, The International Symposium on High Performance Computer Architecture (HPCA 07), Phoenix, AZ, (Feb 2007) RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 19

Virtual Machines Networking Prototyping (with RiceNIC) critical to project success Could not use software programmable NIC Needed to change hardware architecture Software emulation too slow RiceNIC prototype ran at full speed Used RiceNIC to experimentally determine key architectural features Minimum on-nic packet buffer size per virtual machine Could not obtain equivalent results via simulation in a timely fashion Guest OS Guest OS Guest OS Driver Driver Driver Xen Virtual Machine Monitor Context 1 Context 2 Context 3 RiceNIC (with modifications) Ethernet RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 20

Reconfigurable Network Ecosystem Scope of experimental prototyping extends beyond single system Create a reconfigurable and programmable network ecosystem All endpoints and routing fabric are modifiable Use RiceNIC as interface with host computers Use NetFPGA (from Stanford) as routers / switches Experimentally explore new networking architectures Fully Customizable Networking Lab RiceNIC RiceNIC RiceNIC NetFPGA (Router) NetFPGA (Router) RiceNIC RiceNIC RiceNIC RiceNIC RiceNIC RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 21

RiceNIC in Education Simple class project (undergraduate) NAT on NIC implementation (1 day) Intermediate project (undergraduate) Perform deep packet inspection in FPGA hardware Advanced project (graduate) Use fully customizable networking lab (with NICs, routers and switches) to explore new network protocols and architectures Debugging tools used for research (serial console, profiler, logic analyzer) are equally valuable in the classroom for use by non-experts RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 22

Obtaining RiceNIC RiceNIC is free for research and education applications Platform includes the FPGA configuration (bitstream), VHDL source code, embedded PowerPC firmware, and Linux device driver Purchase Avnet Virtex-II Pro development board Additional requirements to modify FPGA designs Xilinx development software (ISE, EDK, Chipscope) New Xilinx licenses for low-level hardware cores Can still modify firmware without any FPGA changes or additional licenses RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 23

Conclusions Research into network systems demands experimental prototypes Complex, asynchronous interactions among system components Simulation insufficient for new architectures Prototyping tools such as RiceNIC are critical for the development and understanding of future computer systems Constructing an experimental prototype is practical RiceNIC is awesome! Provides hardware and software flexibility for a highly-capable experimental platform Currently employed in research projects at several institutions RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 24

Questions? RiceNIC website: http://www.cs.rice.edu/cs/architecture/ricenic/ RiceNIC - A Reconfigurable Network Interface for Experimental Research and Education 25