Overview. Implementing Gigabit Routers with NetFPGA. Basic Architectural Components of an IP Router. Per-packet processing in an IP Router

Similar documents
Verilog for High Performance

FPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1

Verilog Fundamentals. Shubham Singh. Junior Undergrad. Electrical Engineering

ECE 2300 Digital Logic & Computer Organization. More Sequential Logic Verilog

Introduction. Router Architectures. Introduction. Introduction. Recent advances in routing architecture including

EECS150 - Digital Design Lecture 5 - Verilog Logic Synthesis

Register Transfer Level in Verilog: Part I

Digital Design with FPGAs. By Neeraj Kulkarni

VHDL for Synthesis. Course Description. Course Duration. Goals

Introduction. Introduction. Router Architectures. Introduction. Recent advances in routing architecture including

T NetFPGA prototype of zfilter forwarding. Petri Jokela ericsson research, Nomadiclab

Switch and Router Design. Packet Processing Examples. Packet Processing Examples. Packet Processing Rate 12/14/2011

Network Processors. Nevin Heintze Agere Systems

Synthesis vs. Compilation Descriptions mapped to hardware Verilog design patterns for best synthesis. Spring 2007 Lec #8 -- HW Synthesis 1

NetFPGA Hardware Architecture

Experience with the NetFPGA Program

Synthesizable Verilog

SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification. Fang Yu, T.V. Lakshman, Martin Austin Motoyama, Randy H.

Last Lecture: Network Layer

EEL 4783: HDL in Digital System Design

Professor Yashar Ganjali Department of Computer Science University of Toronto.

EECS150 - Digital Design Lecture 10 Logic Synthesis

Register Transfer Level

Verilog Execution Semantics

Synthesis of Combinational and Sequential Circuits with Verilog

Introduction. Why Use HDL? Simulation output. Explanation

FPGA: FIELD PROGRAMMABLE GATE ARRAY Verilog: a hardware description language. Reference: [1]

Lab 7 (Sections 300, 301 and 302) Prelab: Introduction to Verilog

Logic Synthesis. EECS150 - Digital Design Lecture 6 - Synthesis

Verilog Tutorial. Verilog Fundamentals. Originally designers used manual translation + bread boards for verification

Verilog Tutorial 9/28/2015. Verilog Fundamentals. Originally designers used manual translation + bread boards for verification

Verilog. What is Verilog? VHDL vs. Verilog. Hardware description language: Two major languages. Many EDA tools support HDL-based design

Lab 7 (All Sections) Prelab: Introduction to Verilog

Nikhil Gupta. FPGA Challenge Takneek 2012

EECS150 - Digital Design Lecture 10 Logic Synthesis

Lecture 11: Packet forwarding

Topics for Today. Network Layer. Readings. Introduction Addressing Address Resolution. Sections 5.1,

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Sciences

Lecture 16: Router Design

Spring 2017 EE 3613: Computer Organization Chapter 5: Processor: Datapath & Control - 2 Verilog Tutorial

Date Performed: Marks Obtained: /10. Group Members (ID):. Experiment # 11. Introduction to Verilog II Sequential Circuits

Two Level State Machine Architecture for Content Inspection Engines

Graphics: Alexandra Nolte, Gesine Marwedel, Universität Dortmund. RTL Synthesis

EECS 270 Verilog Reference: Sequential Logic

ECE 574: Modeling and Synthesis of Digital Systems using Verilog and VHDL. Fall 2017 Final Exam (6.00 to 8.30pm) Verilog SOLUTIONS

Verilog Design Principles

CS344 - Build an Internet Router. Nick McKeown, Steve Ibanez (TF)

P51: High Performance Networking

The Verilog Language COMS W Prof. Stephen A. Edwards Fall 2002 Columbia University Department of Computer Science

CSE140L: Components and Design Techniques for Digital Systems Lab

Lecture 3. Behavioral Modeling Sequential Circuits. Registers Counters Finite State Machines

Outline. EECS Components and Design Techniques for Digital Systems. Lec 11 Putting it all together Where are we now?

Verilog Sequential Logic. Verilog for Synthesis Rev C (module 3 and 4)

Speaker: Kayting Adviser: Prof. An-Yeu Wu Date: 2009/11/23

CSE140L: Components and Design

Laboratory Exercise 7

SwitchBlade: A Platform for Rapid Deployment of Network Protocols on Programmable Hardware

ECE 4514 Digital Design II. Spring Lecture 15: FSM-based Control

Verilog for Synthesis Ing. Pullini Antonio

Universal Serial Bus Host Interface on an FPGA

ECE 4514 Digital Design II. Spring Lecture 13: Logic Synthesis

Design principles in parser design

NetFPGA : An Open-Source Hardware Platform for Network Research and Teaching. Nick McKeown, John W. Lockwood, Jad Naous, Glen Gibb

A 400Gbps Multi-Core Network Processor

MLR Institute of Technology

Verilog introduction. Embedded and Ambient Systems Lab

CS429: Computer Organization and Architecture

Chapter 2 Using Hardware Description Language Verilog. Overview

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Verilog Module 1 Introduction and Combinational Logic

Master Course Computer Networks IN2097

Verilog 1 - Fundamentals

Network Processors and their memory

Router Architectures

RTL Coding General Concepts

EEL 4783: Hardware/Software Co-design with FPGAs

EE 231 Fall EE 231 Homework 8 Due October 20, 2010

6.9. Communicating to the Outside World: Cluster Networking

INSTITUTE OF AERONAUTICAL ENGINEERING Dundigal, Hyderabad ELECTRONICS AND COMMUNICATIONS ENGINEERING

Control in Digital Systems

CSCB58 - Lab 3. Prelab /3 Part I (in-lab) /2 Part II (in-lab) /2 TOTAL /8

Rapidly Developing Embedded Systems Using Configurable Processors

Programming with HDLs

Behavioral Modeling and Timing Constraints

Master Course Computer Networks IN2097

ISSN: [Bilani* et al.,7(2): February, 2018] Impact Factor: 5.164

VHDL: RTL Synthesis Basics. 1 of 59

Lecture 15: System Modeling and Verilog

CSE 123A Computer Networks

Contents. Appendix D Verilog Summary Page 1 of 16

VERILOG. Deepjyoti Borah, Diwahar Jawahar

Writing Circuit Descriptions 8

P4 for an FPGA target

Transistor: Digital Building Blocks

Laboratory Exercise 3

ECE 4514 Digital Design II. Spring Lecture 20: Timing Analysis and Timed Simulation

Lecture #1: Introduction

15-744: Computer Networking. Routers

TOWARDS HARDWARE VERIFICATION

08 - Address Generator Unit (AGU)

Transcription:

Overview Implementing Gigabit Routers with NetFPGA Prof. Sasu Tarkoma The NetFPGA is a low-cost platform for teaching networking hardware and router design, and a tool for networking researchers. The NetFPGA offloads processing from a host processor. The host's CPU has access to main memory and can DMA to read and write registers and memories on the NetFPGA. A hardware-accelerated datapath. Four Gigabit ports and multiple banks of local memory installed on the card. Uses Verilog and a cross compilation environment. Basic Architectural Components of an IP Router Protocols Software Control Plane Hardware path per-packet processing Per-packet processing in an IP Router 1. Accept packet arriving on an incoming link. 2. Lookup packet destination address in the forwarding table, to identify outgoing port(s). 3. Manipulate packet header: e.g., decrement TTL, update header checksum. 4. S packet to the outgoing port(s). 5. packet in the queue. 6. Transmit packet onto outgoing link. Generic Router Architecture Queue IP Header Packet ~1M prefixes Off-chip DRAM IP Next Hop ~1M packets Off-chip DRAM 1

Generic Router Architecture Rule-of of-thumb IP Header IP Header IP Header size is important Small queues reduce delay Large buffers are expensive A router needs a buffer size of B = 2T*C 2T is the two-way propagation delay (typically 250ms) C is the capacity of the bottleneck link Appears in IETF architectural guidelines TCP flows key input for buffer sizing Number of flow is large enough that flows are indepent and unsynchronized Algorithms Linear search Slow Direct lookup Requires memory, prefix update may lead to many changes Tries Deterministic lookup time, require multiple references TCAM Efficient parallel evaluation, require energy Algorithms CAM Content able Memories Associative memory Compares all entries in parallel Binary CAM Exact matching Ternary CAM Partial matching T-CAM Ternary Content-addressable Memories Partial matching in a single cycle Reports the index of the first match TCAM (prefix) SRAM (next hop address) Algorithmic methods Bloom filter.. T-CAM Fast, cost-effective, simple to manage High power consumption HW compares query word to all stored words (prefixes) in parallel Each bit of a word can be 0,1, or X (don t care) If multiple possible matches, lowest address is returned (shortest) CAM and T-CAM T Applications CAM Translation lookaside buffer (TLB) CPU cache that is used by memory management hardware to improve the speed of virtual address translation. Cache memories compression Image processing Packet forwarding T-CAM Packet forwarding Packet classification L4 switching Intrusion detection Pattern matching base operations 2

Processing Exception Processing Bloom Filters Bloom filter is a probabilistic set membership test (lookup function) Does item x exist in a set or a multiset? Coined by Burton H. Bloom in 1970 Various applications There are no false negatives, but allowable false positives Encoding an attribute a U, n = U Maintain a Bit Vector V of size m Use k hash functions (h 1..h k ), h i : U [1..m] Insert: For item x, set bits V[h 1 (x)]..v[h k (x)]. Lookup: Test bits V[h 1 (i)]..v[h k (i)]. If all are 1, return Probably Yes. Else No. Bloom Filter V 0 V m-1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 h 1 (x) h 2 (x) h 3 (x) h k (x) Bloom Filter Tradeoffs 1 2 3 4 5 6 Build basic router Command Line Protocol Integrate with H/W Interoperability Wow us! Interface (PWOSPF) Three factors: m,k and n. Typically n and m are given, and k is selected K is optimal when the hit ratio (ratio of bits flipped in the array) is 0.5 False positive probability of (1/2) k = 0.6185 m/n Processing software hardware Protocols Processing Innovate and add Presentations Judges Learning Environment Modular design Testing 4-port non-learning switch 4-port learning switch IPv4 router Integrate with S/W Interoperability Wow us! forwarding path 3

Verilog Verilog is a hardware description language (HDL) used to model electronic systems. The language supports the design, verification, and implementation of analog, digital, and mixed-signal circuits at various levels of abstraction. Concept of time is important. Statements are executed concurrently. The language is case-sensitive, has a preprocessor like C, and the major control flow keywords, such as "if" and "while", are similar. Verilog uses Begin/End instead of curly braces to define a block of code. Verilog II The definition of constants in Verilog require a bit width along with their base. A Verilog design consists of a hierarchy of modules. Modules are defined with a set of input, output, and bidirectional ports. Internally, a module contains a list of wires and registers. Concurrent and sequential statements define the behaviour of the module by defining the relationships between the ports, wires, and registers. Sequential statements are placed inside a / block and executed in sequential order within the block. But all concurrent statements and all / blocks in the design are executed in parallel, qualifying Verilog as a flow language. Keywords The always keyword indicates a free-running process that triggers on the accompanying event-control (@) clause. (similar to while(1) {..} in C) always @(posedge a) a <= b; // Run whenever reg a has a low to high change always @(a or b) // Whenever a or b changes The initial keyword indicates a process executes exactly once. The fork/join pair are used by Verilog to create parallel processes. Also forever keyword Delays with # Non blocking operators, for example <= Synthesizable A subset of statements in the language is synthesizable. If the modules in a design contain only synthesizable statements, software can be used to transform or synthesize the design into a netlist that describes the basic components and connections to be implemented in hardware (ASIC, FPGA) // 1 wire out ; assign out = sel? a : b; // 2 reg out; always @(a or b or sel) case(sel) 1'b0: out = b; 1'b1: out = a; case Mux // 3 reg out; always @(a or b or sel) if (sel) out = a; else out = b; 4

FlipFlop FlipFlops module toplevel(clock,reset); input clock; input reset; reg flop1; reg flop2; always @ (posedge reset or posedge clock) if (reset) flop1 <= 0; flop2 <= 1; else flop1 <= flop2; flop2 <= flop1; module Building block for logic One bit storage Counters Finite state machines With Schmitt trigger can be used to implement arbiter in async circuits Select the order of access to a shared resource Note metastability issues Procedural Interface Applications Verilog Procedural Interface (VPI) an interface primarily inted for the C programming language. allows behavioral Verilog code to invoke C functions, and C functions to invoke standard Verilog system tasks. IDS/IDP, Pattern matching, firewalls Content Processing and String Matching IP Lookup and Packet Classfication ing and Queueuing Protocol Processing TCP/IP Flow processing Semantic Processing Classfication and Clustering Reconfigurable Hardware Platforms Soft-core CPUS on FPGAs 5