Automatic Generation of 100 Gbps Packet Parsers from P4 Description

Similar documents
Jakub Cabal et al. CESNET

P51: High Performance Networking

Design Methodology of Configurable High Performance Packet Parser for FPGA

Decision Forest: A Scalable Architecture for Flexible Flow Matching on FPGA

P4 for an FPGA target

Packet Header Analysis and Field Extraction for Multigigabit Networks

Hardware Acceleration for Measurements in 100 Gb/s Networks

Flexible network monitoring at 100Gbps. and beyond

Rx Stack Accelerator for 10 GbE Integrated NIC

Programmable Dataplane

Programmable Data Plane at Terabit Speeds

ProtoFlex Tutorial: Full-System MP Simulations Using FPGAs

Improving DPDK Performance

LogiCORE IP Serial RapidIO Gen2 v1.2

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Parallel Programming Platforms

Dynamic Compilation and Optimization of Packet Processing Programs

FPGA accelerated application monitoring in 40 and 100G networks

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP

INT G bit TCP Offload Engine SOC

Hardware Implementation of Cryptosystem by AES Algorithm Using FPGA

Understanding Performance of PCI Express Systems

FPGA Implementation of RDMA-Based Data Acquisition System Over 100 GbE

Linux Network Programming with P4. Linux Plumbers 2018 Fabian Ruffy, William Tu, Mihai Budiu VMware Inc. and University of British Columbia

Line-rate packet processing in hardware: the evolution towards 400 Gbit/s

S O N i C - P r o g r a m m a b i l i t y, E x t e n s i b i l i t y a n d B e y o n d

Integrated Modeling and Generation of a Reconfigurable Network-On-Chip

PVPP: A Programmable Vector Packet Processor. Sean Choi, Xiang Long, Muhammad Shahbaz, Skip Booth, Andy Keep, John Marshall, Changhoon Kim

Clock Speed Optimization of Runtime Reconfigurable Systems by Signal Latency Measurement

Data Center Virtualization: VirtualWire

A Framework for Rule Processing in Reconfigurable Network Systems

CPS104 Computer Organization and Programming Lecture 20: Superscalar processors, Multiprocessors. Robert Wagner

Dataplane Programming

Slicing a Network. Software-Defined Network (SDN) FlowVisor. Advanced! Computer Networks. Centralized Network Control (NC)

Control plane and data plane. Computing systems now. Glacial process of innovation made worse by standards process. Computing systems once upon a time

SwitchX Virtual Protocol Interconnect (VPI) Switch Architecture

A Configurable High-Throughput Linear Sorter System

DATA NETWORKING CONCEPTS

Programmable Target Architectures for P4

Stateful Openflow: Hardware Proof of Concept

COMP211 Chapter 4 Network Layer: The Data Plane

A SIMULINK-TO-FPGA MULTI-RATE HIERARCHICAL FIR FILTER DESIGN

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures

CONSIDERATIONS ON HARDWARE IMPLEMENTATIONS OF ENCRYPTION ALGORITHMS

Switch programmability 7/ Mellanox Technologies

P4 Introduction. Jaco Joubert SIGCOMM NETRONOME SYSTEMS, INC.

Scrypt ASIC Prototyping Preliminary Design Document

UltraScale Architecture Integrated Block for 100G Ethernet v1.4

End-to-End Adaptive Packet Aggregation for High-Throughput I/O Bus Network Using Ethernet

100 GBE AND BEYOND. Diagram courtesy of the CFP MSA Brocade Communications Systems, Inc. v /11/21

Checker Design for On-line Testing of Xilinx FPGA Communication Protocols

Hardware Acceleration in Computer Networks. Jan Kořenek Conference IT4Innovations, Ostrava

Advanced Computer Networks. RDMA, Network Virtualization

Fast Flexible FPGA-Tuned Networks-on-Chip

Software-Defined Networking:

Some Musings on OpenFlow and SDN for Enterprise Networks. David Meyer Open Networking Summit October 18-19, 2011

PowerPC on NetFPGA CSE 237B. Erik Rubow

Gateware Defined Networking (GDN) for Ultra Low Latency Trading and Compliance

Reconfigurable Hardware Implementation of Mesh Routing in the Number Field Sieve Factorization

Multi-gigabit Switching and Routing

A Design and Implementation of Custom Communication Protocol Based on Aurora

FPGA Implementation of MIPS RISC Processor

Fast implementations of secret-key block ciphers using mixed inner- and outer-round pipelining

Reconfigurable Computing. Design and Implementation. Chapter 4.1

FPGA: What? Why? Marco D. Santambrogio

What is Xilinx Design Language?

Intelligence Defined Network (IDN)

Practical Network-wide Packet Behavior Identification by AP Classifier

Reconfigurable Computing for Network Function Virtualization: A Protocol Independent Switch

HP0-Y51. Building HP SDN and FlexNetwork Solutions. Download Full Version :

SOCS BASED OPENRISC AND MICROBLAZE SOFT PROCESSORS COMPARISON STUDY CASES: AUDIO IMPLEMENTATION AND NETWORK IMPLEMENTATION BASED SOCS

OpenFlow Ronald van der Pol

Table 1: Example Implementation Statistics for Xilinx FPGAs

DetNet. Flow Definition and Identification, Features and Mapping to/from TSN. DetNet TSN joint workshop IETF / IEEE 802, Bangkok

Verilog Sequential Logic. Verilog for Synthesis Rev C (module 3 and 4)

TOWARDS HARDWARE VERIFICATION

Tutorial S TEPHEN IBANEZ

Leveraging Stratum and Tofino Fast Refresh for Software Upgrades

Programmable Data Plane at Terabit Speeds Vladimir Gurevich May 16, 2017

Dynamic Analytics Extended to all layers Utilizing P4

Cubro Packetmaster EX32100

PREEvision Technical Article

TSD. Discussion about the AVTP timestamp for H.264 video transmission described in 1722_D12 chapter 9.5. Christian Sörensen

6.9. Communicating to the Outside World: Cluster Networking

Design principles in parser design

SwitchBlade: A Platform for Rapid Deployment of Network Protocols on Programmable Hardware

LogiCORE IP Serial RapidIO v5.6

FPGA architecture and design technology

Dynamically Configurable Online Statistical Flow Feature Extractor on FPGA

ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests

Virtex-7 FPGA Gen3 Integrated Block for PCI Express

Getting started with O3 Project Achievement ~ Innovating Network Business through SDN WAN Technologies~

Hardware RSA Accelerator. Group 3: Ariel Anders, Timur Balbekov, Neil Forrester

Ethernet Jumbo Frames

HEX Switch: Hardware-assisted security extensions of OpenFlow

NETWORK INTRUSION DETECTION SYSTEMS ON FPGAS WITH ON-CHIP NETWORK INTERFACES

Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA

PERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS

Making Network Functions Software-Defined

The Lekha 3GPP LTE FEC IP Core meets 3GPP LTE specification 3GPP TS V Release 10[1].

Transcription:

Automatic Generation of 100 Gbps acket arsers from 4 Description avel Benáček 1 Viktor uš 1 Hana Kubátová 2 1 Liberouter CESNET 2 Faculty of Information Technology Czech Technical University in rague H 2 RC Workshop, November 2015 Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 1 / 12

Introduction OpenFlow is a very popular protocol for realization of SDN (Software Defined Networking) ros and Cons: + Allows us to decouple control and data plane + rovides a way to fill match tables of switches at runtime - Not possible to change set of supported protocols (parsers are fixed) Researchers are looking for a solution of this disadvantage 4 language is the next step in the SDN concept realization Our paper introduces a generator which transforms 4 source to the FGA parser s architecture Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 2 / 12

4 rogramming rotocol-independent acket rocessors Language with relatively simple syntax rovides a way to define packet processing functionality of network devices Defines following aspects of packet processing: 1 Header Formats 2 acket arser 3 Table Specification 4 Control rogram packet parser definition match to action mapping 5 Action Specification Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 3 / 12

arser structure Two basic types of modules are connected to pipeline 1 rotocol Analyzer - understands one protocol from the protocol stack Data extraction Computes the next protocol type in the stack Computes the next protocol starting offset 2 ipeline - used to tune the final frequency and latency Unified interface for easy connection of modules Input Eth I TC Ethernet Frame I E. 0 Ethernet Analyzer I E. 1 I Analyzer I E. 2 TC/UD Analyzer I E. 3 Eth I TC Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 4 / 12

Transformation Algorithm Structure of protocol analyzer is generated from the 4 s Header Format arser s structure is inferred from 4 s acket arser Basic idea of parser s structure identification We have to identify the latest usage of the protocol in the parser. Algorithm for the identification of the latest usage: DFS (Depth First Search) This can be done in arser Graph Representation (GR) Directed acyclic graph Represents relations between protocols Created from 4 description Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 5 / 12

Transformation Algorithm - Example ICMv6 3 Iv6 2 Unknown TC 3 Eth 0 VLAN 1 Unknown Iv4 2 UD 3 Logical model (GR) Unknown ICM 3 Unknown hysical model Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 6 / 12

Results We have tested two protocol stacks: Simple L2 - Ethernet, Iv4/Iv6, TC/UD, ICM/ICMv6 Full - Ethernet, 2x VLAN, 2x MLS, Iv4/Iv6, TC/UD, ICM/ICMv6 ossible optimizations: Automatic - Optimizations of internal parser s structure O1 - Offset bus optimization O2 - Multiplexer optimization Manual - Tweaked 4 program (O3) We performed synthesis for our implementation platform: Equipped with Xilinx Virtex 7 FGA Suports 100 Gbps Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 7 / 12

Results - Latency 70 60 hand,pareto,simple L2 4,O2,pareto,simple L2 hand,pareto,full 4,O2,pareto,full 50 Latency [ns] 40 30 20 10 0 0 50 100 150 Throughput [Gbps] Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 8 / 12

Results - Resources 14000 12000 hand,pareto,simple L2 4,O2,pareto,simple L2 hand,pareto,full 4,O2,pareto,full 10000 Slice LUT [-] 8000 6000 4000 2000 0 0 50 100 150 Throughput [Gbps] Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 9 / 12

Conclusion We implemented and evaluated our generator of parsers arsers are capable to process 100 Gbps From presented work we can infer following important results: 1 Ability to generate parsers with equal functionality in shorter time 2 Generated parsers aren t significantly worse than hand optimized versions created by a professional with many years of experience in HDL coding Future work: Deparser - construction of packets Match+Action tables - general processing of extracted data Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 10 / 12

Thank you for your attention avel Benáček www.liberouter.org @liberouter Visit our business partner in booth #3011 Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 11 / 12

rotol Analyzer (3) Input rotocol rotocol Analyzer Data Extractor Next rotocol Decoder (6) Output rotocol (1) Input Data Data Extractor (4) Extracted Data (2) Input O set Data Extractor + Length Gen. (5) Output O set Benáček, uš, Kubátová 100 Gbps acket arser from 4 H 2 RC 2015 12 / 12