Towards Effective Packet Classification. J. Li, Y. Qi, and B. Xu Network Security Lab RIIT, Tsinghua University Dec, 2005

Similar documents
AN EFFICIENT HYBRID ALGORITHM FOR MULTIDIMENSIONAL PACKET CLASSIFICATION

TOWARDS EFFECTIVE PACKET CLASSIFICATION

Packet Classification: From Theory to Practice

Performance Evaluation and Improvement of Algorithmic Approaches for Packet Classification

Towards System-level Optimization for High Performance Unified Threat Management

Towards Optimized Packet Classification Algorithms for Multi-Core Network Processors

Packet Classification Algorithms: From Theory to Practice

Problem Statement. Algorithm MinDPQ (contd.) Algorithm MinDPQ. Summary of Algorithm MinDPQ. Algorithm MinDPQ: Experimental Results.

Fast Packet Classification Algorithms

Packet Classification Using Dynamically Generated Decision Trees

Packet Classification and Pattern Matching Algorithms for High Performance Network Security Gateway

Towards High-performance Flow-level level Packet Processing on Multi-core Network Processors

Implementation of Boundary Cutting Algorithm Using Packet Classification

DBS: A Bit-level Heuristic Packet Classification Algorithm for High Speed Network

HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification

Towards High-performance Flow-level Packet Processing on Multi-core Network Processors

Three Different Designs for Packet Classification

DESIGN AND IMPLEMENTATION OF OPTIMIZED PACKET CLASSIFIER

Multi-core Implementation of Decomposition-based Packet Classification Algorithms 1

A Multi Gigabit FPGA-based 5-tuple classification system

Multi-dimensional Packet Classification on FPGA: 100 Gbps and Beyond

Packet Classification using Rule Caching

Design of a Multi-Dimensional Packet Classifier for Network Processors

CS 268: Route Lookup and Packet Classification

Grid of Segment Trees for Packet Classification

Design of a High Speed FPGA-Based Classifier for Efficient Packet Classification

Data Structures for Packet Classification

Decision Forest: A Scalable Architecture for Flexible Flow Matching on FPGA

Algorithms for Packet Classification

A Scalable Approach for Packet Classification Using Rule-Base Partition

Rule Caching for Packet Classification Support

Tree-Based Minimization of TCAM Entries for Packet Classification

Decision Forest: A Scalable Architecture for Flexible Flow Matching on FPGA

Scalable Packet Classification on FPGA

MULTI-MATCH PACKET CLASSIFICATION BASED ON DISTRIBUTED HASHTABLE

Memory-Efficient 5D Packet Classification At 40 Gbps

Packet Classification Using Standard Access Control List

ECE697AA Lecture 21. Packet Classification

Packet Classification Algorithms: A Survey

ITTC High-Performance Networking The University of Kansas EECS 881 Packet Switch I/O Processing

Performance Improvement of Hardware-Based Packet Classification Algorithm

Efficient Packet Classification using Splay Tree Models

Packet Classification. George Varghese

Selective Boundary Cutting For Packet Classification SOUMYA. K 1, CHANDRA SEKHAR. M 2

Efficient Multi-Match Packet Classification with TCAM

HSM: A Fast Packet Classification Algorithm

Routing Lookup Algorithm for IPv6 using Hash Tables

Real Time Packet Classification and Analysis based on Bloom Filter for Longest Prefix Matching

Survey and Taxonomy of Packet Classification Techniques

Toward Predictable Performance in Decision Tree based Packet Classification Algorithms

Priority Area-based Quad-Tree Packet Classification Algorithm and Its Mathematical Framework

Scalable Packet Classification Using Interpreting. A Cross-platform Multi-core Solution

HNPC: Hardware Network Packet Classifier for High Speed Intrusion Detection Systems

Hierarchical Intelligent Cuttings: A Dynamic Multi-dimensional Packet Classification Algorithm

Fast IP Routing Lookup with Configurable Processor and Compressed Routing Table

SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification. Fang Yu, T.V. Lakshman, Martin Austin Motoyama, Randy H.

Packet Classification Algorithm Based on Geometric Tree by using Recursive Dimensional Cutting (DimCut)

High-Performance Packet Classification on GPU

Efficient Packet Classification on FPGAs also Targeting at Manageable Memory Consumption

Scalable Packet Classification for IPv6 by Using Limited TCAMs

A STUDY & COMPARATIVE EVALUATION OF PACKET CLASSIFICATION ALGORITHMS

EECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture

Fast Packet Classification Using Bloom filters

15-744: Computer Networking. Routers

FPGA Implementation of Lookup Algorithms

SINCE the ever increasing dependency on the Internet, there

Efficient TCAM Encoding Schemes for Packet Classification using Gray Code

Switch and Router Design. Packet Processing Examples. Packet Processing Examples. Packet Processing Rate 12/14/2011

Generic Architecture. EECS 122: Introduction to Computer Networks Switch and Router Architectures. Shared Memory (1 st Generation) Today s Lecture

Recursive Flow Classification: An Algorithm for Packet Classification on Multiple Fields

Forwarding and Routers : Computer Networking. Original IP Route Lookup. Outline

424 IEEE TRANSACTIONS ON COMPUTERS, VOL. 63, NO. 2, FEBRUARY Practical Multituple Packet Classification Using Dynamic Discrete Bit Selection

DESIGN AND EVALUATION OF PACKET CLASSIFICATION SYSTEMS ON MULTI-CORE ARCHITECTURE SHARIFUL HASAN SHAIKOT

Ruler: High-Speed Packet Matching and Rewriting on Network Processors

Last Lecture: Network Layer

TTSS Packet Classification Algorithm to enhance Multimedia Applications in Network Processor based Router

A 400Gbps Multi-Core Network Processor

PACKET classification is an enabling function for a variety

Hardware Assisted Recursive Packet Classification Module for IPv6 etworks ABSTRACT

DiffServ over Network Processors: Implementation and Evaluation

Frugal IP Lookup Based on a Parallel Search

Scalable Packet Classification using Distributed Crossproducting of Field Labels

ERFC: An Enhanced Recursive Flow Classification Algorithm

An Efficient TCAM Update Scheme for Packet Classification

All-Match Based Complete Redundancy Removal for Packet Classifiers in TCAMs

Survey on Multi Field Packet Classification Techniques

Efficient Packet Classification for Network Intrusion Detection using FPGA

Hardware Acceleration in Computer Networks. Jan Kořenek Conference IT4Innovations, Ostrava

QoS Services with Dynamic Packet State

ABC: Adaptive Binary Cuttings for Multidimensional Packet Classification

A middle ground between CAMs and DAGs for high-speed packet classification

Homework 1 Solutions:

Scalable Packet Classification on FPGA

High-Performance Packet Classification Algorithm for Multithreaded IXP Network Processor

FPGA Based Packet Classification Using Multi-Pipeline Architecture

Router Architectures

A Configurable Packet Classification Architecture for Software- Defined Networking

Reliably Scalable Name Prefix Lookup! Haowei Yuan and Patrick Crowley! Washington University in St. Louis!! ANCS 2015! 5/8/2015!

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

Dynamic Routing Tables Using Simple Balanced. Search Trees

Transcription:

Towards Effective Packet Classification J. Li, Y. Qi, and B. Xu Network Security Lab RIIT, Tsinghua University Dec, 2005

Outline Algorithm Study Understanding Packet Classification Worst-case Complexity Analysis Existing Algorithmic Solutions Our Novel Algorithms Network Processor Implementation Current Hardware Limits External & Internal Traffics Intel IXP Implementation Summary

Outline Algorithm Study Understanding Packet Classification Worst-case Complexity Analysis Existing Algorithmic Solutions Our Novel Algorithms Network Processor Implementation Current Hardware Limits External & Internal Traffics Intel IXP Implementation Summary

1.1 Understanding Packet Classification Packet Classification Overview H E A D E R Incoming Packet Forwarding Engine Packet Classification Policy/Rule Database) Predicate Action ---- ---- ---- ---- ---- ---- ACTION

1.1 Understanding Packet Classification An Example Rule Set Field 1 Field 2 Field k Action Rule 1 152.163.190.69/21 152.163.80.11/32 UDP A1 Rule 2 152.168.3.0/24 152.163.0.0/16 TCP A2 Rule N 152.168.0.0/16 152.0.0.0/8 ANY An E.g. A packet P(152.168.3.32, 152.163.171.71,, TCP) would have action A2 (also matches An but A2 has higher priority) applied to it.

1.1 Understanding Packet Classification Rules In the Search Space Rule Xrange Yrange R1 0-31 0-255 R2 0-255 128-131 R3 64-71 128-255 128R1 R2 P R4 67-67 0-127 R1 R4 R5 64-71 0-15 (0-31, R6 0-255) R6 128-191 4-131 R5 0 R7 192-192 0-255 0 128 255 Spaces: Single/multiple dimensions (fields); Span of each dimensions. Rules: Prefix/Range matching; Structural characteristics. Packets: Dynamic characteristics. 255 R3 R7

1.2 Worst-case Complexity Analysis Point Location: among N non-overlapping regions in F dimensions takes Either O(logN) time with O(N F ) space Or O(N) space with O(log F-1 N) time E.g. N=1000,F=4:1000G space,1000 accesses De-overlapping: N overlapping regions need up to (2N-1) F non-overlapping region to represent. Range-to-Prefix: N rules in range [0, 2 W ] need up to (N(2W-1)) F prefixes to represent. F: number of fields; W: bit length of each field

1.2 Worst-case Complexity Analysis Conclusion The theoretical bounds tell us that it is not possible to arrive at a practical worst case solution. Fortunately, we don t have to; No single algorithm will perform well for all cases. Hence a hybrid scheme might be able to combine the advantages of several different approaches. P. Gupta

1.3 Existing Algorithmic Solutions Algorithm Categorization (1) Categorization Based on Packet Search Data-structures [17]

1.3 Existing Algorithmic Solutions Algorithm Categorization (2) Categorization Based on Space Partition

1.4 Novel Algorithms: D-Cuts Dynamic Cuttings: Ideas HiCuts Tree Rules If most traffic matches {R1, R3, R4}, i.e. goes through subspace {X(000:111), Y(000:001)} then we can rebuild the HiCuts tree to cut down the worst case depth. D-Cuts Tree

1.4 Novel Algorithms: D-Cuts Dynamic Cuttings: Performance Sopt: Optimized for space Topt: Optimized for time

1.4 Novel Algorithms: HSM Hierarchical Space Mapping: Ideas Packet AMT: Address Mapping Table PMT: Port Mapping Table PLT: Policy Lookup Table SA DA SP DP Indexed Search? Binary Search? AMT PMT PLT (HSM) Binary Search: not slow but avoid large index tables (RFC) Indexed Search: too large index tables

1.4 Novel Algorithms: HSM Hierarchical Space Mapping: Performance Classifiers Number of rules RFC(kB) HSM(kB) Percentage Improved FW1 68 802 41 95% FW2 136 838 111 87% FW3 340 1,186 262 78% CR1 500 1,060 119 89% CR2 1,000 2,122 923 61% CR3 1,535 3,454 1,947 44% CR4 1,945 6,320 3,957 37%

1.4 Novel Algorithms: sbits Shifting Bits: Ideas Pointer array: Too Large Replace the pointers with a Bit string Replace the pointers with Indexes Note: 32:1 compression rate No additional memory access Hardware supported

1.4 Novel Algorithms: sbits Shifting Bits: Performance No. Rules HiCuts HyperCuts sbits FW1 68 5,443 35,401 420 FW2 136 10,779 69,782 924 FW3 340 24,645 172,932 2,331 CR1 500 29,409 89,005 3,612 CR2 1000 979,736 871,541 28,287 CR3 1530 13,606,858* 480,225 29,204 CR4 1945 5,928,724* 672,442 43,183 sbits vs. HiCuts/HyperCuts: memory usage (Unit: 32-bit long-word)

1.4 Novel Algorithms: sbits Shifting Bits: Performance sbits vs. HyperCuts: memory usage against rulesets of different size

1.4 Novel Algorithms: Summary Tree-based Algorithms (HiCuts, D-Cuts) Memory efficient No explicit worst-case bound, not fast enough Table-based Algorithms (RFC, HSM) Fast search speed Not memory efficient Hybrid Algorithms (sbits) Combine the advantages of several different approaches. Maybe hard to implement (too complicated)

Outline Algorithm Study Understanding Packet Classification Worst-case Complexity Analysis Existing Algorithmic Solutions Our Novel Algorithms Network Processor Implementation Current Hardware Limits External & Internal Traffics Intel IXP Implementation Summary

2.1 Current Hardware Limits TCAM Board area Power Range matching ASIC/FPGA R&D cost Update General Purpose CPU Continuity of both time and space Network Processor Highly integrated processing units Date plane & Control plane Handle rarely associative network traffics

2.2 External & Internal Traffics Traffic In a Router Example: External Traffic Memory Internal Traffic Processor Assume: 1 rule = 64 Byte in Memory External Traffic Assume: 1 packet= 64 Byte going through Processor By Linear Search: process 1 packet needs to read 1K rules in worst-case. (Internal Traffic) : (External Traffic) = 1000:1

2.2 External & Internal Traffics Traffic In a Router External Traffic Example (continue): Memory Internal Traffic Processor External Traffic Assume SRAM Bandwidth in NP = 20GByte/s If (Internal Traffic) : (External Traffic) = 1000:1 External Traffic < (20G/1000) = 20MByte/s (160Mbps) Else if (Internal Traffic) : (External Traffic) = 40:1 External Traffic < (20G/40) = 0.5GByte/s (4Gbps)

2.2 External & Internal Traffics Existing Algorithms (dealing with 2,000 rules) Table-based Algorithms: (Internal Traffic) : (External Traffic) = 1:1~5:1 Best temporal performance Require up to 30MB SRAM Tree-based Algorithms: (Internal Traffic) : (External Traffic) = 20:1~30:1 Require less than 10MB SRAM Unstable performance, no worst-case bound

2.3 Intel IXP Implementation Intel Network Processor Architecture 64b 18 18 18 RDRAM 1 PCI (64b) 66 MHz Stripe RDRAM 2 XScale Core 32K IC 32K DC RDRAM 3 G A S K E T MEv2 1 MEv2 8 MEv2 9 MEv2 2 MEv2 7 MEv2 10 MEv2 3 MEv2 6 MEv2 11 MEv2 4 MEv2 5 MEv2 12 IXP2800 Rbuf 64 @ 128B Tbuf 64 @ 128B Hash 64/48/128 S P I 4 or C S I X 16b 16b QDR SRAM 1 E/D Q 18 18 QDR SRAM 2 E/D Q 18 18 QDR SRAM 3 E/D Q 18 18 QDR SRAM 4 E/D Q 18 18 MEv2 16 MEv2 15 MEv2 14 MEv2 13 Scratch 16KB CSRs -Fast_wr -UART -Timers -GPIO -BootROM/SlowPort

2.3 Intel IXP Implementation IXP2xxx Packet Processing Stages SPI4 Ethernet Decap Range Matching Packet Rx IPv4 Forwarding Packet Tx Queue Managing Scheduling CSIX Packet Processing Stages of the Packet Classification Application. Packet classification algorithms are running in Rage Matching PPS.

2.3 Intel IXP Implementation Simulation Result: Linear Search Linear Search Throughput (%linespeed) 100% 80% 60% 40% 20% 0% 1 3 5 8 10 Number of Rules Performance Evaluation of Linear Search Algorithm. Each incoming packet just matches the default rule, so that the worst-case performance is obtained. Deterministic worst-case bound: O(N).

2.3 Intel IXP Implementation Simulation Result: HSM 100% HSM Throughput (%linespeed) 80% 60% 40% 20% 0% 69 341 1001 Number of Rules Performance Evaluation of HSM Algorithm. Deterministic worst-case bound: O(logN).

2.3 Intel IXP Implementation Simulation Result: HiCuts (worst-case path) HiCuts Simulation (1 rules in leaf-nodes) 100.00% ThroughPut (%linespeed) 80.00% 60.00% 40.00% 20.00% 0.00% 1 3 5 8 10 Tree Depth Performance Evaluation of HiCuts Algorithm. Non-deterministic worst-case bound. 1k rules often need a 10-level decision tree.

2.3 Intel IXP Implementation Simulation Result: HiCuts (worst-case path) HiCuts Simulation (10 rules in leaf-nodes) 100.00% Throughput (%linespeed) 80.00% 60.00% 40.00% 20.00% 0.00% 1 3 5 8 10 Tree Depth And what s more, in the worst-case, it often needs up to 10 times of linear searches after tracing down the decision tree.

Outline Algorithm Study Understanding Packet Classification Worst-case Complexity Analysis Existing Algorithmic Solutions Our Novel Algorithms Network Processor Implementation Current Hardware Limits External & Internal Traffics Intel IXP Implementation Summary

3 Summary No single algorithm will perform well for all cases: We search for algorithms that are fast enough and use not too much memory. Search Speed should be guaranteed in the worst-case. Hardware limits require flexible algorithms: Designing an effective algorithm should consider the features and limits of the hardware: e.g. SRAM size Implementation of an algorithm should make full use of all hardware units: e.g. Local Memory for Port Indexing

References [1] P. Gupta and N. McKeown, Packet Classification Using Hierarchical Intelligent Cuttings, Proc. Hot Interconnects, 1999 [2] M.H. Overmars and A.F. van der Stappen, Range Searching and Point Location among Fat Objects, J. of Algorithms, 21(3), 1996. [3] P. Gupta and N. McKeown, Packet Classification on Multiple Fields, Proc. ACM SIGCOMM 99, 1999. [4] B. Xu, D. Jiang, J. Li, HSM: A Fast Packet Classification Algorithm, Proc. IEEE 19th International Conference on Advanced Information Networking and Applications (AINA), 2005. [5] V. Srinivasan, et al., "Fast and Scalable Layer Four Switching, Proc. ACM SIGCOMM, 1998. [6] F. Baboescu, and G. Varghese, "Scalable Packet Classification, Proc. ACM SIGCOMM, 2001. [7] S. Singh, F. Baboescu, and G. Varghese, "Packet Classification Using Multidimensional Cutting, Proc. ACM SIGCOMM, 2003. [8] F. Baboescu, S. Singh, and G. Varghese, Packet Classification for Core Routers: Is There An Alternative To CAMs? Proc. INFOCOM, 2003. [9] V.Srinivasan, S.Suri, and G.Varghese, Packet Classification using Tuple Space Search, Proc. SIGCOMM, 1999. [10] S. Singh and F. Baboescu, Packet Classification Repository, http://ial.ucsd.edu/classification [11] A. Feldman and S. Muthukrishnan, Tradeoffs for Packet Classification, Proc. INFOCOM, 2000. [12] T. Lakshman and D. Stiliadis, High Speed Policy-based Packet Forwarding using Efficient Multi-dimensional Range Matching, Proc. SIGCOMM,1998. [13] T.Y.C Woo, A Modular Approach to Packet Classification: Algorithms and Results, Proc. IEEE INFOCOM, 2000. [14] F. Geraci, M. Pellegrini, and P. Pisati, Packet Classification via Improved Space Decomposition Techniques, Proc. IEEE INFOCOM, 2005. [15] Y. Qi and J. Li, Dynamic Cuttings: Packet Classification with Network Traffic Statistics, Proc. 3rd International Trusted Internet Workshop (TIW), 2004. [16] P. Gupta and N. McKewon, Algorithms for Packet Classification, IEEE Network, March/April 2001, 2001. [17] D. E. Taylor, Survey & Taxonomy of Packet Classification Techniques, Washington University in Saint-Louis, US, 2004. [18] M.E. Kounavis, A. Kumar, H. Vin, R. Yavatkar and A.T. Campbell, Directions in Packet Classification for Network Processors, Proc. Second Workshop on Network Processors (NP2), 2003.

Thanks junl@tsinghua.edu.cn