Forwarding and Routers 15-744: Computer Networking L-9 Router Algorithms IP lookup Longest prefix matching Classification Flow monitoring Readings [EVF3] Bitmap Algorithms for Active Flows on High Speed Links [BV1] Scalable Packet Classification (4 pages) Optional [D+97] Small Forwarding Tables for Fast Routing Lookups [VVV1] EffiCuts: Optimizing Packet Classification 2 Outline IP router design Architectures Scheduling Limits Active networking IP route lookup Variable prefix match algorithms Packet classification Flow monitoring Original IP Route Lookup Address classes A: 7 bit network 24 bit host (16M each) B: 1 14 bit network 16 bit host (64K) C: 11 21 bit network 8 bit host (255) Address would specify prefix for forwarding table Simple lookup 3 4 1
Original IP Route Lookup Example www.cmu.edu address 128.2.11.43 Class B address class + network is 128.2 Lookup 128.2 in forwarding table Prefix part of address that really matters for routing Forwarding table contains List of class+network entries A few fixed prefix lengths (8/16/24) Large tables 2 Million class C networks 32 bits does not give enough space encode network location information inside address i.e., create a structured hierarchy CIDR Revisited Supernets Assign adjacent net addresses to same org Classless routing (CIDR) How does this help routing table? Combine routing table entries whenever all nodes with same prefix share same hop Routing protocols carry prefix with destination network address Longest prefix match for forwarding 5 6 CIDR Illustration Provider is given 21.1../21 CIDR Shortcomings Multi-homing Customer selecting a new provider Provider 21.1../21 21.1../22 21.1.4./24 21.1.5./24 21.1.6./23 Provider 1 Provider 2 21.1../22 21.1.4./24 21.1.5./24 21.1.6./23 or Provider 2 address 7 8 2
Outline Trie Using Sample Database IP route lookup Variable prefix match algorithms Packet classification Flow monitoring P5 P7 P8 Root 1 P6 P1 Trie P4 1 1 P3 1 P2 Sample Database P1 = 1* P2 = 111* P3 = 111* P4 = 1* P5 = * P6 = 1* P7 = 1* P8 = 1* 9 1 How To Do Variable Prefix Match Traditional method Patricia Tree Arrange route entries into a series of bit tests Worst case = 32 bit tests Problem: memory speed is a bottleneck May need to backtrack Optimal Expanded Tries Pick stride s for root and solve recursively default / Bit to test = left child,1 = right child 1 16 128.2/16 19 128.32/16 128.32.13/24 128.32.15/24 11 3
Speeding up Prefix Match (P+98) Prefix Tree Cut prefix tree at 16 bit depth 64K bit mask Bit = 1 if tree continues below cut (root head) Bit = 1 if leaf at depth 16 or less (genuine head) Bit = if part of range covered by leaf 1 1 1 1 1 1 1 1 1 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 Port 1 Port 5 Port 7 Port 3 Port 9 Port 5 13 14 Prefix Tree 1 1 1 1 1 1 1 1 1 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 Subtree 1 Subtree 2 Subtree 3 15 Leaf Pushing: entries that have pointers plus prefix have prefixes pushed down to leaves 4
Speeding up Prefix Match (P+98) Lulea Trick Speeds up Lookup Each 1 corresponds to either a route or a subtree Keep array of routes/pointers to subtree Need index into array how to count # of 1s Keep running count to 16bit word in base index + code word (6 bits) Need to count 1s in last 16bit word Clever tricks Subtrees are handled separately 17 Speeding up Prefix Match (P+98) Scaling issues How would it handle IPv6 Update issues Other possibilities Why were the cuts done at 16/24/32 bits? Improve data structure by shuffling bits Speeding up Prefix Match - Alternatives Route caches Temporal locality Many packets to same destination Other algorithms Waldvogel Sigcomm 97 Binary search on prefixes Works well for larger addresses Bremler-Barr Sigcomm 99 Clue = prefix length matched at previous hop Why is this useful? Lampson Infocom 98 Binary search on ranges 19 2 5
Speeding up Prefix Match - Alternatives Content addressable memory (CAM) Hardware based route lookup Input = tag, output = value associated with tag Requires exact match with tag Multiple cycles (1 per prefix searched) with single CAM Multiple CAMs (1 per prefix) searched in parallel Ternary CAM,1,don t care values in tag match Priority (I.e. longest prefix) by order of entries in CAM Outline IP route lookup Variable prefix match algorithms Packet classification See slides Zhuo Flow monitoring 21 22 Packet Classification Typical uses Identify flows for QoS Firewall filtering Requirements Match on multiple fields Strict priority among rules E.g 1. no traffic from 128.2.* 2. ok traffic on port 8 Complexity N rules and k header fields for k > 2 O(log N k-1 ) time and O(N) space O(log N) time and O(N k ) space Special cases for k = 2 source and destination O(log N) time and O(N) space solutions exist How many rules? Largest for firewalls & similar 17 Diffserv/QoS much larger 1k (?) 23 24 6
Bit Vectors Bit Vectors 1 Rule Field1 Field2 * * 1 Rule Field1 Field2 * * 1 * 1* 1 * 1* 1 2 1* 11* 1 1 2 1* 11* 11 1 1 3 11* 1* 1 1 1 1 3 11* 1* Field 1 Field 2 25 26 Aggregating Rules [BV1] Common case: very few 1 s in bit vector aggregate bits OR together A bits at a time N/A bit-long vector A typically chosen to match word-size Can be done hierarchically aggregate the aggregates AND of aggregate bits indicates which groups of A rules have a possible match Hopefully only a few 1 s in AND ed vector AND of aggregated bit vectors may have false positives Fetch and AND just bit vectors associated with positive entries Rearranging Rules [BV1] Problem: false positives may be common Solution: reorder rules to minimize false positives What about the priority order of rules? How to rearrange? Heuristic sort rules based on single field s values First sort by prefix length then by value Moves similar rules close together reduces false positives 27 28 7
Summary: Addressing/Classification Router architecture carefully optimized for IP forwarding Key challenges: Speed of forwarding lookup/classification Power consumption Some good examples of common case optimization Routing with a clue Classification with few matching rules Not checksumming packets Outline IP route lookup Variable prefix match algorithms Packet classification Flow monitoring Based on IMC 3 talk by Estan, Varghese, and Fisk 29 3 Why count flows? Existing flow counting solutions Detect port/ip scans Identify DoS attacks Estimate spreading rate of a worm Packet scheduling Dave Plonka s FlowScan Router Memory Network bandwidth NetFlow data Memory size & bandwidth Server Analysis Traffic reports Fast link Network Network Operations Center 8
Motivating question Can we count flows at line speeds at the router? Wrong solution counters Naïve solution use hash tables (like NetFlow) Our approach use bitmaps Bitmap counting algorithms A family of algorithms that can be used as building blocks in various systems Algorithms can be adapted to application Low memory and per packet processing Generalize flows to distinct header patterns Count flows or source addresses to detect attack Count destination address+port pairs to detect scan Outline Direct bitmap Virtual bitmaps Multi-resolution bitmaps Some results Bitmap counting direct bitmap Set bits in the bitmap using hash of the flow ID of incoming packets HASH(green)=111 9
Bitmap counting direct bitmap Bitmap counting direct bitmap Different flows have different hash values Packets from the same flow always hash to the same bit HASH(blue)=11 HASH(green)=111 Bitmap counting direct bitmap Bitmap counting direct bitmap Collisions OK, estimates compensate for them HASH(violet)=1111 HASH(orange)=111111 1
Bitmap counting direct bitmap Bitmap counting direct bitmap As the bitmap fills up, estimates get inaccurate HASH(pink)=111 HASH(yellow)=1111 Bitmap counting direct bitmap Bitmap counting direct bitmap HASH(green)=111 Solution: use more bits Solution: use more bits Problem: memory scales with the number of flows HASH(blue)=11 11
Bitmap counting virtual bitmap Bitmap counting virtual bitmap Solution: a) store only a portion of the bitmap b) multiply estimate by scaling factor HASH(pink)=111 Bitmap counting virtual bitmap Bitmap counting multiple bmps Problem: estimate inaccurate when few flows active HASH(yellow)=1111 Solution: use many bitmaps, each accurate for a different range 12
Bitmap counting multiple bmps Bitmap counting multiple bmps HASH(pink)=111 HASH(yellow)=1111 Bitmap counting multiple bmps Bitmap counting multiple bmps Use this bitmap to estimate number of flows Use this bitmap to estimate number of flows 13
Bitmap counting multires. bmp Bitmap counting multires. bmp OR OR Problem: must update up to three bitmaps per packet Solution: combine bitmaps into one HASH(pink)=111 Bitmap counting multires. bmp Error of virtual bitmap HASH(yellow)=1111 Average (relative) error Flow density (flows/bit) 14
1 million flows, error 1% Hash table* Direct bitmap Virtual bitmap* Multiresolution bitmap 1.21 Gbytes 1.29 Mbytes 1.88 Kbytes 1.33 Kbytes Adaptive bitmap Virtual bitmap measures accurately number of flows if range known in advance Often number of flows does not change rapidly Measurement repeated Can use previous measurement to tune virtual bitmap Use small multi-resolution bit map for tuning Combine into single bit map (single update) Triggered bitmap Scan detection memory usage Need multiple instances of counting algorithm (e.g. port scan detection) Many instances count few flows Triggered bitmap Allocate small direct bitmap to new sources If number of bits set exceeds trigger value, allocate large multiresolution bitmap Interval length Snort (naïve) Probabilistic counting Triggered bitmap 12 seconds 1.94 M 2.42 M.37 M 6 seconds 49.6 M 22,34 M 5.59 M 15