Encoding Short Ranges in TCAM Without Expansion: Efficient Algorithm and Applications Yotam Harchol The Hebrew University of Jerusalem, Israel Joint work with: Anat Bremler-Barr David Hay Yacov Hel-Or IDC Herzliya, Israel To appear in ACM SPAA 26 Hebrew University, Israel IDC Herzliya, Israel This research was supported by the European Research Council under the European Union s Seventh Framework Programme (FP7/27-23)/ERC Grant agreement no 25985.
TCAM Ternary Content Addressable Memory TCAM is an associative memory module Useful for parallel multidimensional prefix lookup Each entry may consist of,, and * bits Widely used in high-speed networking devices Example: 2-dimensional IP lookup: Source IP Dest. IP What about ranges? (e.g., port numbers) ******* ******* ************* *********** 2 3 4 5 P P2 Drop P3 2 3 4 5 TCAM SRAM 2
Range Encoding in Binary Representation Can ternary encode only certain intervals Or we need more than one entry (expansion) 2 3 4 5 6 7 8 9 2 3 4 5 ** *** **???? * *???? * * Exponential space blow-up when using multiple range fields 3
Encoding Short Ranges Ranges of arbitrary length infeasible without expansion Ranges with bounded length feasible! Useful for: Short TCP/UDP ranges (>6% of the ranges in real life) QoS mechanisms (IP ToS/DSCP) Packet size classification (by categories) Timestamps and counters IP spoofing detection (using IP TTL) SDX AS numbering Also useful for applications outside the networking domain 4
Range Encoding on TCAM: Two Approaches Database-independent encoding: [Lakshminarayanan et al., 25], [Bremler Barr et al., 27] f enc ([x, y]) = ***** ****** ******* Easy to update Lower bound for w-bits field without expansion: 2 w - bits per range [Lakshminarayanan et al., 25] Database-dependent encoding: [Liu, 22], [Van Lunteren & Engbersen, 23], [Chang & Su, 27], [Che et al., 28], [Bremler Barr et al., 29], [Rottenstreich & Keslassy, 2], [Rottenstreich et al., 23], [Kogan et al., 24] f (, [x, y]) = enc DB **** ***** Both schemes require expansion for a feasible code length Compact codes 5
RENÉ Range Encoding with No Expansion Encoding function for short ranges: f enc ([x, y]) = ***** Database independent - easy to update No row expansion Near-optimal TCAM space usage Useful for packet classification applications Useful for high dimensional nearest neighbor search 6
Binary-Reflected Gray Code (BRGC) Build recursively by reflecting binary code: 2 3 4 5 6 7 8 9 2 3 4 5 Hamming distance between each two adjacent points = 7
Range Encoding with Binary-Reflected Gray Code With ternary BRGC we can encode some of the ranges of length h=2 k (k N) 2 3 4 5 6 7 8 9 2 3 4 5 ** ** ** ** *** ** ** ** ** What about other (red) ranges? 8
Layers Layer: a set of disjoint consecutive ranges Two (blue) layers can be encoded with a BRGC-based ternary word Other (red) layers need more bits 2 3 4 5 6 7 8 ** ** ** ** 9 2 3 4 5 *** ** ** ** ** 9
Cover Ranges Cover range of R: the smallest blue range that fully contains R In red layers: No two consecutive ranges in the same layer are fully contained in a cover range 2 3 4 R 5 6 7 8 9 2 3 4 5 *** cover(r) Given the cover range, one bit is enough to differentiate ranges in the same layer
Single Bit Range Index Red ranges require different encoding Add one bit for each red layer Bit value alternates between ranges in the same layer Bits of other red layers are set to * For blue layers, these bits are always * ** 2 3 4 5 6 7 * * * * * 8 * * * * * 9 2 3 ** ** ** ** ** ** ** ** 4 5
Encoding Scheme for Ranges Encoding of a single range of length h=2 k is: Blue ranges: BRGC(R) Red ranges: BRGC(cover(R)) Ternary BRGC w bits Bit of L... Bit of L h/2- h-2 bits Bit of L h/2+... Bit of L h- Value encoding is now: BRGC Bit of L... Bit of L h/2- Bit of L h/2+... Bit of L h- w bits h-2 bits 2
Toy Example 3
Optimization For ranges of length h=2 k k- LSBs of the BRGC part are always * 2 3 4 5 *** * ** ** Code can be shorten to w-log(h)+h- bits 6 7 ** ** ** ** ** ** ** ** *** * 8 *** * *** * *** * *** * 9 2 3 4 ** ** ** ** ** ** ** ** 5 *** * *** * *** * *** * 4
Encoding Multiple Range Lengths For ranges of length 3, intersect two ranges of length 4: 2 * ** 3 4 5 6 7 We present the conjunction operator Π: ** * * ** ** * 8 9 2 3 4 * ** * ** * ** ** * ** * ** * * ** ( means * ** undefined, * ** ** * and if a i Π b i = ** * then a Π b = ) ** * ** * 5 * ** ** * For two ranges R, R 2 : tcode(v) tcode(r4 7 ) Π tcode(r * ** 2 ) if and only if v R R 2 5 8 (also: tcode(r ) Π tcode(r 2 ) = ** * R R 2 = ) 5 7 * ** Π ** * = * * 5
TCAM bits used per range field (log 2 scale) Theoretical Bound For w=6 Maximal range length (log 2 scale) 6
The Nearest Neighbor Search Problem Input: - A set of data points - A query point (or a series of those) (points are in a discrete space) Output: Data point closest to the query point The Curse of Dimensionality: Hard problem for high dimensions (even over or so) 7
The Nearest Neighbor Search Problem Encode cubes on data points Given a query point, single TCAM lookup returns the smallest cube that contains it No TCAM entry expansion in high dimension Data Query 8
The Nearest Neighbor Search Problem Or instead, save TCAM space Encode cubes on the query point Query TCAM with growing cubes Find first data point to match Data Query 9
Experiment on a Real TCAM Currently - no evaluation board for TCAM Instead, we used a commodity network switch with a TCAM Switch has 48 ports of Gbps each (.5 Million packet per second) TCAM is 92 bits wide d-dimensional Cube Representation: ****************************** Src. MAC Dest. MAC Src. IP Dest. IP st dimension 2 nd dimension 3 rd dimension 4 th dimension 5 th dimension 6 th dimension 7 th dimension OpenFlow flow_mod Single port:.5 Million Queries Per Second! Queries as packets OpenFlow Counters 2
Conclusions Encoding function for short ranges: f enc ([x, y]) = ***** Database independent - easy to update No row expansion Near-optimal TCAM space usage Useful for packet classification applications Useful for high dimensional nearest neighbor search 2
Questions? Thank you. 22