54 CHAPTER 4 BLOOM FILTER 4.1 INTRODUCTION Bloom filter was formulated by Bloom (1970) and is used widely today for different purposes including web caching, intrusion detection, content based routing, databases and computer networks (Broder and Mitzenmacher 2005). The theory behind the Bloom filter is described in this section. At first, the Bloom filter is described and then its enhancement to meet the requirement of string detection is explained. A Bloom filter offers an attractive choice for string matching. It is a randomized technique to test membership of a string in a group of given strings. Using this technique, a group of strings is compressed at first by calculating multiple hash functions over each string. Then, compressed set of strings is stored using memory. This set can be queried to find out if a given string belongs to it. The two important properties of a Bloom filter that make it a viable solution for string matching are the following: Scalability: Bloom filter uses a constant amount of memory to compress each string irrespective of the length of the original string. Thus, large strings can be stored with smaller memory space. This makes it highly scalable in terms of memory usage. Speed: The amount of computation involved in detecting a string using Bloom filter is constant. This computation is a calculation of hash
55 functions and the corresponding memory lookups. Efficient hash functions can be implemented in hardware easily with little resource consumption. Hence, a hardware implementation of Bloom filter can do string matching at high speeds. Bloom filters use less memory space to store the compressed strings. The amount of memory depends on the number of strings being compressed and typically is few megabits. For instance, to store 10,000 strings, around 200k bits are required. Almost all modern FPGAs come with multi port embedded memory blocks which can be utilized for constructing Bloom filters. However, the real reason for using FPGAs stems from the requirement of memory reconfiguration for Bloom filters. It is obvious that a Bloom filter is maintained for detecting strings of a particular length. If the database of strings to be detected has non uniform number of strings for each unique string length, then the Bloom filters need to be tuned to accommodate this non uniformity and achieve the optimal performance. Moreover, since the string length distribution can change over time, as the Bloom filters need to be retuned to maintain optimality which involves reallocation of the Block memories and hash functions. While doing this, the underlying hardware needs to change. Hence, the FPGAs prove to be extremely effective in such a scenario. 4.2 BLOOM FILTER THEORY The theory behind Bloom filter is described in this section. Given a string x, the Bloom filter computes k hash functions on it producing hash values ranging from 1 to m. It then sets k bits in a m bit long vector at the addresses corresponding to the k hash values. The same procedure is repeated for all the members of the set. This process is called programming of the filter. The query process is similar to programming, where a string whose membership to be verified is given as input to the filter. The Bloom filter generates k hash values using the same hash functions which are used to
56 program the filter. The bits in the m bit long vector at the locations corresponding to the k hash values are looked up. If at least one of these k bits is found not set then the string is declared to be a non-member of the set. If all the bits are found to be set then the string is said to belong to the set with a certain probability. This uncertainty in the membership comes from the fact that those k bits in the m bit vector can be set by any of the n members. Thus, finding a bit set does not necessarily imply that it was set by the particular string being queried. Subsequent sections explain the programming and querying process in detail. 4.2.1 Programming a Bloom Filter A Bloom filter is essentially a bit vector of length m which is used to efficiently represent a set of bit-strings. Given a set of strings S, with n members, a Bloom filter is programmed as follows. For each bit string X, in S, k hash functions, h 1 ()...h k (), are computed on x producing k values each ranging from 1 to m. Each of these values addresses a single bit in the m bit vector; hence each bit-string x causes k bits in the m-bit vector to be set to 1. It is to be noted that if one of the k hash values addresses a bit that is already set to 1, then that bit is not changed. Figure 4.1 and 4.2 illustrate Bloom filter programming. Two bit-strings, x and y are programmed in the Bloom filter with k = 3 hash functions and m = 16 bits in the array. It is to be noted that different strings can have overlapping bit patterns. The following pseudo-code describes adding a bit-string, x, to a Bloom filter. Pseudo-code for programming the Bloom filter is given in Table 4.1. Table 4.1 Pseudo-code for programming the Bloom filter BF Prog (x) i. for (i=1 to k) ii. Vector[hi(x)] 1
57 Figure 4.1 Programming a string x in the Bloom filter where k=3 and m=16 Figure 4.2 Programming a string y in the Bloom filter where k=3 and m=16
58 4.2.2 Querying a Bloom Filter Querying the Bloom filter for set membership of a given bit-string, x, is similar to the programming process. Given bit-string x, k hash values are generated using the same hash functions used to program the filter. The bits in the m-bit vector at the locations corresponding to the k hash values are checked. If at least one of the k bits is 0, then the bit-string is declared to be a non-member of the set, as discussed in Figure 4.3. If all the bits are found to be 1, then the bit-string is said to belong to the set with a certain probability, as shown in Figure 4.4. If all the k bits are found to be set and x is not a member of S, then it is said to be a false positive. The following pseudo-code describes the query process. Pseudo-code for querying the Bloom filter is given in Table 4.2. Table 4.2 Pseudo-code for querying the Bloom filter BF Query (x) i. for (i=1 to k) ii. if (Vector[hi(x)]=0) return false iii. return true Figure 4.3 Querying a string z in the Bloom filter
59 Figure 4.4 Querying a string w in the Bloom filter Figure 4.5 False positive probability The ambiguity in membership comes from the fact that the k bits in the m-bit vector can be set by any of the n members of S. For instance, as given in Figure 4.5, q maps to all the bits which were set by x and y. Although q S, the filter shows a match. Thus, finding a bit set does not necessarily
60 imply that it was set by the particular bit-string being queried. However, finding a 0 bit certainly implies that the bit-string does not belong to the set; if it was a member, then all k-bits would have been set when the Bloom filter was programmed. 4.2.3 False Positive Probability This section derives the mathematical representation of the false positive probability i.e., the probability of finding all the k lookup bits set for a bit-string that is not programmed. The probability that a random bit of the m-bit vector is set to 1 by a hash function is simply m 1. The probability that it is not set are set to 0 is 1 1 m. The probability that it is not set by any of the n n 1 members of x is 1. Since each of the bit-strings sets k bits in the m nk 1 vector, the probability becomes 1. The probability that this bit is 1 m nk 1 becomes 1 1. For a bit of string to be detected as a possible m member of the set, all k bit locations generated by the hash functions need to be 1. The probability that this happens, f, is given by Equation (4.1). f k nk 1 1 1 m (4.1) For the large values of m the above equation reduces to Equation (4.2). k nk f 1 e m (4.2)
61 This probability is independent of the input bit-string and is termed the false positive probability. The false positive probability can be reduced by choosing appropriate values for m and k for a given size of the member set, n. It is clear that the size of the bit-vector, m, needs to be much larger than the m size of the bit-string set n. For the given ratio, the false positive probability n can be reduced by increasing the number of hash functions, k. In the optimal case, when false positive probability is minimized with respect to k, the following relationship is obtained. m k ln 2 (4.3) n The false positive probability at this optimal point is given by Equation (4.4). f k 1 (4.4) 2 It should be noted that if the false positive probability is to be fixed, then the size of the filter, m, needs to scale linearly with the size of the bit-string set, n. In the optimally configured Bloom filter, the probability of finding a bit set is 0.5. Tables 4.3, 4.4, 4.5 and Figure 4.6 give the relationship between false positive ratios and combinations of m/n and k. 4.3 PRACTICAL HASH FUNCTIONS A hash function is a well defined procedure or mathematical function that converts a large, possibly variable-sized amount of data into a small datum, usually a single integer that may serve as an index to an array. The values returned by a hash function are called either as hash values, hash codes, hash sums, checksums or simply hashes.
62 Table 4.3 False positive rate under various m/n and k combinations m/n k k=1 k=2 k=3 k=4 k=5 k=6 k=7 k=8 2 1.39 0.393 0.4 3 2.08 0.283 0.237 0.253 4 2.77 0.221 0.155 0.147 0.16 5 3.46 0.181 0.109 0.092 0.092 0.101 6 4.16 0.154 0.0804 0.0609 0.0561 0.0578 0.0638 7 4.85 0.133 0.0618 0.0423 0.0359 0.0347 0.0364 8 5.55 0.118 0.0489 0.0306 0.024 0.0217 0.0216 0.0229 9 6.24 0.105 0.0397 0.0228 0.0166 0.0141 0.0133 0.0135 0.0145 10 6.93 0.0952 0.0329 0.0174 0.0118 0.00943 0.00844 0.00819 0.00846 11 7.62 0.0869 0.0276 0.0136 0.0086 0.0065 0.00552 0.00513 0.00509 12 8.32 0.08 0.0236 0.0108 0.0065 0.00459 0.00371 0.00329 0.00314 13 9.01 0.074 0.0203 0.0088 0.0049 0.00332 0.00255 0.00217 0.00199 14 9.7 0.0689 0.0177 0.0072 0.0038 0.00244 0.00179 0.00146 0.00129 15 10.4 0.0645 0.0156 0.006 0.003 0.00183 0.00128 0.001 0.00085 16 11.1 0.0606 0.0138 0.005 0.0024 0.00139 0.000935 0.0007 0.00057 17 11.8 0.0571 0.0123 0.0042 0.0019 0.00107 0.000692 0.0005 0.00039 18 12.5 0.054 0.0111 0.0036 0.0016 0.00084 0.000519 0.00036 0.00028 19 13.2 0.0513 0.01 0.0031 0.0013 0.00066 0.000394 0.00026 0.00019 20 13.9 0.0488 0.0091 0.0027 0.0011 0.00053 0.000303 0.0002 0.00014 21 14.6 0.0465 0.0083 0.0024 0.0009 0.00043 0.000236 0.00015 0.0001 22 15.2 0.0444 0.0076 0.0021 0.0008 0.00035 0.000185 0.00011 7.46E-05 23 15.9 0.0425 0.0069 0.0018 0.0006 0.00029 0.000147 8.56E-05 5.55E-05 24 16.6 0.0408 0.0064 0.0016 0.0006 0.00024 0.000117 6.63E-05 4.17E-05 25 17.3 0.0392 0.0059 0.0015 0.0005 0.0002 9.44E-05 5.18E-05 3.16E-05 26 18 0.0377 0.0055 0.0013 0.0004 0.00016 7.66E-05 4.08E-05 2.42E-05 27 18.7 0.0364 0.0051 0.0012 0.0004 0.00014 6.26E-05 3.24E-05 1.87E-05 28 19.4 0.0351 0.0048 0.0011 0.0003 0.00012 5.15E-05 2.59E-05 1.46E-05 29 20.1 0.0339 0.0044 0.0009 0.0003 9.96E-05 4.26E-05 2.09E-05 1.14E-05 30 20.8 0.0328 0.0042 0.0009 0.0002 8.53E-05 3.55E-05 1.69E-05 9.01E-06 31 21.5 0.0317 0.0039 0.0008 0.0002 7.33E-05 2.97E-05 1.38E-05 7.16E-06 32 22.2 0.0308 0.0037 0.0007 0.0002 6.33E-05 2.50E-05 1.13E-05 5.73E-06
63 Table 4.4 False positive rate under various m/n and k combinations m/n k k=9 k=10 k=11 k=12 k=13 k=14 k=15 k=16 11 7.62 0.00531 12 8.32 0.00317 0.00334 13 9.01 0.00194 0.00198 0.0021 14 9.7 0.00121 0.0012 0.00124 15 10.4 0.00078 0.00074 0.00075 0.00078 16 11.1 0.00051 0.00047 0.00046 0.00047 0.00049 17 11.8 0.00034 0.0003 0.00029 0.00028 0.00029 18 12.5 0.00023 0.0002 0.00018 0.00018 0.00018 0.00018 19 13.2 0.00016 0.00013 0.00012 0.00011 0.00011 0.00011 0.00011 20 13.9 0.00011 8.89E-05 7.77E-05 7.12E-05 6.79E-05 6.71E-05 6.84E-05 21 14.6 7.59E-05 6.09E-05 5.18E-05 4.63E-05 4.31E-05 4.17E-05 4.16E-05 4.27E-05 22 15.2 5.42E-05 4.23E-05 3.50E-05 3.05E-05 2.78E-05 2.63E-05 2.57E-05 2.59E-05 23 15.9 3.92E-05 2.97E-05 2.40E-05 2.04E-05 1.81E-05 1.68E-05 1.61E-05 1.59E-05 24 16.6 2.86E-05 2.11E-05 1.66E-05 1.38E-05 1.20E-05 1.08E-05 1.02E-05 9.87E-06 25 17.3 2.11E-05 1.52E-05 1.16E-05 9.42E-06 8.01E-06 7.10E-06 6.54E-06 6.22E-06 26 18 1.57E-05 1.10E-05 8.23E-06 6.52E-06 5.42E-06 4.70E-06 4.24E-06 3.96E-06 27 18.7 1.18E-05 8.07E-06 5.89E-06 4.56E-06 3.70E-06 3.15E-06 2.79E-06 2.55E-06 28 19.4 8.96E-06 5.97E-06 4.25E-06 3.22E-06 2.56E-06 2.13E-06 1.85E-06 1.66E-06 29 20.1 6.85E-06 4.45E-06 3.10E-06 2.29E-06 1.79E-06 1.46E-06 1.24E-06 1.09E-06 30 20.8 5.28E-06 3.35E-06 2.28E-06 1.65E-06 1.26E-06 1.01E-06 8.39E-06 7.26E-06 31 21.5 4.10E-06 2.54E-06 1.69E-06 1.20E-06 8.93E-07 7.00E-07 5.73E-07 4.87E-07 32 22.2 3.20E-06 1.94E-06 1.26E-06 8.74E-07 6.40E-07 4.92E-07 3.95E-07 3.30E-07 Table 4.5 False positive rate under various m/n and k combinations m/n k k=17 k=18 k=19 k=20 k=21 k=22 k=23 k=24 22 15.2 2.67E-05 23 15.9 1.61E-05 24 16.6 9.84E-06 1.00E-05 25 17.3 6.08E-06 6.11E-06 6.27E-06 26 18 3.81E-06 3.76E-06 3.80E-06 3.92E-06 27 18.7 2.41E-06 2.34E-06 2.33E-06 2.37E-06 28 19.4 1.54E-06 1.47E-06 1.44E-06 1.44E-06 1.48E-06 29 20.1 9.96E-07 9.35E-07 9.01E-07 8.89E-07 8.96E-07 9.21E-07 30 20.8 6.50E-07 6.00E-07 5.69E-07 5.54E-07 5.50E-07 5.58E-07 31 21.5 4.29E-07 3.89E-07 3.63E-07 3.48E-07 3.41E-07 3.41E-07 3.48E-07 32 22.2 2.85E-07 2.55E-07 2.34E-07 2.21E-07 2.13E-07 2.10E-07 2.12E-07 2.17E-07
64 Figure 4.6 False positive probability (f) Vs Number of hash functions (k) Hash functions are mostly used to speed up table lookup or data comparison tasks such as signature detection which find broad range of applications in network domain. Although the idea was conceived in the 1950s, the design of good hash functions is still a topic of active research (Knuth and Donald 1973). In this section, the effects of utilizing different hash functions in Bloom filters are analyzed. Performances of different hash functions in hardware are investigated by Ramakrishna et al (1997). Three different types of hash functions in Bloom filters were utilized to implement in FPGA.
65 4.3.1 H3 Class of Universal Hash Function Universal class of hash functions are first introduced by Carter et al (2004). They defined a special class of hash functions known as class H3. The definition of H3 Class is given as follows. Given any string X, consisting of b bits, X b = <x1, x2, x3,..., xb> ith hash function over the string X is defined as hi(x) = di1 and x1 xor di2 and x2 xor di3 and x3 xor.. dib and xb (4.5) where dij s are random coefficients uniformly distributed between 1 to size of the lookup vector, m, and xk is the kth bit of the input string. and is a bit by bit AND operation, and xor is a logical Exclusive OR (XOR) operation. A block diagram of the H3 class of hash functions implemented is given in Figure 4.7. Figure 4.7 A block diagram of a H3 class of universal hash function Input is shifted to one bit left till 16 bits are handled. Each bit is logically AND-ed with the random number. At the end, all AND results are XOR-ed together to get a hash value. This type of hash functions is linear transformations and as a result they distribute the index values randomly.
66 Implementation of these type of hash functions require sixteen 2-input AND gates and a single 16-input XOR gate for a 16 bit signature. They produce key values as the same size of the input. Pseudocode to implement H3 class of hash functions is given in Table 4.6. Table 4.6 Pseudo-code for H3 class of universal hash function for each signature: i. generate as many random numbers as the bits in the signature ii. left shift the signature to get to the specified bit iii. AND each shifted signature with the random number iv. XOR all the results of AND s 4.3.2 Bit Extraction Hash Function This type of hash functions consists of selecting j bits out of b bits of the signature. Depending on the selection fashion of these bits out of input signature, they are classified as regular and randomized bit extraction hash functions. Since regular bit extraction hash functions are constrained in number by the input length, randomized bit extraction hash functions are used. Definition of a randomized bit extraction hash function is as follows. Given any string X, consisting of b bits, X b = <x1, x2, x3,..., xb> ith hash function over the string X is defined as hi(x) = <xl 1, xl 2, xl 3,..., xl j > (4.6) where l j s are random bit positions uniformly distributed between one to size of the input signature, b bits and xl j is the input bit located at l j. A block diagram of randomized bit extraction hash functions implemented is
67 illustrated in Figure 4.8. Implementation of these types of hash functions requires eight 2-input AND gates and a single 8-input XOR gate for a 16 bit signature. A shifter is necessary to left shift the bits in input as specified by random number, l j. Figure 4.8 A block diagram of bit extraction hash function These types of hash functions produce key values shorter in bits than the size of the signature. They distribute keys randomly to the bit positions to extract the bits based on random numbers. Pseudocode to simulate this hash function is given in Table 4.7. Table 4.7 Pseudo-code for bit extraction hash function for each signature: i. generate as many random numbers as the bits in the indices ii. right shift the signature to get to random bit position iii. adjust the bit at random position to the correct position at index by left or right shifting iv. XOR all the results of shifting
68 4.3.3 Hash Functions from XOR Method These types of hash functions partition the b bit long input signature into j bits of segments. The segments are XOR-ed to get the hash value. The segments can be formed either in a regular manner or randomly like bit extraction hash functions. To have random indices, random segment forming hash functions are used. The definition of the hash functions from XOR method is as follows. Given any string X, consisting of b bits, X = <x1, x2, x3,..., xb> ith hash function over the string X is defined as hi(x) = (xs1 xor xs2 )(xs3 xor xs4 )..., (xsj-1 xor xsj ) (4.7) where sj s are the uniformly distributed random bit positions in the input string. xsj are the bits at the position specified by sj. There are two segments of length j-bits are formed and XOR-ed. Figure 4.9 illustrates a block diagram of a hash function from XOR method. Implementation of these types of hash functions requires a shifter to get to the bit at the random position, plus eight 2- input XOR gates and an 8-input XOR gate. The length of the resulting hash value is smaller in bits than the input. Figure 4.9 A block diagram of hash function using XOR method
69 However they map the inputs to the hash values in a completely random manner due to the random selection of bits from input. Pseudo code to implement these types of hash functions is given in Table 4.8. Table 4.8 Pseudo-code for XOR method hash function for each signature: i. generate twice as many random numbers as the bits in the indices ii. right shift the signature to get the random bit positions for two segments iii. XOR the bits at each segment iv. right shift the XOR result to get correct position 4.3.4 FPGA Implementation of hash functions To meet today s high-speed networks with line speeds of 10 GBPS and beyond, FPGA implementation is a feasible solution. Performances of three different hash functions in hardware were investigated. Table 4.9 FPGA implementation of hash functions Hash Function LUTs Flip Flops Block RAMs Universal 2990 (4.4%) 2295 6 Bit Extraction 4550 (9%) 3998 7 XOR Method 3050 (4.5%) 2567 6 It utilizes three different types of hash functions in Bloom filters to examine the effects of them on the performance of low power architecture. Logical designs of low power look up Bloom filter with respect to the types of
70 hash functions were implemented on Xilinx XCV2000E FPGA and utilization of LUTs, Flip Flops and Block Random Access Memory (RAMs) are summarized in the Table 4.9. Device utilization is higher in the type of bit extraction hash function. Implementation results of low power Bloom filter using three different hash functions are presented in the Table 4.9. Based on the results, Universal H3 hash function is selected for further power analysis of low power lookup Bloom filter. 4.4 TYPICAL BLOOM FILTER ARCHITECTURE A block diagram of a typical Bloom filter is illustrated in Figure 4.10. Given a string X, which is a member of the signature set, a Bloom filter computes k hash values on the input X and d which are uniformly distributed between 1 to number of hash functions, k. Then it uses these hash values as index to the m-bit long lookup vector. It sets the bits corresponding to the index given by the hash values computed. It repeats this procedure for each member of the signature set. For an input string Y, the Bloom filter computes k hash values by utilizing the same hash functions used in programming of the bloom filter. Figure 4.10 Typical Bloom Filter
71 The Bloom filter looks up the bit values located on the offsets (computed hash values) on the bit vector. If it finds any bit unset at those addresses, it declares the input string to be a non member of the signature set, which is called a mismatch. Otherwise, it finds that all the bits are set and concludes that input string may be a member of the signature set with a false positive probability, which is called a match. 4.5 DRAWBACK OF DSLT BLOOM FILTER A Bloom filter never produces false negatives. If it finds that an input certainly does not belong to the signature set, then it decides that the input is a non member. However, it may produce false positives when a non member input results as a member of the set. Following the analysis of Dharampurikar et al (2004), the false positive probability f is calculated by (4.2). In order to minimize the false positive probability, the value of m must be quite larger than n. For a fixed value of m/n, k must be large enough such that f gets minimized. Since the number of hash functions in Bloom filters is large to reduce the false positive probability, it is intuitive that their total power consumptions are large. During the programming phase of the Bloom filter, not much can be done to reduce the power consumption; otherwise Bloom filter will produce many false positives. However, while performing lookups over the Bloom filter, the number of hash functions used to produce a decision can be reduced significantly. This is because a Bloom filter never makes false negatives, and it is enough to find a zero on the m-bit long lookup vector to conclude that there is a mismatch. Ilhan Kaya and Taskin Kocak (2006) call this type of lookup operation as low power lookup technique. The architecture to support such a lookup operation for a DSLT is illustrated in Figure 4.11 where the number of hash functions per stage (r) is k/2. The drawback of the DSLT scheme presented by Ilhan Kaya and Taskin Kocak (2006) is the ignorance of further investigation with more
72 divisible stages in look up scheme. This research work continues the investigation of the look up technique with further stages where the number of hash functions per stage (r) are 1, k/2, k/4 and k/8. Figure 4.11 DSLT Bloom filter architecture where hash per stage r = k/2 4.6 MULTI STAGE LOOK UP TECHNIQUE BASED BLOOM FILTER ARCHITECTURE Low power Bloom filter architecture is introduced where r = k/4 is illustrated in Figure 4.12. If a match is attained in the first stage itself then 3/4 of the hash calculations are minimized when half of the hash calculations are reduced. In the similar fashion, low power architecture with k/8 is considered for power analysis. Figure 4.13 illustrates the architecture where the number of hash functions per stage r = 1. H3 Class of Universal Hash function was used in the hash calculations of MSLT.
73 Figure 4.12 MSLT Bloom filter architecture where hash per stage r = k/4 Figure 4.13 MSLT Bloom filter architecture where hash per stage r = 1 4.7 POWER ANALYSIS OF MSLT ARCHITECTURES With reference to the discussion in the section 4.4.4, Universal Hash function is selected for the implementation of MSLTs. Basic functional
74 module of Bloom filter using Universal H3 hash function was implemented in 60nm technology (Figure 4.14) with the following parameters as shown in Table 4.10 to derive the power consumption. Average power calculated will be used in the power analysis of low power Bloom filter architecture in this section. Table 4.10 Design specifications Technology CMOS 60 nm Power Supply 5V Metal Layers 6 Avg. Power 0.801 W @ 5nS Figure 4.14 Physical layout of basic functional module of Bloom filter using H3 hash function A theoretical approach is followed to analyze and compare the power consumptions of the different lookup operations available through Bloom filter architectures presented in the section 4.5 and 4.6. A single Bloom filter shown in Figure 4.10 uses k hash functions in order to make a decision on the input given. Hence, the power consumption of a Bloom filter when performing a regular lookup operation is a summation of the power
75 consumptions of each of the hash value computations, P Hi, plus the power consumed accessing the memory for each hash value computed, P Q, plus the power consumed by an AND gate. k P (P P ) P (4.8) BFreg Hi Q AND i1 Power consumption of an AND gate is ignored hereafter, since it is minimal when compared to the power used by the hash functions. Power required to query m bit vector is approximately constant for each index calculated by any of the hash functions. The power equation for a single Bloom filter simply becomes the total power used by the hash functions and the power consumed by querying the m bit vector for each hash value calculation. k P (P P) (4.9) BFreg Hi Q i1 The power consumption of a regular lookup low power architecture presented in Figure 4.13 is compared with 16-bit implementation of hash functions. In section 4.4.4, the results of hardware implementations of all practical hash functions are presented which recommend universal class of hash functions called H3 is suitable for hardware applications. Hence, all of the k hash functions are of type 8-bit H3 class of hash functions. Then Equation (4.9) becomes PBFreg k.(ph8 P) Q (4.10) To derive the power consumption of the new architecture proposed, a mathematical analysis similar to the analysis done in Mitzenmacher (2002) is followed. At first the probability of match in the first stage is derived. The
76 probability that a bit is still unset after all the signatures are programmed into the Bloom filter by using k independent hash functions is. kn 1 1 e m kn m (4.11) where 1 / m represents any one of the m bits set by a single hash function operating on a single signature. Then (1 1/m) is the probability that the bit is unset after a single hash value computation with a single signature. To remain unset, it should not be set by any of the k hash functions each operating on all signatures in the signature set. Consequently, the probability that any one of the bits set is kn m 1 1 e (4.12) In order to produce a match in the first stage, the bits indexed by all r of the independent random hash functions should be set. So the match probability of the first stage is, represented as p, p r 1 1 1e i1 kn r m (4.13) r The mismatch probability of the first stage is 1-p, 11e kn m r (4.14) With a probability of (1-p) the first stage of the hash functions in the Bloom filter will produce a mismatch when performing a lookup operation. Otherwise, the first stage produces a match, and then the second stage is used
77 to compare the input with the signature sought as it is suggested by the architecture proposed. Therefore the power consumption of a Bloom filter shown in Figure 4.11 where r = k/2 is given by BFr k/2 IstStage 2ndstage P P P Match P (4.15) k/2 k (4.16) P P P p P P BFrk/2 Hi Q Hi Q i1 k j 1 2 k PBFrk/2 PH8 PQ 1p 2 k 2 (4.17) Power consumption of a Bloom filter where r = k/4 and r = k/8 are given by Equations (4.18) and (4.19) respectively. k PBFrk/4 PH8 PQ 1p p p 4 k k 3k 4 2 4 (4.18) k PBFrk/8 PH8 PQ 1p p p 1p p p p 8 k k 3k k 3k 3k 7k 8 4 8 2 8 4 8 (4.19) Given by the equation 4.20 (Ilhan Kaya and Taskin Kocak 2006), The Power Saving Ratio (PSR) of Bloom filter implemented based on the architectures presented functioning on two different lookup techniques can be calculated as PBFreg P k BFr n PSR (4.20) P BFreg
78 Using Equation (4.20), with reference to the power consumption of BF reg, PSR of BF r=k/2, BF r=k/4 and BF r=k/8 are calculated for various k values by considering following specifications given in Table 4.11. Table 4.11 Design specifications m/n ratio 21 Number of signatures, n 1024 Size of the m bit vector, m 21504 Width of the signature, i 8 P H8 +P Q 0.801 W where, P H8 +P Q, is average power consumption of basic functional module. As illustrated in section 5, P H8 +P Q comprise both power consumptions of both hash value calculation and match query for single hash function. As illustrated in section 4.7, basic functional module was implemented in Complementary Metal Oxide Semiconductor (CMOS) 60 nm technology using a back end tool and average power consumption was calculated which has been used in the power analysis of proposed low power architectures. When the number of hash functions per stage (r) decreases, power consumption reduces. PSR of BF r=k/2, BF r=k/4 and BF r=k/8 are calculated with reference to BF reg and plotted in Figure 4.15. For different values of the number of hash functions (k) over power consumption of Bloom filter architectures BF r=k/2, BF r=k/4 and BF r=k/8 are illustrated in Figure 4.16. When the number of hash functions per stage (r) decreases, PSR increases. When k increases more than 128, PSR of all three architectures converge.
79 Figure 4.15 PSR Vs Number of hash functions (k) Figure 4.16 Power Vs Number of hash functions (k)
80 Observation shows that increment in k increases the number of basic functional modules used in the design which increases the device density. Obviously device density is directly proportional to power consumption, by the observation from Figure 4.16, which cannot be compensated using parallel look up techniques proposed. This work suggests that selecting less number of hash functions to design Bloom filter architecture with the cost of m/n ratio results in better PSR. 4.8 FPGA IMPLEMENTATION OF MSLT ARCHITECTURES Results of hardware implementation in Xilinx 10.1i are implemented. The simulation for each pattern set was synthesized, placed, and routed on the Virtex5 XC5VLX85 (Xilinx, 2009) chip where the package and speed are FF676 and -3, respectively. To evaluate the proposed implementations, simulations are performed based on the following issues: Table 4.12 FPGA Implementation of MSLT Architectures Design MSLT DSLT Device Virtex5- LX85T Size of the Signature No. of Signatures Slice No. of Registers No. of LUT 32 16028 7635 6550 30387 16 16028 5626 9690 15375 32 16028 17239 8775 42632 16 16028 8852 13758 28574 Size of the signature (bits): Each signature is 16 or 32-bit width data. If bits per cycle are more then throughput is better. Slice: Slice is the FPGA resource in Xilinx FPGA chip. The number of logic elements in a slice is dependent on the FPGA device.
81 Number of slices represents the area cost. In Virtex-5, each FPGA slice contains four LUTs and four flip-flops. Clock period: The clock period is the speed of the maximum critical path in FPGA. The period can be obtained from the synthesis report of Xilinx software. The smaller is clock period, the faster is its implementation. Table 4.12 shows the experiment results. The number of signatures in these pattern sets is 16028. 16 bit and 32 bit designs are simulated for each pattern set. The number of registers and number of LUTs show the device utilization of proposed architectures. Proposed MSLT architectures consume 29% less devices than DSLT in this implementation. 4.9 SUMMARY In this chapter, low power Bloom filter architectures are proposed to meet the network application in the hardware platform. According to this, a better Hash function is selected for hardware implementation. Further, average power consumption of basic functional module of Bloom filter using H 3 universal hash function is derived using CMOS 60 nm technology. Mathematical analysis is carried out to calculate the Power consumption and PSR of the low power Bloom filter architectures with different values of number of hash function per stage (r). Power analysis has shown that increment in the number of hash function per stage reduces the power consumption of the proposed architecture. FPGA implementation results and comparison with similar Bloom filter based signature detection techniques used in NIDS show the hardware compatibility of the proposed architecture. The design parameters, number of hash functions (k), width of the filter (m), number of stages (r) and false Positive probability (f) can be determined for the proposed architecture by the results shown in Figure 4.15
82 & 4.16. If k is smaller, then it decreases the power consumption with less number of hash functions, but the probability of false positive will increase. If m is larger, it will reduce the false positive rate, but searching time and power in the filtering stage will be more. Hence, the design parameters are carefully selected by understanding the trade off among the design parameters. Proposed MSLT architecture involves with parallel k stage hash functions. Pipelined multi stage architecture was also considered and discussed at the earlier stage of the research. Even though pipelined architecture reduces the computation time, it will introduce more hardware complexity which will directly affect the system s performance.