Real-Time Detection of Invisible Spreaders

Save this PDF as:

Size: px
Start display at page:

Download "Real-Time Detection of Invisible Spreaders"


1 Real-Tie Detection of Invisible Spreaders MyungKeun Yoon Shigang Chen Departent of Coputer & Inforation Science & Engineering University of Florida, Gainesville, FL 3, USA {yoon, Abstract Detecting spreaders can help an intrusion detection syste identify potential attackers. The existing work can only detect aggressive spreaders that scan a large nuber of distinct addresses in a short period of tie. However, stealthy spreaders ay perfor scanning deliberately at a low rate. We observe that these spreaders can easily evade the detection because their sall traffic footprint will be covered by the large aount of background noral traffic that frequently flushes any spreader inforation out of the intrusion detection syste s eory. We propose a new streaing schee to detect stealthy spreaders that are invisible to the current systes. The new schee stores inforation about noral traffic within a liited portion of the allocated eory, so that it will not interfere with spreaders inforation stored elsewhere in the eory. The proposed schee is light weight; it can detect invisible spreaders in highspeed networks while residing in SRAM. Through experients using real Internet traffic traces, we deonstrate that our new schee detects invisible spreaders efficiently while keeping both false-positives (noral sources isclassified as spreaders) and false-negatives (spreaders isclassified as noral sources) to low level. Index Ters stealthy spreader, network security, intrusion detection, streaing algorith. I. INTRODUCTION Monitoring and analyzing network logs is at the heart of identifying attackers in the early stage []. These logs can be low-level packet trace generated fro routers or high-level audit records fro network/host intrusion detection systes. In high-speed networks, such logs can coe in large volue. To process the in real tie, a fast and lightweight streaing algorith is required, which should be able to work with liited eory and contiguously process incoing logs. This paper studies the proble of detecting spreaders based on incoing logs that are tuples of source/destination addresses. We call an external source address a spreader if it connects to ore than a threshold nuber of distinct internal destination addresses during a period of tie (such as a day). We define the cardinality of a source to be the nuber of distinct destinations that the source have contacted. Siilarly, we define the destination cardinality to be the nuber of distinct sources that have contacted the destination. The reason for detecting spreaders is that attackers often begin with a reconnaissance phase of finding vulnerable systes before launching the actual attack. Suppose an attacker knows how to coproise a specific type of web servers. Its first step is to locate such web servers on the Internet. The attacker ay probe TCP port on all addresses in a target network by using Nap []. To obtain ore specific inforation, they ay run an application-level vulnerability scanner such as Nessus [3] or Paros []. An intrusion detection syste can inspect the incoing traffic and catch the reconnaissance packets, fro which the spreaders are identified as potential attackers that deand extra attention. It is not possible for a network security adinistrator to anually analyze the huge volue of logs produced by routers and intrusion detection systes in order to find spreaders. An autoatic log-analyzing syste is required. In fact, soe intrusion detection systes have already ipleented functions for identifying spreaders. For exaple, Snort [5] keeps track of the distinct destinations each source contacts in a recent period, and the length of the period is constrained by the aount of eory allocated to this function. The proble is that the existing systes are designed to catch elephants aggressive spreaders whose cardinalities are so large that they easily stand out fro the background of noral traffic. In response, a wily attacker will slow down the rate of its reconnaissance packets and let the noral traffic dilute the footprint of its activity. In the Snort case, the past records ust be deleted to free eory once the allocated space is filled up by logs. If the attacker contacts a lessthan-threshold nuber of destinations in each period during which logs of noral traffic will fill up the allocated space, it will stay undetected. We note that even state-of-the-art intrusion detection systes can not detect stealthy spreaders if they send their packets at a low rate. These spreaders are called invisible spreaders. To catch the, we ust ake the detection syste ore sensitive. It is a cat-and-ouse gae between attackers and defenders. As we build ore and ore sensitive detectors, the attackers will be forced to continuously reduce their reconnaissance rate in order to stay undetected. This will give ore tie for the network adinistrators to take action (such as patching systes) against the outbreak of new attacks. The attacks will becoe less effective if it will take the an exceedingly long tie (e.g., onths) to coplete the reconnaissance phase over the Internet. To design our new real-tie detector for invisible spreaders, we observe (based on real Internet traffic traces) that noral traffic has strong skewness especially in an enterprise (or university capus) network. In particular, ost inbound traffic is headed to a sall nuber of servers for web, DNS, eail, and business application services. Utilizing such skewness, we propose a new spreader detection schee that is able to largely segregate the space used to store noral-traffic logs fro the space used to store logs of potential spreaders. Due to such segregation, a large volue //$5. IEEE. This full text paper was peer reviewed at the direction of IEEE Counications Society subject atter experts for publication in the IEEE "GLOBECOM" proceedings.

2 of noral-traffic logs will not cause the logs of spreaders to be flushed out of the eory. Furtherore, with a copact two-diensional bit array based on Bloo filters, the new schee can store a uch larger aount of inforation about the spreaders, allowing previously invisible spreaders to be detected. We perfor experients based on real Internet traffic traces, and the results show that the proposed schee is able to detect spreaders that are invisible to the existing detection systes and, at the ean tie, keep both false positives (noral sources isclassified as spreader sources) and false negatives (spreader sources isclassified as noral sources) to low level. The rest of the paper is organized as follows. Section II presents the design of our spreader detection schee. Section III evaluates the proposed schee via experients using real Internet traffic traces. Section IV discusses the related work. Section V draws the conclusion. II. INVISIBLE-SPREADER DETECTION In this section, we propose a new schee for detecting invisible spreaders. Our ain technique is a novel streaing algorith based on an invisible-spreader detection filter. A. Invisible-Spreader Detection Filter (ISD) Consider an intrusion detection syste that is deployed to catch all external spreaders whose cardinality exceeds a threshold θ. In order to detect stealthy spreaders, one ust set the value of θ sall. However, as θ becoes saller, the spreaders to be detected will have a saller traffic footprint, and it is increasingly harder for the intrusion detection syste to catch these spreaders without causing significant falsepositive and false-negative ratios. When the sall footprint of the stealthy spreaders is sufficiently diluted by noral traffic, the spreaders ay even becoe invisible to the current detection syste. To catch these invisible spreaders, ore sensitive detection systes ust be designed. Below we propose a new detector that can catch spreaders invisible to today s detectors (such as []). Our invisible-spreader detection filter (ISD) uses an n bit array as its ain data structure, which is initialized to be all zeros. Each bit B(x, y) in the array is referenced by a row index x and a colun index y. Bits will be set to ones to record the incoing connections ade fro external sources to internal destinations. A row is epty (or non-epty) ifit has zero bit (or at least one bit) that is set to be one. There is a row counter c(x) for each row x, storing the nuber of bits in the row that are set as one. The fullness ratio R of c(x) the filter is defined as n, the percentage of bits in the array that are set to one. Siilarly, the fullness ratio of row x is defined as c(x), the percentage of bits in the row that are set to one. We define a syste paraeter α, specifying the desirable fullness. If R>α, we reset the bit array to zeros. Next we describe the operations of ISD. When receiving an input source/destination tuple (a, b), the filter coputes k row indexes, x = h (a),..., x k = h k (a), and one colun index, y = h k+ (b), where h,..., h k are hash functions whose ranges are [..n ] and h k+ is a hash function whose range is [.. ]. The filter sets k bits, B(x,y),..., B(x k,y), to be one. Note that each colun is actually a Bloo filter [7] []. The colun index y selects a Bloo filter and the row indices specify the bits that together represent the tuple (a, b). For each i [..k], if B(x i,y) was set fro zero to one, the filter increases the row counter c(x) by one. Rows indexed by x through x k are called the representative rows of source a in the filter. Bits B(x,y) through B(x k,y) are called the representative bits of a. If the fullness of every representative row of source a is above a threshold β (whose value will be deterined in the next subsection), ISD executes the following procedure to deterine if a is a spreader. ) For the jth colun, let I j be one if B(x i,j)=for all the representative rows of a. Otherwise, I j =.We define a r = I j j= ) The cardinality of a, denoted as â c, can be estiated based on the following forula given in [9]. â c = ln( ) () a r 3) If â c is above θ, we consider source a to be a spreader. Our colun index, y = h k+ (b), is different fro [], which uses y = h k+ (a b). This subtle yet critical difference helps ISD iniize the diluting effect of noral traffic over the sall traffic footprint of invisible spreaders. Suppose a destination address b represents a busy webserver in an enterprise network, and illions of client users connect to b. Ify = h k+ (a b) is used, these clients will fill up the whole bit array with ones since the source addresses a randoizes the colun index y. To the contrary, only one colun of the bit array will be set to ones if y = h k+ (b) is used. Our Internet trace shows that the vast ajority of noral traffic is directed to a sall nuber of servers. Our schee concentrates such noral traffic to a sall nuber of coluns in the bit array, leaving the rest of the array for detecting spreaders. Hash collisions ay cause false positives. By tuning the syste paraeters, we can control the level of false positive, as well as the level of false negative. B. Paraeter Configuration The goal of ISD is to identify spreaders whose cardinality values are larger than θ, which is given as a user requireent. Let M (= n ) be the size of the allocated eory. The perforance of ISD is affected by the selection of the following syste paraeters: α, β,, and n. Belowwe discuss how to set these paraeters. ) We first set the values for β and. According to the previous subsection, a spreader will be detected when the following condition is satisfied. ln( ) θ. () ar //$5. IEEE. This full text paper was peer reviewed at the direction of IEEE Counications Society subject atter experts for publication in the IEEE "GLOBECOM" proceedings.

3 TABLE I: Paraeter Configuration Exaples (c =) θ β α Based on their definitions, we can approxiate β as ar. Applying this approxiation to (), we have the following forula for setting the value of β and. β = e θ. (3) Recall that the paraeter β is used to trigger the procedure for deterining a possible spreader. When the value of β is set by the above forula, once triggered the procedure is likely to find a spreader. The proble is that there are two undecided paraeters in the forula. We observe that sall is desirable for ISD. This is because sall allows large n, which reduces hash collisions aong row indices. Consequently, large β is preferable. However, if β is very close to one, ISD ay suffer fro hash collisions aong colun indices. In this paper, we choose β to be below.95, but it can be adjusted according to any specific application or deployent environent. Once β is chosen, can be set based on (). Alternatively, we ay also set first and then calculate β fro (). For exaple, it is natural to choose as a ultiple of words, which akes it easy to fit the bit array in eory. For each (= 3,,...), we copute β and choose the largest β below.95. Table I shows soe exaples for paraeter configuration. It shows how β and are deterined for θ fro to 9. After is deterined, n is calculated as M. ) We now deterine the value of α. First we exaine how α affects the detection of spreaders. When α is too larger, the bit array of ISD will be overly populated with ones, causing frequent hash collisions and resulting in false positives a non-spreader is claied as a spreader because its representative rows are populated with ones by tuples of other sources (due to hash collisions). If α is too sall, false positives ay hardly happen, but the filter will be frequently reset to zeros, losing the already-recorded inforation about spreaders and resulting in false negatives failure in detecting spreaders. Next we will use soe statistical properties to deterine the value of α. Suppose ISD only receives noral traffic for a period of tie and its bit array is ostly set by the noral traffic. Let Y be a rando variable that represents c(x) for row x in the bit array. The expectation and the variance of Y are given below. We oit the derivation process due to page liit. E(Y ) = α () V (Y ) = α ( α) (5) Fro E(Y ) and V (Y ), we can define a statistical upperbound for Y as follows: U(Y ) = E(Y )+c V (Y ) () where statistical error will be sall if the constant c is large. Eq. () eans that there is a high probability that c(x) is below U(Y ) if x is a representative row for only noral traffic. On the contrary, if x is a representative row for any spreader, c(x) should be larger than U(Y ). Hence, based on the above equations, we can set the value of α as follows. α = β + c ( + c ) β + c ( ( + c ) ) β + c. (7) Table I shows α as a result of the proposed heuristic ethod to configure β, and α when c =. III. EXPERIMENT We evaluate ISD using real-world Internet traffic traces. We ipleented not only ISD but also the advanced schee fro [], which we call online streaing odule (OSM). We copare their false positives and false negatives. The experiental results confir that ISD detects invisible spreaders while iniizing the negative ipact of noral traffic. A. Traffic Trace and Ipleentation Details In these experients, we set k to 3. Large k is helpful to differentiate sources, but it increases processing tie and fills up quickly the bit table. A good arguent for k =3can be found in []. We use packet header traces gathered at the gateway routers of the University of Florida in the U.S. The trace was collected for hours and we take only the inbound session fro the Internet. It contains 75, distinct source IP addresses,,9 distinct destination IP addresses and,7,37 distinct source/destination tuples. Note that we denote the source IP address of a packet as a and the destination IP address as b in our notation of packet (a, b). In this sense, the goal of the experient is to find heavy spreaders of horizontal network scans []. Figure () illustrates the traffic pattern with respect to source(destination) cardinality. The x-axis is the nuber of sources(destinations) whose cardinality lies between x and x. Each figure has two graphs of cuulative ratios for the nuber of distinct sources(destinations) and the nuber of distinct source/destination tuples. In figure, we see that % of the total sources contact less than distinct destinations and 99% of the contact less than 3 distinct destinations. Figure shows that the nuber of source/destination tuples increases just as the nuber of sources does. Therefore, we can not see any skewness in the figure. However, we can see a different pattern in figure. Two curves see different. The figure shows that only soe of the destinations occupy ost of the source/destination tuples. For exaple, at x =, the accuulated nuber of destinations is above 97%, but their aggregated source/destination tuples are below 7%. It eans that less than top 3% servers occupy ore than 73% of the total source/destination tuples. Exploiting //$5. IEEE. 3 This full text paper was peer reviewed at the direction of IEEE Counications Society subject atter experts for publication in the IEEE "GLOBECOM" proceedings.

4 Ratio.... sources src/dst tuples Source Cardinality Fig. : Cuulative ratios of the nubers of distinct sources and distinct source/destination tuples with respect to source cardinality Ratio.... destinations src/dst tuples Destination Cardinality Fig. : Cuulative ratios of the nubers of distinct destinations and distinct source/destination tuples with respect to destination cardinality No. of False Negatives OSM (total) OSM (slow scans) ISD (total) ISD (slow scans) Fig. 3: Nuber of false negatives when M=5KB this skewness, ISD has the edge on other intrusion detection systes. For all the experients, we set θ to be 5. It eans that we take any source of cardinality above 5 as a spreader or scan source. With θ = 5, the original traffic trace already includes 75 spreaders. We also generate soe artificial scan packets to siulate invisible spreaders. For each experient, we add artificial slow scan sources to the original traffic trace. These source addresses are carefully chosen so that the original traffic trace does not include any sae source address as the artificial scan sources. Each artificial scan source will send a total of λ distinct scan packets. It generates a scan packet every other μ noral source/destination tuples. The default paraeters for the experients are as follows: M = 5KB, = 5,n =, 9,α =.57,α O =.,θ = 5,λ =,μ =,,k =3. Note that β and α are deterined by equations 3 and 7. For coparison purpose, we also ipleented OSM fro []. For a fair coparison, both bit tables of OSM and ISD have the sae eory size M. To optiize OSM, the axiu nuber of one-bits for OSM is set to α O, which is different fro α. Through the experients, we observe that OSM degrades if α O is set too large or too sall. The default value of α O is.. Once the ratio of one-bits is above α O, the decoding process runs and OSM restarts in a clean state. For each experient, we copare false negative(positive) sets of OSM and ISD. We use FN O (FN R ) to denote the false negative set of OSM(ISD). Siilarly, we use FP O (FP R ) to denote false positive sets. Let RS be a set of real spreaders, which has 95 sources (75 spreaders fro the original traffic trace and artificial scan sources). Let D O (D R ) be a set of detected sources by OSM(ISD). We define FN O, FN R, FP O and FP R as follows: FN O = RS D O, FN R = RS D R, FP O = D O RS, FP R = D R RS. B. Experiental Results Figures 3 copare the nubers of false negatives(positives) between ISD and OSM. Thex-axis of each figure is μ, the nuber of noral source/destination tuples between two slow scan packets. A large value of μ iplies that the attacker further slows down in sending the scan packets. In figure 3, we have four curves. OSM(total) is the nuber of false negatives of OSM, so it equals FN O with μ fro to,3. OSM(slow scans) is the nuber of false negatives, but we only count the artificial slow scan sources that are not detected. Therefore, its axiu value is as we have artificial slow scan sources. The sae notations are used for ISD such as ISD(total) and ISD(slow scans). Note that ISD(total) plots FN R. Figure 3 shows that ISD catches ost spreaders until μ becoes,9. Even when μ =, 3, ISD catches 7 artificial spreaders out of. To the contrary, OSM isses uch ore spreaders than OSM. Even when μ =, it isses 7 non-artificial spreaders. It starts issing artificial scan sources at μ = 5. Atμ =, 9, OSM can not detect any slow scan sources while ISD detects out of. Note that we trade false positives with false negatives when designing ISD, but false positives should be controlled by setting α to be tight. Figure shows it. Even at μ =, 3, ISD only triggers 9 false positives. Considering that the nuber of source/destination tuples is above two illions, this false positives ay be accepted in ost applications. We repeat the sae experient with different n. Figures 5 and show the result with n = 3, 7, which eans M = MB. In this experient, ISD does not iss any spreaders including slow scan sources except one at μ =, 3. Note that both ISD(total) and ISD(slow scans) reain zero until μ =, 9. To the contrary, OSM still isses soe spreaders as shown in the figure. It can not detect out of slow scan sources at μ =, 3 even though the eory size has quadrupled. It is encouraging that ISD accoplishes better detection accuracy even when M is as sall as 5KB. Figure shows that ISD triggers only sall false positives. IV. RELATED WORK Snort is a world faous network-based intrusion detection syste. To detect scan sources, it siply keeps track of each source and the set of distinct destinations it contacts within a specified tie-window. Thus, the eory require //$5. IEEE. This full text paper was peer reviewed at the direction of IEEE Counications Society subject atter experts for publication in the IEEE "GLOBECOM" proceedings.

5 No. of False Positives OSM ISD Fig. : Nuber of false positives when M=5KB No. of False Negatives OSM (total) OSM (slow scans) ISD (total) ISD (slow scans) Fig. 5: Nuber of false negatives when M=MB No. of False Positives OSM ISD Fig. : Nuber of false positives when M=MB ent should be at least the total nuber of distinct sourcedestination pairs within the onitoring period, which is not feasible for detecting invisible spreaders [5], []. Venkataraan et al. define a heavy distinct-hitters proble [] as follows: given a strea of (x, y) pairs, find all the x s that are paired with a large nuber of distinct y s. The heavy distinct-hitters proble can be used to define the scan detection proble. The authors also define a ore specific proble, superspreaders, and propose two techniques to detect superspreaders. Superspreaders are heavy distinct-hitters, but they scan victis quickly. In other words, the onitoring period is short. Therefore, the proposed techniques can detect network scans only if they finish attacks within the short onitoring period. Recently, an efficient streaing algorith was proposed by Zhao et al. [], which we call Online Streaing Module (OSM) in this paper. They approached the issue by devising a traffic easureent tool, which approxiately estiates the cardinality of a source. It uses a two-diensional bit table, which utilizes the eory space copactly. However, it suffers fro the noral traffic volue as other IDSes do. Our proposed schee uses a two-diensional bit table as the basic data structure to save eory space, but we use a new indexing to overcoe the proble of huge noral traffic. It was not achieved by any previous work. Recently, Gao et al. propose to detect stealthy spreaders by using online outdegree histogras in the context of change detection []. Their schee is also based on two alternating tie-windows. We ephasize that they use the definition of stealthy spreaders different fro ours. In their definition, stealthy spreaders are a group of sources who send scanning packets at a constant rate together. Actually, they ai to detect scans fro botnets. To the contrary, we focus on detecting heavy spreaders who ay see invisible in that they can evade detection thanks to the large aount of noral traffic. V. CONCLUSION We defined the invisible spreader detection proble and showed the negative effects of noral traffic. To the best of our knowledge, none of the current IDSes are free fro these probles. We proposed a novel streaing algorith to detect invisible spreaders and to itigate the negative effects of noral traffic. By experients on real-world Internet traffic traces, we confired the advantages of the proposed schee. It is expected to help network/security anageent people in practice to detect general slow attacks, including invisible spreaders, which have been believed a difficult proble in network security. REFERENCES [] B. Schneier, SIMS: Solution, or Part of the Proble? IEEE Security and Privacy, vol., no. 5, October. [] nap, [3] nessus, [] paros, [5] Snort, [] Q. Zhao, J. Xu, and A. Kuar, Detection of Super Sources and Destinations in High-Speed Networks: Algoriths, Analysis and Evaluation, IEEE JSAC, vol., no., October. [7] B. H. Bloo, Space/Tie Trade-offs in Hash Coding with Allowable Errors, Counications of the ACM, vol. 3, no. 7, pp., 97. [] A. Broder and M. Mitzenacher, Network Applications of Bloo Filters: A Survey, Internet Matheatics, vol., no., June. [9] K. Hwang, B. Vander-Zanden, and H. Taylor, A linear-tie probabilistic counting algorith for database applications, ACM Transactions on Database Systes, vol. 5, no., June 99. [] S. Staniford, J. Hoagland, and J. McAlerney, Practical Autoated Detection of Stealthy Portscans, Journal of Coputer Security, vol., pp. 5 3,. [] S. Venkatataan, D. Song, P. Gibbons, and A. Blu, New Streaing Algoriths for Fast Detection of Superspreaders, Proc. of NDSS 5, Feb. 5. [] Y. Gao, Y. Zhao, R. Schweller, S. Venkataraan, Y. Chen, D. Song, and M. Kao, Detecting Stealthy Spreaders Using Online Outdegree Histogras, Proc. of IEEE International Workshop on Quality of Service 7, pp. 5 53, June 7. All of the above techniques work fine for a short onitoring period. However, none of the detect invisible spreaders if the alicious sources send scanning packets slowly and steadily //$5. IEEE. 5 This full text paper was peer reviewed at the direction of IEEE Counications Society subject atter experts for publication in the IEEE "GLOBECOM" proceedings.