FlockStream: a Bio-inspired Algorithm for Clustering Evolving Data Streams

Size: px
Start display at page:

Download "FlockStream: a Bio-inspired Algorithm for Clustering Evolving Data Streams"

Transcription

1 FlockStream: a Bo-nspred Algorthm for Clusterng Evolvng Data Streams Agostno Forestero, Clara Pzzut, Gandomenco Spezzano Insttute for Hgh Performance Computng and Networkng, ICAR-CNR Va Petro Bucc, 41C Rende (CS), Italy {folno,pzzut,spezzano}@car.cnr.t Abstract Exstng densty-based data stream clusterng algorthms use a two-phase scheme approach consstng of an onlne phase, n whch raw data s processed to gather summary statstcs, and an offlne phase that generates the clusters by usng the summary data. In ths paper we propose a data stream clusterng method based on a mult-agent system that uses a decentralzed bottom-up self-organzng strategy to group smlar data ponts. Data ponts are assocated wth agents and deployed onto a 2D space, to work smultaneously by applyng a heurstc strategy based on a bo-nspred model, known as flockng model. Agents move onto the space for a fxed tme and, when they encounter other agents nto a predefned vsblty range, they can decde to form a flock f they are smlar. Flocks can jon to form swarms of smlar groups. Ths strategy allows to merge the two phases of densty-based approaches and thus to avod the offlne cluster computaton, snce a swarm represents a cluster. Expermental results show the capablty of the bo-nspred approach to obtan very good results on real and synthetc data sets. 1. Introducton In recent years, many organzatons are collectng tremendous amount of data, often generated contnuously as a sequence of events and comng from dfferent locatons. Credt card transactonal flows, telephone records, sensor network data, network event logs are just some examples of data streams. The desgn and development of fast, effcent, and accurate technques, able to extract knowledge from these huge data sets too large to ft nto the man memory of computers, pose sgnfcant challenges. Frst of all, data can be examned only once, as t arrves. Second, usng an entre data stream of a long perod could mslead the analyss results due to the weght of outdated data consdered n the same way as more recent data. Snce data streams are contnuos sequences of nformaton, the underlyng clusters could change wth tme, thus gvng dfferent results wth respect to the tme horzon over whch they are computed. Tradtonal clusterng approaches are not suffcently flexble to deal wth data that contnuously evolves wth tme, thus, n the last few years, many proposals to data stream clusterng have been presented [2, 3, 4, 5, 11, 12, 10, 1, 14]. Aggarwal et al. [1] have been the frst to address the problem of the mpossblty to revse a data stream durng the computaton. They suggested that a stream clusterng algorthm should be separated n an onlne mcro-clusterng component, that gathers approprate summary statstcs on the data stream, and an offlne macro-clusterng component that makes use of the nformaton stored to provde the clusterng results. One of the man drawback of ther CluStream algorthm s that t s unable to dscover clusters of arbtrary shapes, furthermore the number k of clusters must be fxed a-pror. A more amenable approach to clusterng data streams s that based on the concept of densty ntroduced by DBSCAN [9]. Densty-based approach [4, 5] apply the two-phase scheme [1] descrbed above n a densty-based framework. In partcular, the former, Den- Stream, extensvely descrbed n the next secton, extends the concept of core pont ntroduced n DBSCAN and employs the noton of mcro-cluster to store an approxmate representaton of the data ponts n a damped wndow model. The latter apples analogous concepts by assocatng a decay factor to the densty of each data pont and by parttonng the data space nto dscretzed grds where new data ponts are mapped. The authors study the relaton between tme horzon, decay factor, and data densty to guarantee the generaton of hgh qualty clusters. The man drawback of these methods s the executon tme requred when dealng wth large data sets of hgh dmensonalty. In fact, both the methods, when a new data pont arrves, n order to determne whether t should be merged nto an exstng cluster or consder t as the seed for a new group, need to do a comparson wth all the clusters generated so far.

2 In ths paper a densty-based data stream clusterng method, named FlockStream, that employs a mult-agent system usng a decentralzed bottom-up self-organzng strategy to group smlar data ponts, s proposed. Each data pont s assocated wth an agent. Agents are deployed onto a 2D space, called the vrtual space, and work smultaneously by applyng a heurstc strategy based on a bonspred model known as flockng model [8]. Agents move onto the space for a fxed tme and when they encounter other agents nto a predefned vsblty radus they can decde to form a flock (.e a mcro-cluster) f they are smlar. However, the partcpaton of an agent to a group s not defntve snce, f durng the space exploraton a more smlar agent s encountered, the current flock can be dropped and the agent can jon to the nearest agents flock. The movement of the agents n the 2D space s not random, but t s guded by the smlarty functon that aggregates the agents to ther closer neghbors. As dfferent smlar mcroclusters can be created, by applyng the flockng rules, they are aggregated to form swarms of close mcro-clusters. FlockStream ntroduces two man noveltes. Frst, t replaces the exhaustve search of the nearest neghbor of a pont, necessary to assgn t to the approprate mcro-cluster, wth a local stochastc mult-agent search that works n parallel. The method s thus completely decentralzed as each agent acts ndependently from each other and communcates only wth ts mmedate neghbors n an asynchronous way. Localty and asynchronsm mples that the algorthm s scalable to very large data sets. Second, snce flocks of agents can jon together nto swarms of smlar groups, the two-phase scheme of data stream clusterng methods mentoned above s replaced by a unque onlne phase, n whch the clusterng results are always avalable. Ths means that the clusterng generaton on demand by the user can be satsfed at any tme by smply showng all the swarms computed so far. The paper s organzed as follows. The next secton descrbes the DenStream algorthm. Secton 3 ntroduces the Flockng model. Secton 4 descrbes our approach. In secton 5, fnally, the results of the method on synthetc and real lfe data sets are presented. 2. DenStream Algorthm DenStream [4] s a densty-based clusterng algorthm for evolvng data streams that uses summary statstcs to capture synopss nformaton about the nature of the data stream. These statstcs are exploted to generate clusters wth arbtrary shape. The algorthm assumes the damped wndow model to cluster data streams. In ths model the weght of each data pont decreases exponentally wth tme t va a fadng functon f(t) = 2 λt, where λ > 0. The weght of the data stream s a constant W = v, where 1 2 λ v s the speed of the stream,.e. the number of ponts arrved n one unt tme. Hstorcal data dmnshes ts mportance when λ assumes hgher values. The authors extend the concept of core pont ntroduced n DBSCAN [9] and employ the noton of mcro-cluster to store an approxmate representaton of the data ponts. A core pont s an object n whose ɛ neghborhood the overall weght of the ponts s at least an nteger µ. A clusterng s a set of core objects havng cluster labels. Three defntons of mcro-clusters are then ntroduced: the coremcro-cluster, the potental core-mcro-cluster, and the outler mcro-cluster. A core-mcro-cluster (abbrevated c-mcro-cluster) at tme t for a group of close ponts p 1,..., p n wth tme stamps T 1,..., T n s defned as CMC(w, c, r), where w s the weght, c s the center, and r s the radus of the c- mcro-cluster. The weght w = n j=1 f(t T j) must be such that w µ. The center s defned as n j=1 c = f(t T j)p j w and the radus n j=1 r = f(t T j)dst(p j, c) w s such that r ɛ. dst(p j, c) s the Eucldean dstance between the pont p j and the center c. Note that the weght of a mcro-cluster must be above a predefnes threshold µ n order to be consdered core. The authors assume that clusters wth arbtrary shape n a data stream can be descrbed by a set of c-mcro-clusters. However, snce as data flows t can change, structures apt to ncremental computaton, smlar to those proposed by [1] are ntroduced. A potental c-mcro-cluster, abbrevated p-mcro-cluster, at tme t for a group of close ponts p 1,..., p n wth tme stamps T 1,..., T n s defned as {CF 1, CF 2, w}, where the weght w, as defned above, must be such that w βµ. β, 0 < β 1 s a parameter defnng the outlerness threshold relatve to c-mcro-clusters. CF 1 = n j=1 f(t T j)p j s the weghted lnear sum of the ponts, CF 2 = n j=1 f(t T j)p 2 j s the weghed squared sum of the ponts. The center of a p-mcro-cluster s c = CF 1 and CF 2 the radus r ɛ s r = w ( CF 1 w )2 A p-mcro-cluster s a set of ponts that could become a mcro-cluster. An outler mcro-cluster, abbrevated o-mcro-cluster, at tme t for a group of close ponts p 1,..., p n wth tme stamps T 1,..., T n s defned as {CF 1, CF 2, w, t 0 }. The defnton of w, CF 1, CF 2, center and radus are the same of the p-mcro-cluster. t 0 = T 1 denotes the creaton of the o-mcro-cluster. In an outler mcro-cluster the weght w must be below the fxed threshold, thus w < βµ. However w

3 t could grow nto a potental mcro-cluster when, addng new ponts, ts weght exceeds the threshold. The algorthms conssts of two phases: the onlne phase n whch the mcro-clusters are mantaned and updated as new ponts arrve onlne; the off-lne phase n whch the fnal clusters are generated, on demand, by the user. Durng the onlne phase, when a new pont p arrves, DenStream tres to merge p nto ts nearest p-mcro-cluster c p. Ths s done only f the the radus r p of c p does not augment above ɛ,.e. r p ɛ. If ths constrant s not satsfed, the algorthm tres to merge p nto ts nearest o-mcro-cluster c o, provded that the new radus r o ɛ. The weght w s then checked f w βµ. In such a case c o s promoted to p- mcro-cluster. Otherwse a new o-mcro-cluster s generated by p. Note that for an exstng p-mcro-cluster c p, f no new ponts are added to t, ts weght wll decay gradually. When t s below βµ, c p becomes an outler. The off-lne part of the algorthm uses a varant of the DBSCAN algorthm n whch the potental mcro-clusters are consdered as vrtual ponts. The concepts of denstyconnectvty and densty reachable, adopted n DBSCAN, are used by DenStream to generate the fnal result. Den- Stream cannot be used to handle huge amounts of data avalable n large-scale networks of autonomous data sources snce t needs to fnd the closest mcro-cluster for each newly arrved data pont and t assumes that all data s located at the same ste where t s processed. In the next secton a computatonal model that overcome these dsadvantages s descrbed. 3 The Flockng model The flockng model [8] s a bologcally nspred computatonal model for smulatng the anmaton of a flock of enttes. In ths model each ndvdual (also called brd) makes movement decsons wthout any communcaton wth others. Instead, t acts accordng to a small number of smple rules, dependng only upon neghborng members n the flock and envronmental obstacles. These smple rules generate a complex global behavor of the entre flock. The basc flockng model was frst proposed by Crag Reynolds [13], n whch he referred to each ndvdual as a bod. Ths model conssts of three smple steerng rules that a bod needs to execute at each nstance over tme: separaton (steerng to avod collson wth neghbors); algnment (steerng toward the average headng and matchng the velocty of neghbors); coheson (steerng toward the average poston of neghbors). These rules descrbe how a bod reacts to other bods movement n ts local neghborhood. The degree of localty s determned by the vsblty range of the bod s sensor. The bod does not react to the flockmates outsde ts sensor range. A mnmal dstance must also be mantaned among them to avod collson. (a) Fgure 1. Algnment rule. (a) Fgure 2. Separaton rule. A Multple Speces Flockng (MSF) model [7] has been developed to more accurately smulate flockng behavor among an heterogeneous populaton of enttes. MSF ncludes a feature smlarty rule that allows each bod to dscrmnate among ts neghbors and to group only wth those smlar to tself. The addton of ths rule enables the flock to organze groups of heterogeneous mult-speces nto homogeneous subgroups consstng only of ndvduals of the same speces. The basc behavor rules [6] of a sngle entty of the MSF model are llustrated n fgures 1, 2, 3. Let R 1 and R 2, wth R 1 > R 2, be the radus ndcatng the vsblty range of the bods and the mnmum dstance that must be mantaned among them respectvely, and d(f, A c ) the dstance between the current agent A c and a flockmate F. The algnment rule means that a bod tends to move n the same drecton of the nearby bods,.e. t tres to algn ts velocty vector wth the average velocty vector of the flocks n ts local neghborhood. Ths rule s depcted n fgure 1 and t s formally descrbed as f d(f, A c ) R 1 d(f, A c ) R 2 v ar = 1 n (b) (b) v where v ar s the velocty drven by the algnment rule, d(f, A c ) s the dstance between the bod A c and ts neghbor flockmate F, v s the velocty of the bod F, and n s the number of neghbors. The separaton rule avods that a bod be too close to another bod, thus f d(f, A c ) 2R 2 v sr = v + v c d(f, A c )

4 where v sr s the separaton velocty, v c and v are the veloctes of the current bod and of the -th flockmate, respectvely. Ths rule s shown n fgure 2. The advantage of the flockng algorthm s the heurstc prncple of the flock s searchng mechansm. The heurstc searchng mechansm helps bods to quckly form a flock. Snce the bods contnuously fly n the vrtual space and jon the flock consttuted by bods more smlar to them, new results can be quckly re-generate when addng enttes bods or deletng part of bods at run tme. Ths feature allows the flockng algorthm to be appled to clusterng to analyze dynamcally changng nformaton stream. (a) (b) 4 The FlockStream algorthm Fgure 3. Coheson rule. The coheson rule moves a bod towards other nearby bods (unless another bod s too close) orentng the velocty vector of the bod n the drecton of the centrod of the local flock. d(f, A c ) R 1 d(f, A c ) R 2 v cr = (P P c ) where V cr s the coheson velocty and (P P c ) calculates a drectonal vector pont, beng P and P c the postons of the current bod A c and a neghbor bod F. Fgure 3 depcts the coheson rule. When two bods are too close, the separaton rule overrdes the other two, whch are deactvated untl the mnmum separaton s acheved. Smlar bods try to stay close usng the strength of the attractng force, that s proportonal to the dstance between the bods and the smlarty between the bods feature values. Ths strength s computed as v sm = (Sm(F, A c ) d(p, P c )) where v sm s the velocty drven by feature smlarty, Sm(F, A c ) s the smlarty value between the features of bods A c and F, and d(p, P c ) s the dstance between ther postons n the 2D space. Dssmlar bods try to stay away from other bods that have dssmlar features by a repulsve force that s nversely proportonal to the dstance between the bods and the smlarty between the bods. v dsm = 1 Sm(F, A c ) d(p, P c ) where v dsm s the velocty drven by feature dssmlarty. The overall flockng behavor can be expressed by a weghted lnear combnaton of the veloctes calculated by all the rules that represent the net velocty vector v of the bod n the vrtual space. FlockStream s a heurstc densty-based data stream clusterng algorthm bult on the Multple Speces Flockng model. The algorthm uses agents wth dstnct smple functonaltes to mmc the flockng behavor. Each multdmensonal data tem s assocated wth an agent. In our approach, n addton to the standard acton rules of the flockng model, we ntroduce an extenson to the flockng model by consderng the type of an agent. The agents can be of three types: basc (representng a new pont arrvng n one tme unt), p-representatve and o-representatve (correspondng to p- or o- mcro-clusters). FlockStream dstngushes between the ntalzaton phase, n whch the vrtual space s populated of only basc agents, and the mcrocluster mantenance and clusterng, n whch all the three types of agents are present. Intalzaton. At the begnnng a set of basc agents,.e. a set of ponts, s deployed randomly onto the vrtual space. The basc agents work n parallel for a predefned number of teratons and move accordng to the MSF heurstc. Analogously to brds n the real world, agents that share smlar object vector features wll group together and become a flock, whle dssmlar brds wll be moved away from the flock. Agents uses a proxmty functon to dentfy objects that are smlar. In our algorthm we use the Eucldean dstance to measure the dssmlarty between data ponts A and B, and assume that A and B are smlar f ther Eucldean dstance d(a, B) ɛ. Whle teratng, the behavor (velocty) of each agent A wth poston P a s nfluenced by all the agents X wth poston P x n ts neghborhood. The agent s velocty s computed by applyng the local rules of Reynolds and the smlarty rule. The smlarty rule nduces an adaptve behavor to the algorthm snce the agents can leave the group they partcpate for another group contanng agents wth hgher smlarty. Thus, durng ths predefned number of teratons, the ponts jon and leave the groups formng dfferent flocks. At the end of the teratons, for each created group, summary statstcs are computed and the stream of data s dscarded. As result of ths ntal phase we have the two other types of agents: p-representatve and o-representatve agents. Representatve Mantenance and Clusterng. When a

5 new data stream bulk of agents s nserted nto the vrtual space, at a fxed stream speed, the mantenance of the p- and o- representatve agents and onlne clusterng are performed for a fxed number of teratons. Analogously to DenStream, dfferent cases can occur (see fgure 4) : a p-representatve c p or an o-representatve c o encounters another representatve agent. If the dstance between them s below ɛ then they compute the velocty vector by applyng the Reynolds and smlarty rules (step 5), and jon to form a swarm (.e. a cluster) of smlar representatves (step 6). A basc agent A meets ether a p-representatve c p or an o-representatve c o n ts vsblty range. The smlarty between A and the representatve s computed and, f the new radus of c p (c o respectvely) s below or equal to ɛ, A s absorbed by c p (c o ) (step 9). Note that at ths stage FlockStream does not update the summary statstcs because the aggregaton of the basc agent A to the mcro-cluster could be dropped f A, durng ts movement on the vrtual space, encounters another agent more smlar to t. A basc agent A meets another basc agent B. The smlarty between the two agents s calculated and, f d(a, B) ɛ, then the velocty vector s computed (step 11) and A s joned wth B to form an o- representatve (step 12). At the end of the maxmum number of teratons allowed, for each swarm, the summary statstcs of the representatve agents t contans are updated and, f the weght w of a p-representatve dmnshes below βµ, t s degraded to become an o-representatve. On the contrary, f the weght w of an o-representatve becomes above βµ, a new p-representatve s created. It s worth to note that, the swarms of representatve agents avod the off-lne phase of DenStream to apply a clusterng algorthm to get the fnal result. In fact a swarm represents a cluster, thus the clusterng generaton on demand by the user can be satsfed at any tme by smply showng all the swarms computed so far. Fgure 5 llustrates the vrtual space at a generc teraton. The fgure ponts out the presence of swarms of agents, as well as the presence of solated agents not yet aggregated wth other smlar agents. In the next secton we show that our approach successfully detects clusters n evolvng data streams. 5. Expermental Results In ths secton we study the effectveness of FlockStream on real and synthetc datasets. The algorthm has been mplemented n Java and all the experments have been performed on an Intel(R) Core(TM) havng 2 Gb 1. for =1... MaxIteratons 2. foreach agent (all) 3. f (typeagent s (p-representatve o-representatve)) 4. then{ 5. computeveloctyvector(flockmates, MSF rules); 6. moveagentandformswarm( v);} 7. else{ 8. f (typeagent s (basc n neghborhood of a smlar swarm) ) 9. then agent temporary absorbed n the swarm; 10. else{ 11. computeveloctyvector(flockmates, MSF rules); 12. moveagentandformswarm( v);}} 13. end foreach 14. end for Fgure 4. The pseudo-code of the Flock- Stream algorthm. of memory. The synthetc datasets used, named DS1, DS2 and DS3, are showed n Fgure 6(a). Each of them contans 10,000 ponts and they are smlar to those employed by Cao et al. n [4] to evaluate DenStream. For a far comparson, the evolvng data stream, denoted EDS, has been created by adoptng the same strategy of Cao et al. Each dataset has been randomly chosen 10 tmes, thus generatng an evolvng data stream of total length 100,000 havng a 10,000 ponts block for each tme unt. The real dataset used to test the performances of FlockStream s the KDD Cup 1999 Data set ( Ths data set comes from the 1998 DARPA Intruson Detecton Evaluaton Data and contans tranng data consstng of 7 weeks of network-based ntrusons nserted n the normal data, and 2 weeks of network-based ntrusons and normal data for a total of 4,999,000 connecton records descrbed by 41 characterstcs. The man categores of ntrusons are four: Dos (Denal Of Servce), R2L (unauthorzed access from a remote machne), U2R (unauthorzed access to a local super-user prvleges by a local un-prvleged user), PROBING (survellance and probng). The data set has been transformed nto a data stream by takng the nput order as the streamng order. For all the datasets the true class label s known, thus the clusterng qualty can be evaluated by the average purty of the clusters. The average purty of a clusterng s defned as: purty = K C d =1 C K 100%; where K ndcates the number of clusters, C d denotes the

6 Fgure 5. Vsualzaton of swarms of agents and solated agents on the vrtual space at a generc teraton. DS1 DS2 DS3 (a) DS1 DS2 DS3 (b) (c) Fgure 6. (a) Synthetc data sets.(b) Clusterng performed by FlockStream on the synthetc datasets. (c) Clusterng performed by FlockStream on the evolvng data stream EDS. number of ponts wth the domnant class label n cluster, and C denotes the number of ponts n cluster. The purty s calculated only for the ponts arrvng n a predefned wndow (horzon), snce the weght of ponts dmnshes contnuously. The parameters used by FlockStream n the experments are analogous to those adopted by Cao et al., that s ntal ponts/agents Na = 1000, stream speed v = 1000, decay factor λ = 0.25, ɛ = 16, µ = 10, outler threshold β = 0.2 and MaxIteratons = Intally we evaluated the FlockStream algorthm on the non-

7 Fgure 7. Clusterng qualty for evolvng data stream EDS wth horzon=2 and stream speed=2000. Fgure 8. Clusterng qualty for evolvng data stream EDS wth horzon=10 and stream speed=1000. evolvng datasets DS1, DS2 and DS3, to check the ablty of the method to get the shape of each cluster. The results are reported n Fgure 6(b). In ths fgure the crcles ndcate the mcro-cluster detected by the algorthm. We can see that FlockStream exactly recovers the cluster shape. The results obtaned by FlockStream on the evolvng data stream EDS, at dfferent tmes, are shown n fgure 6(c). In the fgure, ponts ndcate the raw data whle crcles denote the mcro-clusters. It can be seen that FlockStream captures the shape of each cluster as the data streams evolve. The purty results of FlockStream compared to DenStream on the EDS data stream are shown n fgure 7. Here, the horzon s set to 2 and the stream speed s set to 2000 ponts per tme unt. We can see the very good clusterng qualty of FlockStream, n fact t s always hgher than 95% and comparable to DenStream. Fgure 8 shows the results of FlockStream when the stream speed s set to 1000 ponts per tme unt and the horzon s set to 10 on the EDS data stream. The results show that FlockStream, smlarly to DenStream, s nsenstve to the horzon. The comparson between FlockStream and DenStream on the Network Intruson data set s shown n Fgures 9 and 10. We selected the same tme ponts, when some partcular attacks happen, chosen by DenStream, and we report the results obtaned at these moments. In the former fgure the horzon s set to 1, whereas n the latter the horzon s set to 5; the stream speed s set to 1000 for both. We can see how FlockStream clearly ouperforms DenStream n almost all the tme unts chosen and the very hgh clusterng qualty acheved also on ths dataset. The purty of FlockStream compared to DenStream n a dataset wth nose s calculated and the clusterng purty results of EDS wth 1% and 5% nose are shown n fgures 11 and 12 respectvely. The results demonstrate that FlockStream acheves hgh clusterng qualty, comparable to DenStream, also when nose s present. Fgure 9. Clusterng qualty for Network Intruson dataset wth horzon=1 and stream speed= Conclusons A heurstc densty-based data stream clusterng algorthm, bult on the Multple Speces Flockng model, has been presented. The method employs a local stochastc mult-agent search strategy that allows agents to act ndependently from each other and to communcate only wth mmedate neghbors n an asynchronous way. Decentralzaton and asynchronsm makes the algorthm scalable to very large data sets. Another man novelty of the approach s that the two-phase scheme of densty-based data stream clusterng methods s replaced by a unque onlne phase, n whch the clusterng results are always avalable. Ths means that the clusterng generaton on demand by the user can be satsfed at any tme by smply showng all the swarms computed so far. Expermental results on real and synthetc data sets confrm the valdty of the approach proposed. Future work ams at extendng the method to a dstrbuted framework, more apt to real lfe applcatons.

8 References Fgure 10. Clusterng qualty for Network Intruson dataset wth horzon=5 and stream speed=1000. Fgure 11. Clusterng qualty for evolvng data stream EDS wth horzon=2, stream speed=2000 and wth 1% nose. Fgure 12. Clusterng qualty for evolvng data stream EDS wth horzon=10, stream speed=1000 and wth 5% nose. [1] C. C. Aggarwal, J. Han, J. Wang, and P. Yu. On clusterng massve data streams: a summarzaton paradgm. In n Data Streams- Models and Algorthms, Charu C. Aggarwal eds., pages Sprnger, [2] B. Babock, M. Datar, R. Motwan, and L. O Callaghan. Mantanng varance and k-medans over data stream wndows. In Proceedngs of the 22nd ACM Symposum on Prncples of Data base Systems (PODS 2003), pages , [3] D. Barbará. Requrements for clusterng data streams. SIGKDD Exploratons Newsletter, 3(2):23 27, [4] F. Cao, M. Ester, W. Qan, and A. Zhou. Densty-based clusterng over evolvng data stream wth nose. In Proceedngs of the Sxth SIAM Internatonal Conference on Data Mnng (SIAM 2006), pages , [5] Y. Chen and L. Tu. Densty-based clusterng for real-tme stream data. In Proceedngs of the 13th ACM SIGKDD Internatonal conference on Knowledge dscovery and data mnng (KDD 07), pages ACM, [6] X. Cu, J. Gao, and T. E. Potok. A flockng based algorthm for document clusterng analyss. Journal of Systems Archtecture, 52(8-9): , [7] X. Cu and T. E. Potok. A dstrbuted agent mplementaton of multple speces flockng model for document parttonng clusterng. In Cooperatve Informaton Agents, pages , [8] R. C. Eberhart, Y. Sh, and J. Kennedy. Swarm Intellgence (The Morgan Kaufmann Seres n Artfcal Intellgence). Morgan Kaufmann, March [9] M. Ester, H.-P. Kregel, J. Sander, and X. Xu. A denstybased algorthm for dscoverng clusters n large spatal databases wth nose. In Proceedngs of the Second ACM SIGKDD Internatonal conference on Knowledge dscovery and data mnng (KDD 96), pages, [10] S. Gua, A. Meyerson, N. Mshra, R. Motwan, and L. O Callaghan. Clusterng data strams: Theory and practse. IEEE Transactons on Knowledge and Data Engneerng, 15(3): , [11] S. Gua, N. Mshra, R. Motwan, and L. O Callaghan. Clusterng data streams. In Proceedngs of the Annual IEEE Symposum on Foundatons of Computer Scence, pages , [12] L. O Callaghan, N. Mshra, N. Mshra, and S. Gua. Streamng-data algorthms for hgh qualty clusterng. In Proceedngs of the 18th Internatonal Conference on Data Engneerng (ICDE 01), pages , [13] C. W. Reynolds. Flocks, herds and schools: A dstrbuted behavoral model. In SIGGRAPH 87: Proceedngs of the 14th annual conference on Computer graphcs and nteractve technques, pages 25 34, New York, NY, USA, ACM. [14] A. Zhou, F. Cao, W. Qan, and C. Jn. Trackng clusters n evolvng data streams over sldng wndows. Knowledge and Informaton Systems, 15(2): , 2007.

Study of Data Stream Clustering Based on Bio-inspired Model

Study of Data Stream Clustering Based on Bio-inspired Model , pp.412-418 http://dx.do.org/10.14257/astl.2014.53.86 Study of Data Stream lusterng Based on Bo-nspred Model Yngme L, Mn L, Jngbo Shao, Gaoyang Wang ollege of omputer Scence and Informaton Engneerng,

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Network Intrusion Detection Based on PSO-SVM

Network Intrusion Detection Based on PSO-SVM TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*

More information

Clustering Algorithm Combining CPSO with K-Means Chunqin Gu 1, a, Qian Tao 2, b

Clustering Algorithm Combining CPSO with K-Means Chunqin Gu 1, a, Qian Tao 2, b Internatonal Conference on Advances n Mechancal Engneerng and Industral Informatcs (AMEII 05) Clusterng Algorthm Combnng CPSO wth K-Means Chunqn Gu, a, Qan Tao, b Department of Informaton Scence, Zhongka

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Overvew 2 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Introducton Mult- Smulator MASIM Theoretcal Work and Smulaton Results Concluson Jay Wagenpfel, Adran Trachte Motvaton and Tasks Basc Setup

More information

Maintaining temporal validity of real-time data on non-continuously executing resources

Maintaining temporal validity of real-time data on non-continuously executing resources Mantanng temporal valdty of real-tme data on non-contnuously executng resources Tan Ba, Hong Lu and Juan Yang Hunan Insttute of Scence and Technology, College of Computer Scence, 44, Yueyang, Chna Wuhan

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING An Improved K-means Algorthm based on Cloud Platform for Data Mnng Bn Xa *, Yan Lu 2. School of nformaton and management scence, Henan Agrcultural Unversty, Zhengzhou, Henan 450002, P.R. Chna 2. College

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

A Notable Swarm Approach to Evolve Neural Network for Classification in Data Mining

A Notable Swarm Approach to Evolve Neural Network for Classification in Data Mining A Notable Swarm Approach to Evolve Neural Network for Classfcaton n Data Mnng Satchdananda Dehur 1, Bjan Bhar Mshra 2 and Sung-Bae Cho 1 1 Soft Computng Laboratory, Department of Computer Scence, Yonse

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Learning Semantics-Preserving Distance Metrics for Clustering Graphical Data

Learning Semantics-Preserving Distance Metrics for Clustering Graphical Data Learnng Semantcs-Preservng Dstance Metrcs for Clusterng Graphcal Data Aparna S. Varde, Elke A. Rundenstener Carolna Ruz Mohammed Manruzzaman,3 Rchard D. Ssson Jr.,3 Department of Computer Scence Center

More information

A single pass algorithm for clustering evolving data streams based on swarm intelligence

A single pass algorithm for clustering evolving data streams based on swarm intelligence Data Min Knowl Disc DOI 10.1007/s10618-011-0242-x A single pass algorithm for clustering evolving data streams based on swarm intelligence Agostino Forestiero Clara Pizzuti Giandomenico Spezzano Received:

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Keyword-based Document Clustering

Keyword-based Document Clustering Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Constructing Minimum Connected Dominating Set: Algorithmic approach

Constructing Minimum Connected Dominating Set: Algorithmic approach Constructng Mnmum Connected Domnatng Set: Algorthmc approach G.N. Puroht and Usha Sharma Centre for Mathematcal Scences, Banasthal Unversty, Rajasthan 304022 usha.sharma94@yahoo.com Abstract: Connected

More information

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition

Optimal Design of Nonlinear Fuzzy Model by Means of Independent Fuzzy Scatter Partition Optmal Desgn of onlnear Fuzzy Model by Means of Independent Fuzzy Scatter Partton Keon-Jun Park, Hyung-Kl Kang and Yong-Kab Km *, Department of Informaton and Communcaton Engneerng, Wonkwang Unversty,

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

RESEARCH about mining in the data stream environment

RESEARCH about mining in the data stream environment IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 19, NO. 10, OCTOBER 2007 1349 Clusterng over Multple Evolvng Streams by Events and Correlatons M-Yen Yeh, B-Ru Da, and Mng-Syan Chen, Fellow, IEEE

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Learning from Multiple Related Data Streams with Asynchronous Flowing Speeds

Learning from Multiple Related Data Streams with Asynchronous Flowing Speeds Learnng from Multple Related Data Streams wth Asynchronous Flowng Speeds Zh Qao, Peng Zhang, Jng He, Jnghua Yan, L Guo Insttute of Computng Technology, Chnese Academy of Scences, Bejng, 100190, Chna. School

More information

Natural Computing. Lecture 13: Particle swarm optimisation INFR /11/2010

Natural Computing. Lecture 13: Particle swarm optimisation INFR /11/2010 Natural Computng Lecture 13: Partcle swarm optmsaton Mchael Herrmann mherrman@nf.ed.ac.uk phone: 0131 6 517177 Informatcs Forum 1.42 INFR09038 5/11/2010 Swarm ntellgence Collectve ntellgence: A super-organsm

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Fitting: Deformable contours April 26 th, 2018

Fitting: Deformable contours April 26 th, 2018 4/6/08 Fttng: Deformable contours Aprl 6 th, 08 Yong Jae Lee UC Davs Recap so far: Groupng and Fttng Goal: move from array of pxel values (or flter outputs) to a collecton of regons, objects, and shapes.

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES

More information

Image Alignment CSC 767

Image Alignment CSC 767 Image Algnment CSC 767 Image algnment Image from http://graphcs.cs.cmu.edu/courses/15-463/2010_fall/ Image algnment: Applcatons Panorama sttchng Image algnment: Applcatons Recognton of object nstances

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Signature and Lexicon Pruning Techniques

Signature and Lexicon Pruning Techniques Sgnature and Lexcon Prunng Technques Srnvas Palla, Hansheng Le, Venu Govndaraju Centre for Unfed Bometrcs and Sensors Unversty at Buffalo {spalla2, hle, govnd}@cedar.buffalo.edu Abstract Handwrtten word

More information

ApproxMGMSP: A Scalable Method of Mining Approximate Multidimensional Sequential Patterns on Distributed System

ApproxMGMSP: A Scalable Method of Mining Approximate Multidimensional Sequential Patterns on Distributed System ApproxMGMSP: A Scalable Method of Mnng Approxmate Multdmensonal Sequental Patterns on Dstrbuted System Changha Zhang, Kongfa Hu, Zhux Chen, Lng Chen Department of Computer Scence and Engneerng, Yangzhou

More information

SAO: A Stream Index for Answering Linear Optimization Queries

SAO: A Stream Index for Answering Linear Optimization Queries SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples

More information

EVALUATION OF THE PERFORMANCES OF ARTIFICIAL BEE COLONY AND INVASIVE WEED OPTIMIZATION ALGORITHMS ON THE MODIFIED BENCHMARK FUNCTIONS

EVALUATION OF THE PERFORMANCES OF ARTIFICIAL BEE COLONY AND INVASIVE WEED OPTIMIZATION ALGORITHMS ON THE MODIFIED BENCHMARK FUNCTIONS Academc Research Internatonal ISS-L: 3-9553, ISS: 3-9944 Vol., o. 3, May 0 EVALUATIO OF THE PERFORMACES OF ARTIFICIAL BEE COLOY AD IVASIVE WEED OPTIMIZATIO ALGORITHMS O THE MODIFIED BECHMARK FUCTIOS Dlay

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Classifier Swarms for Human Detection in Infrared Imagery

Classifier Swarms for Human Detection in Infrared Imagery Classfer Swarms for Human Detecton n Infrared Imagery Yur Owechko, Swarup Medasan, and Narayan Srnvasa HRL Laboratores, LLC 3011 Malbu Canyon Road, Malbu, CA 90265 {owechko, smedasan, nsrnvasa}@hrl.com

More information

SCALABLE AND VISUALIZATION-ORIENTED CLUSTERING FOR EXPLORATORY SPATIAL ANALYSIS

SCALABLE AND VISUALIZATION-ORIENTED CLUSTERING FOR EXPLORATORY SPATIAL ANALYSIS SCALABLE AND VISUALIZATION-ORIENTED CLUSTERING FOR EXPLORATORY SPATIAL ANALYSIS J.H.Guan, F.B.Zhu, F.L.Ban a School of Computer, Spatal Informaton & Dgtal Engneerng Center, Wuhan Unversty, Wuhan, 430079,

More information

Clustering Algorithm of Similarity Segmentation based on Point Sorting

Clustering Algorithm of Similarity Segmentation based on Point Sorting Internatonal onference on Logstcs Engneerng, Management and omputer Scence (LEMS 2015) lusterng Algorthm of Smlarty Segmentaton based on Pont Sortng Hanbng L, Yan Wang*, Lan Huang, Mngda L, Yng Sun, Hanyuan

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

RAP. Speed/RAP/CODA. Real-time Systems. Modeling the sensor networks. Real-time Systems. Modeling the sensor networks. Real-time systems:

RAP. Speed/RAP/CODA. Real-time Systems. Modeling the sensor networks. Real-time Systems. Modeling the sensor networks. Real-time systems: Speed/RAP/CODA Presented by Octav Chpara Real-tme Systems Many wreless sensor network applcatons requre real-tme support Survellance and trackng Border patrol Fre fghtng Real-tme systems: Hard real-tme:

More information

VISUAL SELECTION OF SURFACE FEATURES DURING THEIR GEOMETRIC SIMULATION WITH THE HELP OF COMPUTER TECHNOLOGIES

VISUAL SELECTION OF SURFACE FEATURES DURING THEIR GEOMETRIC SIMULATION WITH THE HELP OF COMPUTER TECHNOLOGIES UbCC 2011, Volume 6, 5002981-x manuscrpts OPEN ACCES UbCC Journal ISSN 1992-8424 www.ubcc.org VISUAL SELECTION OF SURFACE FEATURES DURING THEIR GEOMETRIC SIMULATION WITH THE HELP OF COMPUTER TECHNOLOGIES

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Summarization and Matching of Density-Based Clusters in Streaming Environments

Summarization and Matching of Density-Based Clusters in Streaming Environments Summarzaton and Matchng of Densty-Based Clusters n Streamng Envronments D Yang Oracle Corporaton 1 Oracle Drve Nashua, NH, USA d.yang@oracle.com Elke A. Rundenstener Worcester Polytechnc Insttute 100 Insttute

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

On the Network Partitioning of Large Urban Transportation Networks

On the Network Partitioning of Large Urban Transportation Networks On the etwor Parttonng of Large Urban Transportaton etwors Hamdeh Etemadna and Khaled Abdelghany Abstract Ths paper ams at developng a traffc networ parttonng mechansm for dstrbuted traffc management applcatons.

More information

VFH*: Local Obstacle Avoidance with Look-Ahead Verification

VFH*: Local Obstacle Avoidance with Look-Ahead Verification 2000 IEEE Internatonal Conference on Robotcs and Automaton, San Francsco, CA, Aprl 24-28, 2000, pp. 2505-25 VFH*: Local Obstacle Avodance wth Look-Ahead Verfcaton Iwan Ulrch and Johann Borensten The Unversty

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Application of Learning Machine Methods to 3 D Object Modeling

Application of Learning Machine Methods to 3 D Object Modeling Applcaton of Learnng Machne Methods to 3 D Object Modelng Crstna Garca, and José Al Moreno Laboratoro de Computacón Emergente, Facultades de Cencas e Ingenería, Unversdad Central de Venezuela. Caracas,

More information

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults 1 An Improved Neural Network Algorthm for Classfyng the Transmsson Lne Faults S. Vaslc, Student Member, IEEE, M. Kezunovc, Fellow, IEEE Abstract--Ths study ntroduces a new concept of artfcal ntellgence

More information

KOHONEN'S SELF ORGANIZING NETWORKS WITH "CONSCIENCE"

KOHONEN'S SELF ORGANIZING NETWORKS WITH CONSCIENCE Kohonen's Self Organzng Maps and ther use n Interpretaton, Dr. M. Turhan (Tury) Taner, Rock Sold Images Page: 1 KOHONEN'S SELF ORGANIZING NETWORKS WITH "CONSCIENCE" By: Dr. M. Turhan (Tury) Taner, Rock

More information

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION 24 CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION The present chapter proposes an IPSO approach for multprocessor task schedulng problem wth two classfcatons, namely, statc ndependent tasks and

More information

Summarizing Data using Bottom-k Sketches

Summarizing Data using Bottom-k Sketches Summarzng Data usng Bottom-k Sketches Edth Cohen AT&T Labs Research 8 Park Avenue Florham Park, NJ 7932, USA edth@research.att.com Ham Kaplan School of Computer Scence Tel Avv Unversty Tel Avv, Israel

More information

an assocated logc allows the proof of safety and lveness propertes. The Unty model nvolves on the one hand a programmng language and, on the other han

an assocated logc allows the proof of safety and lveness propertes. The Unty model nvolves on the one hand a programmng language and, on the other han UNITY as a Tool for Desgn and Valdaton of a Data Replcaton System Phlppe Quennec Gerard Padou CENA IRIT-ENSEEIHT y Nnth Internatonal Conference on Systems Engneerng Unversty of Nevada, Las Vegas { 14-16

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information