Summarization and Matching of Density-Based Clusters in Streaming Environments

Size: px
Start display at page:

Download "Summarization and Matching of Density-Based Clusters in Streaming Environments"

Transcription

1 Summarzaton and Matchng of Densty-Based Clusters n Streamng Envronments D Yang Oracle Corporaton 1 Oracle Drve Nashua, NH, USA d.yang@oracle.com Elke A. Rundenstener Worcester Polytechnc Insttute 100 Insttute Road Worcester, MA, USA rundenst@cs.wp.edu Matthew O. Ward Worcester Polytechnc Insttute 100 Insttute Road Worcester, MA, USA matt@cs.wp.edu ABSTRACT Densty-based cluster mnng s known to serve a broad range of applcatons rangng from stock trade analyss to movng object montorng. Although methods for effcent extracton of densty-based clusters have been studed n the lterature, the problem of summarzng and matchng of such clusters wth arbtrary shapes and complex cluster structures remans unsolved. Therefore, the goal of our work s to extend the state-of-art of densty-based cluster mnng n streams from cluster extracton only to now also support analyss and management of the extracted clusters. Our work solves three major techncal challenges. Frst, we propose a novel mult-resoluton cluster summarzaton method, called Skeletal Grd Summarzaton (SGS), whch captures the key features of densty-based clusters, coverng both ther external shape and nternal cluster structures. Second, n order to summarze the extracted clusters n real-tme, we present an ntegrated computaton strategy C-SGS, whch pggybacks the generaton of cluster summarzatons wthn the onlne clusterng process. Lastly, we desgn a mechansm to effcently execute cluster matchng queres, whch dentfy smlar clusters for gven cluster of analyst s nterest from clusters extracted earler n the stream hstory. Our expermental study usng real streamng data shows the clear superorty of our proposed methods n both effcency and effectveness for cluster summarzaton and cluster matchng queres to other potental alternatves. 1. INTRODUCTION Motvaton. Mnng complex patterns such as clusters and graphs from huge volumes of streamng data has been recognzed as crtcal for numerous applcaton domans. To facltate such complex pattern mnng process, a streamng Ths work s supported by the NSF, under grants CCF , IIS and IIS Ths work s done when the author s workng at WPI. Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, to republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. Artcles from ths volume were nvted to present ther results at The 38th Internatonal Conference on Very Large Data Bases, August 27th - 31st 2012, Istanbul, Turkey. Proceedngs of the VLDB Endowment, Vol. 5, No. 2 Copyrght 2011 VLDB Endowment /11/10... $ pattern mnng system does not only need to be equpped wth hghly effcent pattern extracton algorthms, but more mportantly, t must also provde effectve pattern analyss support, as motvated below: 1) Pattern feature abstracton. The key features of detected patterns may be complex and thus may not be easly comprehensble for human analysts wthout analytcal assstance. For example, n real-tme traffc montorng, a cluster representng a congeston area n the traffc of Bejng may be composed of 10K or even more vehcles and may spread to over 10km 2. By smply lookng at the nformaton about ndvdual cluster members (vehcles), such as ther postons and movng speed, an analyst may not be able to dentfy the key features of ths cluster n real tme, such as where s the key bottleneck causng the congeston. 2) Pattern compresson. Some patterns need to be kept for long-term analyss, yet keepng the full representaton of the complex patterns tends to be mpractcal n streamng envronments. In the prevous example, storng the full representaton of the detected traffc congeston patterns (arbtrarly shaped clusters), namely the ndvdual cluster member tuples (tens of thousands tuples for each cluster) would cause not only a huge burden on the storage space but also low effcency for pattern transmsson. 3) Pattern retreval (matchng). For stream analyss, the archved patterns may need to be retreved based on ther features. Usng the above example, when a new traffc congeston arses, the analysts may ask whether smlar congeston patterns have been detected before. If yes, rather than fgurng out a new congeston-relef plan from scratch, the prevous proven-to-work soluton for such congeston patterns could be drectly appled. In short, an effectve pattern summarzaton method s the key for complex pattern analyss and management. It s needed for many dfferent aspects of pattern analyss, ncludng feature abstracton, compresson and pattern retreval (as mentoned above). Also, the pattern summarzatons can also be used for approxmated pattern representaton. For example, one can desgn pattern vsualzaton or full representaton re-generaton technques based on pattern summarzatons. In ths work, our goal s to desgn effectve summarzaton and matchng technques for densty-based clusters n streamng envronments, whch reman open problems for database communty. Sldng Wndow Semantcs. In ths work, we focus on densty-based cluster mnng n sldng stream wndows [7, 8, 16, 17]. In ths query semantcs, arbtrarly shaped 121

2 clusters are contnuously detected wthn the most recent porton of the stream. The traffc congeston montorng task dscussed above s an example that requres such query semantcs. Other applcatons that requre such query semantcs nclude detectng ntensve-transacton areas (clusters) n most recent stock trades, and dentfyng malcous attacks (clusters) n current network traffc. Challenges. Summarzaton and matchng of denstybased clusters s not only an unsolved but also a challengng problem. To serve real-tme streamng applcatons, the proposed technques must address the followng challenges: 1) Cluster summarzaton must be suffcently descrptve yet hghly compact. The cluster structure of a densty-based cluster s defned by a seres of densely populated sub-regons and as well as the connectons among them (See Fgure 1). Clearly, smple statstcal aggregatons, such as the centrod or mnmum boundng rectangle of a cluster, are nsuffcent for descrbng such complex pattern structure. 2) The cluster summarzaton process has to be hghly effcent. A system conductng expensve onlne clusterng can hardly afford addtonal system resources for summarzng clusters n real-tme. 3) The summarzed cluster representaton needs to be effectvely retrevable ( matchable ). The matchng process between cluster summarzatons ought to loyally reflect the smlarty between the orgnal clusters, yet be computatonally effcent. Proposed Soluton. To address the above challenges, we frst analyze densty-based cluster structures and dentfy ther key characterstcs, namely poston, shape, connectvty and densty dstrbuton. To capture these features, we nvestgate two commonly-used summarzaton prncples, namely the graph-based and the grd-based strateges, We dscover that nether of them alone s capable to provde an effectve summarzaton for densty-based clusters. Therefore, we propose a hybrd soluton, called Skeletal Grd Summarzaton (SGS). For descrptve power, SGS s shown to guarantee ts fdelty to the orgnal clusters on all key features. For compactness, our expermental study n Secton 8 confrms that even the SGS of the hghest resoluton acheves on average a 98% compresson rate of the full representaton of the clusters. Empowered by the proposed SGS summarzaton, we desgn a framework to support both contnuous cluster extracton and cluster matchng queres. A contnuous cluster extracton query n our system does not only extract clusters n ther full representaton (all cluster member objects) for onlne montorng purposes lke the other state-of-the-art technques [3, 16], but t also concurrently compacts them nto the SGS summarzaton. The full and the summarzed (SGS) representaton formats are complementary to each other, provdng a descrpton of the clusters at the ndvdual tuple and cluster feature level respectvely. To extract these two representaton formats smultaneously and n a hghly effcent manner, we propose an ntegrated cluster extracton + summarzaton algorthm, C-SGS. C-SGS ncrementally mantans both the full representaton and the correspondng SGS of the extracted clusters n an ntegrated manner. Ths results n an almost free cluster summarzaton generaton by pggy-packng the summarzaton process nto the cluster extracton process tself. Our expermental study n Secton 8 shows that C-SGS, whch returns clusters n both full and summarzed representaton (SGS), has a neglectable overhead, compared wth state-of-the-art algorthm Extra-N [16] computng the full representaton of clusters only. In all our test cases, the extra response tme of C-SGS compared wth Extra-N s consstently less than 6% (Secton 8.1). For any to-be-matched cluster specfed by the analyst, a cluster matchng query dentfes smlar clusters extracted earler n the same stream from a pattern archve. To support such queres, our framework frst archves the SGS of the extracted clusters nto a pattern archve. When executng a cluster matchng query, our system deploys a flterand-refne strategy. Frst, the flter-phase explots a feature ndex to locate the potental matchng canddates from the pattern store. Then, the refne-phase conducts a more detaled cluster match aganst these promsng canddates and returns those wth smlarty above a gven threshold. Our expermental study shows that, effcency-wse, our system takes only 3 seconds on average to answer a cluster matchng query aganst 10K archved clusters (Secton 8.2). Qualty-wse, our user study, whch nvtes human analysts to vsually compare the smlarty between matched clusters, shows that human analysts agree wth a sgnfcant larger percentage of the matched clusters found usng our proposed matchng mechansm compared to those found by alternatves (Secton 8.3). Contrbutons. The man contrbutons of ths work nclude: 1) We propose the frst summarzaton method specfcally desgned for densty-based clusters, namely the Skeletal Grd Summarzaton (SGS), 2) We present an ntegrated cluster mnng and summarzaton algorthm, C-SGS, whch effcently computes the full representaton and the SGS of the extracted clusters n one shot. 3) We develop a cluster matchng mechansm based on SGS to effcently processng cluster matchng queres n real-tme. 4) Our performance evaluaton and user study usng real streamng data confrm that our proposed technques are clearly superor to other alternatves n all aspects, ncludng summarzaton effcency, cluster matchng effcency and matchng qualty. 2. RELATED WORK The concept of densty-based clusterng was frst proposed n [8]. It has drawn sgnfcant research attenton [7, 16, 17, 12, 3, 4], because of ts capablty of dentfyng clusters wth arbtrary shapes and specfed densty. Prevous work manly studed how to effcently extract such clusters n statc [8, 7, 12] or streamng envronments [16, 17, 3, 4]. Also, gven the prevalence of real-tme montorng tasks n stream applcatons, researchers have started to desgn vsual platforms allowng human analysts to nteractvely explore such patterns n streams [14]. However, the fundamental problem of summarzng ths mportant pattern type has not been studed n the lterature yet. Wthout an effectve yet compact summarzaton method, each densty-based cluster has to be expressed by ts full representaton, namely ts cluster member objects. Obvously, such full representaton s nether succnct nor does t explctly reflect the features of each cluster. Ths causes serous nconvenence for both storage and analyss of densty-based clusters. Tradtonal clusterng methods [10, 19], such as k-mean style clusterng, treat clusters as statstcal phenomena. Therefore, many key features of the clusters, such as ther shapes and denstes, are summarzed usng a rather smplstc descrpton. In partcular, frst, these works assume clus- 122

3 ters are sphercally shaped. Therefore, the shape of a cluster s usually descrbed usng a smple centrod + radus formula. Second, the prevous work do not capture the nternal features of the clusters, such as how ts densty s dstrbuted. For example, the densty of a cluster s ether treated as unform or varyng along the radus only. Obvously, such smple formula cannot well descrbe the complex cluster structure of densty-based clusters. Ths s because both the shapes and densty dstrbutons of densty-based clusters can be arbtrary, not to menton the complex subregon connectvtes n each cluster. To the best of our knowledge, no summarzaton method has been specfcally desgned for densty-based clusters. For computng cluster summarzatons n streamng envronments, f the clusters are treated as statstcal phenomena, they are consdered to be aggregatable over tme [1, 5]. For example, [1] used one Cluster Feature Vector (CFV) to represent each mcro-cluster detected n the stream. They rely on the addtvty property of the CFV to aggregate the cluster features over tme and compare the features of a same cluster at dfferent tme ponts by subtractng ts CFVs on the correspondng tme ponts. However, the complex cluster structure of densty-based clusters s not smply aggregatable over the sldng wndows. The contnuous expraton of old objects and arrval of new objects at each wndow may cause complex cluster structural changes, such as merge and splt and connectvty changes wthn the clusters. Clearly, these changes cannot be smply captured by aggregaton results. Thus, these technques cannot effectvely capture the features of densty-based clusters wthn sldng wndows. 3. PRELIMINARIES 3.1 Densty-Based Clusterng n Wndows Densty-based cluster detecton [8, 7] uses a range threshold θ r 0 to defne the neghbor relatonshp between objects. For two objects p and p j, f the dstance between them s no larger than θ r, p and p j are sad to be neghbors. We use the functon NumNegh(p,θ r ) to denote the number of neghbors a object p has, gven the θ r threshold. Defnton 3.1. Densty-Based Cluster: Gven θ r and a count threshold θ c, an object p wth NumNegh(p,θ r ) θ c s defned as a core pont. Otherwse, f p s a neghbor of any core object, p s an edge pont. p s a nose pont f t s nether a core object nor an edge object. Two core objects p 0 and p n are connected, f they are neghbors of each other, or there exsts a sequence of core ponts p 0,p 1,...p n 1,p n, where for any wth 0 n 1, each par of core ponts p and p +1 are neghbors of each other. Fnally, a denstybased cluster s defned as a maxmum group of connected core objects and the edge objects attached to them. Any par of core objects wthn a cluster are connected. Fgure 1 shows an example of a densty-based cluster composed of 11 core objects (black) and 24 edge ponts (grey). We focus on perodc sldng wndow semantcs as proposed n CQL [2] and wdely used n the lterature [16, 17]. These proposed semantcs can be ether tme- or countbased. Each query has a wndow wth a fxed wndow sze wn and a fxed slde sze slde (ether a tme nterval or a tuple count). Clusters are generated for each wndow W Fgure 1: Defnton of Densty-Based Clusters only based on those data ponts that fall nto the same wndow W. Each cluster s returned as all ts cluster member objects assocated wth the same cluster dentfcaton. We call ths typcal output format the full representaton of each cluster. 3.2 Supported Queres and System Overvew Our system support two types of analytcal queres: Contnuous Clusterng Queres. A Contnuous Custerng Query returns both full (Secton 3.1) and summarzed representaton of the extracted clusters (Fgure 2). The desgn of our proposed cluster summarzaton format wll be ntroduced n Secton 4. DETECT DenstyBasedClusters f+s FROM stream USING θ range = r and θ cnt = c IN Wndows WITH wn = w and slde = s Fgure 2: Contnuous Cluster Extracton Query Returnng full (f) and summarzed (s) representatons of clusters Cluster Matchng Queres. Gven a user specfed to-be-matched cluster C, a cluster matchng query fnds clusters smlar to C that resde n the hstorcal pattern archve. We show a template of such a query n Fgure 3. GIVEN DenstyBasedCluster s C SELECT DenstyBasedCluster s C j FROM Hstory WHERE Dstance(C,C j) sm threshold Fgure 3: Cluster Matchng Query fndng Clusters Smlar to To-Be-Matched Cluster Based on Cluster Summarzaton The to-be-matched cluster can be any cluster specfed by an analyst. Typcally, t may be a cluster detected n the most recent porton of the stream that represent the newest characterstcs of the stream. The matched clusters, f any, wll be found n the hstorcal pattern store, whch archves the clusters extracted by Contnuous Clusterng Query earler n the stream. 3.3 System Overvew To support these two types of analytcal queres, we desgn a framework composed of four major components(fgure 4). Here we gve a bref overvew of the functonaltes of each 123

4 component, whle n-depth techncal detals are dscussed later n Sectons 5 to 7. Fgure 4: System Overvew The Pattern Extractor executes the Contnuous Cluster Extracton Query (Fgure 2) aganst the nput stream. It outputs both full and summarzed representatons of the extracted clusters. Both representatons are returned to the analyst for real-tme montorng. Meanwhle, the extracted clusters are also passed to the Pattern Archver for storage, and to Pattern Analyzer for cluster matchng. The Pattern Archver selectvely archves the newly detected clusters nto the Pattern Base. These archved clusters consttute the Stream Hstory avalable for subsequent Cluster Matchng Queres (Fgure 3). The Pattern Archver controls whch extracted clusters should be kept n the Pattern Base and at whch resoluton they should be archved. The Pattern Base organzes the archved clusters. To facltate cluster matchng aganst hstorcal clusters, t employs multple feature ndces to organze the archved clusters. Ths helps the Cluster Matchng Queres to quckly locate the potental matchng canddates. The Pattern Analyzer executes the Cluster Matchng Queres (Fgure 3). If an analyst s nterested n any newly extracted cluster and would lke to learn whether smlar clusters had been detected before n the Stream Hstory, she can submt her Cluster Matchng Query to the Pattern Analyzer to search for matches aganst the Pattern Base. 4. CLUSTER SUMMARIZATION 4.1 Features of Densty-Based Clusters Based on our analyss, we dentfy four key features that defne each densty-based cluster, whch can be dvded nto two categores, namely external and nternal features. External Features: Locaton: The locaton of a cluster ndcates ts poston n the data space. It provdes basc nformaton about each cluster, such as where a congeston area (a cluster) arses n the traffc, or n whch prce range an ntensve-transacton area, a cluster based on prce, volume and transacton tme, s detected n the stock transacton stream. Shape: Densty-based clusters can have arbtrary shapes. The shape s a key feature, because a certan shape of the cluster may convey specfc meanng for an applcaton. For example, for the clusters representng ntensve-transacton areas n stock transactons, a cluster havng a long spread on transacton prce but short range on transacton tme conveys that a large number of transactons of a certan stock happened n a short tme perod whle the prce of t fluctuated dramatcally wthn ths tme perod. Internal Features: Connectvty: The connectvty of a densty-based cluster descrbes how sub-regons wthn the cluster are connected. It s mportant for densty-based clusters for both defnton and applcaton reasons. Frst, t defnes nternal structure of each cluster. The defnton of the densty-based cluster (see Secton 3.1) reles on the connectvtes among sub-regons to defne a cluster. Second, the connectves among sub-regons may be relevant to applcatons. For example, f two sub-regons wthn a sngle cluster representng a group of movng troops are not drectly connected, then ths may ndcate the unts n these two sub-regons cannot drectly communcate wth each other, because there are no connected Head Nodes (core objects) n these two subregons of ther wreless network. Densty Dstrbuton: Although the defnton of denstybased clusters mposes a mnmal densty requrement on objects n acluster, the denstyof each cluster can be rather dverse across ts sub-regons. The densty dstrbuton wthn each cluster may be of an analyst s nterest n many applcatons. Usng the earler example, even n a sngle congeston area, the level of congeston (densty of vehcles) may vary among sub-regons. Therefore, the densty dstrbuton n each sub-regon may be the key for workng out a congeston relef plan, as the super dense sub-regons may be the areas that cause the congeston. 4.2 Intal Effort: Graph-Based Summarzaton Method Any effectve summarzed representaton for densty-based clusters has to capture the above four key features (Secton 4.1). Gven that densty-based clusters may vary arbtrarly n shape, connectvty and also densty dstrbutons, usng any aggregatve method to represent these features wll have rather poor descrptve power. Therefore, we propose to leverage an alternatve strategy, namely the dvde-andconquer approach. We dvde each cluster nto sub-regons, and then we descrbe not only the features n each sub-regon but also the nterrelatonshps among the sub-regons. Gven ths dvde-and-conquer strategy, we frst ntroduce a possble summarzaton method based on graph theory. Ths method uses one representatve object to represent each sub-regon. We call t Skeletal Pont Summaton (SkPS): Defnton 4.1. For each cluster C, the SkPS summarzaton of C s a graph G(V,E) composed of a mnmal set of connected core objects of C, called Skeletal Ponts as vertces V, whose neghborhoods together cover all the objects n ths cluster, and connectons among them as edges E. Thegraphcomposedofallcore objects nfgure1sanexample for SkPS. SkPS captures most of the cluster features and also has good compactness. However, t suffers from several serous shortcomngs. Frst, SkPS has lmted descrptve power for a cluster s densty dstrbuton. Second, such SkPS s not effcently computable. For each cluster, dentfyng ts SkPS s equal to the problem of dentfyng the connected domnant set n an undrected graph whch has been proven to be NP-complete [9]. Thrd, SkPS s not a vable soluton for matchng, because a sngle cluster may have multple SkPSs wth rather dfferent graph structures. Based on our analyss, these lmtatons suffered by SkPS are 124

5 caused by ts overlappng and non-determnstc sub-regon dvson strategy. In concluson, SkPS does not consttute an deal summarzaton for densty-based clusters. A more detaled dscusson of SkPS method can be found n our techncal report [18]. 4.3 Proposed Soluton: Skeletal Grd Summarzaton Method Bascs of Grd-Based Summarzaton. To solve the lmtatons suffered by SkPS, we propose to adapt SkPS by dvdng each cluster nto non-overlappng sub-regons. In partcular, we dvde the whole data space nto unformly szed grd cells. For each cluster, ts sub-regon dvson s now determned by the grd cells nto whch ts members fall. Therefore, a cluster C can be represented by all the grd cells contanng at least one of C s cluster member objects. Connectvty Preservaton. However, ths smplstc grd-based summarzaton lacks one key capablty of the SkPS soluton, namely t does not capture the connectvty wthn clusters. In SkPS, both the nner and nter sub-regon connectvty nformaton of each cluster s well preserved. Frst, each sub-regon n SkPS tself s well connected, as all objects n a sub-regon are neghbors of the same skeletal pont. Second, the nter connectons among dfferent subregons are explctly expressed by the edges n SkPS. Whle ths smplstc grd-based summarzaton preserve nether of these two types of connectvty nformaton. Connectvtes In Grd Cells. To solve ths problem, we propose to ntegrate the concept of connectvtes nto the grd-based soluton. As foundaton, we frst ntroduce the concept of status to a grd cell. We dvde the grd cells n each cluster s summarzaton nto two categores, namely core cells and edge cells. Defnton 4.2. Core cells: a core cell of a cluster C contans at least one core object (See Def. 3.1) of C. Edge cells: an edge cell of a cluster C contans no core object, but at least one edge object (See Def. 3.1) of C. Nose cells: a nose cell contans nether core nor edge objects of any cluster. 1 For nner-sub-regon connectons, we follow the basc prncple for the sub-regon dvson strategy, whch s to pursue homogenety n each sub-regon. In partcular, we pck a fne grd sze to guarantee that the objects that fall nto the same grd cell are neghbors of each other. More precsely, the dagonal of each grd s set to be equal to the range threshold θ r n the gven clusterng query (see Secton 3.1). Ths grd cell sze selecton wll be relaxed later n our dscusson of the mult-resoluton cluster summarzaton (Secton 6). Under ths fne grd sze selecton, the core and edge cells can be shown to have the followng propertes. Lemma 4.1. All objects n a core cell belong to the same cluster. Proof: Snce each core cell contans at least one core object and all the objects n each core cell are now neghbors of each other, t mples that all objects n the same core cell are neghbors of at least one common core object. Based on the defnton of densty-based cluster (see Def. 3.1), the neghbors of a core object belong to the same cluster. 1 nose grd are are only used n cluster computaton stage. Lemma 4.2. The number of objects n an edge cell must be less than the count threshold θ c n the clusterng query. Proof: We prove ths lemma by contradcton. Gven that all objects n a grd cell are neghbors of each other, f there are at least θ c objects n an edge cell, those objects would be core objects, as they all have at least θ c neghbors. Ths contradcts the defnton of edge grd (Def. 4.2). Gven these propertes, each grd cell s well-connected and consttutes a basc unt for the nter-grd connecton expresson, as defned below. For the nter-sub-regon connecton, we now defne the connectons between grd cells. Defnton 4.3. Two core cells ccl 1 and ccl 2 are drectly connected, f there exsts at least one core object p n ccl 1 and one core object p j n ccl 2 that are neghbors of each other. Two core cells ccl 0 and ccl n are connected, f they are drectly connected to each other, or there exsts a sequence of core cells ccl 0,ccl 1,...ccl n 1,ccl n, where for any wth 0 n 1, each par of core cells ccl and ccl +1 are drectly connected wth each other. An edge cell ecl s attached to a core grd ccl j, f there exsts at least one object p n ecl and one core object p j n ccl j that are neghbors of each other. Two edge cells are nether connected nor attached. Gven the connecton defnton for grd cells above, all core cells of a cluster C are connected to each other, and all edge cells are attached to at least one core cell of C. Skeletal Grd Summarzaton. Based on the status and connectons of grd cells, we now gve the defnton of our proposed Skeletal Grd Summarzaton (SGS) method. Defnton 4.4. A Skeletal Grd Summarzaton (SGS) of a densty-based cluster C s composed of all grd cells that contan at least one cluster member object of C. We call each grd cell n a SGS, a Skeletal Grd Cell (Sc) of C. SGS = {Sc 0,Sc 1,...Sc n}. Each Sc has fve attrbutes, namely SG = (locaton[], sdelength, populaton, status, connecton[]). 1) locaton vector: a sequence of values, each ndcatng the mnmum value on one of the dmensons covered by Sc. 2) sde length: the range of values on each dmenson. 3) populaton: the number of objects contaned by Sc 4) status: whether Sc s a core or edge cell. 5) connecton vector: a sequence of boolean connecton ndcators, each ndcatng Sc s connecton to one of ts adjacent skeletal grd cells. For any edge or nose cell, all connecton ndcators are false. For any core grd, a connecton ndcator s true f the correspondng adjacent skeletal grd cell Sc j s a core cell and Sc and SG j are drectly connected, or f SG j s an edge cell attached to SG. Fgure 5 shows an example of our proposed Skeletal Grd Summarzaton (SGS) for a 2D cluster. SGS acheves our goal of preservng all four features, as shown below. Lemma 4.3. Fdelty to Locaton and Shape: The data space covered by C.SGS s larger than that covered by the cluster member objects of C by a bounded error. Namely, any pont n the data space covered by C.SGS s at most θ r away from a cluster member object n C. Proof: The data space covered by C.SGS s composed of the unon of the space covered by all ts skeletal grd cells. 125

6 Observaton 5.1. The man tasks for both densty-based cluster extracton and SGS computaton are the same, namely to frst dentfy the connectons (neghborshps) among the objects and analyze them to form the cluster structures (n ether the full or a summarzed representaton). Fgure 5: Example of full representaton, basc SGS and compressed SGS of a 2D cluster Snce all member objects of C fall nto these grd cells, the data space covered by C.SGS s larger than that covered by C s member objects. Snce each skeletal grd cell n C.SGS contans at least one member of C, and the dagonal of each cell s θ r, any pont n the data space covered by a skeletal grd cell s at most θ r away from a member of C. Lemma 4.4. Fdelty to Densty Dstrbuton: For any sub-regon n a cluster C, whch s composed of n (n 1) grd cells, C.SGS can accurately express ts densty. Proof: Snce the skeletal grd cells n C.SGS don t overlap, the populaton recorded by each skeletal grd cell accurately reflects the number of objects n t. Therefore, for any sub-regon covered by the n skeletal grd cells belongng to C, we can accurately calculate ts densty by dvdng ts total populaton by ts total volume. Lemma 4.5. Fdelty to Connectvty: If there are two sub-regons n C connected through a connected core object path composed of n core objects, there must exst a core grd path connectng these two sub-regons wth at most n core cells on ths path. Proof: Snce any skeletal grd cell contanng a core object s a core cell, f there exsts a core object path between two sub-regons, there must exst a core cell path between them. In the worst case, each core grd on ths core grd path contans only one core object. Thus the length of the core grd path s at most equal to the length of the core object path. In concluson, SGS effectvely captures all key features of densty-based clusters usng a compact descrpton. 5. PATTERN EXTRACTOR Next, we ntroduce the pattern extractor that executes the Contnuous Clusterng Query (Secton 3.2), outputtng clusters n both full and summarzed (SGS) representatons. To provde such functonaltes, a straghtforward approach would be a two-stage process, namely cluster extracton followed by summarzaton. However, ths strategy causes a sgnfcant performance overhead compared to cluster extracton only. An n-depth analyss of such a two-phase strategy can be found n our techncal report [18]. 5.1 Proposed Soluton: Integrated Process To solve ths problem, we nstead propose an ntegrated strategy that ncorporates cluster extracton and summarzaton nto a sngle process. The key observaton that motvates ths ntegrated computaton method s gven below. Ths observaton reveals the key commonalty among the cluster extracton and summarzaton processes. Based on t, we desgn an ntegrated extracton+summarzaton method to effectvely share the neghborshp dentfcaton and cluster formaton processes. 5.2 Incremental Computaton and Challenges To avod conductng the prohbtvely expensve clusterng process from scratch at each wndow, our proposed method ncrementally mantans the cluster structures across the wndows. To realze ncremental computaton, we need to fnd an approprate meta-data that can be mantaned for both the full and summarzed cluster representatons. Our proposed soluton s that, besdes the raw data fallng nto each wndow, whch needs to be mantaned for cluster extracton n any case, we ncrementally mantan the skeletal grd cells n the data space. Wth updated skeletal grd cells, we can easly output both the summarzed and full representatons of detected clusters. Frst, based on connectons among the skeletal grd cells, we can easly determne the summarzed representaton SGS (a group of connected skeletal grd cells) for each cluster. Second, gven the SGS of a cluster C, C.SGS, we can fgure out the cluster member objects of C based on the objects fallng nto the respectve skeletal grd cells belongng to C.SGS. However, ncrementally mantanng skeletal grd cells n an effcent manner s a challengng task. In partcular, trackng the changes to the skeletal grd cells caused by expred objects can be extremely expensve n terms of system resource utlzaton, and thus consttutes the key performance bottleneck for skeletal grd cell mantenance. When an object p exp expres, t needs the connectons at the object level, to update the connectons among the skeletal grd cells. For example, when p exp expres, we frst need to know whch objects are neghbors of p exp, as ther neghborshps wth p exp wll end from now on. Ths may break the connectons between the skeletal grd cell Sc n whch p new resdes and those n whch p exp s neghbors resde. However, consderng the large amount of par-wse neghborshps that may exst among the objects, mantanng all of them has been shown to be extremely expensve n terms of system resource utlzaton, analytcally and expermentally [16]. Therefore, the straghtforward ncremental mantenance method, whch updates skeletal grd cells correspondng to each nserton and deleton, s not practcal. 5.3 lfespan" Analyss To solve ths computaton bottleneck, we present a skeletal grd cell mantenance method usng lfespan analyss. Ths method elegantly elmnates the need for handlng the mpact of expred objects on the skeletal grd cells. The soluton s based on the observaton that n the sldng wndow semantcs the lfespan of any object as well as the neghborshps among objects are determnstc. Therefore, at the nserton stage, when we handle the mpact of new objects on the skeletal grd cells, we take the lfespans of the objects nto consderaton. In partcular, we pre-determne the changes that wll happen to the skeletal grd cells when these 126

7 objects expre later. Then at the expraton stage, no further update s needed to handle the mpact of expred objects. Thus we avod the bottleneck dscussed above. Among the fve attrbutes of a skeletal grd cell, except locaton and sde length that are fxed over tme, the other three, namely populaton, status and connectons are changng over tme as the objects come and go wth each wndow slde. The populaton of each skeletal grd cell s easly trackable wth a smple object counter. Thus, we focus on the lfespan analyss of the status and the connectons. Bascs for lfespan Analyss. Frst, we start wth analyzng the lfespan of ndvdual objects. Observaton 5.2. Gven the slde sze Q.slde of a query Q and the startng tme of the current wndow W n.t start, the lfespan of an object p n W n wth tme stamp p.t s p.lfespan = p.t W n.t start, ndcatng that p Q.slde wll partcpate n wndows W n to W n+p.lfespan 1. The number of wndows that an object p can survve n s determned by after how many wndow sldes that p s tme stamp wll stll be greater than the startng tme of the wndow. Based on the lfespan of ndvdual objects, we analyze the lfespan of neghborshp between two objects. Observaton 5.3. Gven two objects p and p j, the neghborshp between them, Neghbor(p,p j) wll hold for Neghbor(p,p j).lfespan = Mn(p.lfespan,p j.lfespan) wndows, namely, t wll exst n all wndows from W n to W n+neghbor(p,p j ).lfespan 1 untl ether p or p j expres. Based on these observatons, we can further analyze the lfespan of dfferent stages of an object s career. Observaton 5.4. Gven an object p and all ts neghbors objects p nb1 to p nbk, the number of wndows n whch p wll be a core object p.core lfespan = Mn(p.lfespan, wn θ c ne), wth wn θ c ne the number of wndows n whch at least θ c objects wthn p nb1 to p nbk wll partcpate. The number of wndows n whch p wll be edge object p.edge lfespan = Mn[p.lfespan p.core lfespan, Max 1 j k (p nbj.core lfespan)] Bascally, an object wll be a core object n all the wndows that t has at least θ c neghbors. It wll be an edge object when t core object career ends (no longer has enough neghbors) but at least one of ts neghbors s stll a core object. lfespan at Grd Cell Level. To tackle skeletal grd cell mantenance, now we extend the concept of lfespan from the object level to the grd cell level. In partcular, we analyze how the lfespan of objects, ther neghborshps and ther career affects the lfespan of skeletal grd cell s status and connectons. For each skeletal grd cell Sc, we mantan one lfespan ndcator for Sc.status and one for each Sc.connectons[]. Each lfespan ndcates that, based on the objects n the current wndow, n how many future wndows the value of ths attrbute wll persst. These ndcators wll be updated as new objects arrve. Lemma 5.1. Status lfespan. Gven a skeletal grd cell Sc, all the objects p 0 to p n n Sc, the number of wndows n whch Sc wll be a core cell SG.core lfespan = Max 0 n (p.core lfespan) Lemma 5.1 can be deduced by defnton of a core cell (Def. 4.2). Namely, Sc s a core cell f t contans at least one core object. Lemma 5.2. Connecton lfespan. Gven two skeletal grd cells Sc and Sc j, and all objects n Sc, p sc 0 to p sc n, and all objects n Sc j, p sc j 0 to p sc j m, the number of wndows n whch Sc and Sc j wll be connected s defned as Connecton(Sc,Sc j).lfespan = Max[Mn(p sg a.core lfe span,p sg j b.core lfespan,neghbor(p sg a,p sg j b ).lfespan)], a [0,n], b [0,m]. Ths ndcates that two skeletal grd cells reman connected f at least one par of core objects, each from one skeletal grd cell, are neghbors to each other. Auxlary Meta-Data. To nsure that we only run one range query search (rqs) for each new object and never rerun rqs for exstng objects, we mantan an auxlary meta nformaton for each object n the wndow. In partcular, we mantan a non-core-career neghbor lst for each object p to store all p s neghbors n ts non core career. For example, p currently may have 100 neghbors. Based on the lfespan analyss, t wll be a core object for 3 wndows and then due to most of ts neghbors exprng, t wll become a edge object for 2 wndows before expraton. In ths case, the non-core-career neghbor lst of p only contans ts neghbors n the last 2 wndows of ts lfespan, say 5 objects. The non-core-career-neghbors of each object are mantaned n a dynamc hash table. The hash table of each object p s ntalzed to have n buckets, wth n the number of wndows that p can survve. The hash key of the table s the number of wndows that a neghbor object can survve. For example, when a data pont p fnds a non-core-careerneghbor p j, p j wll be added to the k th bucket of the hash table, wth k the number of wndows p j can stll survve (f k s larger than the number of buckets remaned on p, p j s put n the last bucket). At each wndow slde, we can smply remove the whole frst bucket of each remanng object, as all the neghbors n ths bucket must be expred after the wndow slde. The number of neghbors n such non-core-career neghbor lst s bounded by the constant θ c. Namely an object can never have more than θ c neghbors n ts non-core career, otherwse t would nstead be a core object n those wndows. Ths theoretcal bound guarantees the lghtness of ths auxlary meta-data. Also, t provdes all necessary access to the objects neghbors needed n our cluster extracton process. It thus guarantees that we only run the mnmum number of range query searches (one for each new object) durng the clusterng. 5.4 C-SGS Algorthm We call our proposed algorthm based on the mantenance of skeletal grd cells and lfespan analyss C-SGS. Intalzaton. For a contnuous clusterng query, at the ntalzaton stage, C-SGS bulds a grd-based ndex whose grd cell sze s equal to the sze of the fnest skeletal grd sze for ths query (see Secton 4). We assgn to each grd cell n ths ndex the same attrbutes as the skeletal grd cells, whle we set ther status to be nose, densty to be 0, and connectons to be all false ntally. Handlng Insertons. For each new object p new nserted nto the wndow, C-SGS frst loads t nto ts correspondng skeletal grd cell based on ts poston n the data 127

8 space. Then, we run a range query search for p new to dentfy p new s neghbors. Based on the lfespan of p new and ts neghbors (Lemma 5.2), we can determne the lfespan of the neghborshps among them (Lemma 5.3), as well as the lfespan of dfferent stages of p news career (Lemma 5.4). Usng ths nformaton, we can now update the status and connectons of the skeletal grd cells n whch p new falls nto and n whch ts neghbors resde. For status of skeletal grd cells, the nserton of a new object may only cause two types of changes. Namely, t may promote the skeletal grd cells to become core cells or prolong ther core cell lfespans. status promoton: A new object p new may promote the skeletal grd cell Sc that t resdes n to become a core cell, f t becomes the frst core object n Sc. In ths case, we set the status of Sc to core cell and set ts core cell lfespan equal to the core object lfespan of p new. An example of ths case s shown Case 1 of status promoton n Fgure 6. p new may also cause a status change of a skeletal grd cell by upgradng ts non-core-object neghbors, whch resde n these affected skeletal grd cells, to core objects. In ths case, for each upgraded neghbor p upg of p new, we frst determne the lfespan of p upg s career by analyzng tself and ts neghbors. As every p upg was a non-core object, the non-core-career neghbor lst wll help us to quckly access all ts neghbors wthout runnng range query search agan. Thus, we update the status of the skeletal grd cells n whch p upg resdes to core cell and set ts core grd lfespan equal to the core object lfespan of p upg. Correspondngly, the non-core-career neghbor lst of each p upg also needs to be updated to exclude those objects that wll only be neghbor of p upg n ts core object career. An example of ths case s shown n Case 2 of status promoton n Fgure 6. status prolong: A new object p new may prolong the core cell lfespan of the skeletal grd cell Sc n whch t resdes, f p news core object lfespan s longer than that of any exstng object nsc. Inthscase, we setsc score cell lfespan equal to the core object lfespan of p new. An example of ths case s shown n Case 1 of status prolong n Fgure 6. p new may also prolong the core cell lfespans of the skeletal grd cells by extendng p new s neghbors core object lfespan. For each p new s neghbor whose core object lfespan s extended because of p new s arrval, p cole, we frst determne how long ts core object lfespan s extended, by analyzng t would have at least θ c neghbors n how many more wndows after p new jonng ts neghborhood. Then, we update the core cell lfespan of the skeletal grd cell n whch each p cole resdes to the core object lfespan of the correspondng p cole, f the later s longer. An example of ths case s shown n Case 2 of status promoton n Fgure 6. For connectons of skeletal grd cells, the nserton of a new object may only cause two types of changes. Namely, t may buld new connectons between skeletal grd cells or prolong the lfespan of exstng connectons. The mantenance process of the connectons follows the same prncples used n status mantenance logcs (detals omtted here for space reasons but can be found n [18]). Handlng Expratons. By usng the lfespan analyss technque ntroduced above, the mpact to the skeletal grd cells that could be caused by exprng objects has been prehandled when objects arrve. Therefore, no mantenance effort s needed for handlng cluster structure changes when ndvdual objects expre. After the wndow sldes, the only Fgure 6: Examples of updatng cell status. θ c = 4, grey crcle=edge pont, black crcle=core pont, number on each object= number of wndows the object can survve. update needed for the attrbutes of skeletal grd cells s to check whether the new wndow s out of the lfespans. If the new wndow s out of ts core cell lfespan, ts status needs to be set back to edge cell. If the new wndow s out of the lfespan of any of ts connectons, the correspondng connecton needs to be set back to false. Output Stage. At the output stage, the updated skeletal grd cells can be vewed as the vertces V n a graph G, and the connectons among them can be vewed as the edges E among the vertces. Therefore, we smply conduct a depth frst search on all the core cells to collect dfferent groups of connected core cells and the edge cells attached to them. Each connected group of skeletal grd cells consttutes the SGS summarzaton of a cluster C, C.SGS. Gven C.SGS, the full representaton of C can be easly fgured out by collectng all objects covered by core cells n C.SGS and those covered by the edge cells n C.SGS and connected to at least one core object n C.SGS s core cells. 6. PATTERN ARCHIVER The pattern archver handles two major tasks, namely pattern compresson and selectve pattern archval. 6.1 Mult-Resoluton Cluster Summarzaton Our proposed cluster summarzaton SGS supports multple resolutons. In general, the SGS n dfferent levels of resoluton follow the same desgn as presented n Secton 4. An SGS of any resoluton s composed of a sequence of skeletal grd cells, and each skeletal grd cell has the same 5 attrbutes ntroduced before. For any cluster C x, the SGS of C x formed by the Pattern Extractor s based on the fnest granularty, namely the smallest skeletal grds cells. Thus t s of the fnest resoluton. We call such SGS the Basc SGS of C x. The SGS n coarser resolutons are bult based on herarchcally combnng the Basc SGS. For a cluster C x, we say that the Basc SGS of C x s at Level 0 of the resoluton herarchy, noted as C x.sgs L 0. Any SGS n a coarser resoluton s at a Level n denoted as C x.sgs Ln. Each skeletal grd cell n C x.sgs Ln (n > 0), C x.sc Ln s formed by combnng the skeletal grd cells wthn a certan (θ)szedhypercubespacenc x.sgs L n 1. Forexample, a2- dmensonal cluster C x has SGS n two resolutons (Fgure 5). They are at Levels 0 and 1. If the compresson rate 128

9 θ = 3, each skeletal grd cell of SGS at Level 0 s made by combnng 3 2 adjacent skeletal grd cells at Level 1. Both the number of resolutons allowed and the parameter θ are part of the confguraton of our system. Such compresson process of buldng C x.sgs Ln can be fnshed wth a sngle scan of the skeletal grd cells n C x.sgs L n 1. Gven C x.sgs L n 1 and to buld C x.sgs Ln, we frst generate a set of skeletal grd cells for C x.sgsn L to cover the whole data space occuped by correspondng cells n the C x.sgs L n 1. Then we set the fve attrbutes for C x.sc Ln. The sde length of any C x.sg Ln s smply equal to the sde length of a skeletal grd cell at Level n-1 tmes θ. Any C x.sc Ln s a core cell f at least one C x.sc L n 1 covered by t s a core cell. Otherwse, t s an edge cell. The populaton of any C x.sc Ln s equal to the sum of the populaton of the C x.sc L n 1 s covered by t. The connecton vector of a C x.sc Ln s decded by the connectons between the boundary C x.sc L n 1 s covered by t and those covered by ts adjacent cells at level n-1. Budget- and Accuracy-Aware Resoluton Selecton. Gven the multple resoluton choces, the Pattern Archver can decde n whch resoluton to archve the patterns based on both the system-resource budget and the accuracy requred by the specfc analytcal tasks. In our SGS desgn, for a cluster summarzaton at a certan resoluton, both ts space consumpton and concseness are determnstc and easly calculatable. For space consumpton, gven the basc SGS of a cluster extracted, we can easly determne the number of skeletal grd cells needed n any other resoluton for the same cluster, by calculatng how many grd cells at that resoluton are needed to cover the same data space. Snce the SGS at dfferent resolutons have the same desgn, the amount of nformaton carred by each skeletal grd cell n any resoluton s fxed. Thus, one can easly determne how much storage space s needed exactly for a gven cluster n any resoluton. For accuracy, as the sze of the skeletal grd cells at all resolutons are known, the analysts would know exactly the granularty that ther analytcal task wll be workng on for a certan resoluton. 6.2 Selectve Pattern Archvng The Pattern Archver also selectvely pcks whch clusters to archve. Currently, our system supports several smple but useful cluster selecton mechansm, ncludng usng samplng technques to select certan numbers of clusters to archve n a perod of tme and usng feature selecton to only archve clusters wth certan features (e.g. only archve the clusters reachng a certan populaton or volume). More sophstcated pattern selecton technques, such as evoluton drven technques, wll be studed n our future work. 7. PATTERN STORAGE AND MATCHING 7.1 Pattern Organzaton n Pattern Base Our proposed cluster summarzaton method SGS empowers us to easly organze the extracted clusters based on ther features. In partcular, we buld two ndces for the archved clusters. One s based on the poston of each cluster, and the second s based on all other features of each cluster captured n SGS. We call the frst ndex the locatonal feature ndex. As mult-dmensonal objects, we express the poston of each cluster usng ts mnmum boundng rectangle (MBR). In our system, we employ one of the most wdely used ndces for MBRs, namely the R-tree ndex to organze them. The second ndex, called the non-locatonal feature ndex, organzes the clusters based on ther non-locatonal features. We use a four-dmensonal grd ndex to organze the clusters SGS, wth the four dmensons: the volume(number of skeletal grd cells, the status count (number of core cells), the average densty and the average connectvty of each cluster. 7.2 Cluster Matchng Process The Cluster Matchng Queres (see Fgure 3) are executed by the Pattern Analyzer. To execute such queres, we frst provde a dstance metrc (between 0-1) to measure the dstance between two clusters. The metrc s user-customzable based on applcaton semantcs. Dst(C a,c b ) = ps Dst locaton + X w Dst nlf (C a,c b ) ps,dst locaton = 0 1, w,dst nlf = [0,1], X w = 1) Inthsdstancemetrc, Dst locaton ndcatesthatwhether two clusters overlap (1) or not (0). ps ndcates whether the matchng s poston-senstve (1) or not (0). Dst nlf represents the dstance of two clusters on a specfc nonlocatonal feature and w represents the analyst-specfed weght on ths feature. To use ths dstance metrc, the analyst needs to frst specfy whether the matchng requred by her applcaton s poston-senstve, namely whether the matched clusters have to overlap n the data space. For the poston-senstve applcatons, ps = 1. If two clusters are not overlapped, Dst locaton (C a,c b ) = 1, the largest possble dstance between two clusters, ndcatng that the two clusters are not smlar and no further comparson on other features wll be needed. For the non-poston-senstve applcatons, snce ps = 0, the locatonal dstance between two clusters s consdered to be 0. The second part of the dstance metrc measures the dstance between two clusters on the four non-locatonal features, namely volume, status, populaton and connectvty. The dstance on these features are used n both the match canddate search and detaled cell level cluster match. Canddate Search. Gven a to-be-matched cluster, a customzed dstance metrc and a dstance threshold specfed by the analyst, our system frst searches the potental match canddates n the Pattern Base. In the postonalsenstve case, the Pattern Analyzer frst searches the locatonal feature ndex for the canddate clusters. If any overlapped clusters are found, t wll calculate ther nonlocatonal dstance wth the to-be-matched clusters, and returns the smlar clusters f the dstances are smaller than the threshold. In the non-poston-senstve case, the Pattern Analyzer drectly searches aganst the non-locatonal feature ndex for the canddates. Gven the dstance metrc and the dstance threshold, the Pattern Analyzer can determne the range of the search on each dmenson (feature). For example, gven the volume of the to-be-matched cluster equal to 20, the weght on sze dstance s 0.20, the overall dstance threshold s 0.2, the volume of the canddate clusters have to be between 14 and 30. Ths s because any other number x < 14 x > 30wll makeabs(x 20)/mn(x,20) > (0.2/0.4), whch wll defntely not fulfll the search cretera. The same prncple can be used on other features to determne the 129

10 range of search. Gven the search ranges on all dmensons, the Pattern Analyzer can quckly narrow down the canddate clusters to a small subset by searchng the feature ndex. Grd Cell Level Cluster Match. Gven a to-bematched cluster and a match canddate cluster for t, grd cell level cluster match compares the features of two clusters n ther correspondng sub-regons (skeletal grd cells). In partcular, grd cell level match uses the same customzable dstance metrc ntroduced earler, whle the dstance between two clusters s now measured by aggregatng the dfference between all the correspondng skeletal grd cell pars n these two clusters. More precsely, gven a certan algnment between two clusters C a and C b, 2 each skeletal grd cell Sc n C a may have a correspondng skeletal grd cell n Sc j, dependng on whether ts correspondng sub-regon s also covered by Sc j. If Sc has a correspondng skeletal grd cell Sc j n C b, ther dfference can be measured by comparng ther status, densty and connectvty features. Otherwse, Sc s assgned the maxmum dfference wth ts correspondng sub-regon, whch s not a part of C b and thus can vewed as an empty grd. When calculatng the dstance between two clusters C a and C b. we sum the dfference between each Sc n C a and ts correspondng sub-regon n C b to form the overall dstance between the two clusters. In the poston-senstve cases, no algnment s needed, or n other words, the algnment vector s always equal to [0,0,...,0]. Ths s because such applcatons requre any skeletal grd cell Sc n C a to be matched wth the skeletal grd cell Sc j n C b that have the same absolute poston n the data space. Therefore n such cases, we only need a sngle scan on the skeletal grd cells n two clusters to calculate the dstances between them. In the non-poston-senstve case, one or more algnments that mnmze the dstance between two clusters may exst. When gven suffcent computaton tme, such as n an offlne computaton, one could apply an exhaustve search to fnd such an optmal algnment. In our system, for onlne computaton, we use an A* style anytme search algorthm to search for the best algnment wthn a certan computaton tme budget. In partcular, we start wth an algnment that makes two clusters well overlapped. Then we contnuously search along the drecton of the most promsng nearby algnment, whch gves the smallest dstance so far. When the gven computaton tme budget s reached, we stop searchng and return the smallest dstance found so far as the dstance between the two clusters. 8. EXPERIMENTAL EVALUATION We conducted our experments on a Dell desktop wth an Intel Core2 2.2GHz processor and 3GB memory, runnng Wndows 7 professonal. We mplemented the algorthms n VC Real Datasets. We used two real streamng datasets n our experments. The frst dataset, GMTI (Ground Movng Target Indcator) [6], records the real-tme nformaton on movng objects gathered by 24 dfferent ground statons or arcraft n 6 hours from JontSTARS. It has around 100,000 2 An algnment for two Skeletal Grd Summarzatons (SGS) s a locaton shftng vector. For example, gven two three dmensonal clusters C a and C b, an algnment equal to [1,2,1] ndcates that any skeletal grd cell n C a wth locaton vector equal to [x,y,z] corresponds to a skeletal grd cell n C b wth locaton vector equal to [x+1,y+2,z+1], f any. records regardng the nformaton on vehcles and helcopters (speed rangng from mph) movng n a certan geographc regon. The second real dataset we use s the Stock Tradng Traces data (STT) from [11], whch has one mllon transacton records throughout the tradng hours of a day. For the experments that nvolve data sets larger than the szes of these two datasets, we append multple rounds of the orgnal data vared by settng random dfferences on all attrbutes, untl t reaches the desred sze. Alternatve Summarzaton Formats. We compare our proposed Skeletal Grd Summarzaton (SGS) wth three alternatve cluster summarzaton formats. 1) The tradtonal Centrod-Radus-Densty summarzaton (CRD). 2) Random Samplng Summarzaton(RSP). RSP for each cluster s generated by samplng the cluster members at a certan samplng rate R. To compare RSP wth our proposed SGS summarzaton, for each specfc cluster n the experment, R s always controlled to let ts RSP have the same memory consumpton wth the SGS for the same cluster. 3) Skeletal Pont Set (SkPS) summarzaton, our ntal cluster summarzaton desgn proposed n Secton Effcency of Cluster Extracton + Summarzaton In ths experment, we evaluate that how many system resources are needed to generate the alternatve cluster summarzatons respectvely. Snce our proposed soluton, C- SGS, ncorporates cluster extracton and summarzaton nto a sngle process, we compare ts performance wth the followng alternatves. 1) Extra-N: Extract clusters usng stateof-the-art algorthm Extra-N [16] but do not generate any cluster summarzaton. 2) Extra-N + CRD: Extract clusters usng Extra-N and then generate CRD for each extracted cluster. 3) Extra-N + RSP: Extract clusters usng Extra-N and then generate RSP for each extracted cluster. 4) Extra- N + SkPS: Extract clusters usng Extra-N algorthm and then generate (approxmated) SkPS for each cluster usng MG algorthm proposed n [9]. We frst run each alternatve method aganst the STT stream to extract clusters based on four dmensons, namely the transacton type (buy/sell), prce, volume and tme. To compare the performance of the alternatves when handlng clusters wth dfferent characterstcs, we use three dfferent query parameter settngs, namely case 1: (θ r = 0.05, θ c = 10), case 2: (θ r = 0.1, θ c = 8), case 3: (θ r = 0.2, θ c = 5). Also, for each case, we use three dfferent wndow parameter settngs, namely we fx the wndow sze (wn) for all three settngs at 10K tuples, whle varyng the slde sze slde to equal to 0.1K, 1K and 5K tuples respectvely. For each case, we frst verfy the correctness 3 of our proposed C-SGS cluster extracton method by comparng the clusters extracted by t n full representaton wth those extracted by the state-of-the art technque Extra-N. In all the test cases, we found that the clusters extracted by C- SGS are dentcal wth those extracted by Extra-N. For effcency, we measure two major performance metrcs for stream processng: 1) The average response tme for each wndow (denoted as Response Tme). For each wndow, we measure the average CPU tme elapsed from the tme that all new data have arrved to the tme that all clusters have been output n both the full and summarzed 3 All clusterng algorthms followng defnton n [8] should produce the same clusterng results gven a same nput object sequence. 130

11 representaton. The average response tme for each wndow shown n all cases are averaged among runnng for 10K wndows. 2) The memory footprnt, namely the peak memory utlzaton of each alternatve, among the 10K wndows. As shown n Fgure 8.1, compared to Extra-N, whch extracts clusters only but does not generate any cluster summarzaton (the baselne case), the other four alternatves, each generatng a specfc type of cluster summarzaton, exhbt some overheads n terms of CPU tme utlzaton. However, such overhead caused by C-SGS, Extra-N + CRD, and Extra-N + RSP, s very modest, f not neglectable. The reason for such modest overhead caused by Extra-N + CRD and Extra-N + RSP s obvous. Ths s because CRD and RSP are very smple summarzaton formats that can easly be generated by at most two scans of the cluster members of each cluster. The overhead caused by our proposed soluton C-SGS s comparable wth those two smple summarzaton methods. Ths s because the major computaton needed for generatng the SGS cluster summarzaton, namely determnng the status and connectons among skeletal grd cells, s elegantly pggy-backed by the cluster extracton process tself. The CPU overhead of Extra-N + SkPS s sgnfcantly hgher than that of the other alternatves. Ths s because generatng SkPS s very expensve computatonally [9]. For dfferent wndow parameter settngs, C-SGS has lower overhead for the settngs wth larger wn/slde rates. Ths s because the performance of Extra-N s affected by the ncreasng number of vews that needs to be mantaned, whch s equal to wn/slde (see [16] for detals), whle the meta-data mantaned by C-SGS and the correspondng mantenance effort s ndependent from ths rato. Memory-wse, as shown n(fgure 8.1), our proposed method C-SGS also exhbts very lmted overhead n all test cases. Ths s because the process of generatng SGS happens n place wth the cluster extracton process. Smlar performances are also observed n the same experments but usng GMTI data. We have also conducted an experment showng the superorty of our proposed method when usng tme-based wndows and under fluctuatng nput rate. The detals of these experments mentoned can be found n our techncal report [18]. In concluson, usng our proposed C-SGS soluton, we can effcently generate the Skeletal Grd Summarzaton (SGS) for extracted clusters durng onlne clusterng process, wth very lmted system resource overhead. 8.2 Effcency of Cluster Matchng Queres Next, we study the performance for runnng the cluster matchng queres usng our proposed summarzaton format SGS and other alternatve summarzaton formats. We run three queres usng the same pattern parameter settngs as used n the prevous experment but wth the same wndow parameter settng (Wn = 10K, Slde = 1K) aganst the STT data usng our proposed C-SGS method. We vary the szeofthepatternbase equalfrom0.1k,1kand10krespectvely. In each test case, we run each clusterng query and archve all the clusters detected nto the Pattern Base untl the requred number of archved clusters s reached. For each archved cluster, we also generate and keep the other three alternatve cluster summarzaton formats for evaluatng other matchng methods. Once the requred number of clusters s archved, we stop archvng and randomly pck Fgure 7: CPU tme and Memory comparsons for generatng alternatve summarzatons. 100 newly detected clusters as to-be-matched clusters. For each to-be-matched cluster, we run four matchng queres for t aganst the archved clusters, each usng one alternatve cluster summarzaton method and the correspondng dstance metrc. In partcular, we mplement a subtracton functon to measure the dstance between the CRD of two clusters, whch gves equal weght to the three cluster features captured n CRD, namely the centrod, range and densty. We use the subset matchng algorthm presented n [15] to calculate the dstance between the RSP of two clusters. We use the graph edt dstance algorthm presented n [13] to calculate the dstance between the SkPS of two clusters. We gve equal weght to all four features when measurng the dstance between the SGSs of two clusters. For each Pattern Base sze, we measure the average response tme for all cluster matchng queres and memory space consumed by storng cluster summarzatons. As shown n Fgure 8, when matchng aganst 0.1K clusters, the average response tme for each cluster matchng query usng SGS s less than 0.1 second. For the 1K and 10K cases, the average response tme for our soluton s only around 0.5 seconds and 3 seconds. Such hgh effcency s comparable wth cluster matchng usng CRD, whch s very fast because of ts extremely smple matchng mechansm (smply three subtracton operatons). Ths s due to the desgn of SGS, whch effectvely summarzes the key features of each cluster on both cluster and grd levels. In partcular, by usng our proposed two-phase matchng strategy, the majorty of the canddates n the pattern base are fltered out n the summarzaton matchng phase. Thus, the more expensve grd level matchng s only needed for a very small porton of the canddates. In our experment, we found that only 6% of the canddate clusters necesstated the grd level match on average durng the cluster matchng process. Memory-wse, SGS consumes only 0.12M, 1.38M and 12.24M memory space to store 0.1K, 1K and 10K clusters respectvely (Fgure 8). In partcular, each 4-dmensonal skeletal grd cell only consumes 23 bytes, poston: 16 bytes (4 ntegers), status: 1 byte (1 boolean), densty: 4 bytes (1 nteger), connecton: 2 bytes (2 4 = 16 booleans). In all test cases, the average number of skeletal grd cells n each clusters 68. Therefore, only1.5k memorysneededtostore the SGS of each cluster on average. Compared wth the memory space needed for storng the full representaton of the clusters, whch need 6.4M, 75.2M and 680.2M to store 0.1K, 131

12 1K and 10K clusters respectvely, the average compresson rate of SGS n our experment s around 98%. In concluson, our proposed soluton demonstrates hgh effcency for cluster matchng queres, whch s sgnfcantly better than matchng SkPS or RSP. Its performance s comparable wth matchng smple CRD cluster summarzaton. However, matchng CRD s shown to have a much worse cluster matchng qualty compared wth our proposed method of matchng SGS (see next experment below). Fgure 8: CPU tme and Memory comparson for cluster matchng queres usng alternatve cluster summarzaton methods 8.3 Qualty of Cluster Matchng To measure the qualty of cluster matchng usng alternatve summarzaton formats, we nvted 20 human analyts (all WPI graduate students) to vsually analyze the smlarty between the to-be-matched cluster and the matched clusters found for them usng one alternatve method. The analyss process s supported by VStream [14], a freeware multvarate data vsualzaton tool, whch has been shown to be effectve for helpng human analysts to observe and understand mult-dmenstonal clusters n streams. For each to-be-matched cluster, the analysts are asked to rate the top three smlar clusters found by each summarzaton format nto three categores, namely very smlar, smlar, and not smlar. Fgure 9: Smlar rate gven by users for matched clusters found by alternatve summarzatons As shown n Fgure 8.3, our proposed summarzaton method SGS demonstrates a hgh smlar rate, whch s sgnfcantly better than all the other alternatves. Ths ndcates that the human analysts agree wth most of the smlar clusters found usng SGS, whle dsagreeng on a large percentage of those found usng other alternatves. Ths shows the hgh effectveness of SGS summarzaton n terms of cluster matchng. Due to page lmt, the detaled expermental setup and result analyss of ths experment are omtted here but can be found n our techncal report [18]. We also conducted a seres of experments to confrm both the effcency and effectveness of cluster matchng queres when usng SGS wth dfferent resolutons. The detals of those experments can be found n our techncal report [18]. 9. CONCLUSION In ths work, we present a framework to support summarzaton and matchng of densty-based clusters n streamng envronments. Frst, our work solves several open problems for densty-based cluster analyss, namely, desgnng a descrptve yet compact summarzaton method for such clusters. Second, we present an effcent computaton strategy to quckly summarze the detected clusters nto SGS durng the onlne clusterng. Lastly, we desgn a cluster archvng and matchng mechansm, whch allows the analysts to submt cluster matchng queres to fnd smlar clusters detected earler n the stream hstory. Our expermental study demonstrates the clear superorty of our proposed methods on both the effcency and effectveness. 10. REFERENCES [1] C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clusterng evolvng data streams. In VLDB, pages 81 92, [2] A. Arasu, S. Babu, and J. Wdom. The cql contnuous query language: semantc foundatons and query executon. VLDB J., 15(2): , [3] F. Cao, M. Ester, W. Qan, and A. Zhou. Densty-based clusterng over an evolvng data stream wth nose. In SDM, pages , [4] Y. Chen and L. Tu. Densty-based clusterng for real-tme stream data. In KDD, pages , [5] B.-R. Da, J.-W. Huang, M.-Y. Yeh, and M.-S. Chen. Adaptve clusterng for multple evolvng streams. IEEE Trans. Knowl. Data Eng., 18(9): , [6] J. N. Entzmnger, C. A. Fowler, and W. J. Kenneally. Jontstars and gmt: Past, present and future. IEEE Trans on Aero and Elec Sys, 35(2): , [7] M. Ester, H. Kregel, J. Sander, M. Wmmer, and X. Xu. Inc. clusterng for mnng n a data warehousng envronment. In VLDB, pages , [8] M. Ester, H. Kregel, J. Sander, and X. Xu. A densty-based algorthm for dscoverng clusters n large spatal databases wth nose. In KDD, pages , [9] S. Guha and S. Khuller. Approx. algo. for connected domnatng sets. Algorthmca, 20: , [10] J. A. Hartgan and M. A. Wong. A k-means clusterng algorthm. Appled Statstcs, 28(1), [11] I. INETATS. Stock trade traces. [12] L. Lels and J. Sander. Sem-supervsed densty-based clusterng. In ICDM, pages , [13] M. Neuhaus, K. Resen, and H. Bunke. H.: Fast suboptmal algorthms for the computaton of graph edt dstance. In SSSPR, pages , [14] D. Yang, Z. Guo, Z. Xe, E. A. Rundenstener, and M. O. Ward. Interactve vsual exploraton of neghbor-based patterns n data streams. In SIGMOD, pages , [15] D. Yang, E. A. Rundenstener, and M. O. Ward. Nugget dscovery n vsual exploraton by query consoldaton. In CIKM, pages , [16] D. Yang, E. A. Rundenstener, and M. O. Ward. Neghbor-based pattern detecton for wndows over streamng data. In EDBT, pages , [17] D. Yang, E. A. Rundenstener, and M. O. Ward. A shared executon strategy for multple pattern mnng requests over streams. PVLDB, 2(1): , [18] D. Yang, E. A. Rundenstener, and M. O. Ward. Summarzaton and matchng of complex patterns n streamng envronment. WPI-CS-TR-11-04, dyang/wpicstr1104.pdf. [19] T. Zhang, R. Ramakrshnan, and M. Lvny. Brch: an effcent data clusterng method for very large databases. In ACM SIGMOD, pages ,

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

SAO: A Stream Index for Answering Linear Optimization Queries

SAO: A Stream Index for Answering Linear Optimization Queries SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

SCALABLE AND VISUALIZATION-ORIENTED CLUSTERING FOR EXPLORATORY SPATIAL ANALYSIS

SCALABLE AND VISUALIZATION-ORIENTED CLUSTERING FOR EXPLORATORY SPATIAL ANALYSIS SCALABLE AND VISUALIZATION-ORIENTED CLUSTERING FOR EXPLORATORY SPATIAL ANALYSIS J.H.Guan, F.B.Zhu, F.L.Ban a School of Computer, Spatal Informaton & Dgtal Engneerng Center, Wuhan Unversty, Wuhan, 430079,

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Constructing Minimum Connected Dominating Set: Algorithmic approach

Constructing Minimum Connected Dominating Set: Algorithmic approach Constructng Mnmum Connected Domnatng Set: Algorthmc approach G.N. Puroht and Usha Sharma Centre for Mathematcal Scences, Banasthal Unversty, Rajasthan 304022 usha.sharma94@yahoo.com Abstract: Connected

More information

Video Proxy System for a Large-scale VOD System (DINA)

Video Proxy System for a Large-scale VOD System (DINA) Vdeo Proxy System for a Large-scale VOD System (DINA) KWUN-CHUNG CHAN #, KWOK-WAI CHEUNG *# #Department of Informaton Engneerng *Centre of Innovaton and Technology The Chnese Unversty of Hong Kong SHATIN,

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment Journal of Physcs: Conference Seres PAPER OPEN ACCESS Resource and Vrtual Functon Status Montorng n Network Functon Vrtualzaton Envronment To cte ths artcle: MS Ha et al 2018 J. Phys.: Conf. Ser. 1087

More information

Petri Net Based Software Dependability Engineering

Petri Net Based Software Dependability Engineering Proc. RELECTRONIC 95, Budapest, pp. 181-186; October 1995 Petr Net Based Software Dependablty Engneerng Monka Hener Brandenburg Unversty of Technology Cottbus Computer Scence Insttute Postbox 101344 D-03013

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Summarizing Data using Bottom-k Sketches

Summarizing Data using Bottom-k Sketches Summarzng Data usng Bottom-k Sketches Edth Cohen AT&T Labs Research 8 Park Avenue Florham Park, NJ 7932, USA edth@research.att.com Ham Kaplan School of Computer Scence Tel Avv Unversty Tel Avv, Israel

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Learning from Multiple Related Data Streams with Asynchronous Flowing Speeds

Learning from Multiple Related Data Streams with Asynchronous Flowing Speeds Learnng from Multple Related Data Streams wth Asynchronous Flowng Speeds Zh Qao, Peng Zhang, Jng He, Jnghua Yan, L Guo Insttute of Computng Technology, Chnese Academy of Scences, Bejng, 100190, Chna. School

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

arxiv: v2 [cs.db] 18 Sep 2017

arxiv: v2 [cs.db] 18 Sep 2017 Effcent Approxmate Query Answerng over Sensor Data wth Determnstc Error Guarantees arxv:1707.01414v2 [cs.db] 18 Sep 2017 ABSTRACT Jaquelne Brto UC San Dego jabrto@cs.ucsd.edu Yanns Katss UC San Dego katss@cs.ucsd.edu

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Classification Method in Integrated Information Network Using Vector Image Comparison

Classification Method in Integrated Information Network Using Vector Image Comparison Sensors & Transducers 2014 by IFSA Publshng, S. L. http://www.sensorsportal.com Classfcaton Method n Integrated Informaton Network Usng Vector Image Comparson Zhou Yuan Guangdong Polytechnc Normal Unversty

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

CE 221 Data Structures and Algorithms

CE 221 Data Structures and Algorithms CE 1 ata Structures and Algorthms Chapter 4: Trees BST Text: Read Wess, 4.3 Izmr Unversty of Economcs 1 The Search Tree AT Bnary Search Trees An mportant applcaton of bnary trees s n searchng. Let us assume

More information

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach

Modeling, Manipulating, and Visualizing Continuous Volumetric Data: A Novel Spline-based Approach Modelng, Manpulatng, and Vsualzng Contnuous Volumetrc Data: A Novel Splne-based Approach Jng Hua Center for Vsual Computng, Department of Computer Scence SUNY at Stony Brook Talk Outlne Introducton and

More information

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Overvew 2 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Introducton Mult- Smulator MASIM Theoretcal Work and Smulaton Results Concluson Jay Wagenpfel, Adran Trachte Motvaton and Tasks Basc Setup

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

1. Introduction. Abstract

1. Introduction. Abstract Image Retreval Usng a Herarchy of Clusters Danela Stan & Ishwar K. Seth Intellgent Informaton Engneerng Laboratory, Department of Computer Scence & Engneerng, Oaland Unversty, Rochester, Mchgan 48309-4478

More information

PRÉSENTATIONS DE PROJETS

PRÉSENTATIONS DE PROJETS PRÉSENTATIONS DE PROJETS Rex Onlne (V. Atanasu) What s Rex? Rex s an onlne browser for collectons of wrtten documents [1]. Asde ths core functon t has however many other applcatons that make t nterestng

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Signature and Lexicon Pruning Techniques

Signature and Lexicon Pruning Techniques Sgnature and Lexcon Prunng Technques Srnvas Palla, Hansheng Le, Venu Govndaraju Centre for Unfed Bometrcs and Sensors Unversty at Buffalo {spalla2, hle, govnd}@cedar.buffalo.edu Abstract Handwrtten word

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA

APPLICATION OF A COMPUTATIONALLY EFFICIENT GEOSTATISTICAL APPROACH TO CHARACTERIZING VARIABLY SPACED WATER-TABLE DATA RFr"W/FZD JAN 2 4 1995 OST control # 1385 John J Q U ~ M Argonne Natonal Laboratory Argonne, L 60439 Tel: 708-252-5357, Fax: 708-252-3 611 APPLCATON OF A COMPUTATONALLY EFFCENT GEOSTATSTCAL APPROACH TO

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) ,

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , VRT012 User s gude V0.1 Thank you for purchasng our product. We hope ths user-frendly devce wll be helpful n realsng your deas and brngng comfort to your lfe. Please take few mnutes to read ths manual

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

Analysis of Collaborative Distributed Admission Control in x Networks

Analysis of Collaborative Distributed Admission Control in x Networks 1 Analyss of Collaboratve Dstrbuted Admsson Control n 82.11x Networks Thnh Nguyen, Member, IEEE, Ken Nguyen, Member, IEEE, Lnha He, Member, IEEE, Abstract Wth the recent surge of wreless home networks,

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Fitting: Deformable contours April 26 th, 2018

Fitting: Deformable contours April 26 th, 2018 4/6/08 Fttng: Deformable contours Aprl 6 th, 08 Yong Jae Lee UC Davs Recap so far: Groupng and Fttng Goal: move from array of pxel values (or flter outputs) to a collecton of regons, objects, and shapes.

More information

Clustering Algorithm of Similarity Segmentation based on Point Sorting

Clustering Algorithm of Similarity Segmentation based on Point Sorting Internatonal onference on Logstcs Engneerng, Management and omputer Scence (LEMS 2015) lusterng Algorthm of Smlarty Segmentaton based on Pont Sortng Hanbng L, Yan Wang*, Lan Huang, Mngda L, Yng Sun, Hanyuan

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Querying by sketch geographical databases. Yu Han 1, a *

Querying by sketch geographical databases. Yu Han 1, a * 4th Internatonal Conference on Sensors, Measurement and Intellgent Materals (ICSMIM 2015) Queryng by sketch geographcal databases Yu Han 1, a * 1 Department of Basc Courses, Shenyang Insttute of Artllery,

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries Run-Tme Operator State Spllng for Memory Intensve Long-Runnng Queres Bn Lu, Yal Zhu, and lke A. Rundenstener epartment of Computer Scence, Worcester Polytechnc Insttute Worcester, Massachusetts, USA {bnlu,

More information

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example Unversty of Brtsh Columba CPSC, Intro to Computaton Jan-Apr Tamara Munzner News Assgnment correctons to ASCIIArtste.java posted defntely read WebCT bboards Arrays Lecture, Tue Feb based on sldes by Kurt

More information

Adaptive Load Shedding for Windowed Stream Joins

Adaptive Load Shedding for Windowed Stream Joins Adaptve Load Sheddng for Wndowed Stream Jons Bu gra Gedk College of Computng, GaTech bgedk@cc.gatech.edu Kun-Lung Wu, Phlp Yu T.J. Watson Research, IBM {klwu,psyu}@us.bm.com Lng Lu College of Computng,

More information

Simplification of 3D Meshes

Simplification of 3D Meshes Smplfcaton of 3D Meshes Addy Ngan /4/00 Outlne Motvaton Taxonomy of smplfcaton methods Hoppe et al, Mesh optmzaton Hoppe, Progressve meshes Smplfcaton of 3D Meshes 1 Motvaton Hgh detaled meshes becomng

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

Face Detection with Deep Learning

Face Detection with Deep Learning Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

An Efficient Background Updating Scheme for Real-time Traffic Monitoring

An Efficient Background Updating Scheme for Real-time Traffic Monitoring 2004 IEEE Intellgent Transportaton Systems Conference Washngton, D.C., USA, October 3-6, 2004 WeA1.3 An Effcent Background Updatng Scheme for Real-tme Traffc Montorng Suchendra M. Bhandarkar and Xngzh

More information

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution Real-tme Moton Capture System Usng One Vdeo Camera Based on Color and Edge Dstrbuton YOSHIAKI AKAZAWA, YOSHIHIRO OKADA, AND KOICHI NIIJIMA Graduate School of Informaton Scence and Electrcal Engneerng,

More information

Study of Data Stream Clustering Based on Bio-inspired Model

Study of Data Stream Clustering Based on Bio-inspired Model , pp.412-418 http://dx.do.org/10.14257/astl.2014.53.86 Study of Data Stream lusterng Based on Bo-nspred Model Yngme L, Mn L, Jngbo Shao, Gaoyang Wang ollege of omputer Scence and Informaton Engneerng,

More information