Should SDBMS Support a Join Index?: A Case study from CrimeStat

Size: px
Start display at page:

Download "Should SDBMS Support a Join Index?: A Case study from CrimeStat"

Transcription

1 Should SDBMS Support a Jon Index?: A Case study from CrmeStat Pradeep Mohan Department of Computer Scence Unversty of Mnnesota mohan@cs.umn.edu Shash Shekhar Department of Computer Scence Unversty of Mnnesota shekhar@cs.umn.edu Ned Levne Ned Levne and Assocates Houston, TX ned@nedlevne.com Ronald E. Wlson Natonal Insttute of Justce Washngton D.C Ronald.Wlson@usdo.gov Betsy George Department of Computer Scence Unversty of Mnnesota bgeorge@cs.umn.edu Mete Celk Department of Computer Scence Unversty of Mnnesota mcelk@cs.umn.edu ABSTRACT Gven a spatal crme data warehouse, that s updated nfrequently and a set of operatons O as well as constrants of storage and update overheads, the ndex type selecton problem s to fnd a set of ndex types that can reduce the I/O cost of the set of operatons. The ndex type selecton problem s mportant to mprove user experence and system resource utlzaton n crucal spatal statstcs applcaton domans such as mappng and analyss for publc safety, publc health, ecology, and transportaton. Ths s because the response tme of frequent queres based on the set of operatons can be mproved sgnfcantly by an effectve choce of ndex types. Many spatal statstcal queres n these applcaton domans make use of a spatal neghborhood matrx, known as W n spatal statstcs, whch can be thought of as a spatal self-on n spatal database termnology. Currently supported ndex types such as B-Tree and R-Tree famles do not adequately support spatal statstcal analyss because they requre on-the-fly computaton of the W- Matrx, slowng down spatal statstcal analyss. In contrast, ths paper argues that Spatal Database Management Systems (SDBMS) should support a on ndex to materalze the W- Matrx and elmnate on-the-fly computaton of the common selfon. A detaled case study usng the popular spatal statstcal software package for publc safety, namely CrmeStat, shows that on ndces can sgnfcantly speed up spatal analyss such as calculaton of Rpley s K and dentfcaton of hotspots. Categores and Subect Descrptors H.2.2[PHYSICAL DESIGN]: Access methods General Terms Desgn, Expermentaton Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. ACM GIS '8, November 5-7, 28. Irvne, CA, USA (c) 28 ACM ISBN /8/...$5. Keywords Jon Index, Spatal Statstcs, W Matrx, Self-Jon. INTRODUCTION Gven a spatal crme data warehouse that s updated nfrequently and a set of operatons, the spatal ndex type selecton problem s to fnd a set of spatal ndex types that can reduce the I/O cost of the set of operatons under gven constrants of storage and update overheads. The ndex type selecton problem s mportant to mprove user experence, response tme, and system resource utlzaton. For example, n tools such as CrmeStat[6], the response tme for dentfcaton of hotspots s 2 hours for a dataset sze of 5 crme reports. Ths slow response tme occurs because CrmeStat s a man memory tool. Usng spatal ndex types e.g., a on ndex famly, may lower the response tme to a few mnutes, thereby enhancng the user experence. Fgure : Identfcaton of Hotspots Ths paper focuses on spatal statstcal queres n the context of mappng and analyss for publc safety. The applcaton consders questons such as, "Are there spatal concentratons of crme that warrant ncreased polce targetng at the communty, cty, and county levels?" to dentfy a set of spatally grouped nstances defned as hotspots. For example, Fgure llustrates the dentfcaton of burglary hotspots n applcatons such as mappng and analyss of publc safety. In these scenaros, law

2 enforcement agences normally have very lmted resources such as offcers and patrol vehcles to deploy n a concentrated manner. Gven a large area such as a cty, t would be very useful for law enforcement agences to examne dfferent possble confguratons for dstrbutng ther lmted resources to areas where there s ncreased crme actvty. To perform ths strategc placement, crme analysts and law enforcement agences perform an exploratory analyss. Smlar spatal statstcal queres are mportant n many other applcaton domans such as publc health, transportaton, ecology, consumer applcatons etc. For example, publc health authortes may be nterested n hotspots of dseases such as cancer clusters [4] n order to dentfy and remedy envronmental factors such as contamnated sol or water. Transportaton professonals may be nterested n dentfyng and remedyng spatal concentratons of traffc accdents by re-desgnng transportaton networks va traffc calmng etc. Ecologsts may dentfy spatal concentratons of endangered speces to promote ther protecton. Many of these queres make use of a spatal neghborhood matrx known as W n spatal statstcs and perform repeated W Matrx computaton for dfferent neghborhoods. We call such queres W-Queres. Current spatal database management systems (SDBMS) provde a rch set of operatons and spatal ndex structures such as B-Tree and R-Tree ndex famles that can enhance the effcency of processng queres n varous applcatons [, 3, 4,,, 2, 7]. However, these SDBMS must perform repeated on-the-fly computaton of the W-Matrx and are lmted n ther ablty to support W-Query operatons. Ths paper argues that SDBMS should support a self-on ndex. The paper ams to establsh the utlty of the on ndex to process W-Queres effcently by evaluatng the dea of a self-on ndex. Related Work: Research related to the ndex type selecton problem can be classfed nto two categores: () spatal ndces that make use of on-the-fly on computaton strateges by computng ons as a part of the query evaluaton process, (2) spatal ndces used for drect(lookup) on computaton that compute ons by performng a sequence of lookups. On-the-fly on computaton technques that are based on spatal ndces, namely the R-Tree and ts varants, are sutable for computng the spatal on for a sngle neghborhood relatonshp [, 3, 4, 5, 6, 7, 8,,, 7, 8, 9]. However, W-Queres are exploratory n nature and requre repeated self-on computaton, makng on-the-fly on computaton expensve. Spatal ndces such as R-Tree and ts varants, Quad tree, Grd Fles, etc., have been ncorporated as a part of commercal SDBMS systems [7, 8, 9, 9, 6]. IBM Informx Spatal DataBlade makes use of the R Tree, ESRI Arc SDE makes use of Grd Fles, Oracle spatal makes use of R Tree and Quad tree, and Mcrosoft SQL Server 28 spatal data support makes use of mult-level Grd fles. These commercal SDBMS tools retan the lmtatons of the correspondng spatal ndces and hence do not provde support for W-Queres and ther operatons. A maor ssue faced by exstng SDBMS tools to support several spatal ndces s that the choce of a spatal ndex type for a gven set of workloads affects the strategy for I/O optmzaton, query optmzaton strateges, concurrency control, and recovery strateges. Drect on computaton technques are based on spatal ndces such as the (spatal) on ndex [8,3]. Jon ndces have been prmarly used n the context of computng a spatal on between two dfferent relatons to speed up onlne query processng nfrequently updated databases. However, current on ndces are represented as b-partte graphs [9, 4]. By contrast, W-Queres are prmarly focused on computng several spatal self-on operatons. In self-on cases, the on ndex becomes a neghborhood graph rather than a b-partte graph representaton. Hence, the current representaton of on ndces as b-partte graphs needs re-consderaton. Our Contrbutons: Frst, we characterze the computatonal structure of W-Queres. We consder the computaton of Rpley's K Functon, and the dentfcaton of hotspots [2,6] for modelng W-Queres. We propose a set of operatons for handlng these queres. We defne the spatal ndex selecton problem for handlng the set of operatons effcently. We propose two varants of the self-on ndex namely: (a) the Self-on Edge Lst Index (SJELI) and (b) the Self-on Adacency Lst Index (SJALI). We also propose algorthms for processng W-Queres. We evaluate the I/O effcency of the proposed varants of the self-on ndex usng algebrac cost models for the operatons. The cost model and the expermental results establsh the utlty of the self-on ndces. Expermental results usng real crme datasets ndcate that the self-on ndces decrease the user response tme of W-Queres by a factor 4 compared to a sngle threaded verson of CrmeStat and outperform an R-Tree based Tree Matchng self-on algorthm. Based on these fndngs we beleve that exstng SDBMS should adopt the self-on ndces to support spatal statstcal queres such as W-Queres. Scope: Ths paper prmarly focuses on the selecton of a sutable ndex type for a gven set of operatons. The on ndces are materalzed for the study area, a prmary requrement n most spatal statstcal analyss applcatons. Our propostons are manly focused on multple spatal neghborhood analyss queres wthn a partcular study area. The am s to reduce the response tme of the proposed set of operatons for W-Queres n a spatal crme data warehouse settng. We understand that addng new ndex types n a SDBMS s a complex decson due to the mpact on ssues such as concurrency control, recovery, and evaluaton of storage costs. These ssues are beyond the current scope of the paper. Outlne: The rest of the paper s organzed as follows. Secton 2 presents the basc concepts and the spatal ndex type selecton problem. Secton 3 descrbes the proposed self-on ndex varatons and desgn decsons. In Secton 4, we propose two algorthms for two example W-Queres, e.g., Rpley's K Functon computaton and dentfcaton of hotspots, and propose an algebrac cost model for the set of W-Query operatons. The expermental evaluaton s gven n Secton 5 and Secton 6 outlnes the conclusons and future work. 2. Basc Concepts and Problem Statement In ths secton, we present some basc concepts requred to model W-Queres. We model W-Queres by dentfyng two example queres, namely the computaton of the Rpley K- Functon and the dentfcaton of hotspots. We propose a set of W-Query operatons based on the example queres and characterze ther computatonal structure.

3 N 7 N 6 N 5 N 4 N 3 N 2 N N (a) N N 2 N 3 (c) N 4 N 5 N 6 N 7 N 7 N 6 N 5 N 4 N 3 N 2 N (b) N N 2 N 3 N 4 N 5 N 6 N 7 (d) Fgure 2: Sample dataset and the W-Matrx for dfferent relatons. (a) Neghborhood graph for neghborhood relaton R, (b) Neghborhood graph for relaton R2. (d) W-Matrx for relaton R. (f) W-Matrx for relaton R2. In spatal statstcs, the W-Matrx s a matrx--based representaton of space and a measure of the adacency, proxmty, dstance or level of spatal nteracton between spatal nstances [3]. Gven a unform spatal framework and a set of spatal nstances, W-Queres re-compute the W-Matrx for dfferent neghborhood relatons. For example: Fgure 2a and 2b represent the spatal neghborhood graph for a spatal dataset. Fgure 2a corresponds to a neghborhood relaton R, and Fgure 2b corresponds to a neghborhood relaton R2. The correspondng W- Matrces for the neghborhood graphs s llustrated n Fgure 2c and 2d respectvely. Spatal nstances are represented by N,, n a unform spatal framework. In the W-Matrx, a denotes that the two spatal nstances satsfy the neghborhood relaton and a denotes that the two spatal nstances do not satsfy the neghborhood relaton. Defnton 2. Gven two spatal nstances S, and S, where, n a spatal dataset S D a neghborhood relaton R(S, S ) can be defned as a measure of spatal nteracton, dstance or adacency. For example, In Fgure 2, R and R2 are two dfferent spatal neghborhood relatons. Defnton 2.2 Gven a spatal framework S, the W-Matrx s defned as a set of values that quantfy the spatal nteracton, dstance or adacency. These values can be bnary or real dependng on the measure of spatal nteracton used. Formally, the W-Matrx can be defned as follows[3]; W ( S D, R) = { R( S, S ) S, S S D and R( S, S ) s vald and } Defnton 2.3 Gven a spatal nstance S, the no_of_nstances(s, R) of nstance S s the number of spatal nstances S є S D,, that satsfy the neghbor relaton R. For example, n Fgure 2, R, R2 are two dfferent spatal neghborhood relatonshps whose no_of_nstances(n,r) = 3 and no_of_nstances(,r2) = 4 Defnton 2.4 Gven a spatal nstance S, the average edge weght (AEW) (or average weght) of a spatal nstance s the sum of the values of R(S, S ) dvded by the Frequency(S,R) where S є S D and, that have a vald neghbor relaton R. The term average edge weght s relevant only f the neghbor relaton represents a value of dstance or smlarty. N N 2. Two Smple W Queres Neghbor Relaton R Neghbor Relaton R Neghbor Relaton R2 Neghbor Relaton R2 (a) (b) Fgure 3: Computatonal structure of W-Queres. (a) Rpley's K (b) Hotspots To model W-Queres, we consder two spatal statstcal queres that have been appled to compute statstcs n CrmeStat[6]. Query I: Is data spatally clustered?. Query I relates to the calculaton of a well-known statstcal measure called Rpley's K functon [2, 4]. Ths measure calculates the cumulatve number of spatal nstances that are wthn a search radus of each spatal nstance n the dataset. Ths cumulatve count s computed for dfferent neghborhood rad. Fgure 3(a) llustrates the method of computng Rpley s K Functon. In the fgure, dark crcles around the spatal nstances N,.., and represent neghborhood relatonshp R, and dashed crcles around the spatal nstances represent neghborhood relatonshp R2. The Rpley K Functon method computes the number of spatal nstances around a partcular spatal nstance for a partcular neghbor relaton R2 and reports the cumulatve sum of these frequences over all spatal nstances. The process s repeated after the neghbor relaton s changed to R and so on untl a sgnfcant number of levels are completed. The number of neghborhood relatonshps s of the order of n spatal statstcs tools such as CrmeStat [6]. Query II: Are there concentratons of crme that warrant ncreased polce targetng at the communty, cty, and county level? Query II relates to the dentfcaton of a spatally grouped set of nstances defned as hotspots. Fgure 3(b) llustrates hotspots that can be extracted from the spatal dataset for multple neghborhood defntons. N, and are the spatal nstances. In the fgure, dark ellpses refer to hotspots that are dentfed for a neghborhood R and the dashed ellpse refers to hotspots that are dentfed for a neghborhood R2. The computatonal process begns wth the computaton of the W-Matrx for an ntal neghborhood relaton R and the selecton of a set of representatve ponts called seeds. Seeds are defned as spatal nstances whch have a mnmal edge weght compared to ther neghbor spatal nstances. For example, n Fgure 3(b),,, and are the seed ponts snce they have mnmum average edge weghts? for the neghbor relaton R. The hotspot dentfcaton process always mantans a lst of potental seeds that are updated whenever a new hotspot s dentfed. The key challenge n the process s to dentfy nonoverlappng hotspots so that spatal nstances are not reconsdered n subsequent hotspots. N

4 Table : W-Queres from CrmeStat[6] Statstc W(S D,R) Consecutve W Subsets Frequency Based Average Edge Weght Based Jon Computaton: On the Fly Jon Computaton: Look up Rpley's K Functon Yes Yes Yes NO NO Yes Nearest Neghbor Statstc Yes Yes Yes NO NO Yes Hotspots Yes Yes Yes Yes NO Yes Moran s I NO NO NO NO Yes Yes Geary s C NO NO NO NO Yes Yes Local Moran (LISA) Yes NO NO NO NO Yes 2.2 Case Study: W Queres from CrmeStat Spatal statstcal queres that can be classfed as W-Queres and that manly nvolve repeated computaton of neghborhood relatonshps are drawn from crme analyss tools such as CrmeStat [6]. Table lsts some of these queres. CrmeStat has several spatal autocorrelaton routnes ncludng Moran s I, Geary s C and LISA. These are global level statstcs that determne f there s clusterng or dsperson wthn a dataset across a study area. They are used as a gude to conduct local level hotspot analyss whereby f the results ndcate there s no clusterng or dsperson, then any hotspots found wth local level technques wll lkely be false postves. These spatal statstcal measures can also be modeled as W-Queres. 2.3 Operatons for W-Queres W-Queres can be modeled as a set of operatons that can be used to dentfy a sutable spatal ndex type to process them effcently. Fgure 4 llustrates the effect of the set of operatons on the example dataset llustrated by Fgure. Snce the spatal dataset s modeled as a neghborhood graph under a neghborhood relaton, we make use of termnology used n the spatal network database lterature such as predecessor and successor [7]. We make use of node colorng to dstngush a predecessor from a successor as the operatons are appled on a neghborhood graph. get-neghbors-n-relatonshp(s,r): Identfy the neghbors of a spatal nstance S. Gven the spatal nstance S, the get-neghbors-n-relatonshp() operaton colors the spatal nstance S and gves all the neghbors that satsfy the relatonshp R the same color as S. For example: Fgure 4 (a) shows the effect of the get-neghborsn-relatonshp(s,r) on the spatal nstance where the operaton get-neghbors-n-relatonshp(,r) results n the colorng of the nstances,, and. get-successors (S ): Retreve the successors of S. The successor of a spatal nstance S s defned as a set of spatal nstances that satsfy the neghbor relaton R wth S and have the same color. For example: Fgure 4 (b) shows the effect of the getsuccessor(s ) operaton on the spatal nstance, where the nstances N,,, and are reported as successors snce they have the same color as. get-successor (S ): Retreve the farthest unreported successors of S. Ths operaton returns the spatal nstance whch s the successor of S and has the maxmum value of the neghbor relaton R wth S. We call ths the "farthest successor frst " strategy. For example: Fgure 4(c) shows the effect of the get-successor(s ) operaton on the spatal nstance, where the nstances N,,, and are reported as successors snce they have the same color as that of. get-predecessors (S ): Retreve the predecessors of S. Retreves the spatal nstances that have a color dfferent from that of spatal nstance S. Ths operaton s executed normally when the degree of spatal nstances requres updatng. For example: Fgure 4 (f), shows the result of get-predecessors(s ) on the spatal nstance. The operaton reports nstances and as the results. get-predecessor-of-successor (S ): Retreve the predecessors of the successor of S Ths operaton returns the nearest uncolored spatal nstance to the successor of S. A predecessor s a spatal nstance S that does not have the same color as spatal nstance S. For example: Fgure 4(d) shows the result get-predecessor-ofsuccessor(s ) appled two tmes on the spatal nstance. The operaton reports nstances and as the results. get-predecessors-of-successor (S ): Retreve the predecessors of the successors of S. Ths operaton retreves the predecessors of the successor of a spatal nstance S. Ths operaton s mportant to update the average edge weght of neghborng spatal nstances of the neghbors of S. For example: Fgure 4(g) shows the result of ths operaton on the spatal nstance,where the frst successor of s N and ts frst predecessor s gets reported. update-successors (S, <successors>): Un-colors all the successors of S Checks whether the spatal nstance S s colored; f t s colored then t un-colors the spatal nstance. <successors> represents a lst of successors to be updated.

5 N N N N (a) (b) (c) (d) N N N N (e) (f) (g) (h) Colored Spatal Instances Successors Unmarked Instance Requres Update R R2 Predecessors Fgure 4 Effect of W-Query operatons on sample dataset. (a)get-neghbors-n-relatonshp(,r). (b) getsuccessors(). (c) get-successor() (d)get-predecessor-of-successor() appled two tmes. (e) updatesuccessors(). (f) get-predecessors().(g) get-predecessors-of-successor(). (h) get-predecessor(). For example: Fgure 4 (e) shows the result of updatesuccessors(s ) on the spatal nstances and. update-average edge weght (S ): Update the average edge weght of a spatal nstance. Ths operaton updates (reduces) the average edge weght of a gven spatal nstance S. For example: Ths operaton s appled on the nstances and, whch are shown n Fgure 4(f,g). s updated two tmes n ths example. 2.4 Problem Statement Ths secton defnes the spatal ndex type selecton problem gven a set of operatons that are relevant to W-Queres. Gven: A spatal crme data warehouse A set of operatons O Fnd: A sutable secondary memory ndex structure type. Obectve: To mnmze the I/O cost of the set of operatons O. Constrants: Spatal datasets are updated nfrequently. Concurrency control and recovery consderatons are addressed separately. There are no storage overheads. User response tme s mnmzed. Example: To compute a W-Query such as the Rpley K Functon, gven a spatal dataset and a set of operatons, namely getneghbors-n-relatonshp() and get-successors(). The obectve of the above problem s to fnd a sutable spatal ndex type that mnmzes the I/O cost of the operatons get-neghbors-nrelatonshp(), get-successors() and the user response tme of the W-Query. Dfferent W-Queres may have dfferent workloads whch are provded as an nput to the query. For example, Rpley's K has parameters such as maxmum neghborhood sze and number of spatal neghborhoods. 3. Self-Jon Index and Its Varants In ths secton, we formally defne a self-on ndex (SJI) and propose two varants, namely the Self-Jon edge lst ndex (SJELI) and the Self-on adacency lst ndex (SJALI). We formally defne the self-on ndex as: SJI = { < S, S R( S, S, R( S, S ) s vald) & ) > S, S } S D & ( R R where S D s the spatal dataset, R S s a set of neghborhood relatonshps that are defned for a spatal framework S. For example: From Fgure 5, R S = {R,R2}. R(S,S ) s ether R or R2. 3. Representatons of the SJI Tradtonally, the on ndex has been represented as a bpartte graph. Snce W-Queres repeatedly compute self-ons, the modelng of the self-on ndex as a b-partte graph needs to be modfed to that of an undrected neghborhood graph, G=(S D, E). The neghborhood graph G conssts of a set of spatal nstances S D and an edge set E. Each element S єs D s a spatal locaton n a unform spatal framework S. The set of edges E s a subset of the cross product, S S. Each element (S, S ) n E s an edge D D that ons nstances S, and S, where. Also each edge has a weght whch s the level of spatal nteracton, dstance or adacency. S,

6 Fgure 5: Self-on ndex representatons.(a). Neghborhood graph for relaton R.(b). Neghborhood graph for relaton R2.(c) Self-on edge lst ndex (SJELI).(d). Self-on adacency lst ndex.(sjali) Ths neghborhood graph can be represented n two dfferent ways, namely, the edge lst and the adacency lst. Fgure 5(a) and 5(b) are the neghborhood graphs for the relatons R and R2 respectvely. We present the desgn of the two representatons and evaluate the effect of the operatons on the two varants. 3.. Self Jon Index: Edge Lst Representaton (SJELI) The edge lst representaton of the self-on ndex s llustrated n Fgure 5(c). In ths representaton, the on ndex s ordered by column and wthn column by the value of the relaton R(S,S ). Ths representaton does not provde any nformaton on the successors or the predecessors of a spatal nstance S. Ths s clearly evdent from ts representaton. A clear challenge wth ths representaton s to determne an optmal parttonng of the SJELI to mnmze the I/O costs of the set of operatons Self Jon Index: Adacency Lst Representaton(SJALI) The adacency lst representaton of the self-on ndex s llustrated n Fgure 5(d).The adacency lst representaton has clear advantages compared to that of the edge lst representaton. Frst, the adacency lst representaton mantans a lst of successors and predecessors that are crtcal for processng W- Queres. Second, the colorng scheme used by the set of operatons can easly explot the adacency lst representaton to retreve the successors or predecessors wth lesser I/O. Also, processng updates on the adacency lst s easer due to the same reasons Desgn Issues We make use of the connectvty clusterng heurstc [7] to cluster the spatal nstances of the SJALI and SJELI. CCAM (Clustered Connectvty Access Method) [7] makes use of separate lsts for successors and predecessors and does not explot the concept of a spatal neghborhood. The self-on ndces, SJALI and SJELI are prmarly neghborhood graphs that are represented as adacency lsts and edge lsts. We apply the connectvty clusterng heurstc for the two neghborhood graphs to store them nto dsk pages. In the desgn of the SJALI, we mantan only one lst of adacent neghbors of a partcular spatal nstance. The proposed set of W-Query operatons, for example, getneghbors-n-relatonshp(s,r), makes use of a colorng heurstc to retreve the successors and the predecessors of a partcular spatal nstance. To allocate these spatal nstances to dsk pages,we make use of the same connectvty clusterng heurstc on the neghborhood graph. For example, n Fgure 4(d), a typcal page allocaton would nvolve storng N,, and n the same page;,, and n another page; and n a separate page. Ths allocaton scheme changes wth the maxmum sze of a page and the value of the Connectvty Resdue Rato (CRR) [7]. CRR s defned as the probablty that two neghborng spatal nstances are present n the same dsk page. Utlzng the same heurstc on the SJELI nvolves storng the edge lsts of spatal nstances n the same dsk page such that the number of cut edges s mnmzed. Ths allocates the edge lsts of spatal nstances to pages where each edge of the spatal nstance corresponds to a page entry. In some cases for large neghborhood szes, t s possble that the edge lst of one spatal nstance tself may exceed one sngle page. For example, n Fgure 5(c), a typcal page allocaton would nvolve allocatng the edge lsts of N,, and to the same page, edge lsts of,, and to another page, and to a separate page.

7 The key trade-off n the two dfferent representatons s n the value of the connectvty resdue rato (CRR) they yeld.. The SJELI would yeld a lower value of CRR for small page szes, thus resultng n larger I/O costs. SJELI would also ncur more I/O costs for larger neghborhood szes than the other representaton. Ths clearly ndcates that the value of the CRR n the case of both the SJELI and the SJALI depends on the value of the neghborhood relaton R. An n-depth evaluaton of the varaton n CRR for the two self-on ndces s beyond the scope of ths paper. 4. W-Query Processng Algorthms In ths secton, we propose two query processng algorthms usng the set of operatons get-neghbors-n-relatonshp(), getsuccessors(), get-predecessors(), get-successor(), getpredecessor(), get-predecessor-of-successor(), get-predecessorsof-successor(), update-average-edge-weght(), and updatesuccessors(). These operatons are used to desgn the algorthms for W-Queres, namely Rpley's K- Functon computaton and dentfcaton of hotspots. 4. Rpley's K Functon Computaton The Rpley K Functon computaton nvolves the use of two operatons, get-neghbors-n-relatonshp(s,r) and getsuccessors(s ). Algorthm lsts the computatonal process for the Rpley K Functon. The trace of the algorthm s lsted n Table 2. Algorthm : CalcRpleyK: Computaton process for computng Rpley s K Functon Inputs: Spatal sataset S D, Query: Is data spatally clustered?, Total number of levels, Study Area Output: K Functon: Measure of spatal randomness. Procedure: CalcRpleyK. do 2. begn 3. for every spatal nstance S n SD 4. get-neghbors-n-relatonshp(s,r[]) 5. F[] := F[]+sze(get-successors(S,R[])) 6. update-successors(s) 7. endfor 8. K [] := calculate_rpley_k from F[] 9. := +. R [] := decrease_neghborhood(r[-]). end 2. Whle(<= Total Number of Levels) The trace of the Hotspot_JI Algorthm s lsted n Table 3. The trace clearly shows that the number of hotspots computed decreases as the sze of the neghborhood ncreases. Also, the effect of the set of operatons s lsted n the trace. Table 2: Trace of CalcRpleyK Algorthm Neghbor get-neghbors-nrelatonshp(s, get-successors(s) Relaton R) R2 R :[,N,,] [,N,,] Frequency N:[,,,,] [,,,,] 5 :[,N,,] :[,,,N] [,N,,] [,,,N] 4 4 :[,,,N, ] [,,,N, ] 5 :[,,,N] :[,,] [,,,N] [,,] 4 3 Total = 28 :[,N,,] [,N,,] 4 N:[,,,] [,,,] 4 :[,N] [,N] 2 :[,] [,] 2 :[,,,N] [,,,N] 4 :[,, ] [,, ] 3 :[ ] [ ] Total = Identfcaton of Hot Spots The dentfcaton of hotspots nvolves the use of the operatons get-neghbors-n-relatonshp(s,r), getsuccessors(s,r), get-successor(s), update-successors(s), getpredecessors(s), and update-average-edge-weght(s). Algorthm 2, Hotspot_JI lsts the computatonal process for the dentfcaton of hotspots. Algorthm 2: Hotspot_JI: Computaton process for extractng hotspots from a spatal dataset. Inputs: Spatal Dataset S D, Query: Are there concentratons of crme that warrant ncreased polce targetng at the block,cty and county level? HotspotSzeThreshold, Set of Neghbor Relatons Output: Set of hotspots correspondng to each neghbor relaton Procedure: Hotspot_JI. Whle ( Sze(HotspotQueue >= HotspotSzeThreshold ) 2. begn 3. whle( Termnate when there are no more seeds) 4. S := Retreve New Seed 5. get-neghbors-n-relatonshp(s,r) 6. Successor_Lst:= get-successors(s) 7. whle(r[](predecessor-of-successor(s))<r[](get-successor(s)) 8. upd_succ_lst.enque( Successor_Lst.Deque()) 9. endwhle. update-successors(s,upd_succ_lst). HotspotQueue:= Successors_Lst 2. whle(successor_lst!=null) 3. p:=get-predecessor(successor_lst.deque()) 4. update-average-edge-weght(p) 5. endwhle 6. := + 7. R[] := ncrease_neghborhood R[-] 8. end 4

8 Table 3: Trace of Hotspot_JI Algorthm for dentfyng Hotspots from the sample dataset. Neghbor Relaton Hotspots Seeds get-successors (S) get-successor(s) get-predecessorof-successor(s) updatesuccessors getpredecessors(s) update-averageedge-weght R :[,N,,] [,N,,],,N,,, :[,N],,, :[,,,N] [,], :[] :[,,,N] [] - - :[] - - R2 :[,,,N, ] [,,,N, ],N,,, Null, Null, Null,, Null :[,,,N,],,, 4.3 Algebrac Cost Model In ths secton, we provde algebrac cost models for the I/O costs of W-Query operatons. We make use of the CRR to measure the worst case I/O costs of the operatons. Table 4 lsts the symbols used to develop the cost formulas. Table 4: Symbols used n Cost Analyss. Symbol S Meanng Average number of successors of a partcular node P Average number of predecessors of a partcular node. CRR Connectvty resdue rato : The probablty that the page(s ) = page( S ) for edge(s, S ) S R s the average number of nstances satsfyng the Neghbor Relaton R S D s the total sze of the spatal dataset. Ρ Z LI = Z Z EL = Z selectvty of a Range Query for a neghbor relaton, R, { S R /( S D -)}X S D Cost of accessng a sngle spatal nstance from the SJALI Cost of accessng a sngle spatal nstance from the SJELI For both self-on ndex varants, let the costs of retrevng one spatal nstance be Z. The value of Z s equal to, whch s the cost of a smple look-up from the on ndces. As descrbed earler, the CRR of SJELI s expected to be lower as compared to SJALI due to the presence of a large number of cut edges on a sngle page. Hence, the I/O costs of the W-Query operatons are expected to be greater for SJELI. The get-neghbors-n-relatonshp(s,r) operaton retreves all the nstances that satsfy the neghborhood relatonshp R wth S. The cost of one get-neghbors-n-relatonshp operaton equals the product of the cost of retrevng the neghbors of S multpled by the probablty that the neghbors are not n the same dsk page. The get-successors(s.) operaton retreves all the successors of S. The cost of one get-successors() operaton nvolves the cost of retrevng all the successors and the probablty that the successors are not n the same page as S. The get-predecessors(s ) operaton retreves all the predecessors of S. The cost of one get-predecessors() operaton nvolves the cost of retrevng all the predecessors of S and the probablty that they are not n the same page as S. The cost of one get-successor(s ) operaton s the probablty that the successor of S s not n the same page as S. The cost of one getpredecessor(s ) operaton s also the same. The cost of one get-predecessors-of-successor(s ) operaton nvolves the cost of extractng one successor and then the cost of extractng the predecessors of that successor, accountng for the probablty that they are not n the same dsk page. The cost of one update-successors(s ) operaton s the cost of un-colorng the successors of S whch s the cost of retrevng the successors multpled the probablty that they are not n the same page. The cost of one update-average-weght(s ) operaton s the cost of retrevng S and also movng S to an approprate secondary memory bucket whch mantans potental seeds for handlng W- Queres such as dentfcaton of hotspots. These costs are summarzed n Table 5. Table 5. Worst case I/O cost analyss of W-Query operatons. Operaton get-neghbors-nrelatonshp(s,r) get-successors(s ) get-successor(s ) get-predecessor-ofsuccessor(s ) update-successors(s ) get-predecessors(s ) get-predecessors-ofsuccessor(s ) get-predecessor(s ) update-average-edgeweght( S ) Data Page Accesses { S R /( S D -)} S D Z (-CRR) = ρ Z S D (-CRR) S Z (-CRR) Z (-CRR) 2 Z (-CRR) Z (-CRR)X S P Z (-CRR) ( P Z + ) (-CRR) Z (-CRR) 2 Z 5. Expermental Evaluaton The self-on ndces were evaluated usng a set of experments that measure the response tme of the two queres, namely Rpley s K Functon and hotspots. The experments were mplemented n C++/CLI and conducted on a Pentum Xeon 3.2 GHz Machne wth a 4GB man memory. We make use of real crme datasets to demonstrate the utlty of the self-on ndex varants to process W-Queres and ther set of operatons effcently. We measured the user response tme for the queres.

9 We compared our proposed self-on ndex-based drect on computaton method wth an R-Tree-based tree matchng self-on computaton method that computes the W-Matrx for every new neghborhood relatonshp. We performed experments for dfferent dataset szes rangng from 82 spatal nstances to 4852 spatal nstances. We also compared the response tme of the self-on ndex based algorthms wth that of the ones mplemented n a modularzed sngle threaded verson of CrmeStat. The expermental evaluaton addresses the followng questons: Queston : What s the user response tme of the Rpley K Functon Query? We mplemented the W-Query processng algorthm CalcRpleyK, proposed n Secton 4, on a self-on adacency lst ndex (SJALI). We also mplemented the same queres by repeated computaton of self-ons on the R-Tree ndex. Fgure 6 shows the comparson of the R-Tree-based on-the-fly on computaton method and the method usng the self-on ndex. The total response tme also ncludes the tme for performng I/O. It can be concluded from Fgure 6 that the self-on ndex-based mplementaton gves a better performance as compared to the R- Tree-based on-the-fly on computaton. We have omtted the detals of the algorthm for space consderatons. Ths algorthm nvolves a repeated computaton of only the self-on operaton. The algorthm was executed for neghborhood relatonshps. We mplemented the W-Query processng algorthm for hotspot Identfcaton, Hotspot_JI, on the SJALI. The user response tme of the hotspot dentfcaton process was compared wth the Tree matchng self-on algorthm usng the R Tree Fgure 7 shows the comparson of the self-on ndex based method wth the R-Tree-based method. The total response tme also ncludes the tme taken for performng I/O. It was observed that the self-on ndex-based hotspot dentfcaton method takes more response tme because of the seed selecton process that ncurs more updates on the average edge weght of the spatal nstances. However, the self-on ndex outperforms the R-Treebased on-the-fly on computaton, whch has processng overheads for removng false postves from dentfed hotspots. Fgure 6.User-response tme comparson for Rpley's K Computaton Table 6 shows the comparson wth a sngle threaded verson of CrmeStat where the self-on ndex speeds up the query processng tme by a factor of 4 for the computaton of Rpley's K functon. Table 6. User response tme comparson wth CrmeStat Datase t Sze User response tme for CrmeStat (seconds) User response tme for self-on ndex (seconds) Queston 2: What s the user response tme of the hotspot dentfcaton query? Fgure 7. User-response tme comparson for hotspot dentfcaton Table 7 shows the user response tme of the self-on ndex based algorthms wth a sngle threaded CrmeStat. As can be seen, the self-on ndex mproves the user response tme by a factor of 5 for the dentfcaton of hotspots Table 7. User response tme comparson wth CrmeStat. Datase t Sze User Response tme for CrmeStat (seconds) User response tme for self-on ndex (seconds) Conclusons and Future Work We characterzed the computatonal structure of a class of spatal statstcal queres called W-Queres. We defned a set of operatons that can be used to process these queres. These operatons have been dentfed as a basc set that s requred to process two smple W-Queres such as Rpley's K and hotspots. Table lsts other types of W-Queres that are frequently observed n spatal analyss and dentfes the two smple W- Queres as the most representatve queres. Ths paper does not clam about the completeness of the set of operatons. We defned the spatal ndex type selecton problem for selectng a sutable spatal ndex type for handlng these operatons effcently. We proposed two varants of the self-on ndex and presented our desgn decsons. We proposed algorthms for two smple W- Queres. We presented an algebrac

10 cost model for the proposed set of operatons. We performed expermental evaluaton on real crme datasets to demonstrate that the self-on ndex guarantees better user response tme as compared to an R-Tree-based on-the-fly self-on computaton and a repettve W-Matrx computaton-based CrmeStat. These observatons establsh the utlty of the on ndex to process W- Queres effcently and we have dentfed a sutable representaton of the on ndex to acheve ths obectve. Ths result valdates our clam that the self-on ndex should be supported by SDBMS for processng such queres. In future work, we plan to evaluate the detaled I/O costs of the W-Query processng algorthms for the proposed varants of the self-on ndex. We also plan to address crtcal ssues such as concurrency control and recovery, optmal query processng strateges, and extracton of optmal page access sequences for the proposed self-on ndex varants. We also want to consder more spatal statstcal queres such as the Local Moran Index, Moran's I, Geary's C, as well as other hotspot algorthms. Acknowledgments The authors would lke to thank the members of the spatal database research group at the Unversty of Mnnesota for helpful dscussons and comments. We would lke to thank Km Koffolt for her comments to mprove the readablty of the paper. Ths work was supported by grants from NSF : CN-S-7864, IIS- 7324, USDOD and NIJ: As an unrestrcted gft from Ned Levne and Assocates. 7. REFERENCES [] N. Beckmann, H.P. Kregel, R. Schneder and BB. Seeger. The R*-Tree: an effcent and robust access method for ponts and rectangles. SIGMOD Rec., 9(2): , 99. [2] N.A. Cresse, edtor. Statstcs for Spatal Data. Wley- Interscence, 993. [3] V. Gaede and O. Gunther. Multdmensonal access methods. ACM Comput. Surv., 3(2): 7-23, 998 [4] A Guttman. R Trees: a dynamc ndex structure for spatal searchng. In SIGMOD 84: Proceedngs of the 984 ACM SIGMOD nternatonal conference on Management f data, pages 47-57, New York, NY, USA. 984.ACM [5] E.H. Jacox and H.Samet. Spatal Jon Technques. ACM Transactons on Database Systems., 32(): 7, 27. [6] N. Levne, CrmeStat: A spatal statstcs program for the analyss of Crme ncdent locatons, verson 3.. Ned Levne and Assocates: Houston, TX/ Natonal Insttute of Justce: Washngton, DC, 24. URL: [7] G. Malcom. Mcrosoft SQL Server 28, Delverng Locaton Intellgence wth Spatal Data. SQL Server Techncal Artcle. Mcrosoft Corporaton, Aug 27. Avalable onlne at d69b-4f9-bc9e-468b65aaa7/spataldata.doc [8] A. Mtchell, edtor. The ESRI Gude to GIS Analyss, Volume : Geographc Patterns and Relatonshps. ESRI Press, 25. [9] A. Mtchell, edtor. The ESRI Gude to GIS Analyss, Volume 2:Statstcal Measurements and Statstcs. ESRI Press, 25. [] D. Rotem. Spatal Jon Indces. In Proceedngs of the Seventh Internatonal Conference on Data Engneerng, Aprl 8-2, 99, Kobe Japan, pages IEEE Computer Socety, 99. [] H. Samet. The quadtree and related herarchcal data structures. ACM Comput. Surv., 6(2): 87-26, 984. [2] T.K. Sells, N. Roussopoulos and C.Faloutsos. The R+-Tree: A dynamc ndex for mult-dmensonal obects. In VLDB 87: Proceedngs of the 3 th Internatonal Conference on Very large databases, pages 57-58, San Francsco, CA, USA, 987. Morgan Kaufman Publshers Inc. [3] S. Shekhar and S.Chawla, edtors. Spatal Databases: A Tour. Prentce Hall, 22. [4] B.D. Rpley. The second-order analyss of statonary pont processes. Journal of Appled Probablty 3: [5] S.Shekhar, C.T. Lu, S.Chawla and S.Ravada. Effcent Jon- Index- Based Spatal Jon Processng: A Clusterng Approach. IEEE Trans. In Know. and Data Engneerng 5(), 23. [6] Oracle Spatal g: Advanced Spatal Data Management for the Enterprse. Oracle Data Sheet. Feb 25. Avalable onlne at collateral/spatalg_datasheet.pdf [7] S. Shekhar and D. R. Lu, CCAM: A Connectvty-Clustered Access Method for Networks and Network Computatons, IEEE Trans. on Knowledge and Data Engneerng, Vol. 9, No., Jan. 997 [8] M. Worboys and M. Duckham, edtors. GIS: A Computng Perspectve. Second Edton. CRC, 24. [9] IBM Informx Spatal DataBlade Module: User's Gude. IBM Corporaton, Ver 8.2, Part No.-99, Aug: 22. Avalable onlne at

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Efficient Content Distribution in Wireless P2P Networks

Efficient Content Distribution in Wireless P2P Networks Effcent Content Dstrbuton n Wreless P2P Networs Qong Sun, Vctor O. K. L, and Ka-Cheong Leung Department of Electrcal and Electronc Engneerng The Unversty of Hong Kong Pofulam Road, Hong Kong, Chna {oansun,

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Constructing Minimum Connected Dominating Set: Algorithmic approach

Constructing Minimum Connected Dominating Set: Algorithmic approach Constructng Mnmum Connected Domnatng Set: Algorthmc approach G.N. Puroht and Usha Sharma Centre for Mathematcal Scences, Banasthal Unversty, Rajasthan 304022 usha.sharma94@yahoo.com Abstract: Connected

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Professional competences training path for an e-commerce major, based on the ISM method

Professional competences training path for an e-commerce major, based on the ISM method World Transactons on Engneerng and Technology Educaton Vol.14, No.4, 2016 2016 WIETE Professonal competences tranng path for an e-commerce maor, based on the ISM method Ru Wang, Pn Peng, L-gang Lu & Lng

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Querying by sketch geographical databases. Yu Han 1, a *

Querying by sketch geographical databases. Yu Han 1, a * 4th Internatonal Conference on Sensors, Measurement and Intellgent Materals (ICSMIM 2015) Queryng by sketch geographcal databases Yu Han 1, a * 1 Department of Basc Courses, Shenyang Insttute of Artllery,

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING An Improved K-means Algorthm based on Cloud Platform for Data Mnng Bn Xa *, Yan Lu 2. School of nformaton and management scence, Henan Agrcultural Unversty, Zhengzhou, Henan 450002, P.R. Chna 2. College

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

GA-Based Learning Algorithms to Identify Fuzzy Rules for Fuzzy Neural Networks

GA-Based Learning Algorithms to Identify Fuzzy Rules for Fuzzy Neural Networks Seventh Internatonal Conference on Intellgent Systems Desgn and Applcatons GA-Based Learnng Algorthms to Identfy Fuzzy Rules for Fuzzy Neural Networks K Almejall, K Dahal, Member IEEE, and A Hossan, Member

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

K-means and Hierarchical Clustering

K-means and Hierarchical Clustering Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

EVALUATION OF THE PERFORMANCES OF ARTIFICIAL BEE COLONY AND INVASIVE WEED OPTIMIZATION ALGORITHMS ON THE MODIFIED BENCHMARK FUNCTIONS

EVALUATION OF THE PERFORMANCES OF ARTIFICIAL BEE COLONY AND INVASIVE WEED OPTIMIZATION ALGORITHMS ON THE MODIFIED BENCHMARK FUNCTIONS Academc Research Internatonal ISS-L: 3-9553, ISS: 3-9944 Vol., o. 3, May 0 EVALUATIO OF THE PERFORMACES OF ARTIFICIAL BEE COLOY AD IVASIVE WEED OPTIMIZATIO ALGORITHMS O THE MODIFIED BECHMARK FUCTIOS Dlay

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Keyword-based Document Clustering

Keyword-based Document Clustering Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval Fuzzy -Means Intalzed by Fxed Threshold lusterng for Improvng Image Retreval NAWARA HANSIRI, SIRIPORN SUPRATID,HOM KIMPAN 3 Faculty of Informaton Technology Rangst Unversty Muang-Ake, Paholyotn Road, Patumtan,

More information

Mining User Similarity Using Spatial-temporal Intersection

Mining User Similarity Using Spatial-temporal Intersection www.ijcsi.org 215 Mnng User Smlarty Usng Spatal-temporal Intersecton Ymn Wang 1, Rumn Hu 1, Wenhua Huang 1 and Jun Chen 1 1 Natonal Engneerng Research Center for Multmeda Software, School of Computer,

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

On the Efficiency of Swap-Based Clustering

On the Efficiency of Swap-Based Clustering On the Effcency of Swap-Based Clusterng Pas Fränt and Oll Vrmaok Department of Computer Scence, Unversty of Joensuu, Fnland {frant, ovrma}@cs.oensuu.f Abstract. Random swap-based clusterng s very smple

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

SAO: A Stream Index for Answering Linear Optimization Queries

SAO: A Stream Index for Answering Linear Optimization Queries SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Outlier Detection Methodologies Overview

Outlier Detection Methodologies Overview Outler Detecton Methodologes Overvew Mohd. Noor Md. Sap Department of Computer and Informaton Systems Faculty of Computer Scence and Informaton Systems Unverst Teknolog Malaysa 81310 Skuda, Johor Bahru,

More information

Needed Information to do Allocation

Needed Information to do Allocation Complexty n the Database Allocaton Desgn Must tae relatonshp between fragments nto account Cost of ntegrty enforcements Constrants on response-tme, storage, and processng capablty Needed Informaton to

More information

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION 24 CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION The present chapter proposes an IPSO approach for multprocessor task schedulng problem wth two classfcatons, namely, statc ndependent tasks and

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

5 The Primal-Dual Method

5 The Primal-Dual Method 5 The Prmal-Dual Method Orgnally desgned as a method for solvng lnear programs, where t reduces weghted optmzaton problems to smpler combnatoral ones, the prmal-dual method (PDM) has receved much attenton

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

OPTIMAL CONFIGURATION FOR NODES IN MIXED CELLULAR AND MOBILE AD HOC NETWORK FOR INET

OPTIMAL CONFIGURATION FOR NODES IN MIXED CELLULAR AND MOBILE AD HOC NETWORK FOR INET OPTIMAL CONFIGURATION FOR NODE IN MIED CELLULAR AND MOBILE AD HOC NETWORK FOR INET Olusola Babalola D.E. Department of Electrcal and Computer Engneerng Morgan tate Unversty Dr. Rchard Dean Faculty Advsor

More information

Clustering algorithms and validity measures

Clustering algorithms and validity measures Clusterng algorthms and valdty measures M. Hald, Y. Batstas, M. Vazrganns Department of Informatcs Athens Unversty of Economcs & Busness Emal: {mhal, yanns, mvazrg}@aueb.gr Abstract Clusterng ams at dscoverng

More information