A Cost Model For Nearest Neighbor Search. High-Dimensional Data Space

Size: px
Start display at page:

Download "A Cost Model For Nearest Neighbor Search. High-Dimensional Data Space"

Transcription

1 A Cost Moel For Nearest Neighbor Search in High-Dimensional Data Space Stefan Berchtol University of Munich Germany Daniel A Keim University of Munich Germany keim@informatikuni-muenchene Christian Böhm University of Munich Germany boehm@informatikuni-muenchene Hans-Peter Kriegel University of Munich Germany kriegel@informatikuni-muenchene Abstract In this paper, we present a new cost moel for nearest neighbor search in high-imensional ata space We first analyze ifferent nearest neighbor algorithms, present a generalization of an algorithm which has been originally propose for Quatrees [13], an show that this algorithm is optimal Then, we evelop a cost moel which - in contrast to previous moels - takes bounary effects into account an therefore also works in high imensions The avantages of our moel are in particular: Our moel works for ata sets with an arbitrary number of imensions an an arbitrary number of ata points, is applicable to ifferent ata istributions an inex structures, an provies accurate estimates of the expecte query execution time To show the practical relevance an accuracy of our moel, we perform a etaile analysis using synthetic an real ata The results of applying our moel to Hilbert an X-tree inices show that it provies a goo estimation of the query performance, which is consierably better than the estimates by previous moels especially for highimensional ata Key Wors: Nearest Neighbor Search, Cost Moel, Multiimensional Searching, Multiimensional Inex Structures, High-Dimensional Data Space 1 Introuction In this paper, we escribe a cost moel for nearest neighbor queries in high-imensional space Nearest neighbor queries are very important for many applications Examples inclue multimeia inexing [9], CAD [17], molecular biology (for the ocking of molecules) [24], string matching [1], etc Most applications use some kin of feature vector for an efficient access to the complex original ata Examples of feature vectors are color histograms [23], shape escriptors [16, 18], Fourier vectors [26], text escriptors [15], etc Nearest neighbor search on the high-imensional feature vectors may be efine as follows: Given a ata set of -imensional points, fin the ata point NN from the ata set which is closer to the given query point Q than any other point in the ata set More formally: NN( Q) { e e : e Q e Q } Usually nearest neighbor queries are execute using some kin of multiimensional inex structure such as k--trees, R-trees, Quatrees, etc In section 2, we iscuss the ifferent nearest neighbor algorithms propose in the literature We present a generalization of an algorithm, which has been originally propose for Quatrees [13] an show that this algorithm is optimal A problem of inex-base nearest neighbor search is that it is ifficult to estimate the time which is neee for executing the nearest neighbor query The estimation of the time, however, is important not only for a theoretic complexity analysis of the average query execution time but it is also crucial for optimizing the parameters of the inex structures (eg, the block size) an for query optimization An aequate cost moel shoul work for ata sets with an arbitrary number of imensions an an arbitrary number of ata points, it shoul be applicable to ifferent ata istributions an inex structures, an most important, it shoul provie accurate estimates of the expecte query execution time Unfortunately, existing moels fail to fulfill these requirements In particular, none of the moels provies accurate estimates for nearest neighbor queries in high-imensional space, an most moels pose awkwar an unrealistic re-

2 quirements on the number of necessary ata points preventing the moels from being practically applicable One of the reasons for the problems of existing moels is that basically none of them accounts for bounary effects, ie effects that occur if the query processing reaches the borer of the ata space As we will show later, bounary effects play an important role in processing nearest neighbor queries in highimensional space Our moel etermines the expecte number of page accesses when performing a nearest neighbor query by intersecting all pages with the minimal sphere aroun the query point containing the nearest neighbor In contrast to previous approaches, our cost moel consiers bounary effects an therefore also provies accurate estimates for the high-imensional case Furthermore, our moel works for an arbitrary number of ata points an is applicable to a wie range of inex structures such as k--trees, R-trees, quatrees, etc Besies escribing our cost moel, we provie a etaile experimental evaluation showing the accuracy an practical relevance of our moel In our experiments, we use artificial as well as real ata an compare the moel estimates with the actually measure page counts obtaine from two ifferent inex structures: the Hilbert-inex an the X-tree 2 Algorithms for Nearest Neighbor Search In the last ecae, a large number of algorithms an inex structures have been propose for nearest neighbor search In the following, we give an overview of these algorithms 21 Known Algorithms Algorithms for nearest neighbor search may be ivie into two major groups: partitioning algorithms an graph-base algorithms Partitioning algorithms partition the ata space (or the actual ata set) recursively an store information about the partitions in the noes Graph-base algorithms precalculate some nearest neighbors of points, store the istances in a graph an use the precalculate information for a more efficient search Examples for such algorithms are the RNG* algorithm of Arya [2] an algorithms using Voronoi iagrams [20] Although in this paper we concentrate our iscussion on partitioning algorithms, we believe that our results are applicable to graph-base algorithms, as well initialize PartitionList with the subpartitions of the root-partition sort PartitionList by MINDIST; while (PartitionList is not empty) if (top of PartitionList is a leaf) fin nearest point NNC in leaf; if (NNC closer than NN) prune PartitionList with NNC; let NNC be the new NN else replace top of PartitionList with its son noes; enif resort PartitionList by MINDIST; enwhile output NN; Figure 1: Algorithm NN-opt A rather simple partitioning algorithm is the bucketing algorithm of Welch [27] The algorithm ivies the ata space into ientical cells an stores the ata objects insie a cell in a list which is attache to the cell During nearest neighbor search the cells are visite in orer of their istance to the query point The search terminates if the nearest point which has been etermine so far is nearer than any cell not visite yet Unfortunately, the algorithm is not efficient for high-imensional or real ata A more practical approach is the k-tree algorithm of Friemann, Bentley an Finkel [12] In contrast to Welch s algorithm, the orer in which the k--algorithm visits the partitions of the ata space is etermine by the structure of the k--tree Ramasubramanian an Paliwal [21] propose an improvement of the algorithm by optimizing the structure of the k--tree Roussopoulos etal [22] propose a ifferent approach using the R*-tree [4] for nearest neighbor search The algorithm traverses the R*-tree an stores for every visite partition a list of subpartitions orere by their minmaxist The minmaxist of a partition is the maximal possible istance from the query point to the nearest ata point insie the partition If a point is foun having a istance smaller than the nearest point etermine so far, all partition lists can be prune because all noes with a larger minmaxist cannot contain the nearest neighbor A problem of the R*-tree algorithm is that it traverses the inex in a epth-first fashion Subnoes are sorte before escent, but once a branch has been chosen, its processing has to be complete, even if sibling branches appear more likely to contain the NN The algorithm therefore accesses more partitions than actually necessary In [13], Hjaltason an Samet propose an algorithm using PMR-Quatrees In contrast to the algorithm of Roussopoulos etal, partitions are visite orere by their minist The minist of a partition is the minimal istance from the query point Qto any point p insie the partition P More formally: MINDIST ( P, Q) min( p Q ) p P The algorithmic principle of the metho of Hjaltason an Samet can be applie to any hierarchical inex structure which uses recursive an conservative partitioning In Figure 1, we present a generalization of the algorithm which works for any hierarchical inex structure Pruning the partition list with a point NNC means that all partitions in the list which

3 have a minist larger than the istance of NNC to the query point are remove from the list 22 Optimality of Algorithm NN-opt In this section, we show that the algorithm NN-opt (cffigure 1) is optimal For this purpose, we nee to efine the minimal sphere aroun the query point containing the nearest neighbor: Definition 1: (NN-sphere) Let Q be a query point an NN be the nearest neighbor of Q Then NN-ist Q NN is the istance of the nearest neighbor an the query point The NN-sphere SP( Q, r) of a query point Q is efine as the sphere with center Q an raius r NN-ist Definition 2: (Optimality) An algorithm for nearest neighbor search is optimal if the pages accesse by the algorithm uring the nearest neighbor search are exactly the pages that intersect the NN-sphere Note that we use the term Optimality relative to an unerlying inex structure an not relative to the nearest neighbor problem itself Lemma 1: Algorithm NN-opt is an optimal algorithm accoring to efinition 2, ie algorithm NN-opt accesses exactly the partitions which intersect the NN-sphere but no other partitions Proof: From the correctness of algorithm NN-opt as provie in [13] it follows that any partition intersecting the NN-sphere is accesse uring the search process To show the minimality of the accesse partitions, let us assume that algorithm NN-opt accesses a partition NA, which oes not intersect the NN-sphere, ie minist( NA) > r Let NP 0 be the partition (ata page) containing the nearest neighbor, NP 1 be the partition containing NP 0,, an NP k be the partition in the root-page containing NP 0,, NP k 1 Thus, r minist( NP 0 ) minist( NP k ) Consequently, minist( NA) > r minist( NP 0 ) minist( NP k ) Since NP k is in the root-page, NP k is replace uring the search process by NP k 1 an so on, until NP 0 is loae If, as assume, the algorithm accesses NA, NA has to be on top of the partition list at some point uring the search Since minist( NA) is smaller than the minist of any partition containing the nearest neighbor, NA cannot be loae until NP 0 has been loae If NP 0 is loae, however, the algorithm prunes all partitions which have a minist smaller than N C eff a r Therefore, NA is prune an not accesse which is in contraiction to the assumption 3 The Cost Moel number of imensions number of ata points average number of ata points per inex page ege length of a ata page NP partition of the inex structure containing partitions NP 1 i,, NP i 1 Q SP ( E, r) Vol Sp Vol avg ( r) Vol Mink ( r) ( r) query point ata space -imensional hypersphere with center E an raius r volume of a -imensional hypersphere average volume of a -imensional hypersphere, bounary effects consiere Minkowski sum of an inex page an a query sphere with raius r p( r), P( r) istribution function of the raius, ensity function of the raius NN-ist, E(NN-ist) #pages, E(#pages) nearest neighbor istance, expecte nearest neighbor istance number of page accesses, expecte number of page accesses The objective of our cost moel is to provie accurate estimates of the execution time of nearest neighbor queries incluing high-imensional ata It is a well-known fact that simple queries, incluing nearest neighbor queries, are I/Oboun an only complex queries such as the spatial join may be CPU-boun Therefore, it is justifie to take the number of page accesses as a measure for the query performance Our cost moel may be use for optimizing the parameters of the inex structures such as the block size as well as for query optimization 31 Previous Approaches an their Problems Due to the high practical relevance of nearest neighbor queries, cost moels for estimating the number of necessary page accesses have been propose alreay several years ago The first approach is the well-known cost moel propose by Frieman, Bentley an Finkel [12] The assumptions of the

4 moel, however, are unrealistic for nearest neighbor queries on high-imensional ata, since N is assume to converge to infinity an bounary effects are not consiere The moel by Cleary [7] extens the Frieman, Bentley an Finkel moel by allowing non-rectangular-boune pages, but still oes not account for bounary effects Sproull [25] uses the existing moels for optimizing the nearest neighbor search in high imensions an shows that the number of ata points must be exponential in the number of imensions for the moels to provie accurate estimates Accoring to [25], bounary effects significantly contribute to the costs unless the following conition hols: 1 N >> C eff C eff Vol Sp -- 2 where Vol Sp ( r) is the volume of a hypersphere with raius r which can be compute as Vol Sp ( r) π Γ ( 2 + 1) r with Γ( x + 1) x Γ( x), Γ( 1) 1 Γ an π Unfortunately, the assumptions mae in the existing moels o not hol in the high-imensional case The main reason for the problems of the existing moels is that they o not account for bounary effects Bounary effects is short for an exceptional performance behavior, when the query reaches the bounary of the ata space As we show later, bounary effects occur frequently in high-imensional ata spaces an lea to a pruning of major amounts of empty search space, which is not consiere by the existing moels To examine these effects, we performe experiments to compare the necessary page accesses with the moel estimates Figure 2 shows the real page counts versus the estimates of the Frieman, Bentley an Finkel moel For high-imensional ata, the moel completely fails to estimate the number of page accesses Papaopoulos an Manolopoulos present in a very recent work [19] an analysis of nearest neighbor queries using R- trees In a recent paper [3], Arya, Mount, an Narayan evelop a moel that is capable of accounting for bounary effects The problem of the Arya approach, however, is that the moel still assumes N to be growing exponentially with the imension an it also uses the L metric, which is not suitable for most atabase applications Note that our moel also confirms the earlier results of Yao an Yao [28] 32 Overview of our Cost Moel The main objective of this paper is to present a new cost moel for nearest neighbor queries in high imensions In contrast to existing moels, our cost moel provies accurate estimates of the number of page accesses in the highimensional case since it accounts for bounary effects Furthermore, our moel is base on the optimal algorithm for nearest neighbor search (cf subsection 22) an works for an arbitrary number of ata points For the presentation of our cost moel, we first assume that the ata is uniformly istribute an that the split is performe in a k- tree fashion We will show later that our moel is also applicable to arbitrary ata istributions an a wie range of inex structures such as k--trees, R-trees, quatrees, Z-inices, etc The goal of our moel is to etermine the expecte number of pages which have to be accesse in performing a nearest neighbor query The number of ata pages which have to be accesse can be etermine by intersecting all pages with the minimal sphere aroun the query point containing the nearest neighbor The first step in eveloping the cost function is to etermine the average portion of a query sphere with a given query raius, which is insie the ata space Note, that the ata space is assume to be normalize to the unit hypercube [01] Then, we etermine the expecte raius of the sphere, which can be escribe as a stochastic variable Taking bounary effects into account, we erive the istribution function, probability ensity, an expecte value of the nearest neighbor istance (cf subsection 33) In the next step, we have to etermine the number of pages intersecte by the query sphere For this purpose, we require the Minkowski sum of the query sphere an the shape of an inex page (eg, the bouning box of the page in case of the R-tree) Due to bounary effects, portions of the volume of the Minkowski sum are outsie of the ata space, an therefore we have to introuce some moifications to the stanar Minkowski sum (cf subsection 34) The last step is the integration of the separate steps into the cost function For etermining the ex- page counts imension measure page accesses Bentley moel Figure 2: Real Page Counts versus Estimates by Moel [12]

5 pecte number of page accesses, we have to form the weighte average of the costs associate with the nearest neighbor istances weighte by the probability of their occurrence The etails are provie in subsection Expecte Nearest Neighbor Distance In this subsection, we now etermine the number of pages intersecte by a query sphere with a given raius r For this purpose, we have to etermine the Minkowski sum of the query sphere an the inex pages As can be seen in Figure 3, the concept of the Minkowski sum transforms a spherical query on a set of boxes into an equivalent point query on a set of enlarge objects The Minkowski sum irectly correspons to the volume of the intersecte pages Graphically presente, the Minkowski sum escribes the volume which results from moving the center of the query sphere over the surface of the bouning box of the inex page (cf Figure 4 for an ex- The goal of this subsection is to etermine the expecte istance between a query point an its nearest neighbor in a atabase of N points Before we are able to solve this problem, however, we first consier a simpler problem, namely the expecte istance of two uniformly istribute points (one query point an one ata point) in the ata space Let us first assume that the ata point (ata entry E) has a fixe position E [e 1, e 2,, e ] Then, the probability that the istance from the query point Q [q 1, q 2,, q ] is less than r can be moele as the volume of the hypersphere aroun E with raius r If point E is close to the borer of the ata space [ i { 1 }: ( r > e i ) ( e i > 1 r) ], we have to consier that part of the hypersphere volume is outsie of the ata space an oes not contribute to the probability The volume of the intersection of the ata space an the hypersphere can be expresse as the integral of a piecewise efine function integrate over all possible positions of Q Vol( SP ( E, r) ) where E 1 if E Q r Q 0 otherwise Q r ( e i q i ) 2 r 2 i an f ( X) X f ( x 1,, x ) x 1 x 0 0 If we assume that the ata point is also ranomly taken from the ata space, the above formula has to be average over all possible positions of E 1 Vol avg ( r) Vol( SP ( E, r) ) E Note that Vol avg ( r) correspons to the probability P( E Q r) To etermine the expecte istance between a query point an its nearest neighbor in a atabase of N points, we have to etermine the probability istribution of the minimum istance between query an ata points The probability that the nearest neighbor istance is at most r can also be escribe by the opposite: None of the N ata points is in the intersec- 1 As is [01], the enominator of the average is 1 tion of an the NN-sphere The corresponing istribution function P(r) is therefore: P( r) 1 ( 1 Vol avg ( r) ) N The ensity function p( r) of P( r) can be erive by etermining the erivative of this function p( r) P( r) r Volavg ( r) N ( 1 Vol r avg ( r) ) N 1 From this, we obtain the expecte nearest neighbor istance by the integral E( NN-ist) r p( r) S NN Figure 3: Transforming a Spherical Query into a Point Query by the Concept of Minkowski Sum 0 N r Vol r ( ( avg r ) ) ( 1 Vol avg ( r) ) N 1 r 0 In section 4, we will show that this formula may be use to accurately preict the expecte nearest neighbor istance 34 Number of Pages Intersecte by the Query Sphere r S

6 r a a Vol 1 Sp ( r) a 2 -- Vol Sp ( r) Figure 4: Example of the Minkowski Sum in Two Dimensions ample of the two-imensional Minkowski sum) For calculating the Minkowski sum, we have to consier volumes of each imension between 1 an which result from the ifferent faces of the bouning box If the inex page is a bouning box with an extension a in all imensions, the Minkowski sum may be calculate as Vol Mink ( r) The Minkowski sum is the expecte value of the hyper-volume of the bouning boxes of the ata pages which are intersecte by the NN-sphere The expecte value of the number of ata pages can easily be etermine by normalizing the Minkowski sum using the volume of the bouning box #Pages( r) The Minkowski sum, however, oes not consier bounary effects which occur in high-imensional space because r becomes large an portions of the volume of the Minkowski sum are outsie of the ata space To obtain a more realistic moel for the high-imensional case, we have to introuce some moifications to the Minkowski sum Similar to the case escribe in the previous subsection, we integrate over the ata space an etermine the intersection of partition B with the query sphere aroun Q: Vol Mink If B is a rectilinear bouning box with a lower corner l l u u [ b 1,, b ] an an upper corner [ b 1,, b ], MINDIST may be compute as MINDIST (B,Q) a i i 0 i Vol Mink ( r) a ( r) Vol Mink ( r) i Vol Sp ( r) 1 if MINDIST (B,Q) r 0 otherwise i 1 r l u 0 if ( b i q i b i ) l ( b i q i ) 2 l if ( q i < b i ) u ( b i q i ) 2 otherwise To etermine the Minkowski sum accoring to this formula, l we woul nee a stochastic moel for the parameters b i an u b i of the inex pages In practical experiments, we observe that in high-imensional space usually one of the two parameters, b i or b i, falls together with one of the borers of the l u ata space which results from the fact that each imension has been split at most once If all imensions are of about the same significance, the split algorithm has to use all imensions as split axes in orer to obtain a high selectivity In this case it is practically impossible in a high-imensional space to obtain more than one split per imension since the number of ata points oes not increase exponentially with the imension In general, the number of ata points is even not high enough that all imensions are split once Therefore, without loss of generality, we may assume that only the first ' imensions have been split at position s i in imension i ( 1 i ) may be etermine as N ' log C eff The Minkowski sum over all inex pages which irectly correspons to the average number of pages intersecte by the query sphere can be etermine as ' #Pages( r) Vol( SP k ([ s i1,, s ik ], r) ) k 0 { i 1,, i k } P( { 1,, ' }) For each k, the partitions have some ( -k)-imensional faces insie At these faces, a hyper-cyliner arises which is spherical in k imensions (with raius r) an cubical in the remaining imensions (with sie-length 1) The spherical part may be intersecte with an only this intersection is relevant The secon sum iterates over all elements of the power set of {1,, }an thus, selects exactly all possible k-imensional projections of the split imensions, encountering all possible cyliners For uniformly istribute ata, the s i are all at the same position ( s i 1 j -- In this case, the formula becomes j ) 2 ' #Pages( r) Vol SP k 1 ( ( --,, , r) ) 2 k 0 { i 1,, i k } P( { 1,, ' }) As the volume of all k-imensional cyliners is ientical now, we may simplify the formula to: #Pages( r) ' ' k Vol SP 1 ( ( --,, k --, r) ) k 0

7 E (#Pages) N Volavg r r ( ) ( 1 Vol avg ( r) ) N 1 ' Vol( SP k ([ s i1,, s ik ], r) ) r 0 k 0 { i 1,, i k } P( { 1,, ' }) Figure 5: Cost Formula for the Expecte Number of Page Access 35 Expecte Number of Page Accesses In the previous section, we evelope a moel to etermine the number of page accesses for a query sphere with a given raius The goal of this section is to etermine the expecte number of page accesses for a nearest neighbor query To etermine the expecte number of page accesses for a nearest neighbor query, we have to integrate over the raius multiplie with the probability with which the raius occurs More formally, the expecte number of page accesses for a nearest neighbor query E(#Pages) may be etermine as E (#Pages) #Pages( r) p( r) If we integrate the partial results from subsections 33 an 34, we obtain the formula presente in Figure 5 4 Experimental Evaluation In this section, we first escribe the implementation of our cost moel presente in section 3 Then, we escribe the experiments conucte to show the practical applicability of our cost moel an provie a short interpretation of the experimental results 41 Implementation of the Cost Moel 0 In subsection 33, we presente an integral formula to etermine the volume of the intersection between the ata space an a query sphere with raius r This volume integral can be evaluate easily using numerical integration Among the various methos, the so-calle Montecarlo integration is bestsuite in the high-imensional case Montecarlo integration [14] is base on the principle of ranomization an can be concisely escribe, as follows: The volume of a complex object correspons irectly to the probability that a point, ranomly selecte from the ata space, is insie this object Therefore, an approximation of the volume can be gaine by selecting a number of points an measuring the fraction of points insie the object Note that Montecarlo integration may be use for arbitrary ata istributions We use a variation of this technique to etermine the volume functions Vol avg ( r) an Vol SP 1 ( ( --,, , r) ) as well as the corresponing erivative for the require ranges of an r These functions are inepenent from iniviual parameters such as the number of points in the atabase or the capacity or geometrical shape of the ata pages an are r thus universally applicable for all subsequent cost computations The expecte value of the NN-istance can then be efficiently integrate from the precompute function Vol avg ( r) by the ex- tene trapezoial rule The same applies for the cost function 42 Experiments To show the accuracy of our moel, we mae several experiments on both, synthetic an real ata We integrate the algorithm NN-opt in an implementation of the well-known Hilbert-inex [11] an in the original implementation of the X- tree [5] The Hilbert-inex maps -imensional points to a one-imensional space which is then inexe by a B + -tree Accoring to subsection 21, the algorithm NN-opt first examines the partition (given by a range of Hilbert values) with the lowest MINDIST uring the search process The X-tree is an R-tree-like multiimensional inex structure which has been especially esigne for inexing high-imensional ata Our cost moel is base on an estimation of the raius of the NN-sphere To show the accuracy of our moel, we compare the average nearest neighbor istance of a uniformly istribute ata set with the raius estimate by our moel For the experiments, we varie from 2 to 16 using up to 369,000 ata points We average the raius over 100 NN-queries an foun our expecte nearest neighbor istance perfectly confirme (cf Figure 6) To evaluate the accuracy of our cost function an its applicability to various inex structures, we performe several experiments In the first experiment, we fixe the imension to 16 an varie the number of uniformly istribute ata points from 93,000 to 2,976,000 In this experiment, we use the Hilbert inex with a B + -tree page size of 32 KBytes which implies an effective capacity of 360 ata objects per ata NN-istance imension measure istance our moel Figure 6 Expecte NN-istance Depening on the Dimension

8 page accesses Hilbert our moel moel Hilbert N 1000 Figure 7: Expecte Number of Page Accesses an Hilbert Inex Performance Depening on the Number of Data Points N46000 N92000 N page The experiment confirme our cost moel up to a relative error of 5-8% (cf Figure 7) This remaining error is ue to the impact of the specific split behavior, which is ifficult to inclue in any formal moel In the experiment shown in Figure 8, we compare our cost moel to the performance of the X-tree with a fixe number of ata pages an varying imensionality ( 2 50) The performance of the X-tree is slightly better than the estimate of our cost moel The reason for the better performance is that the X-tree ignores ea space, ie parts of the ata space which are not covere by any partition As the experiments show, however, the estimates of our moel are sufficiently close to the real performance of the X-tree Even for low an meium imensions, the accuracy of our moel is much better than the moel of Frieman, Bentley an Finkel Note that in general our moel is also applicable to R-treelike inex structures especially in higher imensions To show the practical relevance of our approach, we also performe experiments using real ata The test ata use for the experiments originate from a real atabase consisting of highimensional Fourier points Each 16-imensional Fourier point correspons to a region of a CAD-moel escribing an inustrial part We store the Fourier-points in the Hilbert-inex an performe 100 ranom nearest neighbor queries Since in general the actual imensionality of a real ata set is lower than the formal imensionality [10], we have to use the fractal imension of the Fourier atabase for in our moel page accesses imension X-tree FBF-moel our moel Figure 8: Expecte Number of Page Accesses an Measure X-tree Performance Depening on the Dimension Figure 9: Application of the Cost Moel to Real Data We therefore etermine the fractal imension of the Fourier ata set which is 1056 Using 10 as the imension in our moel, we get an accurate estimation of the page accesses Figure 9 shows the result of some experiments using ifferent numbers N of ata items 5 Conclusion In this paper, we presente a new cost moel for nearest neighbor queries in high-imensional ata space using conservative recursive inex structures such as the R-tree, k-- B-tree or quatree Our cost moel is accurate even in high imensions, where other moels completely fail, because our moel consiers bounary effects As a further avantage, our moel uses the Eucliean metric which is relevant to many atabase applications We showe the applicability an accuracy of our moel by presenting the results of various experiments both on synthetic an real ata sets comparing our preictions with the performance of X-tree an Hilbert-base inices Whereas previous moels such as the moel by Frieman, Bentley an Finkel overestimate the cost by orers of magnitue in high imensions, our moel is exact up to a moerate relative error Our further research will focus on the extension of our moel to k-nearest neighbor queries In aition, we plan to perform a theoretically well-foune analysis of various inex structures for high-imensional ata References [1] Altschul S F, Gish W, Miller W, Myers E W, Lipman D J: A Basic Local Alignment Search Tool, Journal of Molecular Biology, Vol 215, No 3, 1990, pp [2] Arya S: Nearest Neighbor Searching an Applications, PhD thesis, University of Marylan, College Park, MD, 1995 [3] Arya S, Mount D M, Narayan O: Accounting for Bounary Effects in Nearest Neighbor Searching, Proc 11th Annual Symposium on Computational Geometry, Vancouver, Canaa, 1995, pp [4] Beckmann N, Kriegel H-P, Schneier R, Seeger B:

9 The R*-tree: An Efficient an Robust Access Metho for Points an Rectangles, Proc ACM SIGMOD Int Conf on Management of Data, Atlantic City, NJ, 1990, pp [5] Berchtol S, Keim D, Kriegel H-P: The X-tree: An Inex Structure for High-Dimensional Data, 22n Conf on Very Large Databases, 1996, Bombay, Inia [6] Berchtol S, Keim D, Kriegel H-P: Fast Searching for Partial Similarity in Polygon Databases, accepte for publication: V LDB Journal, 1996 [7] Cleary J G: Analysis of an Algorithm for Fining Nearest Neighbors in Eucliean Space, ACM Transactions on Mathematical Software, Vol 5, No 2, June 1979, pp [8] Eastman CM: Optimal Bucket Size for Nearest Neighbor Searching in k- Trees, Information Processing Letters Vol 12, No 4, 1981 [9] Faloutsos C, Barber R, Flickner M, Hafner J, et al: Efficient an Effective Querying by Image Content, Journal of Intelligent Information Systems, 1994, Vol 3, pp [10] Faloutsos C, Gaee V: Analysis of n-dimensional Quatrees Using the Hausorff Fractal Dimension, Proc ACM SIGMOD Int Conf on Management of Data, 1996 [11] Faloutsos C, Roseman S: Fractals for Seconary Key Retrieval, Proc 8th ACM SIGACT-SIGMOD- SIGART Symposium on Principles of Database Systems, 1989, pp [12] Frieman J H, Bentley J L, Finkel R A: An Algorithm for Fining Best Matches in Logarithmic Expecte Time, ACM Transactions on Mathematical Software, Vol 3, No 3, September 1977, pp [13] Hjaltason G R, Samet H: Ranking in Spatial Databases, Proc 4th Int Symp on Large Spatial Databases, Portlan, ME, 1995, pp [14] Kalos M H, Whitlock P A: Monte Carlo Methos, Wiley, New York, 1986 [15] Kukich K: Techniques for Automatically Correcting Wors in Text, ACM Computing Surveys, Vol 24, No 4, 1992, pp [16] Jagaish H V: A Retrieval Technique for Similar Shapes, Proc ACM SIGMOD Int Conf on Management of Data, 1991, pp [17] Mehrotra R, Gary J E: Feature-Base Retrieval of Similar Shapes, Proc 9th Int Conf on Data Engineering, Vienna, Austria, 1993, pp [18] Mehrotra R, Gary J E: Feature-Inex-Base Similar Shape Retrieval, Proc of the 3r Working Conf on Visual Database Systems, March 1995 [19] Papapoulos A, Manolopoulos Y: Performance of Nearest Neighbor Queries in R-Trees, Proc of the 6th International Conference on Database Theory, Delphi, Greece, 1997, LNCS 1186, pp [20] Preparata FP, Shamos M I: Computational Geometry, Chapter 5 ( Proximity: Funamental Algorithms ), Springer Verlag New York, 1985, pp [21] Ramasubramanian V, Paliwal K K: Fast k-dimensional Tree Algorithms for Nearest Neighbor Search with Application to Vector Quantization Encoing, IEEE Transactions on Signal Processing, Vol 40, No 3, March 1992, pp [22] Roussopoulos N, Kelley S, Vincent F: Nearest Neighbor Queries, Proc ACM SIGMOD Int Conf on Management of Data, 1995, pp [23] Shawney H, Hafner J: Efficient Color Histogram Inexing, Proc Int Conf on Image Processing, 1994, pp [24] Shoichet B K, Boian D L, Kuntz I D: Molecular Docking Using Shape Descriptors, Journal of Computational Chemistry, Vol 13, No 3, 1992, pp [25] Sproull RF: Refinements to Nearest Neighbor Searching in k-dimensional Trees, Algorithmica 1991, pp [26] Wallace T, Wintz P: An Efficient Three-Dimensional Aircraft Recognition Algorithm Using Normalize Fourier Descriptors, Computer Graphics an Image Processing, Vol 13, pp , 1980 [27] Welch T: Bouns on the Information Retrieval Efficiency of Static File Structures, Technical Report 88, MIT, June 1971 [28] Yao AC, Yao FF: A General Approach to D-Dimensional Geometric Queries, Proc of the ACM Symposium on Theory of Computing, 1985

A Cost Model for Query Processing in High-Dimensional Data Spaces

A Cost Model for Query Processing in High-Dimensional Data Spaces A Cost Moel for Query Processing in High-Dimensional Data Spaces Christian Böhm Luwig Maximilians Universität München This is a preliminary release of an article accepte by ACM Transactions on Database

More information

X-tree. Daniel Keim a, Benjamin Bustos b, Stefan Berchtold c, and Hans-Peter Kriegel d. SYNONYMS Extended node tree

X-tree. Daniel Keim a, Benjamin Bustos b, Stefan Berchtold c, and Hans-Peter Kriegel d. SYNONYMS Extended node tree X-tree Daniel Keim a, Benjamin Bustos b, Stefan Berchtold c, and Hans-Peter Kriegel d a Department of Computer and Information Science, University of Konstanz b Department of Computer Science, University

More information

1 Surprises in high dimensions

1 Surprises in high dimensions 1 Surprises in high imensions Our intuition about space is base on two an three imensions an can often be misleaing in high imensions. It is instructive to analyze the shape an properties of some basic

More information

Fast Parallel Similarity Search in Multimedia Databases

Fast Parallel Similarity Search in Multimedia Databases Proc. Int. Conf. on Management of Data, 997 ACM SIGMOD, Tuscon, AZ, 997 SIGMOD Best Paper Award Fast Parallel Similarity Search in Multimedia Databases Stefan Berchtold Christian Böhm Bernhard Braunmüller

More information

Analysis of half-space range search using the k-d search skip list. Here we analyse the expected time for half-space

Analysis of half-space range search using the k-d search skip list. Here we analyse the expected time for half-space Analysis of half-space range search using the k- search skip list Mario A. Lopez Brafor G. Nickerson y 1 Abstract We analyse the average cost of half-space range reporting for the k- search skip list.

More information

Fast Nearest Neighbor Search in High-dimensional Space

Fast Nearest Neighbor Search in High-dimensional Space 1th International Conference on Data Engineering (ICDE 98), February 23-27, 1998, Orlando, Florida Fast Nearest Neighbor Search in High-dimensional Space Stefan Berchtold, Bernhard Ertl, Daniel A Keim,

More information

Skyline Community Search in Multi-valued Networks

Skyline Community Search in Multi-valued Networks Syline Community Search in Multi-value Networs Rong-Hua Li Beijing Institute of Technology Beijing, China lironghuascut@gmail.com Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China yu@se.cuh.eu.h

More information

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 3 Sofia 017 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-017-0030 Particle Swarm Optimization Base

More information

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method Southern Cross University epublications@scu 23r Australasian Conference on the Mechanics of Structures an Materials 214 Transient analysis of wave propagation in 3D soil by using the scale bounary finite

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks TR-IIS-05-021 Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu, Pangfeng Liu, Da-Wei Wang, Jan-Jan Wu December 2005 Technical Report No. TR-IIS-05-021 http://www.iis.sinica.eu.tw/lib/techreport/tr2005/tr05.html

More information

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means Classifying Facial Expression with Raial Basis Function Networks, using Graient Descent an K-means Neil Allrin Department of Computer Science University of California, San Diego La Jolla, CA 9237 nallrin@cs.ucs.eu

More information

Shift-map Image Registration

Shift-map Image Registration Shift-map Image Registration Linus Svärm Petter Stranmark Centre for Mathematical Sciences, Lun University {linus,petter}@maths.lth.se Abstract Shift-map image processing is a new framework base on energy

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science

More information

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation DEIM Forum 2018 I4-4 Abstract Ranom Clustering for Multiple Sampling Units to Spee Up Run-time Sample Generation uzuru OKAJIMA an Koichi MARUAMA NEC Solution Innovators, Lt. 1-18-7 Shinkiba, Koto-ku, Tokyo,

More information

6 Gradient Descent. 6.1 Functions

6 Gradient Descent. 6.1 Functions 6 Graient Descent In this topic we will iscuss optimizing over general functions f. Typically the function is efine f : R! R; that is its omain is multi-imensional (in this case -imensional) an output

More information

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing Inexing the Eges A simple an yet efficient approach to high-imensional inexing Beng Chin Ooi Kian-Lee Tan Cui Yu Stephane Bressan Department of Computer Science National University of Singapore 3 Science

More information

New Version of Davies-Bouldin Index for Clustering Validation Based on Cylindrical Distance

New Version of Davies-Bouldin Index for Clustering Validation Based on Cylindrical Distance New Version of Davies-Boulin Inex for lustering Valiation Base on ylinrical Distance Juan arlos Roas Thomas Faculta e Informática Universia omplutense e Mari Mari, España correoroas@gmail.com Abstract

More information

Optimal Oblivious Path Selection on the Mesh

Optimal Oblivious Path Selection on the Mesh Optimal Oblivious Path Selection on the Mesh Costas Busch Malik Magon-Ismail Jing Xi Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 280, USA {buschc,magon,xij2}@cs.rpi.eu Abstract

More information

Kinematic Analysis of a Family of 3R Manipulators

Kinematic Analysis of a Family of 3R Manipulators Kinematic Analysis of a Family of R Manipulators Maher Baili, Philippe Wenger an Damien Chablat Institut e Recherche en Communications et Cybernétique e Nantes, UMR C.N.R.S. 6597 1, rue e la Noë, BP 92101,

More information

CONSTRUCTION AND ANALYSIS OF INVERSIONS IN S 2 AND H 2. Arunima Ray. Final Paper, MATH 399. Spring 2008 ABSTRACT

CONSTRUCTION AND ANALYSIS OF INVERSIONS IN S 2 AND H 2. Arunima Ray. Final Paper, MATH 399. Spring 2008 ABSTRACT CONSTUCTION AN ANALYSIS OF INVESIONS IN S AN H Arunima ay Final Paper, MATH 399 Spring 008 ASTACT The construction use to otain inversions in two-imensional Eucliean space was moifie an applie to otain

More information

Blind Data Classification using Hyper-Dimensional Convex Polytopes

Blind Data Classification using Hyper-Dimensional Convex Polytopes Blin Data Classification using Hyper-Dimensional Convex Polytopes Brent T. McBrie an Gilbert L. Peterson Department of Electrical an Computer Engineering Air Force Institute of Technology 9 Hobson Way

More information

Online Appendix to: Generalizing Database Forensics

Online Appendix to: Generalizing Database Forensics Online Appenix to: Generalizing Database Forensics KYRIACOS E. PAVLOU an RICHARD T. SNODGRASS, University of Arizona This appenix presents a step-by-step iscussion of the forensic analysis protocol that

More information

Divide-and-Conquer Algorithms

Divide-and-Conquer Algorithms Supplment to A Practical Guie to Data Structures an Algorithms Using Java Divie-an-Conquer Algorithms Sally A Golman an Kenneth J Golman Hanout Divie-an-conquer algorithms use the following three phases:

More information

Research Article Inviscid Uniform Shear Flow past a Smooth Concave Body

Research Article Inviscid Uniform Shear Flow past a Smooth Concave Body International Engineering Mathematics Volume 04, Article ID 46593, 7 pages http://x.oi.org/0.55/04/46593 Research Article Invisci Uniform Shear Flow past a Smooth Concave Boy Abullah Mura Department of

More information

Non-homogeneous Generalization in Privacy Preserving Data Publishing

Non-homogeneous Generalization in Privacy Preserving Data Publishing Non-homogeneous Generalization in Privacy Preserving Data Publishing W. K. Wong, Nios Mamoulis an Davi W. Cheung Department of Computer Science, The University of Hong Kong Pofulam Roa, Hong Kong {wwong2,nios,cheung}@cs.hu.h

More information

The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand).

The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand). http://waikato.researchgateway.ac.nz/ Research Commons at the University of Waikato Copyright Statement: The igital copy of this thesis is protecte by the Copyright Act 99 (New Zealan). The thesis may

More information

WLAN Indoor Positioning Based on Euclidean Distances and Fuzzy Logic

WLAN Indoor Positioning Based on Euclidean Distances and Fuzzy Logic WLAN Inoor Positioning Base on Eucliean Distances an Fuzzy Logic Anreas TEUBER, Bern EISSFELLER Institute of Geoesy an Navigation, University FAF, Munich, Germany, e-mail: (anreas.teuber, bern.eissfeller)@unibw.e

More information

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE БСУ Международна конференция - 2 THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE Evgeniya Nikolova, Veselina Jecheva Burgas Free University Abstract:

More information

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES OLIVIER BERNARDI AND ÉRIC FUSY Abstract. We present bijections for planar maps with bounaries. In particular, we obtain bijections for triangulations an quarangulations

More information

A Classification of 3R Orthogonal Manipulators by the Topology of their Workspace

A Classification of 3R Orthogonal Manipulators by the Topology of their Workspace A Classification of R Orthogonal Manipulators by the Topology of their Workspace Maher aili, Philippe Wenger an Damien Chablat Institut e Recherche en Communications et Cybernétique e Nantes, UMR C.N.R.S.

More information

NEW METHOD FOR FINDING A REFERENCE POINT IN FINGERPRINT IMAGES WITH THE USE OF THE IPAN99 ALGORITHM 1. INTRODUCTION 2.

NEW METHOD FOR FINDING A REFERENCE POINT IN FINGERPRINT IMAGES WITH THE USE OF THE IPAN99 ALGORITHM 1. INTRODUCTION 2. JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 13/009, ISSN 164-6037 Krzysztof WRÓBEL, Rafał DOROZ * fingerprint, reference point, IPAN99 NEW METHOD FOR FINDING A REFERENCE POINT IN FINGERPRINT IMAGES

More information

Shift-map Image Registration

Shift-map Image Registration Shift-map Image Registration Svärm, Linus; Stranmark, Petter Unpublishe: 2010-01-01 Link to publication Citation for publishe version (APA): Svärm, L., & Stranmark, P. (2010). Shift-map Image Registration.

More information

Design of Policy-Aware Differentially Private Algorithms

Design of Policy-Aware Differentially Private Algorithms Design of Policy-Aware Differentially Private Algorithms Samuel Haney Due University Durham, NC, USA shaney@cs.ue.eu Ashwin Machanavajjhala Due University Durham, NC, USA ashwin@cs.ue.eu Bolin Ding Microsoft

More information

Distributed Decomposition Over Hyperspherical Domains

Distributed Decomposition Over Hyperspherical Domains Distribute Decomposition Over Hyperspherical Domains Aron Ahmaia 1, Davi Keyes 1, Davi Melville 2, Alan Rosenbluth 2, Kehan Tian 2 1 Department of Applie Physics an Applie Mathematics, Columbia University,

More information

Learning convex bodies is hard

Learning convex bodies is hard Learning convex boies is har Navin Goyal Microsoft Research Inia navingo@microsoftcom Luis Raemacher Georgia Tech lraemac@ccgatecheu Abstract We show that learning a convex boy in R, given ranom samples

More information

Loop Scheduling and Partitions for Hiding Memory Latencies

Loop Scheduling and Partitions for Hiding Memory Latencies Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN 46556 Email: fchen,esha @cse.n.eu Tel:

More information

A Versatile Model-Based Visibility Measure for Geometric Primitives

A Versatile Model-Based Visibility Measure for Geometric Primitives A Versatile Moel-Base Visibility Measure for Geometric Primitives Marc M. Ellenrieer 1,LarsKrüger 1, Dirk Stößel 2, an Marc Hanheie 2 1 DaimlerChrysler AG, Research & Technology, 89013 Ulm, Germany 2 Faculty

More information

Using Vector and Raster-Based Techniques in Categorical Map Generalization

Using Vector and Raster-Based Techniques in Categorical Map Generalization Thir ICA Workshop on Progress in Automate Map Generalization, Ottawa, 12-14 August 1999 1 Using Vector an Raster-Base Techniques in Categorical Map Generalization Beat Peter an Robert Weibel Department

More information

Ad-Hoc Networks Beyond Unit Disk Graphs

Ad-Hoc Networks Beyond Unit Disk Graphs A-Hoc Networks Beyon Unit Disk Graphs Fabian Kuhn, Roger Wattenhofer, Aaron Zollinger Department of Computer Science ETH Zurich 8092 Zurich, Switzerlan {kuhn, wattenhofer, zollinger}@inf.ethz.ch ABSTRACT

More information

Learning Subproblem Complexities in Distributed Branch and Bound

Learning Subproblem Complexities in Distributed Branch and Bound Learning Subproblem Complexities in Distribute Branch an Boun Lars Otten Department of Computer Science University of California, Irvine lotten@ics.uci.eu Rina Dechter Department of Computer Science University

More information

Indexing High-Dimensional Space:

Indexing High-Dimensional Space: Indexing High-Dimensional Space: Database Support for Next Decade s Applications Stefan Berchtold AT&T Research berchtol@research.att.com Daniel A. Keim University of Halle-Wittenberg keim@informatik.uni-halle.de

More information

Overlap Interval Partition Join

Overlap Interval Partition Join Overlap Interval Partition Join Anton Dignös Department of Computer Science University of Zürich, Switzerlan aignoes@ifi.uzh.ch Michael H. Böhlen Department of Computer Science University of Zürich, Switzerlan

More information

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Queueing Moel an Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Marc Aoun, Antonios Argyriou, Philips Research, Einhoven, 66AE, The Netherlans Department of Computer an Communication

More information

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2 This paper appears in J. of Parallel an Distribute Computing 10 (1990), pp. 167 181. Intensive Hypercube Communication: Prearrange Communication in Link-Boun Machines 1 2 Quentin F. Stout an Bruce Wagar

More information

Characterizing Decoding Robustness under Parametric Channel Uncertainty

Characterizing Decoding Robustness under Parametric Channel Uncertainty Characterizing Decoing Robustness uner Parametric Channel Uncertainty Jay D. Wierer, Wahee U. Bajwa, Nigel Boston, an Robert D. Nowak Abstract This paper characterizes the robustness of ecoing uner parametric

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. Preface Here are my online notes for my Calculus I course that I teach here at Lamar University. Despite the fact that these are my class notes, they shoul be accessible to anyone wanting to learn Calculus

More information

EXACT SIMULATION OF A BOOLEAN MODEL

EXACT SIMULATION OF A BOOLEAN MODEL Original Research Paper oi:10.5566/ias.v32.p101-105 EXACT SIMULATION OF A BOOLEAN MODEL CHRISTIAN LANTUÉJOULB MinesParisTech 35 rue Saint-Honoré 77305 Fontainebleau France e-mail: christian.lantuejoul@mines-paristech.fr

More information

Coupling the User Interfaces of a Multiuser Program

Coupling the User Interfaces of a Multiuser Program Coupling the User Interfaces of a Multiuser Program PRASUN DEWAN University of North Carolina at Chapel Hill RAJIV CHOUDHARY Intel Corporation We have evelope a new moel for coupling the user-interfaces

More information

Improving Spatial Reuse of IEEE Based Ad Hoc Networks

Improving Spatial Reuse of IEEE Based Ad Hoc Networks mproving Spatial Reuse of EEE 82.11 Base A Hoc Networks Fengji Ye, Su Yi an Biplab Sikar ECSE Department, Rensselaer Polytechnic nstitute Troy, NY 1218 Abstract n this paper, we evaluate an suggest methos

More information

filtering LETTER An Improved Neighbor Selection Algorithm in Collaborative Taek-Hun KIM a), Student Member and Sung-Bong YANG b), Nonmember

filtering LETTER An Improved Neighbor Selection Algorithm in Collaborative Taek-Hun KIM a), Student Member and Sung-Bong YANG b), Nonmember 107 IEICE TRANS INF & SYST, VOLE88 D, NO5 MAY 005 LETTER An Improve Neighbor Selection Algorithm in Collaborative Filtering Taek-Hun KIM a), Stuent Member an Sung-Bong YANG b), Nonmember SUMMARY Nowaays,

More information

Optimal path planning in a constant wind with a bounded turning rate

Optimal path planning in a constant wind with a bounded turning rate Optimal path planning in a constant win with a boune turning rate Timothy G. McGee, Stephen Spry an J. Karl Herick Center for Collaborative Control of Unmanne Vehicles, University of California, Berkeley,

More information

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation Solution Representation for Job Shop Scheuling Problems in Ant Colony Optimisation James Montgomery, Carole Faya 2, an Sana Petrovic 2 Faculty of Information & Communication Technologies, Swinburne University

More information

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory Feature Extraction an Rule Classification Algorithm of Digital Mammography base on Rough Set Theory Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative

More information

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics CS 106 Winter 2016 Craig S. Kaplan Moule 01 Processing Recap Topics The basic parts of speech in a Processing program Scope Review of syntax for classes an objects Reaings Your CS 105 notes Learning Processing,

More information

Lecture 1 September 4, 2013

Lecture 1 September 4, 2013 CS 84r: Incentives an Information in Networks Fall 013 Prof. Yaron Singer Lecture 1 September 4, 013 Scribe: Bo Waggoner 1 Overview In this course we will try to evelop a mathematical unerstaning for the

More information

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters Available online at www.scienceirect.com Proceia Engineering 4 (011 ) 34 38 011 International Conference on Avances in Engineering Cluster Center Initialization Metho for K-means Algorithm Over Data Sets

More information

State Indexed Policy Search by Dynamic Programming. Abstract. 1. Introduction. 2. System parameterization. Charles DuHadway

State Indexed Policy Search by Dynamic Programming. Abstract. 1. Introduction. 2. System parameterization. Charles DuHadway State Inexe Policy Search by Dynamic Programming Charles DuHaway Yi Gu 5435537 503372 December 4, 2007 Abstract We consier the reinforcement learning problem of simultaneous trajectory-following an obstacle

More information

Polygon Simplification by Minimizing Convex Corners

Polygon Simplification by Minimizing Convex Corners Polygon Simplification by Minimizing Convex Corners Yeganeh Bahoo 1, Stephane Durocher 1, J. Mark Keil 2, Saee Mehrabi 3, Sahar Mehrpour 1, an Debajyoti Monal 1 1 Department of Computer Science, University

More information

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control Almost Disjunct Coes in Large Scale Multihop Wireless Network Meia Access Control D. Charles Engelhart Anan Sivasubramaniam Penn. State University University Park PA 682 engelhar,anan @cse.psu.eu Abstract

More information

A Stochastic Process on the Hypercube with Applications to Peer to Peer Networks

A Stochastic Process on the Hypercube with Applications to Peer to Peer Networks A Stochastic Process on the Hypercube with Applications to Peer to Peer Networs [Extene Abstract] Micah Aler Department of Computer Science, University of Massachusetts, Amherst, MA 0003 460, USA micah@cs.umass.eu

More information

Rough Set Approach for Classification of Breast Cancer Mammogram Images

Rough Set Approach for Classification of Breast Cancer Mammogram Images Rough Set Approach for Classification of Breast Cancer Mammogram Images Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative Methos an Information Systems

More information

Modifying ROC Curves to Incorporate Predicted Probabilities

Modifying ROC Curves to Incorporate Predicted Probabilities Moifying ROC Curves to Incorporate Preicte Probabilities Cèsar Ferri DSIC, Universitat Politècnica e València Peter Flach Department of Computer Science, University of Bristol José Hernánez-Orallo DSIC,

More information

Holy Halved Heaquarters Riddler

Holy Halved Heaquarters Riddler Holy Halve Heaquarters Riler Anonymous Philosopher June 206 Laser Larry threatens to imminently zap Riler Heaquarters (which is of regular pentagonal shape with no courtyar or other funny business) with

More information

Image compression predicated on recurrent iterated function systems

Image compression predicated on recurrent iterated function systems 2n International Conference on Mathematics & Statistics 16-19 June, 2008, Athens, Greece Image compression preicate on recurrent iterate function systems Chol-Hui Yun *, Metzler W. a an Barski M. a * Faculty

More information

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly International Journal "Information Technologies an Knowlege" Vol. / 2007 309 [Project MINERVAEUROPE] Project MINERVAEUROPE: Ministerial Network for Valorising Activities in igitalisation -

More information

Fuzzy Clustering in Parallel Universes

Fuzzy Clustering in Parallel Universes Fuzzy Clustering in Parallel Universes Bern Wisweel an Michael R. Berthol ALTANA-Chair for Bioinformatics an Information Mining Department of Computer an Information Science, University of Konstanz 78457

More information

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources An Algorithm for Builing an Enterprise Network Topology Using Wiesprea Data Sources Anton Anreev, Iurii Bogoiavlenskii Petrozavosk State University Petrozavosk, Russia {anreev, ybgv}@cs.petrsu.ru Abstract

More information

2-connected graphs with small 2-connected dominating sets

2-connected graphs with small 2-connected dominating sets 2-connecte graphs with small 2-connecte ominating sets Yair Caro, Raphael Yuster 1 Department of Mathematics, University of Haifa at Oranim, Tivon 36006, Israel Abstract Let G be a 2-connecte graph. A

More information

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks Architecture Design of Mobile Access Coorinate Wireless Sensor Networks Mai Abelhakim 1 Leonar E. Lightfoot Jian Ren 1 Tongtong Li 1 1 Department of Electrical & Computer Engineering, Michigan State University,

More information

FINDING OPTICAL DISPERSION OF A PRISM WITH APPLICATION OF MINIMUM DEVIATION ANGLE MEASUREMENT METHOD

FINDING OPTICAL DISPERSION OF A PRISM WITH APPLICATION OF MINIMUM DEVIATION ANGLE MEASUREMENT METHOD Warsaw University of Technology Faculty of Physics Physics Laboratory I P Joanna Konwerska-Hrabowska 6 FINDING OPTICAL DISPERSION OF A PRISM WITH APPLICATION OF MINIMUM DEVIATION ANGLE MEASUREMENT METHOD.

More information

d 3 d 4 d d d d d d d d d d d 1 d d d d d d

d 3 d 4 d d d d d d d d d d d 1 d d d d d d Proceeings of the IASTED International Conference Software Engineering an Applications (SEA') October 6-, 1, Scottsale, Arizona, USA AN OBJECT-ORIENTED APPROACH FOR MANAGING A NETWORK OF DATABASES Shu-Ching

More information

Visualizing and Animating Search Operations on Quadtrees on the Worldwide Web

Visualizing and Animating Search Operations on Quadtrees on the Worldwide Web Visualizing and Animating Search Operations on Quadtrees on the Worldwide Web František Brabec Computer Science Department University of Maryland College Park, Maryland 20742 brabec@umiacs.umd.edu Hanan

More information

Animated Surface Pasting

Animated Surface Pasting Animate Surface Pasting Clara Tsang an Stephen Mann Computing Science Department University of Waterloo 200 University Ave W. Waterloo, Ontario Canaa N2L 3G1 e-mail: clftsang@cgl.uwaterloo.ca, smann@cgl.uwaterloo.ca

More information

Improving the Query Performance of High-Dimensional Index Structures by Bulk Load Operations

Improving the Query Performance of High-Dimensional Index Structures by Bulk Load Operations Improving the Query Performance of High-Dimensional Index Structures by Bulk Load Operations Stefan Berchtold, Christian Böhm 2, and Hans-Peter Kriegel 2 AT&T Labs Research, 8 Park Avenue, Florham Park,

More information

Object Recognition Using Colour, Shape and Affine Invariant Ratios

Object Recognition Using Colour, Shape and Affine Invariant Ratios Object Recognition Using Colour, Shape an Affine Invariant Ratios Paul A. Walcott Centre for Information Engineering City University, Lonon EC1V 0HB, Englan P.A.Walcott@city.ac.uk Abstract This paper escribes

More information

PART 2. Organization Of An Operating System

PART 2. Organization Of An Operating System PART 2 Organization Of An Operating System CS 503 - PART 2 1 2010 Services An OS Supplies Support for concurrent execution Facilities for process synchronization Inter-process communication mechanisms

More information

A New Search Algorithm for Solving Symmetric Traveling Salesman Problem Based on Gravity

A New Search Algorithm for Solving Symmetric Traveling Salesman Problem Based on Gravity Worl Applie Sciences Journal 16 (10): 1387-1392, 2012 ISSN 1818-4952 IDOSI Publications, 2012 A New Search Algorithm for Solving Symmetric Traveling Salesman Problem Base on Gravity Aliasghar Rahmani Hosseinabai,

More information

Investigation into a new incremental forming process using an adjustable punch set for the manufacture of a doubly curved sheet metal

Investigation into a new incremental forming process using an adjustable punch set for the manufacture of a doubly curved sheet metal 991 Investigation into a new incremental forming process using an ajustable punch set for the manufacture of a oubly curve sheet metal S J Yoon an D Y Yang* Department of Mechanical Engineering, Korea

More information

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH Galen H Sasaki Dept Elec Engg, U Hawaii 2540 Dole Street Honolul HI 96822 USA Ching-Fong Su Fuitsu Laboratories of America 595 Lawrence Expressway

More information

Nearest Neighbor Search using Additive Binary Tree

Nearest Neighbor Search using Additive Binary Tree Nearest Neighbor Search using Aitive Binary Tree Sung-Hyuk Cha an Sargur N. Srihari Center of Excellence for Document Analysis an Recognition State University of New York at Buffalo, U. S. A. E-mail: fscha,sriharig@cear.buffalo.eu

More information

Short-term prediction of photovoltaic power based on GWPA - BP neural network model

Short-term prediction of photovoltaic power based on GWPA - BP neural network model Short-term preiction of photovoltaic power base on GWPA - BP neural networ moel Jian Di an Shanshan Meng School of orth China Electric Power University, Baoing. China Abstract In recent years, ue to China's

More information

Image Segmentation using K-means clustering and Thresholding

Image Segmentation using K-means clustering and Thresholding Image Segmentation using Kmeans clustering an Thresholing Preeti Panwar 1, Girhar Gopal 2, Rakesh Kumar 3 1M.Tech Stuent, Department of Computer Science & Applications, Kurukshetra University, Kurukshetra,

More information

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides

Threshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides Threshol Base Data Aggregation Algorithm To Detect Rainfall Inuce Lanslies Maneesha V. Ramesh P. V. Ushakumari Department of Computer Science Department of Mathematics Amrita School of Engineering Amrita

More information

A Highly Scalable Parallel Boundary Element Method for Capacitance Extraction

A Highly Scalable Parallel Boundary Element Method for Capacitance Extraction A Highly Scalable Parallel Bounary Element Metho for Capacitance Extraction The MIT Faculty has mae this article openly available. Please share how this access benefits you. Your story matters. Citation

More information

Pairwise alignment using shortest path algorithms, Gunnar Klau, November 29, 2005, 11:

Pairwise alignment using shortest path algorithms, Gunnar Klau, November 29, 2005, 11: airwise alignment using shortest path algorithms, Gunnar Klau, November 9,, : 3 3 airwise alignment using shortest path algorithms e will iscuss: it graph Dijkstra s algorithm algorithm (GDU) 3. References

More information

Performance Modelling of Necklace Hypercubes

Performance Modelling of Necklace Hypercubes erformance Moelling of ecklace ypercubes. Meraji,,. arbazi-aza,, A. atooghy, IM chool of Computer cience & harif University of Technology, Tehran, Iran {meraji, patooghy}@ce.sharif.eu, aza@ipm.ir a Abstract

More information

A Plane Tracker for AEC-automation Applications

A Plane Tracker for AEC-automation Applications A Plane Tracker for AEC-automation Applications Chen Feng *, an Vineet R. Kamat Department of Civil an Environmental Engineering, University of Michigan, Ann Arbor, USA * Corresponing author (cforrest@umich.eu)

More information

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks : a Movement-Base Routing Algorithm for Vehicle A Hoc Networks Fabrizio Granelli, Senior Member, Giulia Boato, Member, an Dzmitry Kliazovich, Stuent Member Abstract Recent interest in car-to-car communications

More information

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem Throughput Characterization of Noe-base Scheuling in Multihop Wireless Networks: A Novel Application of the Gallai-Emons Structure Theorem Bo Ji an Yu Sang Dept. of Computer an Information Sciences Temple

More information

A Convex Clustering-based Regularizer for Image Segmentation

A Convex Clustering-based Regularizer for Image Segmentation Vision, Moeling, an Visualization (2015) D. Bommes, T. Ritschel an T. Schultz (Es.) A Convex Clustering-base Regularizer for Image Segmentation Benjamin Hell (TU Braunschweig), Marcus Magnor (TU Braunschweig)

More information

A Framework for Dialogue Detection in Movies

A Framework for Dialogue Detection in Movies A Framework for Dialogue Detection in Movies Margarita Kotti, Constantine Kotropoulos, Bartosz Ziólko, Ioannis Pitas, an Vassiliki Moschou Department of Informatics, Aristotle University of Thessaloniki

More information

Evolutionary Optimisation Methods for Template Based Image Registration

Evolutionary Optimisation Methods for Template Based Image Registration Evolutionary Optimisation Methos for Template Base Image Registration Lukasz A Machowski, Tshilizi Marwala School of Electrical an Information Engineering University of Witwatersran, Johannesburg, South

More information

On the Placement of Internet Taps in Wireless Neighborhood Networks

On the Placement of Internet Taps in Wireless Neighborhood Networks 1 On the Placement of Internet Taps in Wireless Neighborhoo Networks Lili Qiu, Ranveer Chanra, Kamal Jain, Mohamma Mahian Abstract Recently there has emerge a novel application of wireless technology that

More information

The Effects of Dimensionality Curse in High Dimensional knn Search

The Effects of Dimensionality Curse in High Dimensional knn Search The Effects of Dimensionality Curse in High Dimensional knn Search Nikolaos Kouiroukidis, Georgios Evangelidis Department of Applied Informatics University of Macedonia Thessaloniki, Greece Email: {kouiruki,

More information

New Geometric Interpretation and Analytic Solution for Quadrilateral Reconstruction

New Geometric Interpretation and Analytic Solution for Quadrilateral Reconstruction New Geometric Interpretation an Analytic Solution for uarilateral Reconstruction Joo-Haeng Lee Convergence Technology Research Lab ETRI Daejeon, 305 777, KOREA Abstract A new geometric framework, calle

More information

A PSO Optimized Layered Approach for Parametric Clustering on Weather Dataset

A PSO Optimized Layered Approach for Parametric Clustering on Weather Dataset Vol.3, Issue.1, Jan-Feb. 013 pp-504-508 ISSN: 49-6645 A PSO Optimize Layere Approach for Parametric Clustering on Weather Dataset Shikha Verma, 1 Kiran Jyoti 1 Stuent, Guru Nanak Dev Engineering College

More information

0607 CAMBRIDGE INTERNATIONAL MATHEMATICS

0607 CAMBRIDGE INTERNATIONAL MATHEMATICS CAMBRIDGE INTERNATIONAL EXAMINATIONS International General Certificate of Seconary Eucation MARK SCHEME for the May/June 03 series 0607 CAMBRIDGE INTERNATIONAL MATHEMATICS 0607/4 Paper 4 (Extene), maximum

More information

1 Shortest Path Problems

1 Shortest Path Problems CS268: Geometric Algorithms Hanout #7 Design an Analysis Original Hanout #18 Stanfor University Tuesay, 25 February 1992 Original Lecture #8: 4 February 1992 Topics: Shortest Path Problems Scribe: Jim

More information

6. Concluding Remarks

6. Concluding Remarks [8] K. J. Supowit, The relative neighborhood graph with an application to minimum spanning trees, Tech. Rept., Department of Computer Science, University of Illinois, Urbana-Champaign, August 1980, also

More information

10. WAVE OPTICS ONE MARK QUESTIONS

10. WAVE OPTICS ONE MARK QUESTIONS 1 10. WAVE OPTICS ONE MARK QUESTIONS 1. Define wavefront.. What is the shape of wavefront obtaine from a point source at a (i) small istance (ii) large istance? 3. Uner what conitions a cylinrical wavefront

More information