Efficient Evaluation of Probabilistic Advanced Spatial Queries on Existentially Uncertain Data

Size: px
Start display at page:

Download "Efficient Evaluation of Probabilistic Advanced Spatial Queries on Existentially Uncertain Data"

Transcription

1 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X Efficient Evaluation of Probabilistic Advanced Spatial Queries on Eistentiall Uncertain Data Man Lung iu, Nikos Mamoulis, Xianguan Dai, ufei Tao, Michail Vaitis Abstract We stud the problem of answering spatial queries in databases where objects eist with some uncertaint and the are associated with an eistential probabilit. The goal of a thresholding probabilistic spatial quer is to retrieve the objects that qualif the spatial predicates with probabilit that eceeds a threshold. Accordingl, a ranking probabilistic spatial quer selects the objects with the highest probabilities to qualif the spatial predicates. We propose adaptations of spatial access methods and search algorithms for probabilistic versions of range queries, nearest neighbors, spatial sklines, and reverse nearest neighbors and conduct an etensive eperimental stud, which evaluates the effectiveness of proposed solutions. Inde Terms H.2.4.h Quer processing, H.2.4.k Spatial databases INTRODUCTION Conventional spatial databases manage objects located on a thematic map with % certaint. In real-life cases, however, there ma be uncertaint about the eistence of spatial objects or events. As an eample, consider a satellite image, where interesting objects (e.g., vessels) have been etracted (e.g., b a human epert or an image segmentation tool). Due to low image resolution and/or color definitions, the data etractor ma not be % certain about whether a piel formation corresponds to an actual object o; a probabilit E o could be assigned to o, reflecting the confidence of o s eistence. We call such objects eistentiall uncertain, since their eact locations are known and the uncertaint refers onl to their eistence. As another eample of eistentiall uncertain data, consider emergenc calls to a police calling center, which are dialed from various map locations. Depending on the caller s voice, for each call we can generate a spatial event associated with a potential emergenc and a probabilit that the emergenc is actual. Events generated from sensors (e.g., smoke detection) can also be regarded as eistentiall uncertain data because each sensor is associated with a certain location and the eistence of each detected event depends on the sensor sensitivit and the background noise. Eistential probabilities are also a natural wa to model fuzz classification []. In this case, the class label of a particular object is uncertain; each label has an eistential probabilit and the sum of all probabilities is. M. L. iu is with the Department of Computer Science, Aalborg Universit, DK-922 Aalborg, Denmark. ml@cs.aau.dk N. Mamoulis and X. Dai are with the Department of Computer Science, Universit of Hong Kong, Pokfulam Road, Hong Kong. {nikos,dai}@cs.hku.hk. Tao is with the Department of Computer Science and Engineering, Chinese Universit of Hong Kong, Sha Tin, New Territories, Hong Kong. taof@cse.cuhk.edu.hk M. Vaitis is with the Department of Geograph, Universit of the Aegean, Mtilene, Greece. vaitis@aegean.gr Manuscript received XXXX XX, 2X; revised XXXX XX, 2X. We can naturall define probabilistic versions of spatial queries that appl on collections of eistentiall uncertain objects. We identif two tpes of such probabilistic spatial queries. Given a confidence threshold t, a thresholding quer returns the objects (or object pairs, in case of a join), which qualif some spatial predicates with probabilit at least t. E.g., given a segmented satellite image with uncertain objects, consider a port officer who wishes to find a set of vessels S such that ever o S is the nearest ship to the port with confidence at least 3%. Another eample is a police station asking for the emergencies in its vicinit, which have high confidence. A ranking spatial quer returns the objects, which qualif the spatial predicates of the quer, in order of their confidence. Ranking queries can also be thresholded (in analog to nearest neighbor queries) b a parameter m. For instance, the port officer ma want to retrieve the m = ships with the highest probabilit to be the nearest neighbor of the port. Previous work on managing spatial data with uncertaint [2], [3], [4], [5], [6], [7] focus on locationall uncertain objects; i.e., objects which are known to eist, but their (uncertain) location is described b a probabilit densit function. The rationale is that the managed objects are actual moving objects with unknown eact locations due to GPS errors or transmission delas. In Section 2, we elaborate the fundamental differences (e.g., location, eistence, storage, and probabilit computation) between eistentiall uncertain objects and locationall uncertain objects, and eplain wh eisting solutions for managing and searching locationall uncertain data are inappropriate for eistentiall uncertain data. To our knowledge, there is no prior work on eistentiall uncertain spatial data. Our contributions are summarized as follows: We identif the class of eistentiall uncertain spatial data and define two intuitive probabilistic quer tpes on them; thresholding and ranking queries. Assuming that the spatial attributes of the objects are indeed b 2-dimensional R trees, we propose search

2 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X 2 algorithms for probabilistic variants of spatial range queries, nearest neighbor (NN) search, spatial skline (SS) queries, and reverse nearest neighbor (RNN) queries. Regarding different variants of R trees, we derive appropriate lower/upper probabilistic bounds for effectivel reducing the search cost. Our search algorithms for NN, SS, RNN are carefull designed to handle disqualified entries in such a wa that their removal is guaranteed not to influence the probabilistic bounds of an potential result object. The rest of the paper is organized as follows. Section 2 provides background on quering spatial objects with uncertain locations and etents. Section 3 defines eistentiall uncertain data and quer tpes on them. In Section 4 we stud the evaluation of probabilistic spatial queries, when the are primaril indeed on their spatial attributes, or when considering eistential probabilit as an additional dimension. Section 5 addresses probabilistic variants for interesting advanced spatial queries. Section 6 is a comprehensive eperimental stud for the performance of the proposed methods. Section 7 discusses the case where the eistential probabilities of objects are correlated. Finall, Section 8 concludes the paper with a discussion about future work. 2 BACKGROUND AND RELATED WORK In this section, we review popular spatial quer tpes and show how the can be processed when the spatial objects are indeed b R trees. In addition, we provide related work on modeling and quering spatial objects of uncertain location and/or etent. 2. Spatial Quer Processing The most popular spatial access method is the R tree [8], which indees minimum bounding rectangles (MBRs) of objects. R trees can efficientl process main spatial quer tpes, including spatial range queries, nearest neighbor queries, and spatial joins. Figure shows a collection R = {p,..., p 8 } of spatial objects (e.g., points) and an R tree structure that indees them. Given a spatial region W, a spatial range quer retrieves from R the objects that intersect W. For instance, consider a range quer that asks for all objects within distance 3 from q, corresponding to the shaded area in Figure. Starting from the root of the tree, the quer is processed b recursivel following entries, having MBRs that intersect the quer region. 5 5 p p 2 e.mbr 2 p 8 e.mbr p3 p 7 q e.mbr 3 p 4 p Fig.. Spatial queries on R trees p 5 e e 2 e 3 p 2 p 3 p p 8 p 7 p 4 p 5 p 6 A nearest neighbor (NN) quer takes as input a quer object q and returns the closest object in R to q. For instance, the NN of q in Figure is p 7. If R is indeed b an R tree, then the best-first (BF) algorithm of [9] is the most efficient solution for processing NN queries. A best-first priorit queue P Q, which organizes R tree entries based on the (minimum) distance of their MBRs to q, is initialized with the root entries. The top entr of the queue e is then retrieved; if e is a leaf node entr, the corresponding object is returned as the NN (assuming point objects). Otherwise, the node pointed b e is accessed and all entries are inserted to P Q. In order to find the NN of q in Figure, BF first inserts to P Q entries e, e 2, e 3, and their distances to q. Then the nearest entr e 2 is retrieved from P Q and objects p, p 7, p 8 are inserted to P Q. The net nearest entr in P Q is p 7, which is the NN of q. In Section 4, we will etend BF for processing probabilistic versions of NN search on eistentiall uncertain data. 2.2 Locationall Uncertain Spatial Data Recentl, there is an increasing interest on the modeling, indeing, and quering of objects with uncertain location and/or etent. For instance, consider a collection of moving objects, whose positions are tracked b GPS devices. Eact locations are unknown due to GPS errors and transmission delas; e.g., if the object is in motion its location might be outdated when reaching the listening server. As a result, the set of possible locations of an object is captured b a probabilit densit function (PDF), which combines GPS measurement error, the last reported object location, and object velocit [2]. Figure 2a eemplifies a locationall uncertain object o, modeled b a Gaussian PDF, with the regions of higher probabilit marked in darker color. According to [], [7], an arbitrar PDF can be approimated b a spatial histogram (e.g., 3 3 bins in Figure 2a), where each bin stores the probabilit to include the object, and their sum equals to. o (a) loc. uncertain PDF (b) PCR of o, at g =.2 o λ q o 3 r o 2 (c) NN search o 4 l _ o (.) W o W q l + o 2 (.2) o 3 (.3) l + l _ o 4 (.9) (d) eistentiall uncertain object Fig. 2. Locationall and Eistentiall Uncertaint Objects

3 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X 3 TABLE Fundamental Differences between Two Notions of Uncertaint Propert Locationall Eistentiall Uncertain Object Uncertain Object Location uncertain certain Eistence certain uncertain Storage a spatial histogram a point o with eistence probabilit E o Probabilit epensive constant time computation numerical integration constant time Given a locationall uncertain object o and a quer range W (see Figure 2b), the probabilit that o intersects a quer range W is formall defined b: P rng (o, W ) = o.pdf(v) dv o W where o.pdf(v) denotes the probabilit that o coincides with point v. Probabilistic threshold range queries [], [7] retrieve result pairs o, P rng (o, W ) such that P rng (o, W ) t, where t is a user-specified threshold. The filter-refinement framework is adopted to accelerate their evaluation. An inepensive filter step is applied to determine fast whether an object o can belong to the result. Onl when o ma potentiall become a result, the refinement step is eecuted to compute the P rng (o, W ) value. In the state of the art method of [7] probabilistic constrained rectangle (PCR) is used for the filter step of the queries. Given a sstem parameter g, modeling a minimum value for t, the PCR of a object o is pre-computed b sliding each aisparallel line inwards until the swept area over the PDF of o equals to g. Figure 2b illustrates the PCR of an object o, for g =.2; o appears in the region on the left of line l with probabilit.2. Similarl, o appears in regions on the right/bottom/top of lines l + /l /l + respectivel, with probabilit.2. To answer the threshold range quer W (with t =.5), we first compare W with the lines l /l + /l /l +. Since W does not intersect the PCR of o (i.e., it is above line l + ), we can immediatel infer that P rng (o, W ).2 < t. Thus, o is discarded during the filter step of quer W, saving the epensive computation of the eact probabilit P rng (o, W ). Table summarizes the fundamental differences between locationall uncertain objects and eistentiall uncertain objects. As depicted in Figure 2d, an eistentiall uncertain object o 3 has a certain location (i.e., a point) but its eistence is associated with a probabilit E o3 =.3. The probabilit of o 3 satisfing a range quer W is P rng (o 3, W ) = E o3 if o intersects W ; or otherwise. Thus, P rng (o 3, W ) can be computed in constant time. One ma argue that an eistentiall uncertain point o with eistence probabilit E o could be modeled as a locationall uncertain object with the PDF consisting of eactl 2 locations: one point o with probabilit E o, and a point at infinit with probabilit ( E o ). This model encumbers the application of eisting locationall uncertain techniques [], [7], because the assume multiple locations with probabilities and the continuit of PDF in the space. Consider, for instance, the probabilistic NN search algorithm for locationall uncertain data, proposed in [6]. Given a quer point q and a set R of locationall uncertain objects, we can derive λ = min o R mad(q, o), i.e., the minimum furthest distance of an o from q. For instance, in Figure 2c, the object o leads to the minimum λ. Since the PDF of o sums to within the circle (centered at q with radius λ), it is clear that, an object o (e.g., o 4 ) with mind(q, o ) > λ has no chance of being the NN of q. For an remaining object o (e.g., o, o 2, o 3 ), its probabilit of being the NN of q is denoted b P nn (o, q). Assuming independent PDFs between different objects, [6] define P nn (o, q) as follows: P nn(o, q) = Z λ mind(q,o) o.pdf( q(r)) o R,o o,mind(q,o ) λ ( o.pdf( q(r))) dr where q (r) and q (r) represent the hollow ring and the concrete circle respectivel, centered at q with radius r. The evaluation of the above probabilit is epensive for arbitrar PDFs so [6] focuses on basic PDFs and develops efficient computation techniques for P nn (o, q). Note that the above probabilistic NN search technique is inapplicable to eistentiall uncertain data. Figure 2d depicts a set of eistentiall uncertain objects, with a similar spatial configuration as in Figure 2c. In this case, o is still the object causing the minimum λ value. However, since its eistence probabilit is not, it cannot be used to bound the search space. For instance, the object o 4 now has non-zero probabilit of being the NN of q; this happens with the probabilit (.)(.2)(.3).9 =.454, when o 4 eists but o, o 2, o 3 do not eist. Other work on locationall uncertain data includes indeing the trajector of an object as a clindrical volume around the tracked polline (e.g., b a GPS), capturing uncertaint up to a certain distance from the polline []. A similar approach is followed in [3], where recorded trajectories are converted to sequences of locations connected b elliptical volumes. [5] also models the uncertain locations of spatial objects b (circular) uncertaint regions and discuss how to process simple and aggregate spatial range queries using the fuzz representations. [4] studies the evaluation of spatial joins between two sets of objects, for the case where the object etents are floating according to uncertaint distance bounds. An etension of the R tree that captures uncertaint in director node entries is proposed, and R tree join techniques are adapted to process the join efficientl. Cheng et al. [2], [] stud a problem related to probabilistic spatial range queries. The uncertain data are not spatial, but ordinal D values (e.g., temperature values recorded from sensors). [] indees such uncertain data for efficient evaluation of probabilistic range queries. [2] classifies queries on such data to entit-based queries asking for the set of objects satisfing a quer predicate and valuebased queries asking for a PDF describing the distribution of a quer result when it is a single aggregate value (e.g., the sum of values, the maimum value, etc.). Finall, [3] studies the evaluation of queries over uncertain or summarized data, where the user specifies thresholds (precision, recall, lait) regarding the qualit (i.e., accurac) of the desired result. 3 EXISTENTIALL UNCERTAIN SPATIAL DATA An object is eistentiall uncertain if its eistence is described b a probabilit E, < E. We refer to E as eistential probabilit or confidence of. Note that since we

4 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X p 2 (.5) p (.2) W p 8 (.2) Fig. 3. NN search eample p3 (.3) p (.) 7 q p (.5) 4 p (.) p (.5) 5 can have E =, we (triviall) regard a % known object as eistentiall uncertain. This allows us to model object collections which are mitures of uncertain and certain data. On the other hand, E = corresponds to an object that definitel does not eist, so there is no need to store it in a database. We take the eistential independence assumption that the confidence values of two different objects are independent of each other. This assumption is reasonable for the applications mentioned in the Introduction (e.g., satellite image etraction, emergenc call). We will rela this assumption in Section 7 and handle eistentiall uncertain objects whose confidence values are correlated. Figure 3 shows a collection R = {p, p 2,..., p 8 } of eistentiall uncertain points. Net to each point label p i, is its eistential probabilit E pi enclosed in parentheses (e.g., E p =.2). We are interested in answering spatial queries that take uncertaint into account. Let R be a collection of eistentiall uncertain objects. We then define probabilistic versions of basic spatial quer tpes: Definition : A probabilistic spatial range quer takes as input a spatial region W and returns all (, P ) pairs, such that R and intersects W with probabilit P = E, where P >. Definition 2: A probabilistic nearest neighbor quer takes as input an object q and returns all (, P ) pairs, such that R and is the nearest neighbor of q, with probabilit P = E R,,d(q, )<d(q,) ( E ), where P > and d(q, ) denotes the distance between q and. The output of a probabilistic quer is a conventional quer result coupled with a positive probabilit that the item satisfies the quer. The case of probabilistic range queries is simple; P = E for each object that qualifies the spatial predicate. Consider, for instance, the shaded window W, shown in Figure 3. Two objects p and p 2 intersect W, with confidences E p =.2 and E p2 =.5, respectivel. Similar to locationall uncertain data, the probabilit of an object to qualif a spatial range quer is irrelevant of the locations and confidences of other objects. On the other hand, the probabilit of an object to be the nearest neighbor depends on the locations and probabilities of other objects. Consider again Figure 3 and assume that we want to find the potential nearest neighbor of q. The nearest point to q (i.e., p 7 ) is the actual NN iff p 7 eists. Thus, (p 7, E p7 ) is a quer result. In order for the second nearest point p 6 to be the NN of q (i) p 7 must not eist and (ii) p 6 must eist. Thus, (p 6, ( E p7 ) E p6 ) is another result. B continuing this wa, we can eplore the whole set of points in R and assign a probabilit to each of them to be the NN of q. This nearest neighbor quer eample not onl shows the search compleit in uncertain data, but also unveils that the result of probabilistic queries ma be arbitraril large. For instance, the result of an NN quer is as large as R, if E < for all R. We can define practical versions of probabilistic queries with controlled output b either thresholding the results of low probabilit to occur or ranking them and selecting the most probable ones: Definition 3: Let (, P ) be an output item of a probabilistic spatial quer Q. The thresholding version of Q takes as additional input a threshold t, < t and returns the results for which P t. The ranking version of Q takes as additional input a positive integer m and returns the m results with the highest P. For eample, a thresholding range (window) quer W with t =.6 on the objects of Figure 3 returns, whereas a ranking range quer W with m = returns (p 2,.5). 4 EVALUATION OF BASIC PROBABILISTIC QUERIES Like spatial queries on eact data, probabilistic spatial queries can be efficientl processed with the use of appropriate access methods. In this section, we eplore alternative indeing schemes and propose algorithms for probabilistic queries on them. We focus on the most important spatial quer tpes; namel, range queries and nearest neighbor queries. 4. Algorithms for R trees The most straightforward wa to inde a set R of eistentiall uncertain spatial data is to create a 2-dimensional R tree on their spatial attribute. The confidences of the spatial objects are stored together with their geometric representation or approimation (for comple objects) at the leaves of the tree. We now stud the evaluation of probabilistic queries on top of this indeing scheme. 4.. Range Queries Probabilistic range queries can be easil processed in two steps; a standard depth-first search algorithm is applied on the R tree to retrieve the objects that qualif the spatial predicate of the quer. For each retrieved object, P = E. If the quer Q is a thresholding quer, the threshold t is used to filter out objects with P < t. If Q is a ranking quer, a priorit queue maintains the m results with the highest P, during search, and outputs them at the end of quer processing.. Especiall for thresholding range queries of ver large thresholds t, a viable alternative could be to use a B + tree that indees objects based on their probabilit to efficientl access the objects with E t and then filter them using the spatial quer predicate.

5 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X Nearest neighbor search NN search is more comple compared to range queries, because the probabilit of an object to qualif the quer depends on the locations and confidences of other objects. Algorithm elegantl and efficientl computes the probabilit P of to be nearest neighbor of q, for all having P >. Algorithm Probabilistic NN on a R tree Algorithm PNN(Quer point q, R tree on R) : P first := ; Prob. of no object before 2: while P first > and more objects in R do 3: := net NN of q in R; 4: P := P first E ; 5: output (,P ); 6: P first := P first ( E ); Algorithm PNN applies best-first NN-search [9] on the R tree to incrementall retrieve the nearest neighbors of q, without considering confidences. It also incrementall maintains a variable P first which captures the probabilit that no object retrieved before the current object is the actual NN. P first is equal to ( E ), for all objects seen before. Thus the probabilit of to be the nearest neighbor of q is P first E. In the eample of Figure 3, PNN graduall computes P p7 =., P p6 = (.). =.9, P p8 = (.)(.).2 =.62, P p4 = (.)(.)(.2).5 =.324, etc. Note that all objects of R in this eample are retrieved and inserted to the response set. In other words, PNN does not terminate, until an object with E = is found; if no such object eists, all objects have a positive probabilit to be the nearest neighbor Thresholding and ranking: As discussed in Section 3, the user ma want to restrict the response set b thresholding or ranking. Algorithm 2 is the thresholding version of PNN, which returns onl the objects with P t. The onl differences with the non-thresholding version are the termination condition at Line 2 and the filtering of results having P < t (Line 5). As soon as P first < t, we know that the net objects, even with % confidence cannot be the NN of q, so we can safel terminate. For eample, assume that we wish to retrieve the points in Figure 3 which are the NN of q with probabilit at least t =.23. First p 7 with P p7 = E p7 =. is retrieved, which is filtered out at Line 5 and P first is set to.9 t. Then we retrieve p 6 with P p6 = P first E p6 =.9 (also disqualified) and set P first =.8 t. Net, p 8 is retrieved with P p8 =.62 (also disqualified) and P first =.648 t. The net object p 4 satisfies P p4 =.324 t, thus (p 4,.324) is output. Then P first =.324 t and we retrieve p 3 with P p3 =.972 (disqualified). Finall, P first =.2268 < t and the algorithm terminates having produced onl (p 4,.324). PRNN (Algorithm 3), the ranking version of PNN, maintains a heap H of m objects with the largest P found so far. Let P m be the m-th largest P in H; as soon as < P m, we know that the net objects, even with % confidence cannot be the in the set of m most probable NN of q, so we can safel terminate. For eample, assume that we wish to retrieve the point with the highest probabilit of being the NN of q in Figure 3. PRNN progressivel P first Algorithm 2 Probabilistic NN on a R tree with thresholding Algorithm PTNN(Quer point q, R tree on R, Threshold t) : P first := ; Prob. of no object before 2: while P first t and more objects in R do 3: := net NN of q in R; 4: P := P first E ; 5: if P t then 6: output (,P ); 7: P first := P first ( E ); maintains the object with the highest P. After each of the first 4 object accesses, P m becomes.,.,.62, and.324. The algorithm terminates after the 4-th loop, when P first =.324 and P m = P p4 =.324; this indicates that the net object can have P at most P p4, thus p 4 has the highest chances among all objects to be the NN of q. Algorithm 3 Probabilistic NN on a R tree with ranking Algorithm PRNN(Quer point q, R tree on R, Integer m) : P first := ; Prob. of no object before 2: H := ; heap of m objects with highest P 3: P m := ; P of m-th object in H 4: while P first P m and more objects in R do 5: := net NN of q in R; 6: P := P first E ; 7: if P P m then 8: update H to include ; 9: P m := m-th probabilit in H; : P first := P first ( E ); 4.2 Quer Evaluation using Augmented R trees We can enhance the efficienc of the probabilistic search algorithms, b augmenting some statistical information to the R tree director node MBRs. A simple and intuitive method is to store with each director node entr e a value e mae ; the maimum E for all objects indeed under e. This value can be used to prune R tree nodes, while processing thresholding or ranking queries. Similar augmentation techniques are proposed in [4], [] for locationall uncertain data. TABLE 2 Checking disqualified entries in augmented R trees quer tpe range search NN search thresholding e mae < t P first e mae < t ranking e mae P m P first e mae P m Table 2 summarizes the conditions for pruning R tree entries (and the corresponding sub-trees) which do not point to an results, during range or NN thresholding and ranking queries. For range queries, we can directl prune an entr e when: (i) e.mbr does not intersect the quer range, or (ii) its e mae satisfies the condition in the table. On the other hand, for NN search, a disqualified entr cannot be directl pruned, because the confidences of objects in the pointed subtree ma be needed for computing the probabilities of objects with greater distances to q, but high enough probabilities to be included in the result.

6 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X 6 Let us assume for the moment that for each non-leaf entr e we know the eact number of objects e num in its subtree. Algorithm 4 is the thresholding NN procedure for the augmented R tree. BF is etended as follows: If a non-leaf entr e is de-heaped for which P first e mae < t, the node where e points is not immediatel loaded (as in PTNN) but e is inserted into a set L of deleted entries. For objects retrieved later from the Best-First heap, we use entries in L to compute P min and P ma ; lower and upper bounds for P. If P min t, we know that is definitel a result. If P ma < t, we know that is definitel not a result. On the other hand, if P min < t P ma (Lines 6 2), we must refine the probabilit range for. For this purpose, we pick the entr e with the minimum mind(q, e) in L. 2 Observe that an entries with mind(q, e) > d(q, ) cannot contribute to the probabilit of. As P min < P ma (at Line 6), the entr e selected at Line 7 must satisf mind(q, e) d(q, ). If e is an object, then q must be nearer to e than and we update P first with the confidence of e. Otherwise, its confidence does not affect P first, we access its child node n e and insert all entries of n e into L. In either case, the probabilit range of shrinks. The process is repeated while the range covers t. Algorithm 4 Probabilistic NN on an augmented R tree with thresholding Algorithm PTNNaug(Quer point q, Augmented R tree on R, Threshold t) : P first := ; Prob. of no object before 2: L := ; list of disqualified entries 3: while P first t and more objects in R do 4: := net NN of q in R; during BF-search, each non-leaf entr with P first e mae < t is removed from Best-First heap and inserted into L 5: compute P min and P ma b using P first, L and E ; 6: while P min < t P ma do 7: pick the entr e with the smallest mind(q, e) in L; remove e from L; 8: if e is an object then e is an object closer to q than is 9: P first := P first ( E e); : else : read node n e pointed b e and insert all entries of n e into L; 2: compute P min 3: if P min t then 4: output (,P min,p ma ); 5: P first := P first ( E ); and P ma b using P first, L and E ; It remains to clarif how P min and P ma for an object are computed. Note that L onl contains entries whose minimum distance to q are smaller than d(q, ). For an entr e in the list L, the confidence of each object in its subtree is in the range (, e mae ]. In addition, there eists at least one object in e whose confidence is eactl e mae. Thus, P min corresponds to the case where for all objects under all entries in L are closer to q than is and the all have the maimum possible confidences. P ma corresponds to the case, where for all e L, with maimum distance from q greater than d(q, ), there is onl one object with e mae confidence (for all other objects 2. Throughout the paper, we use d(q, ) to denote the distance between two points q and ; and use mind(q, e) (mad(q, e)) to denote the minimum (maimum) possible distance between q and an data point indeed b the sub-tree pointed b e. under e the confidence converges to ): P min = P first E P ma = P first E e L mind(q,e) d(q,) e L mad(q,e) d(q,) ( e mae ) enum () ( e mae ) (2) So far, we have assumed that for each non-leaf entr e the number of objects e num in its subtree is known (e.g., this information is augmented, or the tree is packed). We can still appl the algorithm for the case where this information is not known, b using an upper bound for e num : f level(e), where level(e) is the level of the entr e (leaves are at level ) and f is the maimum R tree node fanout. This upper bound replaces e num in Equation. 5 5 p 2 (.5) p (.2) e.mbr p 8 (.2) e.mbr p3 (.3) p (.) 7 q e.mbr 3 p (.5) 4 p (.) p (.5) 5 e e 2 e 3 (.5)(.2)(.5) p 2 p 3 p p 8 p 7 p 4 p 5 p 6 (.5)(.3) (.2) (.2)(.) (.5)(.5)(.) Fig. 4. Eample of augmented R tree Let us now show the functionalit of the PTNNaug algorithm b an eample. Consider the augmented R tree of Figure 4 that indees the pointset of Figure 3 and assume that we want to find the points that are the NN of q with probabilit at least t =.23. First, the entries in the root are enheaped in the Best-First heap. Net, the entr e 2 is dequeued. Since it disqualifies the quer (P first e mae 2 =.2 < t), it is inserted into the list L. Then, the entr e 3 is dequeued. Its objects p 4, p 5, p 6 are enheaped in the Best-First Queue. The nearest object p 6 is dequeued. From Equations and 2, we derive a probabilit range for P p6 b using P first and L. p 6 is disqualified as Pp ma 6 = E p6 =. < t. Then, P first =.9 t and we retrieve p 4. Since Pp min 4 =.9.5 (.2) 3 =.234 t, p 4 is a result. Net, P first =.45 t and the net entr retrieved from the priorit queue of the BF algorithm is e. We do not access the node pointed b e, since we know that for each object indeed under e, P e mae P first =.225 < t. Thus, e is inserted into L. Net, p 5 is dequeued and discarded as Pp ma 5 =.45.5 (.2) (.5) < t. Now, the Best- First heap becomes empt and the algorithm terminates. Note that PTNN accesses all nodes of the tree in this eample, whereas PTNNaug saves two leaf node accesses. Ranking NN retrieval on the augmented R tree is performed b Algorithm 5. PRNNaug has several differences from the thresholding NN algorithm. A heap H is emploed to organize objects o b their Po min. P m denotes the m-th highest Po min in the heap. Observe that more complicated techniques are used for updating H, as the accesses to L ma affect the order of objects in H. Each object o in H maintains Po first, which is the value of P first when o is enheaped (Line 8). At Lines 2

7 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X 7 Algorithm 5 Probabilistic NN on an augmented R tree with ranking Algorithm PRNNaug(Quer point q, Augmented R tree on R, Integer m) : P first := ; Prob. of no object before 2: L := ; list of disqualified entries 3: H := ; heap of objects, organized b Po min 4: P m := ; P min of m-th object in H 5: while P first > P m and more objects in R do 6: := net NN of q in R; during BF-search, each non-leaf entr with P first e mae < t is removed from Best-First heap and inserted into L 7: compute P min and P ma b using P first, L and E ; 8: while P min < P m P ma do 9: pick the entr e with the smallest mind(q, e) in L; remove e from L; : if e is an object then e is an object closer to q than is : P first := P first ( E e); 2: for all entr o H such that d(q, e) d(q, o) do 3: Po first 4: else := P first o ( E e); 5: read node n e pointed b e and insert all entries of n e into L; 6: compute P min and P ma b using P first, L and E ; 7: if P min > P m then 8: enheap(h,(,p first :=P first,p min,p ma )); 9: if H is changed or L is changed then 2: recompute, for each o H, Po min and Po ma b using Po first, L and E o; 2: P m := m-th P min in H; 22: remove entries o from H with P ma o < P m ; 23: P first := P first ( E ); 24: while H > m and L > do 25: appl Lines 9 6; 26: appl Lines 2 22; 27: remove e from L with mind(q, e) > ma{d(q, o) : o H}; 3, Po first (for some entries in H) is updated for each object e found no further than o from q. The new Po first value is used to update Po min and potentiall the order of objects in H at Lines 2 2. Note that H ma store more than m entries, since there ma be objects o in it satisfing Po ma P m. However, entries o are removed from H once P ma P min o o < P m. The algorithm does not need to access an more objects from the Best-First heap as soon as P first < P m. In case H has more than m objects at that point, we need to refine the probabilit ranges of the objects in H (b processing entries in L) until we have the best m objects. In this case, entries e are removed from L once mind(q, e) > ma{d(q, o) : o H} because such entries cannot be used to refine the probabilit ranges of the objects in H. 4.3 Quer Evaluation using R trees An alternative method for indeing eistentiall uncertain data is to model the confidences E of objects as an additional dimension and use a R tree to inde the objects. Now, each non-leaf entr e in the tree, apart from the spatial dimensions, has a range [e mine, e mae ] within which the eistential probabilities of all objects in its subtree fall. Figure 5 illustrates the differences between the augmented R tree and the R tree. Figure 5a depicts the structure of the augmented R tree for the points p, p 2,, p 6. The R* tree insertion algorithm [4] aims at grouping the points into leaf nodes such that the their MBR areas are minimized. As such, the (non-leaf) entr e points to a leaf node containing the points p, p 2, p 3 ; whereas the entr e 2 points to a leaf node containing the points p 4, p 5, p 6. The spatial X, ranges, and the augmented probabilit, for these two entries in the augmented R tree are listed in Figure 5c. Note that each entr consists of 6 values (including its child node pointer). Figure 5b shows the structure of the R tree, for the same set of points. The R* tree insertion optimizes the bounding rectangles of nodes defined b three dimensions: spatial dimensions X and, as well as the probabilit dimension E. Hence, the entr e points to a leaf node containing the points p, p 2, p 5 ; whereas the entr e 2 points to a leaf node containing the points p 3, p 4, p 6. The values stored in these entries in the R tree are also listed in Figure 5c. Now, each entr consists of 7 values (including its child node pointer), impling that the fanout of the R tree is slightl smaller than the augmented R tree. The methods for processing the probabilistic range and NN queries over the augmented R tree (in Section 4.2) are applicable for the R tree, since each tree entr still stores an e mae value. In particular, for the NN quer, we utilize e mine to derive tighter probabilit ranges: P min = P first E (3) e L mind(q,e) d(q,) ( e mine )( e mae ) (enum ) P ma = P first E (4) e L mad(q,e) d(q,) ( e mine ) (enum ) ( e mae ) If the eact number e num of objects in the subtree pointed b e is not known, we can use the fanout f and the minimum node utilization (.4 for R* trees) and replace e num b f level(e) in Equation 3 and b (.4 f) level(e) in Equation 4. p 4 (.7) p (.3) p6 (.9) e (.8) p 3 e 2 p 2 (.6) p 5 (.5) (a) Augmented R tree p (.3) p 4 (.7) e 2 p6 (.9) (.8) p e 3 p 5 (.5) p 2 (.6) (b) R tree Name Entr e Entr e 2 Grouping Fields Aug. X=[.5,.4] X=[.55,.85] Spatial 6 R tree =[.2,.6] =[.3,.8] onl (+4+) mae=.8 mae=.9 R tree X=[.5,.7] X=[.4,.85] Both 7 =[.2,.6] =[.4,.8] spatial and (+4+2) E=[.3,.6] E=[.7,.9] probabilities (c) Comparison between the two trees Fig. 5. Structures of different R tree variants Interestingl, the quer performance of the R tree is not necessaril better than the augmented R tree. A careful eamination of Equations 3,4 reveals that these probabilit bounds are determined b both the spatial and probabilistic intervals of the entries. Even the e mine values in the R tree are helpful for tightening the bounds, this effect is

8 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X 8 counteracted b the large spatial bounding rectangles in the tree. Thus, more (disqualified) entries e L satisf the mind(q, e) d(q, ) condition in Equation 3 and fewer entries e L satisf the mad(q, e) d(q, ) condition in Equation 4. Hence, the final probabilit bounds for the R tree ma indeed become looser. Besides, the R tree has a slightl smaller fanout, which ma lead to more page accesses. 5 ADVANCED SPATIAL QUERIES In this section, we discuss probabilistic variants of spatial skline queries [5] and reverse nearest neighbor queries [6], due to their applications in spatial decision support sstems. For each quer tpe, we first present its background, then define its probabilistic variant, and finall develop corresponding quer algorithms for the thresholding and ranking versions. 5. Spatial Skline Queries Given a set Q of quer points (e.g., user locations) and two points p and p (e.g., two facilities), p spatiall dominates [5] p when all quer points in Q are closer to p than to p : q Q, d(q, p) d(q, p ) (5) Given a point dataset R, its spatial skline [5] (with respect to Q) contains the objects p R that are not spatiall dominated b an other object in R. As an eample, consider the distances of the stations p i R from a group of 2 users Q = {q, q 2 } in Figure 6a. The spatial skline contains p, p 2, and p 3. The main application of spatial skline queries is to discover facilities that are not farther than other facilities, for all users. distance to q 2 p (.3) p 7 (.7) p6 (.9) p 4 (.8) p 5 (.5) p 2 (.6) p 3 (.4) (a) a point set distance to q distance to q 2 ψ (e ) ψ (e ) 2 ψ (e ) ψ (e ) 3 ψ + (e ) distance to q (b) dominance relationship Fig. 6. Feature space defined b the distances from quer points To ease our discussion, we first introduce some notation. The spatial skline quer is formulated in a feature space in which each dimension captures the distance to a quer point. Given a set Q = {q, q 2,, q z } of quer points, a spatial location (i.e., data point) p (or a MBR e) can be mapped to a point ψ(p) (or a MBR ψ(e)) in a z-dimensional feature space, where the i-th dimension captures the distances of the points to q i (for i [, z]). Table 3 illustrates the mapping of a data point or an MBR (corresponding to a non-leaf R tree entr, assuming that the data points are indeed b an R tree) to this feature space. As a shorthand notation, we use ψ(p) ψ(p ) to mean that p spatiall dominates p. Let ψ (e) and ψ + (e) be the lower TABLE 3 Mapping from the original space to the feature space Original space Feature space ψ( ), i-th dimension Point p d(q i, p) MBR e [mind(q i, e), mad(q i, e)] and upper bound corners of the MBR ψ(e) respectivel (see Figure 6b). Since ψ + (e 2 ) ψ (e ), each point in e 2 must spatiall dominate all points in e. On the other hand, onl some point in e 3 ma spatiall dominate some points in e as ψ (e 3 ) ψ + (e ) and ψ + (e 3 ) ψ (e ). With the above mapping technique, [7] propose an R tree based algorithm for computing the dnamic skline in the feature space. The idea is to appl the best-first search algorithm [9] on the R tree to visit the entries e, from the origin z in the feature space, in ascending order of the value: ω Q (e) = q Q mind(q, e) (6) [7] proved that a point must be discovered earlier than the points it dominates (if an). Hence, a point p is reported as a result if it cannot be dominated b an eamined points. We then adapt the above algorithm for the probabilistic spatial skline quer Probabilistic spatial skline quer and its properties: For eistentiall uncertain data, a point is a quer result with probabilit: P = E ( E ) (7) R ψ( ) ψ() which corresponds to the case that eists and the points dominating do not eist. A probabilistic spatial skline quer takes as input a set Q of quer points and returns all (, P ) pairs, such that R and belongs to the spatial skline of Q with probabilit P >. For instance, in Figure 6a, we have P p4 =.8 (.6) =.32 because p 2 dominates p 4. Since no points dominate p 2, we derive P p2 =.6 =.6. The probabilit of other points can be computed in a similar wa. In Section 4..2, we used a single variable P first to incrementall compute the upper bound probabilit for the remaining objects to be eamined. This technique is inapplicable to the spatial skline quer, since the points visited in decreasing order from the origin do not necessaril influence the points that will be visited net. For instance, the eistence of point p 2 in Figure 6a does not influence the probabilit that p (which is further than p 2 from the origin and will be visited net) is in the skline. However, it influences p 4, since p 4 is dominated b p 2. In general, given a set S R of alread eamined points, in order of their distance to the origin, an upper bound P first (S) of the probabilit that point is in the skline with respect to S can be computed b: P first (S) = ( E ) (8) S ψ( ) ψ() For a MBR e, the upper bound probabilit Pe first (S) of an

9 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X 9 point in e to be in the skline can be computed as follows: Pe first (S) = ( E ) (9) S ψ( ) ψ (e), since ψ (e) dominates an point in e. Net, we discuss how thresholding and ranking versions of the quer are evaluated on a R tree Thresholding and ranking: Assume that we want to find the points with probabilit at least t to be in the skline. Algorithm 6 describes the procedure to retrieve these points from a R tree. At Line 3, objects are incrementall retrieved from the tree in increasing order of their ω Q () value, which is defined in Equation 6. Set S is used for storing objects eamined so far, in order to derive the probabilit P first (S) of remaining objects (using Equation 8). The probabilit derivation of P as P first (S) E is correct because [7] proved that all the points dominating must have been eamined before (and stored into S). When P t, is reported as a result. In case P first (S) < t, an (remaining) point dominated b must be at least dominated b the same subset of points in S such that P first (S) < t. Thus, is inserted into S onl when P first (S) t. Following the above logic, we can optimize the algorithm at Line 3 b removing non-leaf entries with Pe first (S) < t from the Best-First heap. Algorithm 6 Probabilistic spatial skline on a R tree with thresholding Algorithm PTSK(Quer set Q, R tree on R, Threshold t) : S := ; set of eamined objects 2: while more objects in R do 3: := net point in R with minimum ω Q (); during BF-search, non-leaf entries e with Pe first (S) < t are removed from Best-First heap 4: P := P first (S) E ; 5: if P t then 6: output (,P ); 7: if P first (S) t then 8: insert into S; Threshold-based retrieval (of Algorithm 6) can be etended to retrieve the m points with the highest probabilit to be in the skline (i.e., the ranking probabilistic variant of the quer). The general idea is to maintain a heap H of m objects with the highest P found so far. In addition, we replace the fied threshold t b a floating bound P m, which indicates the m- th highest P in H. If P is found to be greater than P m, then the result heap H and the bound P m are updated. As P m increases, (unnecessar) objects with Po first (S) < P m are removed from S in order to save space Etensions for augmented R trees: As discussed in Section 4.2, augmented R trees can be used to improve the quer efficienc. Algorithm 7 generalizes Algorithm 4 to utilize information from an augmented R tree. During the BF-search at Line 4, each non-leaf entr e with Pe first (S) e mae < t is removed from Best-First heap because the cannot contain an results. If the removed entr has Pe first (S) at least t, then it ma influence the remaining points and e is inserted into the list L for further processing. At Line 5, the lower bound P min and upper bound P ma probabilities of a point are computed from S, L and E using Equations and respectivel. P min = P first (S) E P ma = P first (S) E e L ψ (e) ψ() e L ψ + (e) ψ() ( e mae ) enum () ( e mae ) () When P min < t P ma, we need to refine the probabilit range for (Lines 6-3). After that, is reported as a result if P min t. In case P first (S) t, is inserted into S because it influences the probabilit of other points that ma end up in the result. Algorithm 7 Probabilistic spatial skline on an augmented R tree with thresholding Algorithm PTSKaug(Quer point q, Augmented R tree on R, Threshold t) : S := ; set of eamined objects 2: L := ; list of disqualified entries 3: while more objects in R do 4: := net point in R with minimum ω Q (); during BF-search, each non-leaf entr e with Pe first (S) e mae < t is removed from Best-First heap, and inserted into L if Pe first (S) t 5: compute P min and P ma b using P first (S), L and E ; 6: while P min < t P ma do 7: pick the entr e with the smallest ω Q (e) in L; remove e from L; 8: if Pe first (S) t then e ma influence the probabilit of potential results 9: if e is an object then : insert e into S; : else 2: read node n e pointed b e and insert all entries of n e into L; 3: compute P min 4: if P min t then 5: output (,P min 6: if P first (S) t then 7: insert into S; and P ma,p ma ); b using P first (S), L and E ; Similarl, we can generalize the algorithm for evaluating ranking spatial skline queries on an augmented R tree. Regarding R trees, Equations 2 and 3 are applied to compute the P min and P ma values of a point respectivel. In case the eact number e num of objects in the subtree pointed b e is not known, we can use the fanout f and the minimum node utilization (.4) and replace e num b f level(e) in Equations,2, and b (.4 f) level(e) in Equation 3. P min = P first (S) E (2) P ma e L ψ (e) ψ() ( e mine )( e mae ) (enum ) = P first (S) E (3) e L ψ + (e) ψ() ( e mine ) (enum ) ( e mae ) 5.2 Reverse Nearest Neighbor Queries Given a point dataset R and a quer point q, a reverse nearest neighbor (RNN) quer [6] retrieves the objects p R having q as their nearest neighbor. This quer has applications in decision support and resource allocation. [6], [8] develop R tree based algorithms for reverse nearest neighbor queries. In this section, we etend the geometric partitioning method of [6] to solve probabilistic versions of this problem. According

10 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X to [6], an RNN quer can be answered in two steps. In the filter step, the data space (shown in Figure 7a) is divided into si equal sectors A, A 2,, A 6 around the quer point q. The NN of q in each sector (if an) is included into the candidate set. In the eample, the candidates are the points p (in A 2 ), p 2 (in A 5 ), and p 4 (in A 3 ). [6] proved that the candidate set is a superset of the result set. During the refinement step, each candidate is verified b retrieving its NN. A candidate (e.g., p 2 ) is reported as a result if its NN is q. Otherwise, the candidate (e.g., p 3 ) is a false hit and it is discarded. A 3 p 4 (.5) A 4 A 2 p 3 (.7) p (.6) q p 2 (.8) A 5 A A 6 (a) 6-sector partitioning p 3 A 5 (.7) A 2 p p 4 A 6 (.5) (.6) A A 7 A8 A 4 A 3 q p2 (.8) A 9 A A A 2 (b) 2-sector partitioning Fig. 7. Reverse nearest neighbor quer eample Probabilistic reverse nearest neighbor quer and its properties: For eistentiall uncertain data, a point belongs to the RNN set of q with probabilit: P = E ( E ) (4) R d(, )<d(q,) which corresponds to the case that eists and the points that are closer to than to q do not eist. For instance, in Figure 7a, we have P p3 =.7 (.6) (.5) =.4 because p 3 is closer to p and p 4 than to q. Since p 2 is closer to q than to other points, we derive P p2 =.8 =.8. The probabilit of other points can be computed in a similar wa. Similarl to the skline quer and unlike the NN quer of Section 4..2, we cannot define an order of visiting the points around q, such that the upper bound probabilit of remaining points to be in the RNN result, can be maintained b incrementall updating a single P first value. To elaborate this, suppose that we first eamined the point p in Figure 7a. Note that p onl influences the probabilities of p 3, p 4 but not that of p 2. The eample also demonstrates that the upper bound probabilit of a point can be computed b using eamined points. Given a set S R of (eamined) points, the upper bound probabilit P first of a point with respect to q and S is defined as: P first = S d(, )<d(q,) ( E ) (5) Geometric properties of reverse nearest neighbors can be eploited to derive the upper bound probabilit of remaining points in a specific sector (see Figure 7a). [6] proved that, if two points p and p are in the same sector and q is closer to p than to p, then p must be closer to p than to q. Based on this propert, a natural solution for the quer is to retrieve the points in ascending order of their distances from q. For each sector A j, its P first A j value is used as the upper bound probabilit of an remaining point in A j. P first A j is set to initiall and it is multiplied b the factor ( E ) when a new point is discovered in A j. We observe that introducing additional sectors ma help deriving tighter probabilit bounds for uneamined points. Consider the 2-sector partitioning shown in Figure 7b. When a point (sa, p ) is discovered in the sector A 4, it is used to update P first A j for the sectors (i.e., A 3, A 4, A 5 ) that are within (maimum) 6 angular range from A 4. Conversel, the probabilit bound of a sector is contributed b the points within (3 3) = 9 angular range. Recall that, for the original 6-sector partitioning in [6], a sector is onl affected b the points within 6 angular range. In general, given a positive integer V, in the (6V )-sector partitioning scheme, (2V ) sectors need to be eamined per visited point. The more sectors we have, the tighter probabilit bounds are derived for (uneamined points in) the sectors, and the earlier unqualified sectors can be pruned. On the other hand, the computational overhead of updating probabilit bounds for the sectors is proportional to V. In Section 6, we will determine an appropriate number of sectors that achieves significant cost reduction and adds little computational overhead for updating probabilit bounds for the sectors. Net, we discuss how this partitioning scheme can be used to evaluate probabilistic reverse nearest neighbor queries Thresholding and ranking: Algorithm 8 shows how thresholding RNN queries are evaluated on a tree. The sstem parameter κ specifies the number of sectors to be used. First, the space is divided into κ sectors A i and their probabilit bounds P first A i are set to. The algorithm maintains candidate objects (i.e., potential results) in a set C and delas computing the actual probabilit of a candidate until all objects influencing it have been eamined. Eamined objects are stored in the set S and the are used to compute upper bound probabilities for candidate objects. Both C and S are initialized to empt sets. At Line 6, we appl Best-First search [9] to incrementall retrieve the net NN (i.e., the object ) of q from the tree R. Suppose that A() denotes the sector containing. If the upper bound probabilit E P first A() is greater than the threshold t, then a tighter upper bound probabilit E P first is computed, b eamining the objects in S (see Equation 5). When the above probabilit is at least t, the object is inserted into C. After that, is inserted into S and P first A i is updated for each sector A i within 6 angular range from A(). In turn, is used to update the Po first value for objects in C, and those satisfing E o Po first < t are removed from C. If the last deheaped distance d last (from the Best-First heap) is greater than 2 d(q, o) for a candidate object o C, then all entries in Best-First heap cannot affect the probabilit of o. At Line 7, we compute the actual probabilit P o as Po first E o and report o as a result when P o t. The loop (Lines 5-2) continues while some sector ma contain potential results to be discovered or C is not empt. The above algorithm can be etended to retrieve the topm ranked reverse nearest neighbors from the R tree. It

11 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X Algorithm 8 Probabilistic reverse nearest neighbor on a R tree with thresholding Algorithm PTRNN(Quer point q, R tree on R, Threshold t) κ: number of sectors (sstem parameter) : divide the space into κ equal sectors around q: A i, i [, κ]; 2: for all i [, κ] do 3: P first A :=; i Upper prob. bound of remaining objects in sector A i 4: C:= ; S:= ; 5: while i [, κ], P first A t or C > do i 6: := net NN of q in R; d last := d(q, ); 7: let A() be the sector of ; 8: if E P first first t and E P A() t then appl cheap filter first, and then epensive filter 9: C := C {}; : S := S {}; : for all i [, κ] such that A i is within (maimum) 6 angular range from A() do 2: P first A i := P first A i ( E ); 3: for all o C such that d(, o) < d(q, o) do update Po first 4: Po first :=Po first ( E ); 5: remove objects o from C with E o Po first < t; filter false hits 6: for all o C such that d last 2 d(q, o) do entries in Best-First heap cannot affect the probabilit of o 7: P o := Po first E o; 8: if P o t then 9: output (o,p o); 2: remove o from C; Algorithm 9 Probabilistic reverse nearest neighbor on an augmented R tree with thresholding Algorithm PTRNNaug(Quer point q, Augmented R tree on R, Threshold t) κ: number of sectors (sstem parameter) : divide the space into κ equal sectors around q: A i, i [, κ]; 2: for all i [, κ] do 3: P first A :=; i Upper prob. bound of remaining objects in sector A i 4: C:= ; S:= ; L:= ; 5: while more objects in R do 6: := net NN of q in R; d last := d(q, ); during BF-search, each non-leaf entr e intersecting onl sector(s) with P first A e mae < t is removed from Best-First heap and inserted i into L 7: appl Lines 7 5 of Algorithm 8; 8: for all o C such that d last 2 d(q, o) do entries in Best-First heap cannot affect the probabilit of o 9: while Po first E o t and e L, mind(e, o) < d(q, o) do refinement step : remove the entr e with the smallest mind(e, o) in L; : if e is an object then 2: appl Lines 5 of Algo. 8, but b replacing with e; 3: else 4: read the node n e pointed b e and insert all entries of n e into L; 5: P o := Po first E o; 6: if P o t then 7: output (o,p o); 8: remove o from C; 9: for all o C do verif remaining candidates in C 2: appl Lines 9 8 of this Algorithm; Besides, the above thresholding 3 algorithm can be adapted to Algorithm 9, for augmented R trees and R trees. At Line 6, each non-leaf entr e intersecting onl sector(s) with P first A i e mae < t is removed from Best-First heap because the cannot contain an results. However, such entries ma affect the probabilit of other points so the are inserted into L. Lines 9-4 compute the actual probabilit for such an object, b refining its Po first value with the entries in L. For this, we check whether the upper bound probabilit Po first E o is above t and q is closer to some entries in L than to q. If so, the entr closest to o is removed from L and its child node n e is accessed. In case n e points to a tree node, all its entries are inserted into L. Otherwise, entries in n e are used to update the set S, Po first values of candidate objects, and P first A i values of sectors. At Lines 9-2, the remaining candidates in C are verified b accessing entries in L that ma influence their probabilities. 6 EXPERIMENTAL EVALUATION In this section, we evaluate the efficienc of the proposed techniques. We compare the performances of five indees and their corresponding algorithms for thresholding and ranking versions of range queries, nearest neighbor search, skline queries and reverse nearest neighbor retrieval. The five indees are (i) a simple R tree (denoted b ), (ii) a R tree, where each non-leaf entr e is augmented with e mae (denoted b ), (iii) a R tree, where each non-leaf entr e is augmented with e mae and e num (i.e., the number of objects in the subtree indeed b it), denoted b COUNT, (iv) a R tree (denoted b ), and (v) a R tree, where each non-leaf entr e is augmented with e num (denoted b COUNT). For indees (iv) and (v), all (spatial/probabilit) dimensions are normalized to the same domain interval. Note that inde (i) captures minimum information in non-leaf entries and occupies the least space, whereas inde (v) is at the other end (entries capture maimum information and the inde occupies the most space). All algorithms were implemented in C++. Eperiments were run on a PC with a Pentium D CPU of 2.8GHz. The page size of indees was set to Kb; the relative performance results of the above methods were observed for other page sizes (up to 8Kb). No memor buffers are used for caching disk pages between different queries; the number of node accesses directl reflects the cost. In each eperiment, the measured cost is the average cost of queries with the same parameter values (but with different locations randoml chosen from the dataset). For range queries, nearest neighbor search, and reverse nearest neighbor retrieval, the time is over 9% of the total eecution cost so the CPU time is not reported. maintains a heap H of m objects with the highest P found so far. In addition, we replace the fied threshold t b a floating bound P m, which indicates the m-th highest P in H. At Lines 8-9, if P o is greater than P m, then the result heap H and the bound P m are updated. 6. Description of Data For our eperiments, we used various real datasets of different sizes and object distributions, described in Table 4. The datasets TG and SF are obtained from [9] while 3. Adaptations of ranking algorithms for RNN queries are omitted due to space constraints.

12 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X 2 TABLE 4 cost of thresholding/ranking NN on different datasets, t =.5, m = Dataset Size COUNT COUNT San Joaquin / 6.49/ 9.75/ 28.6/ 3./ roads (TG) Greece / 7.23/ 2.79/ 24.58/ 25.2/ roads (GR) Long Beach / 9.7/ 2.49/ 24.7/ 25.8/ roads (LB) LA streets / 9.9/ 2.89/ 32.83/ 36.7/ (LA) San Francisco / 2.39/ 22.5/ 32.7/ 36.7/ roads (SF) Tiger streams / 8.36/ 9.69/ 27.37/ 32.94/ (TS) the other datasets are obtained from the R tree Portal ( Due to the lack of a real spatial dataset with objects having eistential probabilities, we generated probabilities for the objects, using the following methodolog. First we generated K = 2 anchor points randoml on the map, following the data distribution. These points model locations around which there is large certaint for the eistence of data (e.g., the could be antennas of receivers close to which information is accurate). For each point of the dataset, we (i) find the closest anchor a and (ii) assign an eistential probabilit proportional to. Thus, the distribution of probabilities around (c dist(,a)) θ the anchors is a Zipfian one. The probabilities are normalized (using c) with respect to the maimum probabilit () corresponding to the anchor point. The default skew value is θ = ; eperiments on different skew values can be found in our preliminar work [2]. 6.2 Eperimental Results Table 4 shows the performances of the five indees for thresholding and ranking NN queries on different datasets. We fi t =.5 for thresholding NN queries and m = for ranking NN queries. 4 Observe that the augmented and R trees perform better than the R tree, even though the are larger in size. Algorithms 4 and 5 manage to prune a large number of nodes that do not contain quer results, which are otherwise visited in the simple R tree inde. The cost of R tree variants (i.e., methods, COUNT) does not change much with the database size. The costs of R tree variants increase slowl as the database size increases. This is due to the fact that R trees group entries using both spatial and probabilit dimensions, but the quer algorithms mainl search for objects based on spatial dimensions. In subsequent eperiments, we compare the performance of the indees on the SF dataset and default parameter values are t =.5 and m = for thresholding and ranking queries respectivel. Figure 8 shows the performance of the indees for thresholding and ranking queries. Augmented and 4. A small value for t is necessar in order to observe difference between the indees. Larger values for t will be tested in a subsequent eperiment COUNT COUNT t (a) thresholding queries COUNT COUNT m (b) ranking queries Fig. 8. NN queries on the SF dataset, θ = R trees perform much better than the simple R tree for all tested values of t and m. For t.2, less than 5 accesses are required to find the quer result when using the four advanced indees and Algorithms 4 and 5. When comparing these indees, we observe that augmenting e num is not a good idea; using the fanout f gives accurate enough estimations of P min and P ma. Thus the etra space (translated to etra accesses) required for augmenting e num does not pa off. In addition, the augmented R tree performs better than the R tree. First, the R tree occupies more space (the capacit of each non-leaf node is smaller) and results in more accesses, since the etra space is not compensated b tighter P min and P ma (see Equations 3 and 4). Second, since the R tree groups entries to nodes using the eistential probabilities as well as spatial dimensions, it does not achieve as good partitioning as the one using the spatial dimensions onl; however, search is performed primaril using the spatial dimensions. Net, we eamine the performances of range queries on the indees. The parameter len denotes the etent of the quer window (in each dimension), whose default value is set to 5% of the domain length. Figure 9a and 9b show the cost of thresholding and ranking queries as a function of t and m respectivel. Ecept for the simple R tree, all indees follow similar trends as in probabilistic nearest neighbor queries. The cost of range queries on the R tree is independent of t and m as all points within the spatial range are retrieved. Observe that for ver small t, the augmented and indees ma perform worse than the R tree because (i) the prune no or ver few director entries that have lower e mae than t and (ii) the are larger in size than the simple R tree. Similarl, P m decreases with m, affecting the costs of the advanced methods. The R tree performs worse than the augmented R tree also for range queries. Figure 9c shows the cost of thresholding queries as a function of len. The costs of all methods increase with len. We proceed to compare the performances of spatial skline queries on the indees. For each quer, a set of Q quer points are randoml generated in a quer window with side length len, such that the window follows the data distribution. The default values of Q and len are 6 and 5% of the domain range respectivel. Figure shows the -CPU time breakdown of thresholding and ranking queries as a function of t an m respectivel. Each page fault is charged milliseconds of time. Observe that the method outperforms its competitors for a wide range of parameters. In terms of,

13 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X COUNT COUNT 2 5 COUNT COUNT 8 COUNT COUNT t m length(%) (a) thresholding queries vs t (b) ranking queries vs m (c) thresholding queries vs len, t =.5 Fig. 9. Range queries on the SF dataset the trends are similar to the ones in Figure 8. However, the CPU time of augmented and trees becomes high at low t value and high m value COUNT COUNT COUNT COUNT 6 6 quer processing time (sec) C C CPU time time C C C C C C C C C C C C t (a) thresholding queries vs t quer processing time (sec) C C CPU time time C C C C C C C C C C C C m (b) ranking queries vs m Fig.. Spatial skline queries on the SF dataset, Q = 6, len = 5% number of sectors number of sectors (a) thresholding queries, t =.5 (b) ranking queries, m = Fig. 2. Reverse nearest neighbor queries on the SF dataset, varing the number of sectors COUNT COUNT COUNT COUNT Figure plots the cost of the indees b varing the number Q of quer points. In general, when Q increases, a point is spatiall dominated b fewer points, and thus the probabilit of the point to be in the skline increases. Thus, more points need to be eamined b thresholding queries and its cost increases rapidl. On the other hand, P m increases with Q, strengthening the pruning power of advanced indees. Thus, the cost of ranking queries increases at a slower rate. quer processing time (sec) CPU time time C C C C C C C C C C size of Q quer processing time (sec) CPU time time C C C C C C C C C C size of Q (a) thresholding queries, t =.5 (b) ranking queries, m = Fig.. Spatial skline queries on the SF dataset, varing Q Finall, we stud the performance of the indees for reverse nearest neighbor queries. Figure 2 shows the effect of the number of sectors in performance. When more sectors are used, tighter probabilit bounds are derived for the sectors and hence the algorithm terminates faster. In particular, the 96-sector partitioning achieves substantial cost reduction (over the basic 6-sector partitioning) for thresholding and ranking queries respectivel. Observe that the cost starts converging to its final value with as few as 24 partitions. Figure 3 plots the t (a) thresholding queries vs t m (b) ranking queries vs m Fig. 3. Reverse nearest neighbor queries on the SF dataset, using the 24-sector partitioning cost of the methods as a function of t and m respectivel, when using the 24-sector partitioning. For thresholding queries, the performance gap between the R tree and other indees widens as t increases because of the increased pruning power of the advanced indees. On the other hand, the cost differences among the indees are not sensitive to the value of m. As with previous queries, prevails. 7 DISCUSSION: RELAXING THE INDEPEN- DENCE ASSUMPTION Our analsis so far assumes that the eistential probabilities of objects are independent. This assumption is valid in a large number of applications (e.g., those mentioned in Section ); hence, our solutions have significant value in practice. However, there are also other applications where the eistential probabilities of different objects are correlated. For eample, consider a collection of sensors distributed in a forest for detecting wildfire. When a sensor detects smoke, sensors in its neighborhood are likel to sense it as well. A thorough solution in this scenario falls out of the scope of this paper. Nevertheless, in the sequel, we point out the direction towards

14 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X 4 etending the proposed algorithms and indeing schemes to support correlated eistential probabilities. We now elaborate how to evaluate the thresholding probabilistic NN quer using a simple R tree, in the correlated probabilit model. Instead of Definition 2, we define the probabilit of an object to be the NN of q as: JP = JPr( Γ() R,,d(q, )<d(q,) (NOT Γ( )) ) where JPr denotes the joint probabilit that eists (i.e., the event Γ()) and all objects closer to q do not eist. This probabilit can be computed from a Baesian network modeling dependent probabilities among the objects. The value JPr( R,,d(q, )<d(q,) (NOT Γ( )) ) serves as an upper bound of JP, regardless of how the probabilities are correlated. Based on this propert, we modif Algo. 2 as follows: (i) we maintain the set S of visited objects, (ii) at Line 7, we insert the object into S and compute P first = JPr( S (NOT Γ( )) ), (iii) at Line 4, we compute JP = JPr( Γ() S (NOT Γ( )) ). The above idea works also for spatial skline (Equation 7, Algo. 6) and reverse nearest neighbor (Equation 4, Algo. 8), after replacing each multiplication b, each E b Γ(), each ( E ) b NOT Γ( ), and the final probabilit b JPr(). Etensions of other R tree solutions (e.g., augmented trees, trees) generate non-trivial research issues, due to the fact that: (i) the number of possible joint probabilities is enormous (i.e., eponential to the data cardinalit), and (ii) it remains unclear how to augment a non-leaf entr to effectivel capture the joint probabilities of the objects in its subtree. In the future, we will develop efficient solutions for augmented trees that are applicable for the correlated probabilit model. 8 CONCLUSIONS In this paper, we presented the interesting problem of evaluating spatial queries for eistentiall uncertain data. Variants of common spatial queries, like range and nearest neighbor search, have probabilistic versions for this data model. We proposed algorithms for these probabilistic versions and several etensions of spatial access methods (i.e., R trees) where these algorithms are applied. In addition, we discuss how comple spatial queries such as spatial skline queries and reverse nearest neighbor queries can be processed in our framework. Finall, we conducted etensive eperiments to evaluate the search algorithms and the corresponding spatial indees. In most of the tested cases, the data structure that performs best is a R tree, where non-leaf entries are augmented with maimum eistential probabilities of the sub-tree the point at. In the future, we plan to stud in detail more advanced quer tpes and etend our methods to appl on data that are both eistentiall and locationall uncertain, as well as results of fuzz classifiers []. ACKNOWLEDGMENTS This work was supported b grant HKU 749/7E from Hong Kong RGC. The work of ufei Tao was supported b grants CUHK 22/6 and CUHK 46/7 from Hong Kong RGC. A preliminar version of this work appeared in Ref. [2]. REFERENCES [] P. M. Atkinson and N. J. Tate, Eds., Advances in Remote Sensing and GIS Analsis. John Wile & Sons, 999. [2] O. Wolfson, A. P. Sistla, S. Chamberlain, and. esha, Updating and Quering Databases that Track Mobile Units, Distributed and Parallel Databases, vol. 7, no. 3, pp , 999. [3] D. Pfoser and C. S. Jensen, Capturing the Uncertaint of Moving- Object Representations, in Proc. of SSD, 999. [4] J. Ni, C. V. Ravishankar, and B. Bhanu, Probabilistic Spatial Database Operations, in Proc. of SSTD, 23. [5] X. u and S. Mehrotra, Capturing Uncertaint in Spatial Queries over Imprecise Data, in Proc. of DEXA, 23. [6] R. Cheng, D. V. Kalashnikov, and S. Prabhakar, Quering Imprecise Data in Moving Object Environments, IEEE TKDE, vol. 6, no. 9, pp. 2 27, 24. [7]. Tao, X. Xiao, and R. Cheng, Range Search on Multidimensional Uncertain Data, ACM TODS, vol. 32, no. 3, p. 5, 27. [8] A. Guttman, R-trees: A Dnamic Inde Structure for Spatial Searching, in Proc. of ACM SIGMOD, 984. [9] G. R. Hjaltason and H. Samet, Distance Browsing in Spatial Databases, ACM TODS, vol. 24, no. 2, pp , 999. [] R. Cheng,. Xia, S. Prabhakar, R. Shah, and J. S. Vitter, Efficient Indeing Methods for Probabilistic Threshold Queries over Uncertain Data, in Proc. of VLDB, 24. [] G. Trajcevski, O. Wolfson, F. Zhang, and S. Chamberlain, The Geometr of Uncertaint in Moving Objects Databases, in Proc. of EDBT Conf., 22. [2] R. Cheng, D. V. Kalashnikov, and S. Prabhakar, Evaluating Probabilistic Queries over Imprecise Data, in Proc. of ACM SIGMOD, 23. [3] I. Lazaridis and S. Mehrotra, Approimate Selection Queries over Imprecise Data, in Proc. of ICDE, 24. [4] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, The R*-tree: An Efficient and Robust Access Method for Points and Rectangles, in SIGMOD, 99. [5] M. Sharifzadeh and C. Shahabi, The Spatial Skline Queries, in Proc. of VLDB, 26. [6] I. Stanoi, D. Agrawal, and A. Abbadi, Reverse Nearest Neighbor Queries for Dnamic Databases, in SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discover, 2. [7] D. Papadias,. Tao, G. Fu, and B. Seeger, Progressive Skline Computation in Database Sstems, ACM TODS, vol. 3, no., pp. 4 82, 25. [8]. Tao, D. Papadias, and X. Lian, Reverse knn Search in Arbitrar Dimensionalit, in Proc. of VLDB, 24. [9] T. Brinkhoff, A Framework for Generating Network-Based Moving Objects, GeoInformatica, vol. 6, no. 2, pp. 53 8, 22. [2] X. Dai, M. L. iu, N. Mamoulis,. Tao, and M. Vaitis, Probabilistic Spatial Queries on Eistentiall Uncertain Data, in Proc. of SSTD, 25. Man Lung iu Man Lung iu received the Bachelor Degree in Computer Engineering and the PhD Degree in Computer Science from the Universit of Hong Kong in 22 and 26 respectivel. He is currentl an assistant professor at Department of Computer Science, Aalborg Universit. His research focuses on the management of comple data, in particular the quer processing topics on spatio-temporal data and multidimensional data.

15 IEEE TRANSACTIONS OF KNOWLEDGE AND DATA ENGINEERING, VOL. X, NO. X, XXXXX 2X 5 Nikos Mamoulis Nikos Mamoulis received a diploma in Computer Engineering and Informatics in 995 from the Universit of Patras, Greece, and a PhD in Computer Science in 2 from the Hong Kong Universit of Science and Technolog. He is currentl an associate professor at the Department of Computer Science, Universit of Hong Kong, which he joined in 2. In the past, he has worked as a research and development engineer at the Computer Technolog Institute, Patras, Greece and as a post-doctoral researcher at the Centrum voor Wiskunde en Informatica (CWI), the Netherlands. His research focuses on management and mining of comple data tpes. He has served on the program committees of over 4 international conferences and workshops on data management and data mining. He was the general chair of SSDBM 28 and a coorganizer of SSTDM 26. He is an editorial board member for Geoinformatica Journal and a field editor of the Encclopedia of Geographic Information Sstems. Xianguan Dai Xianguan Dai received the Bachelor of Engineering Degree from the Department of Computer Science and Technolog of the Universit of Science and Technolog of China in 24. He obtained the MPhil Degree in Computer Science from the Universit of Hong Kong in 26. His research interests include quer processing problems on spatial data. ufei Tao Dr. Tao is engaged in research of database sstems. He is particularl interested in inde structures and quer algorithms on multidimensional data, and has published primaril on temporal databases, spatial databases, and privac preservation. He received the Hong Kong oung scientist award in 22. He has served the program committees of most prestigious database conferences such as SIGMOD, VLDB, ICDE, and is currentl an associate editor of ACM Transactions on Database Sstems (TODS). He joined the Chinese Universit of Hong Kong in September 26. Before that, he held positions at the Carnegie Mellon Universit and the Cit Universit of Hong Kong. He is a member of the ACM. Michail Vaitis Michail Vaitis holds an Engineering Diploma (992) and a Ph.D. degree (2) in Computer Engineering and Informatics from the Universit of Patras, Greece. Since 23 he has been a facult member of the Department of Geograph at the Universit of the Aegean, Greece. Now he is an assistant professor. In the past, he was working for 5 ears at the Research Academic Computer Technolog Institute (RA- CTI), Greece, on hpertet and database sstems. His research interests include geographical databases, spatial data infrastructures, geographic hpermedia and geo-spatial semantic web. He is a member of the ACM and the Technical Chamber of Greece.

Probabilistic Spatial Queries on Existentially Uncertain Data

Probabilistic Spatial Queries on Existentially Uncertain Data Probabilistic Spatial Queries on Existentially Uncertain Data Xiangyuan Dai 1, Man Lung Yiu 1, Nikos Mamoulis 1, Yufei Tao 2, and Michail Vaitis 3 1 Department of Computer Science, University of Hong Kong,

More information

Probabilistic Spatial Queries on Existentially Uncertain Data

Probabilistic Spatial Queries on Existentially Uncertain Data Probabilistic Spatial Queries on Eistentially Uncertain Data Xiangyuan Dai 1, Man Lung Yiu 1, Nikos Mamoulis 1,YufeiTao 2, and Michail Vaitis 3 1 Department of Computer Science, University of Hong Kong

More information

AGiDS: A Grid-based Strategy for Distributed Skyline Query Processing

AGiDS: A Grid-based Strategy for Distributed Skyline Query Processing : A Grid-based Strateg for Distributed Skline Quer Processing João B. Rocha-Junior, Akrivi Vlachou, Christos Doulkeridis, and Kjetil Nørvåg Norwegian Universit of Science and Technolog {joao,vlachou,cdoulk,noervaag}@idi.ntnu.no

More information

The B dual -Tree: Indexing Moving Objects by Space Filling Curves in the Dual Space

The B dual -Tree: Indexing Moving Objects by Space Filling Curves in the Dual Space The VLDB Journal manuscript No. (will be inserted by the editor) Man Lung Yiu Yufei Tao Nikos Mamoulis The B dual -Tree: Indeing Moving Objects by Space Filling Curves in the Dual Space Received: date

More information

Image Metamorphosis By Affine Transformations

Image Metamorphosis By Affine Transformations Image Metamorphosis B Affine Transformations Tim Mers and Peter Spiegel December 16, 2005 Abstract Among the man was to manipulate an image is a technique known as morphing. Image morphing is a special

More information

Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Tan,Steinbach, Kumar Introduction to Data Mining 4/18/

Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining Cluster Analsis: Basic Concepts and Algorithms Lecture Notes for Chapter Introduction to Data Mining b Tan, Steinbach, Kumar What is Cluster Analsis? Finding groups of objects such that the

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machine learning Final eam December 3, 24 Your name and MIT ID: J. D. (Optional) The grade ou would give to ourself + a brief justification. A... wh not? Problem 5 4.5 4 3.5 3 2.5 2.5 + () + (2)

More information

Clustering Part 2. A Partitional Clustering

Clustering Part 2. A Partitional Clustering Universit of Florida CISE department Gator Engineering Clustering Part Dr. Sanja Ranka Professor Computer and Information Science and Engineering Universit of Florida, Gainesville Universit of Florida

More information

High Dimensional Indexing by Clustering

High Dimensional Indexing by Clustering Yufei Tao ITEE University of Queensland Recall that, our discussion so far has assumed that the dimensionality d is moderately high, such that it can be regarded as a constant. This means that d should

More information

Progressive Skyline Computation in Database Systems

Progressive Skyline Computation in Database Systems This is the Pre-Published Version Progressive Skline Computation in Database Sstems DIMITRIS PAPADIAS Hong Kong Universit of Science and Technolog, Hong Kong, China YUFEI TAO Cit Universit of Hong Kong,

More information

2.2 Absolute Value Functions

2.2 Absolute Value Functions . Absolute Value Functions 7. Absolute Value Functions There are a few was to describe what is meant b the absolute value of a real number. You ma have been taught that is the distance from the real number

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machine learning Final eam December 3, 24 Your name and MIT ID: J. D. (Optional) The grade ou would give to ourself + a brief justification. A... wh not? Cite as: Tommi Jaakkola, course materials

More information

Introduction to Algorithms Massachusetts Institute of Technology Professors Erik D. Demaine and Charles E. Leiserson.

Introduction to Algorithms Massachusetts Institute of Technology Professors Erik D. Demaine and Charles E. Leiserson. Introduction to Algorithms Massachusetts Institute of Technolog Professors Erik D. Demaine and Charles E. Leiserson October 24, 2005 6.046J/18.410J Handout 16 Problem Set 5 MIT students: This problem set

More information

Discussion: Clustering Random Curves Under Spatial Dependence

Discussion: Clustering Random Curves Under Spatial Dependence Discussion: Clustering Random Curves Under Spatial Dependence Gareth M. James, Wenguang Sun and Xinghao Qiao Abstract We discuss the advantages and disadvantages of a functional approach to clustering

More information

Chapter Goals: Evaluate limits. Evaluate one-sided limits. Understand the concepts of continuity and differentiability and their relationship.

Chapter Goals: Evaluate limits. Evaluate one-sided limits. Understand the concepts of continuity and differentiability and their relationship. MA123, Chapter 3: The idea of its (pp. 47-67) Date: Chapter Goals: Evaluate its. Evaluate one-sided its. Understand the concepts of continuit and differentiabilit and their relationship. Assignments: Assignment

More information

Unsupervised Learning. Supervised learning vs. unsupervised learning. What is Cluster Analysis? Applications of Cluster Analysis

Unsupervised Learning. Supervised learning vs. unsupervised learning. What is Cluster Analysis? Applications of Cluster Analysis 7 Supervised learning vs unsupervised learning Unsupervised Learning Supervised learning: discover patterns in the data that relate data attributes with a target (class) attribute These patterns are then

More information

Reverse Nearest Neighbor Search in Metric Spaces

Reverse Nearest Neighbor Search in Metric Spaces Reverse Nearest Neighbor Search in Metric Spaces Yufei Tao Dept. of Computer Science and Engineering Chinese University of Hong Kong Sha Tin, New Territories, Hong Kong taoyf@cse.cuhk.edu.hk Man Lung Yiu

More information

Section 2.2: Absolute Value Functions, from College Algebra: Corrected Edition by Carl Stitz, Ph.D. and Jeff Zeager, Ph.D. is available under a

Section 2.2: Absolute Value Functions, from College Algebra: Corrected Edition by Carl Stitz, Ph.D. and Jeff Zeager, Ph.D. is available under a Section.: Absolute Value Functions, from College Algebra: Corrected Edition b Carl Stitz, Ph.D. and Jeff Zeager, Ph.D. is available under a Creative Commons Attribution-NonCommercial-ShareAlike.0 license.

More information

kd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d )

kd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d ) kd-trees Invented in 1970s by Jon Bentley Name originally meant 3d-trees, 4d-trees, etc where k was the # of dimensions Now, people say kd-tree of dimension d Idea: Each level of the tree compares against

More information

y = f(x) x (x, f(x)) f(x) g(x) = f(x) + 2 (x, g(x)) 0 (0, 1) 1 3 (0, 3) 2 (2, 3) 3 5 (2, 5) 4 (4, 3) 3 5 (4, 5) 5 (5, 5) 5 7 (5, 7)

y = f(x) x (x, f(x)) f(x) g(x) = f(x) + 2 (x, g(x)) 0 (0, 1) 1 3 (0, 3) 2 (2, 3) 3 5 (2, 5) 4 (4, 3) 3 5 (4, 5) 5 (5, 5) 5 7 (5, 7) 0 Relations and Functions.7 Transformations In this section, we stud how the graphs of functions change, or transform, when certain specialized modifications are made to their formulas. The transformations

More information

GRAPHICS OUTPUT PRIMITIVES

GRAPHICS OUTPUT PRIMITIVES CHAPTER 3 GRAPHICS OUTPUT PRIMITIVES LINE DRAWING ALGORITHMS DDA Line Algorithm Bresenham Line Algorithm Midpoint Circle Algorithm Midpoint Ellipse Algorithm CG - Chapter-3 LINE DRAWING Line drawing is

More information

Intermediate Algebra. Gregg Waterman Oregon Institute of Technology

Intermediate Algebra. Gregg Waterman Oregon Institute of Technology Intermediate Algebra Gregg Waterman Oregon Institute of Technolog c 2017 Gregg Waterman This work is licensed under the Creative Commons Attribution 4.0 International license. The essence of the license

More information

CS 314H Honors Data Structures Fall 2017 Programming Assignment #6 Treaps Due November 12/15/17, 2017

CS 314H Honors Data Structures Fall 2017 Programming Assignment #6 Treaps Due November 12/15/17, 2017 CS H Honors Data Structures Fall 207 Programming Assignment # Treaps Due November 2//7, 207 In this assignment ou will work individuall to implement a map (associative lookup) using a data structure called

More information

A Formal Definition of Limit

A Formal Definition of Limit 5 CHAPTER Limits and Their Properties L + ε L L ε (c, L) c + δ c c δ The - definition of the it of f as approaches c Figure. A Formal Definition of Limit Let s take another look at the informal description

More information

Computer Graphics. P04 Transformations. Aleksandra Pizurica Ghent University

Computer Graphics. P04 Transformations. Aleksandra Pizurica Ghent University Computer Graphics P4 Transformations Aleksandra Pizurica Ghent Universit Telecommunications and Information Processing Image Processing and Interpretation Group Transformations in computer graphics Goal:

More information

NONCONGRUENT EQUIDISSECTIONS OF THE PLANE

NONCONGRUENT EQUIDISSECTIONS OF THE PLANE NONCONGRUENT EQUIDISSECTIONS OF THE PLANE D. FRETTLÖH Abstract. Nandakumar asked whether there is a tiling of the plane b pairwise non-congruent triangles of equal area and equal perimeter. Here a weaker

More information

Scale Invariant Feature Transform (SIFT) CS 763 Ajit Rajwade

Scale Invariant Feature Transform (SIFT) CS 763 Ajit Rajwade Scale Invariant Feature Transform (SIFT) CS 763 Ajit Rajwade What is SIFT? It is a technique for detecting salient stable feature points in an image. For ever such point it also provides a set of features

More information

Top-k Spatial Preference Queries

Top-k Spatial Preference Queries Top-k Spatial Preference Queries Man Lung Yiu Department of Computer Science Aalborg University DK-9220 Aalborg, Denmark mly@cs.aau.dk Xiangyuan Dai, Nikos Mamoulis Department of Computer Science University

More information

Modeling and Simulation Exam

Modeling and Simulation Exam Modeling and Simulation am Facult of Computers & Information Department: Computer Science Grade: Fourth Course code: CSC Total Mark: 75 Date: Time: hours Answer the following questions: - a Define the

More information

Ad-hoc Top-k Query Answering for Data Streams

Ad-hoc Top-k Query Answering for Data Streams Ad-hoc Top-k Quer Answering for Data Streams Gautam Das Universit of Teas at Arlington gdas@cse.uta.edu Nick Koudas Universit of Toronto koudas@cs.toronto.edu Dimitrios Gunopulos Universit of California,

More information

Key properties of local features

Key properties of local features Key properties of local features Locality, robust against occlusions Must be highly distinctive, a good feature should allow for correct object identification with low probability of mismatch Easy to etract

More information

CMSC 425: Lecture 10 Basics of Skeletal Animation and Kinematics

CMSC 425: Lecture 10 Basics of Skeletal Animation and Kinematics : Lecture Basics of Skeletal Animation and Kinematics Reading: Chapt of Gregor, Game Engine Architecture. The material on kinematics is a simplification of similar concepts developed in the field of robotics,

More information

4.2 Properties of Rational Functions. 188 CHAPTER 4 Polynomial and Rational Functions. Are You Prepared? Answers

4.2 Properties of Rational Functions. 188 CHAPTER 4 Polynomial and Rational Functions. Are You Prepared? Answers 88 CHAPTER 4 Polnomial and Rational Functions 5. Obtain a graph of the function for the values of a, b, and c in the following table. Conjecture a relation between the degree of a polnomial and the number

More information

Answers. Investigation 4. ACE Assignment Choices. Applications

Answers. Investigation 4. ACE Assignment Choices. Applications Answers Investigation ACE Assignment Choices Problem. Core Other Connections, ; Etensions ; unassigned choices from previous problems Problem. Core, 7 Other Applications, ; Connections ; Etensions ; unassigned

More information

Ring-constrained Join: Deriving Fair Middleman Locations from Pointsets via a Geometric Constraint

Ring-constrained Join: Deriving Fair Middleman Locations from Pointsets via a Geometric Constraint Ring-constrained Join: Deriving Fair Middleman Locations from Pointsets via a Geometric Constraint Man Lung Yiu Department of Computer Science Aalborg University DK-922 Aalborg, Denmark mly@cs.aau.dk Panagiotis

More information

Chain Pattern Scheduling for nested loops

Chain Pattern Scheduling for nested loops Chain Pattern Scheduling for nested loops Florina Ciorba, Theodore Andronikos and George Papakonstantinou Computing Sstems Laborator, Computer Science Division, Department of Electrical and Computer Engineering,

More information

Introduction to Indexing R-trees. Hong Kong University of Science and Technology

Introduction to Indexing R-trees. Hong Kong University of Science and Technology Introduction to Indexing R-trees Dimitris Papadias Hong Kong University of Science and Technology 1 Introduction to Indexing 1. Assume that you work in a government office, and you maintain the records

More information

Answers Investigation 4

Answers Investigation 4 Answers Investigation Applications. a. At seconds, the flare will have traveled to a maimum height of 00 ft. b. The flare will hit the water when the height is 0 ft, which will occur at 0 seconds. c. In

More information

4.5.2 The Distance-Vector (DV) Routing Algorithm

4.5.2 The Distance-Vector (DV) Routing Algorithm 4.5 ROUTING ALGORITHMS 371 highl congested (for eample, high-dela) links. Another solution is to ensure that not all routers run the LS algorithm at the same time. This seems a more reasonable solution,

More information

Lecture 16 Notes AVL Trees

Lecture 16 Notes AVL Trees Lecture 16 Notes AVL Trees 15-122: Principles of Imperative Computation (Fall 2015) Frank Pfenning 1 Introduction Binar search trees are an ecellent data structure to implement associative arras, maps,

More information

Quad-Tree Based Geometric-Adapted Cartesian Grid Generation

Quad-Tree Based Geometric-Adapted Cartesian Grid Generation Quad-Tree Based Geometric-Adapted Cartesian Grid Generation EMRE KARA1, AHMET ĐHSAN KUTLAR1, MEHMET HALUK AKSEL 1Department of Mechanical Engineering Universit of Gaziantep 7310 Gaziantep TURKEY Mechanical

More information

Spatial Queries. Nearest Neighbor Queries

Spatial Queries. Nearest Neighbor Queries Spatial Queries Nearest Neighbor Queries Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer efficiently point queries range queries k-nn

More information

Reverse Nearest Neighbors Search in Ad-hoc Subspaces

Reverse Nearest Neighbors Search in Ad-hoc Subspaces Reverse Nearest Neighbors Search in Ad-hoc Subspaces Man Lung Yiu Department of Computer Science Aalborg University DK-9220 Aalborg, Denmark mly@cs.aau.dk Nikos Mamoulis Department of Computer Science

More information

Lines and Their Slopes

Lines and Their Slopes 8.2 Lines and Their Slopes Linear Equations in Two Variables In the previous chapter we studied linear equations in a single variable. The solution of such an equation is a real number. A linear equation

More information

2.8 Distance and Midpoint Formulas; Circles

2.8 Distance and Midpoint Formulas; Circles Section.8 Distance and Midpoint Formulas; Circles 9 Eercises 89 90 are based on the following cartoon. B.C. b permission of Johnn Hart and Creators Sndicate, Inc. 89. Assuming that there is no such thing

More information

Investigation Free Fall

Investigation Free Fall Investigation Free Fall Name Period Date You will need: a motion sensor, a small pillow or other soft object What function models the height of an object falling due to the force of gravit? Use a motion

More information

Multibody Motion Estimation and Segmentation from Multiple Central Panoramic Views

Multibody Motion Estimation and Segmentation from Multiple Central Panoramic Views Multibod Motion Estimation and Segmentation from Multiple Central Panoramic Views Omid Shakernia René Vidal Shankar Sastr Department of Electrical Engineering & Computer Sciences Universit of California

More information

Boolean Function Simplification

Boolean Function Simplification Universit of Wisconsin - Madison ECE/Comp Sci 352 Digital Sstems Fundamentals Charles R. Kime Section Fall 200 Chapter 2 Combinational Logic Circuits Part 5 Charles Kime & Thomas Kaminski Boolean Function

More information

Section 1.6: Graphs of Functions, from College Algebra: Corrected Edition by Carl Stitz, Ph.D. and Jeff Zeager, Ph.D. is available under a Creative

Section 1.6: Graphs of Functions, from College Algebra: Corrected Edition by Carl Stitz, Ph.D. and Jeff Zeager, Ph.D. is available under a Creative Section.6: Graphs of Functions, from College Algebra: Corrected Edition b Carl Stitz, Ph.D. and Jeff Zeager, Ph.D. is available under a Creative Commons Attribution-NonCommercial-ShareAlike.0 license.

More information

NUMERICAL PERFORMANCE OF COMPACT FOURTH ORDER FORMULATION OF THE NAVIER-STOKES EQUATIONS

NUMERICAL PERFORMANCE OF COMPACT FOURTH ORDER FORMULATION OF THE NAVIER-STOKES EQUATIONS Published in : Communications in Numerical Methods in Engineering (008 Commun.Numer.Meth.Engng. 008; Vol : pp 003-019 NUMERICAL PERFORMANCE OF COMPACT FOURTH ORDER FORMULATION OF THE NAVIER-STOKES EQUATIONS

More information

CS 450: COMPUTER GRAPHICS RASTERIZING LINES SPRING 2016 DR. MICHAEL J. REALE

CS 450: COMPUTER GRAPHICS RASTERIZING LINES SPRING 2016 DR. MICHAEL J. REALE CS 45: COMPUTER GRAPHICS RASTERIZING LINES SPRING 6 DR. MICHAEL J. REALE OBJECT-ORDER RENDERING We going to start on how we will perform object-order rendering Object-order rendering Go through each OBJECT

More information

Web Based Spatial Ranking System

Web Based Spatial Ranking System Research Inventy: International Journal Of Engineering And Science Vol.3, Issue 5 (July 2013), Pp 47-53 Issn(e): 2278-4721, Issn(p):2319-6483, Www.Researchinventy.Com 1 Mr. Vijayakumar Neela, 2 Prof. Raafiya

More information

Matrix Representations

Matrix Representations CONDENSED LESSON 6. Matri Representations In this lesson, ou Represent closed sstems with transition diagrams and transition matrices Use matrices to organize information Sandra works at a da-care center.

More information

and Algorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 8/30/ Introduction to Data Mining 08/06/2006 1

and Algorithms Dr. Hui Xiong Rutgers University Introduction to Data Mining 8/30/ Introduction to Data Mining 08/06/2006 1 Cluster Analsis: Basic Concepts and Algorithms Dr. Hui Xiong Rutgers Universit Introduction to Data Mining 8//6 Introduction to Data Mining 8/6/6 What is Cluster Analsis? Finding groups of objects such

More information

CS 314H Algorithms and Data Structures Fall 2012 Programming Assignment #6 Treaps Due November 11/14/16, 2012

CS 314H Algorithms and Data Structures Fall 2012 Programming Assignment #6 Treaps Due November 11/14/16, 2012 CS H Algorithms and Data Structures Fall 202 Programming Assignment # Treaps Due November //, 202 In this assignment ou will work in pairs to implement a map (associative lookup) using a data structure

More information

5.2 Graphing Polynomial Functions

5.2 Graphing Polynomial Functions Locker LESSON 5. Graphing Polnomial Functions Common Core Math Standards The student is epected to: F.IF.7c Graph polnomial functions, identifing zeros when suitable factorizations are available, and showing

More information

3.9 Differentials. Tangent Line Approximations. Exploration. Using a Tangent Line Approximation

3.9 Differentials. Tangent Line Approximations. Exploration. Using a Tangent Line Approximation 3.9 Differentials 3 3.9 Differentials Understand the concept of a tangent line approimation. Compare the value of the differential, d, with the actual change in,. Estimate a propagated error using a differential.

More information

Lecture Notes on AVL Trees

Lecture Notes on AVL Trees Lecture Notes on AVL Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 19 March 28, 2013 1 Introduction Binar search trees are an ecellent data structure to implement associative

More information

Data Mining. Cluster Analysis: Basic Concepts and Algorithms

Data Mining. Cluster Analysis: Basic Concepts and Algorithms Data Mining Cluster Analsis: Basic Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining /8/ What is Cluster Analsis? Finding groups of objects such that the objects in a group will

More information

Linear Algebra and Image Processing: Additional Theory regarding Computer Graphics and Image Processing not covered by David C.

Linear Algebra and Image Processing: Additional Theory regarding Computer Graphics and Image Processing not covered by David C. Linear Algebra and Image Processing: Additional Theor regarding Computer Graphics and Image Processing not covered b David C. La Dr. D.P. Huijsmans LIACS, Leiden Universit Februar 202 Differences in conventions

More information

2.3. One-to-One and Inverse Functions. Introduction. Prerequisites. Learning Outcomes

2.3. One-to-One and Inverse Functions. Introduction. Prerequisites. Learning Outcomes One-to-One and Inverse Functions 2. Introduction In this Section we eamine more terminolog associated with functions. We eplain one-to-one and man-to-one functions and show how the rule associated with

More information

A Safe-Exit Approach for Efficient Network-Based Moving Range Queries

A Safe-Exit Approach for Efficient Network-Based Moving Range Queries Data & Knowledge Engineering Data & Knowledge Engineering 00 (0) 5 A Safe-Exit Approach for Efficient Network-Based Moving Range Queries Duncan Yung, Man Lung Yiu, Eric Lo Department of Computing, Hong

More information

Tight Clustering: a method for extracting stable and tight patterns in expression profiles

Tight Clustering: a method for extracting stable and tight patterns in expression profiles Statistical issues in microarra analsis Tight Clustering: a method for etracting stable and tight patterns in epression profiles Eperimental design Image analsis Normalization George C. Tseng Dept. of

More information

MRI Imaging Options. Frank R. Korosec, Ph.D. Departments of Radiology and Medical Physics University of Wisconsin Madison

MRI Imaging Options. Frank R. Korosec, Ph.D. Departments of Radiology and Medical Physics University of Wisconsin Madison MRI Imaging Options Frank R. Korosec, Ph.D. Departments of Radiolog and Medical Phsics Universit of Wisconsin Madison f.korosec@hosp.wisc.edu As MR imaging becomes more developed, more imaging options

More information

CSC 263 Lecture 4. September 13, 2006

CSC 263 Lecture 4. September 13, 2006 S 263 Lecture 4 September 13, 2006 7 ugmenting Red-lack Trees 7.1 Introduction Suppose that ou are asked to implement an DT that is the same as a dictionar but has one additional operation: operation:

More information

Chapter 1. Limits and Continuity. 1.1 Limits

Chapter 1. Limits and Continuity. 1.1 Limits Chapter Limits and Continuit. Limits The its is the fundamental notion of calculus. This underling concept is the thread that binds together virtuall all of the calculus ou are about to stud. In this section,

More information

2.3. Horizontal and Vertical Translations of Functions. Investigate

2.3. Horizontal and Vertical Translations of Functions. Investigate .3 Horizontal and Vertical Translations of Functions When a video game developer is designing a game, she might have several objects displaed on the computer screen that move from one place to another

More information

Precision Peg-in-Hole Assembly Strategy Using Force-Guided Robot

Precision Peg-in-Hole Assembly Strategy Using Force-Guided Robot 3rd International Conference on Machiner, Materials and Information Technolog Applications (ICMMITA 2015) Precision Peg-in-Hole Assembl Strateg Using Force-Guided Robot Yin u a, Yue Hu b, Lei Hu c BeiHang

More information

Module 2, Section 2 Graphs of Trigonometric Functions

Module 2, Section 2 Graphs of Trigonometric Functions Principles of Mathematics Section, Introduction 5 Module, Section Graphs of Trigonometric Functions Introduction You have studied trigonometric ratios since Grade 9 Mathematics. In this module ou will

More information

Transformations of Functions. 1. Shifting, reflecting, and stretching graphs Symmetry of functions and equations

Transformations of Functions. 1. Shifting, reflecting, and stretching graphs Symmetry of functions and equations Chapter Transformations of Functions TOPICS.5.. Shifting, reflecting, and stretching graphs Smmetr of functions and equations TOPIC Horizontal Shifting/ Translation Horizontal Shifting/ Translation Shifting,

More information

Proposal of a Touch Panel Like Operation Method For Presentation with a Projector Using Laser Pointer

Proposal of a Touch Panel Like Operation Method For Presentation with a Projector Using Laser Pointer Proposal of a Touch Panel Like Operation Method For Presentation with a Projector Using Laser Pointer Yua Kawahara a,* and Lifeng Zhang a a Kushu Institute of Technolog, 1-1 Sensui-cho Tobata-ku, Kitakushu

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 7. Introduction to Data Mining, 2 nd Edition

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 7. Introduction to Data Mining, 2 nd Edition Data Mining Cluster Analsis: Basic Concepts and Algorithms Lecture Notes for Chapter 7 Introduction to Data Mining, nd Edition b Tan, Steinbach, Karpatne, Kumar What is Cluster Analsis? Finding groups

More information

The Graph of an Equation

The Graph of an Equation 60_0P0.qd //0 :6 PM Page CHAPTER P Preparation for Calculus Archive Photos Section P. RENÉ DESCARTES (96 60) Descartes made man contributions to philosoph, science, and mathematics. The idea of representing

More information

Computer Graphics. Si Lu. Fall er_graphics.htm 10/11/2017

Computer Graphics. Si Lu. Fall er_graphics.htm 10/11/2017 Computer Graphics Si Lu Fall 27 http://www.cs.pd.edu/~lusi/cs447/cs447_547_comput er_graphics.htm //27 Last time Filtering Resampling 2 Toda Compositing NPR 3D Graphics Toolkits Transformations 3 Demo

More information

Chapter 2 Part 5 Combinational Logic Circuits

Chapter 2 Part 5 Combinational Logic Circuits Universit of Wisconsin - Madison ECE/Comp Sci 352 Digital Sstems Fundamentals Kewal K. Saluja and Yu Hen Hu Spring 2002 Chapter 2 Part 5 Combinational Logic Circuits Originals b: Charles R. Kime and Tom

More information

triangles leaves a smaller spiral polgon. Repeating this process covers S with at most dt=e lights. Generaliing the polgon shown in Fig. 1 establishes

triangles leaves a smaller spiral polgon. Repeating this process covers S with at most dt=e lights. Generaliing the polgon shown in Fig. 1 establishes Verte -Lights for Monotone Mountains Joseph O'Rourke Abstract It is established that dt=e = dn=e?1 verte -lights suce to cover a monotone mountain polgon of t = n? triangles. A monotone mountain is a monotone

More information

2.3 Polynomial Functions of Higher Degree with Modeling

2.3 Polynomial Functions of Higher Degree with Modeling SECTION 2.3 Polnomial Functions of Higher Degree with Modeling 185 2.3 Polnomial Functions of Higher Degree with Modeling What ou ll learn about Graphs of Polnomial Functions End Behavior of Polnomial

More information

12.4 The Ellipse. Standard Form of an Ellipse Centered at (0, 0) (0, b) (0, -b) center

12.4 The Ellipse. Standard Form of an Ellipse Centered at (0, 0) (0, b) (0, -b) center . The Ellipse The net one of our conic sections we would like to discuss is the ellipse. We will start b looking at the ellipse centered at the origin and then move it awa from the origin. Standard Form

More information

Appendix F: Systems of Inequalities

Appendix F: Systems of Inequalities A0 Appendi F Sstems of Inequalities Appendi F: Sstems of Inequalities F. Solving Sstems of Inequalities The Graph of an Inequalit The statements < and are inequalities in two variables. An ordered pair

More information

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) International Association of Scientific Innovation and Research (IASIR) (An Association Unifing the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. X, X X 1

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. X, X X 1 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. X, X X 1 Joint video frame set division and low-rank decomposition for background subtraction Jiajun Wen, Yong Xu, Member, IEEE,

More information

arxiv: v4 [cs.cg] 6 Sep 2018

arxiv: v4 [cs.cg] 6 Sep 2018 1-Fan-Bundle-Planar Drawings of Graphs P. Angelini 1, M. A. Bekos 1, M. Kaufmann 1, P. Kindermann 2, and T. Schneck 1 1 Institut für Informatik, Universität Tübingen, German, {angelini,bekos,mk,schneck}@informatik.uni-tuebingen.de

More information

Clustering: Centroid-Based Partitioning

Clustering: Centroid-Based Partitioning Clustering: Centroid-Based Partitioning Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong 1 / 29 Y Tao Clustering: Centroid-Based Partitioning In this lecture, we

More information

Elastic Non-contiguous Sequence Pattern Detection for Data Stream Monitoring

Elastic Non-contiguous Sequence Pattern Detection for Data Stream Monitoring Elastic Non-contiguous Sequence Pattern Detection for Data Stream Monitoring Xinqiang Zuo 1, Jie Gu 2, Yuanbing Zhou 1, and Chunhui Zhao 1 1 State Power Economic Research Institute, China {zuoinqiang,zhouyuanbing,zhaochunhui}@chinasperi.com.cn

More information

A SCATTERED DATA INTERPOLANT FOR THE SOLUTION OF THREE-DIMENSIONAL PDES

A SCATTERED DATA INTERPOLANT FOR THE SOLUTION OF THREE-DIMENSIONAL PDES European Conference on Computational Fluid Dnamics ECCOMAS CFD 26 P. Wesseling, E. Oñate and J. Périau (Eds) c TU Delft, The Netherlands, 26 A SCATTERED DATA INTERPOLANT FOR THE SOLUTION OF THREE-DIMENSIONAL

More information

Topic 2 Transformations of Functions

Topic 2 Transformations of Functions Week Topic Transformations of Functions Week Topic Transformations of Functions This topic can be a little trick, especiall when one problem has several transformations. We re going to work through each

More information

Generalized Gaussian Quadrature Rules in Enriched Finite Element Methods

Generalized Gaussian Quadrature Rules in Enriched Finite Element Methods Generalized Gaussian Quadrature Rules in Enriched Finite Element Methods Abstract In this paper, we present new Gaussian integration schemes for the efficient and accurate evaluation of weak form integrals

More information

A Robust and Real-time Multi-feature Amalgamation. Algorithm for Fingerprint Segmentation

A Robust and Real-time Multi-feature Amalgamation. Algorithm for Fingerprint Segmentation A Robust and Real-time Multi-feature Amalgamation Algorithm for Fingerprint Segmentation Sen Wang Institute of Automation Chinese Academ of Sciences P.O.Bo 78 Beiing P.R.China100080 Yang Sheng Wang Institute

More information

dt Acceleration is the derivative of velocity with respect to time. If a body's position at time t is S = f(t), the body's acceleration at time t is

dt Acceleration is the derivative of velocity with respect to time. If a body's position at time t is S = f(t), the body's acceleration at time t is APPLICATIN F DERIVATIVE INTRDUCTIN In this section we eamine some applications in which derivatives are used to represent and interpret the rates at which things change in the world around us. Let S be

More information

Course Content. Objectives of Lecture? CMPUT 391: Spatial Data Management Dr. Jörg Sander & Dr. Osmar R. Zaïane. University of Alberta

Course Content. Objectives of Lecture? CMPUT 391: Spatial Data Management Dr. Jörg Sander & Dr. Osmar R. Zaïane. University of Alberta Database Management Systems Winter 2002 CMPUT 39: Spatial Data Management Dr. Jörg Sander & Dr. Osmar. Zaïane University of Alberta Chapter 26 of Textbook Course Content Introduction Database Design Theory

More information

Cross-validation for detecting and preventing overfitting

Cross-validation for detecting and preventing overfitting Cross-validation for detecting and preventing overfitting Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures.

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analsis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining b Tan, Steinbach, Kumar What is Cluster Analsis? Finding groups of objects such that the

More information

Efficient Range Query Processing on Uncertain Data

Efficient Range Query Processing on Uncertain Data Efficient Range Query Processing on Uncertain Data Andrew Knight Rochester Institute of Technology Department of Computer Science Rochester, New York, USA andyknig@gmail.com Manjeet Rege Rochester Institute

More information

Part 2.2 Continuous functions and their properties v1 2018

Part 2.2 Continuous functions and their properties v1 2018 Part 2.2 Continuous functions and their properties v 208 Intermediate Values Recall R is complete. This means that ever non-empt subset of R which is bounded above has a least upper bound. That is: (A

More information

Unit 4 Trigonometry. Study Notes 1 Right Triangle Trigonometry (Section 8.1)

Unit 4 Trigonometry. Study Notes 1 Right Triangle Trigonometry (Section 8.1) Unit 4 Trigonometr Stud Notes 1 Right Triangle Trigonometr (Section 8.1) Objective: Evaluate trigonometric functions of acute angles. Use a calculator to evaluate trigonometric functions. Use trigonometric

More information

Simple example. Analysis of programs with pointers. Program model. Points-to relation

Simple example. Analysis of programs with pointers. Program model. Points-to relation Simple eample Analsis of programs with pointers := 5 ptr := & *ptr := 9 := program S1 S2 S3 S4 What are the defs and uses of in this program? Problem: just looking at variable names will not give ou the

More information

Section 4.4 Concavity and Points of Inflection

Section 4.4 Concavity and Points of Inflection Section 4.4 Concavit and Points of Inflection In Chapter 3, ou saw that the second derivative of a function has applications in problems involving velocit and acceleration or in general rates-of-change

More information

Where we are. Exploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min)

Where we are. Exploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min) Where we are Background (15 min) Graph models, subgraph isomorphism, subgraph mining, graph clustering Eploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min)

More information

Translations, Reflections, and Rotations

Translations, Reflections, and Rotations Translations, Reflections, and Rotations The Marching Cougars Lesson 9-1 Transformations Learning Targets: Perform transformations on and off the coordinate plane. Identif characteristics of transformations

More information

Tutorial of Motion Estimation Based on Horn-Schunk Optical Flow Algorithm in MATLAB

Tutorial of Motion Estimation Based on Horn-Schunk Optical Flow Algorithm in MATLAB AU J.T. 1(1): 8-16 (Jul. 011) Tutorial of Motion Estimation Based on Horn-Schun Optical Flow Algorithm in MATLAB Darun Kesrarat 1 and Vorapoj Patanavijit 1 Department of Information Technolog, Facult of

More information