Categorical Range queries on Spatial Networks

Similar documents
Route Planning. Reach ArcFlags Transit Nodes Contraction Hierarchies Hub-based labelling. Tabulation Dijkstra Bidirectional A* Landmarks

Transit Nodes Lower Bounds and Refined Construction

An Efficient Technique for Distance Computation in Road Networks

MobiPLACE*: A Distributed Framework for Spatio-Temporal Data Streams Processing Utilizing Mobile Clients Processing Power.

SILC: Efficient Query Processing on Spatial Networks

Highway Dimension and Provably Efficient Shortest Paths Algorithms

Geometric Data Structures Multi-dimensional queries Nearest neighbour problem References. Notes on Computational Geometry and Data Structures

Range-Aggregate Queries Involving Geometric Aggregation Operations

Privacy Protected Spatial Query Processing for Advanced Location Based Services

Approximate Continuous K Nearest Neighbor Queries for Continuous Moving Objects with Pre-Defined Paths

Querying Shortest Distance on Large Graphs

Trajectory Similarity Search in Spatial Networks

Data Structures for Moving Objects on Fixed Networks

Visualizing and Animating Search Operations on Quadtrees on the Worldwide Web

Feasibility Study: Subgoal Graphs on State Lattices

Aggregate-Max Nearest Neighbor Searching in the Plane

Spatiotemporal Access to Moving Objects. Hao LIU, Xu GENG 17/04/2018

Reach for A : an Efficient Point-to-Point Shortest Path Algorithm

Efficient Construction of Safe Regions for Moving knn Queries Over Dynamic Datasets

Lecture 3 February 9, 2010

Towards a Taxonomy of Location Based Services

ISSUES IN SPATIAL DATABASES AND GEOGRAPHICAL INFORMATION SYSTEMS (GIS) HANAN SAMET

Distance Join Queries on Spatial Networks

Point-to-Point Shortest Path Algorithms with Preprocessing

Search Space Reductions for Nearest-Neighbor Queries

Understanding Subgoal Graphs by Augmenting Contraction Hierarchies

Topic: Local Search: Max-Cut, Facility Location Date: 2/13/2007

arxiv: v1 [cs.ds] 5 Oct 2010

Smallest Intersecting Circle for a Set of Polygons

Proceedings of the 5th WSEAS International Conference on Telecommunications and Informatics, Istanbul, Turkey, May 27-29, 2006 (pp )

Approximate Shortest Distance Computing: A Query-Dependent Local Landmark Scheme

Combining Hierarchical and Goal-Directed Speed-Up Techniques for Dijkstra s Algorithm

Nearest Neighbor Search on Moving Object Trajectories

Fast and Compact Oracles for Approximate Distances in Planar Graphs

SPATIAL RANGE QUERY. Rooma Rathore Graduate Student University of Minnesota

Distributed k-nn Query Processing for Location Services

Experimental Study on Speed-Up Techniques for Timetable Information Systems

R. Kalpana #1 P. Thambidurai *2. Puducherry, India. Karaikal, Puducherry, India

Geometric Data Structures

Surrounding Join Query Processing in Spatial Databases

Nearest Neighbor Search on Moving Object Trajectories

Finding Shortest Path on Land Surface

Car or Public Transport Two Worlds

Highway Hierarchies Hasten Exact Shortest Path Queries

X-tree. Daniel Keim a, Benjamin Bustos b, Stefan Berchtold c, and Hans-Peter Kriegel d. SYNONYMS Extended node tree

Trajectory Compression under Network Constraints

A Novel Method to Estimate the Route and Travel Time with the Help of Location Based Services

Contraction Hierarchies: Faster and Simpler Hierarchical Routing in Road Networks

Mobility Data Management and Exploration: Theory and Practice

Inverted Index for Fast Nearest Neighbour

Introduction to Spatial Database Systems

ANNATTO: Adaptive Nearest Neighbor Queries in Travel Time Networks

Exploring Graph Partitioning for Shortest Path Queries on Road Networks

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

CMSC 754 Computational Geometry 1

SHARC: Fast and Robust Unidirectional Routing

Effective Density Queries for Moving Objects in Road Networks

Foundations of Nearest Neighbor Queries in Euclidean Space

Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases

Computational Geometry for Imprecise Data

Construction of Voronoi Diagrams

Exploiting a Page-Level Upper Bound for Multi-Type Nearest Neighbor Queries

Indexing Land Surface for Efficient knn Query

I/O Efficieny of Highway Hierarchies

Lecture 3 February 23, 2012

Detect tracking behavior among trajectory data

On Computing the Centroid of the Vertices of an Arrangement and Related Problems

Route Planning in Road Networks

Time-Dependent Contraction Hierarchies

Computing intersections in a set of line segments: the Bentley-Ottmann algorithm

On Support of Ordering in Multidimensional Data Structures

Two Ellipse-based Pruning Methods for Group Nearest Neighbor Queries

Parallel Hierarchies for Solving Single Source Shortest Path Problem

Spatial Queries in the Presence of Obstacles

Highway Hierarchies Star

The Effects of Dimensionality Curse in High Dimensional knn Search

Mobile Route Planning

External-Memory Algorithms with Applications in GIS - (L. Arge) Enylton Machado Roberto Beauclair

Probabilistic Spatial Queries on Existentially Uncertain Data

Comparison of spatial indexes

Experimental Evaluation of Spatial Indices with FESTIval

Lecture 3: Art Gallery Problems and Polygon Triangulation

Temporal Range Exploration of Large Scale Multidimensional Time Series Data

9/23/2009 CONFERENCES CONTINUOUS NEAREST NEIGHBOR SEARCH INTRODUCTION OVERVIEW PRELIMINARY -- POINT NN QUERIES

A Real Time GIS Approximation Approach for Multiphase Spatial Query Processing Using Hierarchical-Partitioned-Indexing Technique

Constrained Shortest Path Computation

6.854J / J Advanced Algorithms Fall 2008

Experimental Study on Speed-Up Techniques for Timetable Information Systems

Introduction to Indexing R-trees. Hong Kong University of Science and Technology

Dynamic Euclidean Minimum Spanning Trees and Extrema of Binary Functions

Approximation Algorithms for Geometric Intersection Graphs

Customizable Route Planning

T. Biedl and B. Genc. 1 Introduction

Boundary Recognition in Sensor Networks. Ng Ying Tat and Ooi Wei Tsang

Location Privacy Protection for Preventing Replay Attack under Road-Network Constraints

Computing Continuous Skyline Queries without Discriminating between Static and Dynamic Attributes

Spatial Queries in Road Networks Based on PINE

CMSC 425: Lecture 9 Geometric Data Structures for Games: Geometric Graphs Thursday, Feb 21, 2013

Algorithm Engineering for Route Planning: An Update International Symposium on Algorithms and Computation 2011 Dorothea Wagner December 6, 2011

26 The closest pair problem

Transcription:

Categorical Range queries on Spatial Networks Rahul Saladi, Akash Agarwal and Xiaofei Zhao December 21, 2011 1 Introduction Rapid growth of smart phones, GPS technologies and online mapping services (eg, Google Maps, Microsoft MapPoint) has made it possible to answer some fundamental location based queries. A lot of work has been done on k-nn (nearest neighbour) queries and shortest-path queries. Surprisingly very little work has been done on range queries or its generalized version called categorical range queries. Categorical range queries are the focus of this work. A visitor comes over to Minneapolis to spend his vacation. Being new to the city he would want to ask some interesting queries on his smart phone such as: Which are the movie theaters within 10 kms from my current location? or Which are the restaurants within 2 kms of Keller Hall? Note that when the visitor says report theaters within 10 kms, he does not mean euclidean distance. In a realistic scenario, he can only walk along the road network of Minneapolis and not in arbitrary directions. Therefore, by 10 kms he means that the theaters of interest should be less than 10 kms away from the current location while moving along the network. Refer to Figure 1 for a graphical description. 2 Problem Statement Categorical range search query on a spatial network We have a spatial network G = (V, E), where V represents the set of nodes/vertices (i.e. intersection points of roads) and E is the set of edges connecting any two vertices. With each edge e E, we shall associate a weight w(e) which is the length of the edge. A set F of facilities lie along the edges of the network. The set of facilites F are categorized into l disjoint groups s.t. F = F 1 F 2... F l and F i F j = Φ, i j and i, j [1, l]. The user gives a two-dimensional query point q, a query value d q and facility type/category F i. The objective is to report all the facilities in F i which are within a network distance of d q from q. The network is not restricted to be planar. 1

q Figure 1: An example of a road network. The black dots denote the vertices on the network (intersection of roads). All the edges have unit length. The red and the blue circles denote the theaters on the network. Suppose our visitor is at vertex q and wants to know all the theaters within a distance of 3 2 units (the diameter of the network is 3 2). If the distance metric was euclidean distance then all the theaters will be reported. However, in a realistic scenario we can only move along the edges of the network. If we consider network distance then all the theaters colored blue can be reached. All the red colored theaters have a network distance greater than 3 2 from q. 2

3 Limitation of existing techniques The work in [24] presents the first known index structure and algorithms for spatial queries such as range queries, nearest neighbours, spatial joins and closest pairs using the network distance instead of the euclidean distance. For the range query they propose two algorihtms: a) Range Euclidean Restriction (RER), and b) Range Network Expansion (RNE). In RER technique, first a circular range query is performed on S to retrieve a subset S of facilities which lie within a euclidean distance of d q from the query point q. This helps in taking care of false misses. From the query point q, we shall be expanding along the network to find out all the edges in the network which are within a network distance of d q from q. All the egdes are visited in increasing order of their network distance (similar to Djikstra s algorithm). For each edge visited, all the facilities in S which have an overlap with the edge are reported. Clearly we can construct inputs for which RER technique will fail miserably. In an extreme case it might happen that all the facilities in set S fall within euclidean distance of d q from q, but none of them fall with network distance of d q from q. In such a case no facility will get reported but it might happen that we explore O( E ) edges in the network in the process. The overall query time would be O( E S ) (ignoring the cost of maintaining a queue) and we end up reporting none of the facilities! Refer to Figure 3. In RNE technique they first compute a set QS of all the edges in the network which are within a network distance of d q of from q. The facilities are indexed in an R-tree. An intersection join in performed between QS and the facilities R-tree which is shown to be optimal. Alternate techniques for performing the join are also mentioned. Though the join operation is efficient, the first step of finding the set QS is the major bottleneck of this process. In an extreme scenario, it might happen that O( E ) edges in the network might be within a network distance of d q from q but none of the facilities lie on them! Refer to Figure 3. There a couple of other works [29] which claim to handle range queries based on network distance but they do not provide any concrete theoretical bounds on the space and the query time requirements. 4 Our Technique In this section we shall present an initial solution we could come up with for this problem. We start off with a couple of definitions. Definition 1 Neighbour of a vertex Suppose we are interested in facility type F i F. A vertex v V and a facility f F i are neighbours iff there is a path P vf from v to f which does not have any other facility on P vf. Then f becomes a neighbour of v. Definition 2 Neighbourhood of a vertex w.r.t F i For a given facility type F i, the neighbourhood of a vertex v (denoted by n i (v)) are the set of facilities in 3

q Figure 2: An example to show the limitations of RER and RNE techniques proposed in [24] for answering range queries on spatial networks. Assume that all the edges are unit length. The facilites are shown in blue and the vertices in black. The query point is q and the query distance is 2 units. In RER, the circular range query around q with radius two units would report all the facilities as candidate points. From q all the edges within network distance of 2 units are explored but unfortunately none of these edges will have facilities on them. Similarly, in RNE the first step is to find out all the edges in the network which are withing a network of 2 units from q. As stated before none of the edges explored will have a facility lying on them! 4

Figure 3: This networks represents an example where the algorithm proposed would work really well. The blue circles represent the facilities. The number of neighbours for each vertex and each facility is exactly two which makes our algorithm work well. F i which are its neighbours. Definition 3 Neighbourhood of a facility For a given facility type F i, the neighbourhood of a facility f F i (denoted by n i (f)) are the set of facilities in F i \ f which are its neighbours. 4.1 Preprocessing steps For each facililty type F i F the following steps are carried out: 1. For each vertex v V, identify its neighbours i.e. n i (v). Maintain this list in increasing order of their network distance from v. 2. We shall build a Facility-Neighbourhood graph for F i (say F NG i ) as follows: The vertices of F NG i are the facilities in set F i. An edge is made between vertices u and v if they are neighbours. The weight of the edge is the path P uv G which does not have any other facility on P uv and has the minimum possible weight (weight of a path is the sum of the weights of the edges on it). The edges incident on a vertex are stored in increasing order of their weights. Space requirements The input graph G occupies O( V + E + F ) space. Let N i = Σ V i=1 n i(v). Let N = Σ l i=1 N i. The space occupied by F NG i is F i = Σ Fi i=1 n i(f). Let F = Σ l i=1 F i. Therefore, the total space occupied by the data structure is O( V + E + F + N + F). If we assume the graph to be a road network then the space complexity would be O(N + F). 5

Figure 4: The blue circles represent facilities. The black circles represent vertices. Here each vertex has each facility as its neighbour. Also each facility has every other facility as its neighbour. This makes our algorithm perform poorly. 4.2 Query Algorithm Suppose the query is to find out all the facilities of type F i which are within a network distance of d q from the given query point q. For simplicity assume that q V. This assumption can be easily removed later. The following steps shall be followed: 1. The list n i (q) is scanned to identify the facilities lying within network distance d q. These facilities are pushed into a queue Q. 2. Now we shall run a Djikstra s style algorithm on F NG i. However, unlike in the traditional Djikstra where we start with an empty queue, we shall here start with the entries in Q. We shall keep exploring the graph till we cross the network distance d q. Query time The second step dominates the running time. The time taken to run a Djikstra s like algorithm would be O(Σ k j=1 n i(f j )) where f 1, f 2,..., f k are the facilities reported. Proof of Correctness Consider a node v which is the query point q. Clearly all the neighbouring facilities are reported only if they lie within the query distance. The property of FNG is that edge represents the shortest distance between the two facilities. Therefore, running a Djistra s algorithm on FNG will report the shortest distance to each facility from node v. Hence all the required facilities will be reported. 6

5 Other Related Work Euclidean space For static objects there has been extensive work done on range searching queries in the database and computational geometry community. Orthogonal range searching query is perhaps one of the most widely studied problem in the computational geometry and database community due to its wide range of applications. Here the set of objects are points lying in a d-dimensional space and the query is an axis-parallel orthogonal query box. Traditionally, the researchers in computational geometry have been interested in building solutions for this problem which aim at coming up worstcase efficient query time solutions, i.e., ensuring that the query time is low for all possible values of the query. The most popular among them are the range-trees [10] and the K-d tree [6]. The original range-tree when built on n points in d-dimensional space took O(n log d n) space and answered queries in O(log d 1 n + k) query time, where k are the number of points lying inside the query region q. By using the technique of fractional cascading we can reduce the query time by a log factor [8]. In a dynamic setting, the rangetree uses O(n log d 1 n) space, answers queries in O(log d 1 n log log n) time and handles updates in O(log d 1 n log log n) time [8]. A K-d tree when built on n points in d-dimensional space takes up linear space and answers a range-query in O(n 1 1/d + k) time. Updates can be handled efficiently in it. In the field of databases there have been a significant number of index structures proposed to answer an orthogonal range-query. The researchers in database community try to come up with practical solutions which are optmized to work well for the average case queries (with the assumption that the worstcase query will occur rarely). Perhaps no other structure has been more popular than the R-tree proposed in 1984 by Guttman [17]. According to Google scholar, as on 16th June 2011 this paper has been cited 5260 times! R-trees is extremely popular for indexing multi-dimensional information. It s main advantage arises from the fact that it can handle multiple queries efficiently (range-queries being one of them). Contrast this with a range-tree which is tailor made for handling only range-queries. Varaints of R-tree such as R* tree [5], R+ tree [30], Hilbert tree and Priority R-tree are also quite popular index structures for storing multi-dimensional data. Circular range searching [7] and Halfspace range searching [7] are some other important range queries where the point set are queried with a disk and a halfspace, respectively. Range queries on moving objects In the field of moving object databases, there has been work on indexing moving objects which can answer range queries or window queries. However, most of these assume the objects to have an unconstrained two-dimensional movement. A couple of index structures such as FNR-tree [11], MON-tree [9] and [25] assume the objects to be constrained to move along the spatial network and appropriately model the index structures. These structures store the past history of movement of all the objects and range queries are asked on this past data. However, the range queries which they handle are different than what we are attempting in this paper. The range query 7

which they want to answer is the following: Given a spatio-temporal window, return all the objects whose movements overlap the query window. So in some sense they still consider euclidean metric distance. Highway dimension Recently there was a paper [1] which introduced the notion of highway dimension. They observed that there were a lot of heuristics for finding shortest-path among vertices which are based on a wide variety of ideas, such as arc flags [21, 18, 4], A* search with landmarks [15], highway hierarchies [27, 28], reach [14, 13, 16], transit nodes [2, 3], and contraction hierarchies [12]. In experiments on real-world data these heuristics performed a lot better than a plan Dijkstra. Also the preprocessing involved in them is practical and space occupied was only slightly more than than the road network alone. However, these were all purely experimental with no good theoretical bounds. It is possible to construct inputs for which these heuristics will perform miserably. In [1] they tried to bring all these techniques within a theoretical framework and tried to identify factors which lead to good performance of these heuristic techniques. In this process they define the notion of highway dimension and showed that having small highway dimension gives a theoretical guarentee of good query performance for these heuristic algorithms. Interestingly, the line of approach of this work seems promising for coming up with good theoretical guarentees for algorithms which perfrom range queries on spatial networks. Other problems Apart from the shortest path query, the nearest neighbour or K-nearest neighbour query in terms of network distance happens to be an active field of research [31, 19, 26, 23]. It looks possible that we can borrow ideas from shortest path and K-nearest neighbour algorithms for answering the range query problem. In [32] they provide various techniques to estimate the number of nodes and edges which lie within d q distance of the query point q which might help us in predicting the space and query time requirements for range queries. In some scenarios users of location based services would not want to reveal their local information. In [20] they come up with nearest neighbour and range search queries on spatial network with network distance without revealing the local information of the user. In this work we are looking at static range queries but there has been work done on dynamic range queries in spatial networks [22]. 6 Future Work In future we plan to build cost models for analysing the performance of our algorithm. Also experiments will be performed and compared with existing solutions. 8

References [1] Ittai Abraham, Amos Fiat, Andrew V. Goldberg, and Renato Fonseca F. Werneck. Highway dimension, shortest paths, and provably efficient algorithms. In SODA, pages 782 793, 2010. [2] H. Bast, S. Funke, and D. Matijevic. Ultrafast shortest-path queries via transit nodes. In In C. Demetrescu, A. V. Goldberg, and D. S. Johnson, editors, The Shortest Path Problem: Ninth DIMACS Implementation Challenge, AMS, pages 175 192, 2009. [3] H. Bast, S. Funke, D. Matijevic, P. Sanders, and D. Schultes. In transit to constant time shortest-path queries in road networks. In In Proc. 9th International Workshop on Algorithm Engineering and Experiments, SIAM, pages 46 59, 2006. Available at http://www.mpiinf.mpg.de/ bast/tmp/transit.pdf. [4] Reinhard Bauer and Daniel Delling. Sharc: Fast and robust unidirectional routing. In ALENEX, pages 13 26, 2008. [5] Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, and Bernhard Seeger. The r*-tree: An efficient and robust access method for points and rectangles. In SIGMOD Conference, pages 322 331, 1990. [6] Jon Louis Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509 517, 1975. [7] Bernard Chazelle. Filtering search: A new approach to query-answering. SIAM J. Comput., 15(3):703 724, 1986. [8] Y. Chiang and R. Tamassia. Dynamic algorithms in computational geometry. Proceedings of the IEEE, Special Issue on Computational Geometry, G. Toussaint (Ed.), 80(9):1412 1434, 1992. [9] Victor Teixeira de Almeida and Ralf Hartmut Güting. Indexing the trajectories of moving objects in networks. GeoInformatica, 9(1):33 60, 2005. [10] M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf. Computational Geometry. Springer Verlag, 2nd edition, 2000. [11] Elias Frentzos. Indexing objects moving on fixed networks. In SSTD, pages 289 305, 2003. [12] Robert Geisberger, Peter Sanders, Dominik Schultes, and Daniel Delling. Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In WEA, pages 319 333, 2008. [13] A. V. Goldberg, H. Kaplan, and R. F. Werneck. Reach for a*: Efficient point-to-point shortest path algorithms. In In Proc. 8th International Workshop on Algorithm Engineering and Experiments, SIAM, pages 38 51, 2006. 9

[14] A. V. Goldberg, H. Kaplan, and R. F. Werneck. Reach for a*: Shortest path algorithms with preprocessing. In In C. Demetrescu, A. V. Goldberg, and D. S. Johnson, editors, The Shortest Path Problem: Ninth DIMACS Implementation Challenge, AMS, pages 93 140, 2009. [15] Andrew V. Goldberg and Chris Harrelson. Computing the shortest path: search meets graph theory. In SODA, pages 156 165, 2005. [16] Ronald J. Gutman. Reach-based routing: A new approach to shortest path algorithms optimized for road networks. In ALENEX/ANALC, pages 100 111, 2004. [17] Antonin Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD Conference, pages 47 57, 1984. [18] M. Hilger, E. Khler, R. H. Mhring, and H. Schilling. Fast point-to-point shortest path computations with arc-flags. In In C. Demetrescu, A. V. Goldberg, and D. S. Johnson, editors, The Shortest Path Problem: Ninth DIMACS Implementation Challenge, pages 73 92, 2009. [19] Mohammad R. Kolahdouzan and Cyrus Shahabi. Voronoi-based k nearest neighbor search for spatial network databases. In VLDB, pages 840 851, 2004. [20] Wei-Shinn Ku, Roger Zimmermann, Wen-Chih Peng, and Sushama Shroff. Privacy protected query processing on spatial networks. In ICDE Workshops, pages 215 220, 2007. [21] U. Lauther. An extremely fast, exact algorithm for finding shortest paths in static networks with geographical background. In IfGIprints 22, Institut fuer Geoinformatik, Universitaet Muenster (ISBN 3-936616-22-1), pages 219 230, 2004. [22] Fuyu Liu, Tai T. Do, and Kien A. Hua. Dynamic range query in spatial network environments. In DEXA, pages 254 265, 2006. [23] Kyriakos Mouratidis, Man Lung Yiu, Dimitris Papadias, and Nikos Mamoulis. Continuous nearest neighbor monitoring in road networks. In VLDB, pages 43 54, 2006. [24] Dimitris Papadias, Jun Zhang, Nikos Mamoulis, and Yufei Tao. Query processing in spatial network databases. In VLDB, pages 802 813, 2003. [25] Dieter Pfoser and Christian S. Jensen. Indexing of network constrained moving objects. In GIS, pages 25 32, 2003. [26] Hanan Samet, Jagan Sankaranarayanan, and Houman Alborzi. Scalable network distance browsing in spatial databases. In SIGMOD Conference, pages 43 54, 2008. 10

[27] Peter Sanders and Dominik Schultes. Highway hierarchies hasten exact shortest path queries. In ESA, pages 568 579, 2005. [28] Peter Sanders and Dominik Schultes. Engineering highway hierarchies. In ESA, pages 804 816, 2006. [29] Jagan Sankaranarayanan, Houman Alborzi, and Hanan Samet. Efficient query processing on spatial networks. In GIS, pages 200 209, 2005. [30] Timos K. Sellis, Nick Roussopoulos, and Christos Faloutsos. The r+-tree: A dynamic index for multi-dimensional objects. In Peter M. Stocker, William Kent, and Peter Hammersley, editors, VLDB 87, Proceedings of 13th International Conference on Very Large Data Bases, September 1-4, 1987, Brighton, England, pages 507 518. Morgan Kaufmann, 1987. [31] Cyrus Shahabi, Mohammad R. Kolahdouzan, and Mehdi Sharifzadeh. A road network embedding technique for k-nearest neighbor search in moving object databases. In ACM-GIS, pages 94 10, 2002. [32] Eleftherios Tiakas, Apostolos N. Papadopoulos, Alexandros Nanopoulos, and Yannis Manolopoulos. Node and edge selectivity estimation for range queries in spatial networks. Inf. Syst., 34(3), 2009. 11