ISSUES IN SPATIAL DATABASES AND GEOGRAPHICAL INFORMATION SYSTEMS (GIS) HANAN SAMET

Similar documents
Extending the SAND Spatial Database System for the Visualization of Three- Dimensional Scientific Data

Visualizing and Animating Search Operations on Quadtrees on the Worldwide Web

SILC: Efficient Query Processing on Spatial Networks

Foundations of Nearest Neighbor Queries in Euclidean Space

hjs SORTING SPATIAL DATA BY SPATIAL OCCUPANCY

Speeding up Bulk-Loading of Quadtrees

Cartographic Labeling

QUERY PROCESSING AND OPTIMIZATION FOR PICTORIAL QUERY TREES HANAN SAMET

Finite-Resolution Simplicial Complexes

AHybridShortestPathAlgorithmfor Intra-Regional Queries on Hierarchical Networks

Proceedings of the 4th Symposium on Spatial Databases, Portland, Maine, Aug. 1995, eds. Egenhofer and Herring, Lecture Notes in Computer Science 951,

Finding Shortest Path on Land Surface

SPATIAL RANGE QUERY. Rooma Rathore Graduate Student University of Minnesota

Speeding up Queries in a Leaf Image Database

Visualizing the inner structure of n-body data using skeletonization

Distance Browsing in Spatial Databases

(Master Course) Mohammad Farshi Department of Computer Science, Yazd University. Yazd Univ. Computational Geometry.

Augmenting SAND with a Spherical Data Model (EXTENDED ABSTRACT)

SORTING IN SPACE HANAN SAMET

Distance Join Queries on Spatial Networks

Online Document Clustering Using the GPU

Distributed Database Management Systems M. Tamer Özsu and Patrick Valduriez

Planning Techniques for Robotics Planning Representations: Skeletonization- and Grid-based Graphs

Transactions on Information and Communications Technologies vol 18, 1998 WIT Press, ISSN

Line Simplification in the Presence of Non-Planar Topological Relationships

Foundations of Multidimensional and Metric Data Structures

Raster Data. James Frew ESM 263 Winter

Querying Shortest Distance on Large Graphs

Announcements. Data Sources a list of data files and their sources, an example of what I am looking for:

Applications of Geometric Spanner

3-Dimensional Object Modeling with Mesh Simplification Based Resolution Adjustment

A New Approach for Data Conversion from CAD to GIS

Chapter 1: Introduction to Spatial Databases

Spatial data structures in 2D

Proceedings of the 5th WSEAS International Conference on Telecommunications and Informatics, Istanbul, Turkey, May 27-29, 2006 (pp )

SAS Web Report Studio 3.1

A Multi Join Algorithm Utilizing Double Indices

Chapter 7 Spatial Operation

Programming Assignment 1: A Data Structure For VLSI Applications 1

Geometric and Thematic Integration of Spatial Data into Maps

Lecturer 2: Spatial Concepts and Data Models

CHAPTER 10 Creating and maintaining geographic databases Overview Learning objectives Keywords and concepts Overview Chapter summary

LECTURE 2 SPATIAL DATA MODELS

Spatial Data Management

An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve

Geometric Data Structures Multi-dimensional queries Nearest neighbour problem References. Notes on Computational Geometry and Data Structures

Lecture 6: GIS Spatial Analysis. GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

Generating Tool Paths for Free-Form Pocket Machining Using z-buffer-based Voronoi Diagrams

Chapter 9. Attribute joins

Novel Materialized View Selection in a Multidimensional Database

Spatial Processing using Oracle Table Functions

QGIS LAB SERIES GST 102: Spatial Analysis Lab 3: Advanced Attributes and Spatial Queries for Data Exploration

Sorting in Space and Network Distance Browsing

Sorting in Space. Hanan Samet hjs

ICS RESEARCH TECHNICAL TALK DRAKE TETREAULT, ICS H197 FALL 2013

Principles of Data Management. Lecture #14 (Spatial Data Management)

Distance Browsing in Spatial Databases 1

A Real Time GIS Approximation Approach for Multiphase Spatial Query Processing Using Hierarchical-Partitioned-Indexing Technique

Distributed Database Management Systems

An Efficient Technique for Distance Computation in Road Networks

Progress in Image Analysis and Processing III, pp , World Scientic, Singapore, AUTOMATIC INTERPRETATION OF FLOOR PLANS USING

Intelligent Traffic System: Road Networks with Time-Weighted Graphs

Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases

Maps as Numbers: Data Models

General Image Database Model

Generating Traffic Data

Efficient Spatial Query Processing in Geographic Database Systems

6SDFH3URJUDP/DQJXDJH63/64/ IRUWKH 5HODWLRQDO$SSURDFKRIWKH6SDWLDO'DWDEDVHV

Interoperating Geographic Information Systems, INTEROP'99, Berlin, 1999, pp. 115{

6. Concluding Remarks

K-Nearest Neighbor Finding Using the MaxNearestDist Estimator

Spatial Data Management

Proposal: k-d Tree Algorithm for k-point Matching

Interactive Scientific Visualization of Polygonal Knots

Space-based Partitioning Data Structures and their Algorithms. Jon Leonard

Implementation Techniques

Indexing Land Surface for Efficient knn Query

Object-Based Classification & ecognition. Zutao Ouyang 11/17/2015

Remote Access to Large Spatial Databases

Bichromatic Line Segment Intersection Counting in O(n log n) Time

Using the Holey Brick Tree for Spatial Data. in General Purpose DBMSs. Northeastern University

Handling Multiple K-Nearest Neighbor Query Verifications on Road Networks under Multiple Data Owners

M. Andrea Rodríguez-Tastets. I Semester 2008

Lecture 12 Level Sets & Parametric Transforms. sec & ch. 11 of Machine Vision by Wesley E. Snyder & Hairong Qi

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition

Nearest Neighbor Queries

Lecture 25 November 26, 2013

A Sweep-Line Algorithm for the Inclusion Hierarchy among Circles

CS535 Fall Department of Computer Science Purdue University

Efficient Construction of Safe Regions for Moving knn Queries Over Dynamic Datasets

Hot topics and Open problems in Computational Geometry. My (limited) perspective. Class lecture for CSE 546,

Spatial Index Keyword Search in Multi- Dimensional Database

Unity In diversity. ArcGIS JS API as an Integration Tool. RICARDO BANDEIRA - IplanRio

Configuration Management for Component-based Systems

Fast Distance Transform Computation using Dual Scan Line Propagation

2) For any triangle edge not on the boundary, there is exactly one neighboring

The Effects of Dimensionality Curse in High Dimensional knn Search

MODIFIED SEQUENTIAL ALGORITHM USING EUCLIDEAN DISTANCE FUNCTION FOR SEED FILLING

Raster Data. James Frew ESM 263 Winter

Domain Independent Prediction with Evolutionary Nearest Neighbors.

Transcription:

zk0 ISSUES IN SPATIAL DATABASES AND GEOGRAPHICAL INFORMATION SYSTEMS (GIS) HANAN SAMET COMPUTER SCIENCE DEPARTMENT AND CENTER FOR AUTOMATION RESEARCH AND INSTITUTE FOR ADVANCED COMPUTER STUDIES UNIVERSITY OF MARYLAND COLLEGE PARK, MARYLAND 20742-3411 USA Copyright 2003 Hanan Samet These notes may not be reproduced by any means (mechanical or electronic or any other) without the express written permission of Hanan Samet

zk1 BACKGROUND (A PERSONAL VIEW!) 1. GIS originally focussed on paper map as output anything is better than drawing by hand no great emphasis on execution time 2. Paper output supports high resolution display screen is of limited resolution can admit less precise algorithms Ex: buffer zone computation (spatial range query) a. usually use a Euclidean distance metric (L 2 ) takes a long time b. can be sped up using a quadtree and a Chessboard distance metric (L ) not as accurate as Euclidean but may not be able to perceive the difference on a display screen! as much as 3 orders of magnitude faster 3. Users accustomed to spreadsheets GIS should work like a spreadsheet fast response time ability to ask what if questions and see the results incorporate a database for seamless integration of spatial and nonspatial (i.e., attribute data)

zk2 GENERAL SPATIAL DATABASE ISSUES 1. Why do we want a database? to store data so that it can be retrieved efficiently should not lose sight of this purpose 2. How to integrate spatial data with nonspatial data 3. Long fields in relational database are not the answer a stopgap solution as just a repository for data does not aid in retrieving the data if data is large in volume, then breaks down as tuples get very large 4. A database is really a collection of records with fields corresponding to attributes of different types records are like points in higher dimensional space a. some adaptations take advantage of this analogy b. however, can act like a straight jacket in case of relational model 5. Retrieval is facilitated by building an index need to find a way to sort the data index should be compatible with data being stored choose an appropriate zero or reference point need an implicit rather than an explicit index a. impossible to foresee all possible queries in advance b. explicit would sort two-dimensional points on the basis of distance from a particular point P impractical as sort is inapplicable to points different from P

zk3 6. Identify the possible queries and find their analogs in conventional databases e.g., a map in a spatial database is like a relation in a conventional database (also known as spatial relation) a. difference is the presence of spatial attribute(s) b. also presence of spatial output 7. How do we interact with the database? SQL may not be easy to adapt graphical query language output may be visual in which case a browsing capability (e.g., an iterator) is useful 8. What strategy do we use in answering a query that mixes traditional data with nontraditional data? need query optimization rules must define selectivity factors a. dependent on whether index exists on nontraditional data b. if no, then select on traditional data first Ex: find all cities within 100 miles of the Mississippi River with population in excess of 1 million a. spatial selection first if region is small (implies high spatial selectivity) b. relational selection first if very few cities with a large population (implies high relational selectivity)

zk4 SPECIFIC SPATIAL DATABASE ISSUES 1. Representation bounding boxes versus disjoint decomposition 2. How are spatial integrity constraints captured and assured? edges of a polygon link to form a complete object line segments do not intersect except at vertices contour lines should not cross 3. Interaction with the relational model spatial operations don t fit into SQL a. buffer b. nearest to... c. others... difficult to capture hierarchy of complex objects (e.g., nested definition) 4. Spatial input is visual need a graphical query language

zk5 5. Spatial output is visual unlike conventional databases, once operation is complete, want to browse entire output together rather than one tuple at-a-time don t want to wait for operation to complete before output a. partial visual output is preferable e.g., incremental spatial join and nearest neighbor b. multiresolution output is attractive 6. Functionality determining what people really want to do! 7. Performance not enough to just measure the execution time of an operation time to load a spatial index and build a spatiallyindexed output is important sequence of spatial operations as in a spatial spreadsheet a. output of one operation serves as input to another e.g., cascaded spatial join b. spatial join yields locations of objects and not just the object pairs

CHALLENGES: zk6 1. Incorporation of geometry into database queries without user being aware of it! find geometric analogs of conventional database operations (e.g., ranking semi-join yields discrete Voronoi diagram) extension of browser concept to permit more general browsing units based on connectivity (e.g., shortest path), frequency, etc. 2. Spatial query optimization different query execution plans use spatial selectivity factors to choose among them 3. Graphical query specification instead of SQL 4. Incorporation of time-varying data how to represent rates? 5. Incorporation of imagery 6. Develop spatial indices that support both locationbased ( what is at X?) and feature-based queries ( where is Y?) 7. Incorporate rendering attributes into database objects or relations queries based on the rendering attributes Ex: find all red regions query by content (e.g., image databases) 8. GIS on the Web and distributed data and algorithms 9. Knowledge discovery 10. Interoperability

SELECTED REFERENCES (Also see http://www.cs.umd.edu/~hjs/pubs.html) zk7 1. F. Brabec and H. Samet. Visualizing and Animating R-trees and Spatial Operations in Spatial Databases on the Worldwide Web. in Visual Database Systems 4 (VDB4), Chapman & Hall, London, 1998, pp. 123-140. http://www.cs.umd.edu/~hjs/quadtree http://www.cs.umd.edu/~hjs/pubs/v4s.pdf 2. C. Esperança and H. Samet. Experience with SAND/Tcl: a scripting tool for spatial databases. Journal of Visual Languages and Computing, 13(2):229-255, April 2002. http://www.cs.umd.edu/~hjs/pubs/sandtcl.ref.pdf 3. G. R. Hjaltason and H. Samet. Ranking in Spatial Databases. Advances in Spatial Databases 4th Symposium, SSD 95, Lecture Notes in Computer Science 951, Springer-Verlag, Berlin, 1995, 83-95. http://www.cs.umd.edu/~hjs/pubs/incnear.pdf 4. G. R. Hjaltason and H. Samet. Incremental distance join algorithms for spatial databases. Proceedings of the ACM SIGMOD Conference, pages 237-248, Seattle, WA, June 1998. http://www.cs.umd.edu/~hjs/pubs/incjoin.pdf 5. G. R. Hjaltason and H. Samet. Distance browsing in spatial databases. ACM Transactions on Database Systems, 24(2):265-318, June1999. http://www.cs.umd.edu/~hjs/pubs/incnear2.pdf 6. G. R. Hjaltason and H. Samet. Speeding up construction of PMR quadtree-based spatial indexes. VLDB Journal, 11(2):109-137, November 2002. http://www.cs.umd.edu/~hjs/pubs/bulkload.pdf 7. H. Samet, H. Alborzi, F. Brabec, C. Esperança, G. R. Hjaltason, F. Morgan, and E. Tanin. Use of the SAND

zk8 spatial browser for digital government applications. Communications of the ACM, 46(1):63-66, January 2003.http://www.cs.umd.edu/~hjs/pubs/cacm03.pdf 8. H. Samet and F. Brabec. Remote thin-client access to spatial database systems. Proceedings of the 2nd National Conference on Digital Government Research, pages 75-82, 409, Los Angeles, CA, May 2002. http://www.cs.umd.edu/~hjs/pubs/dgo02.ps 9. H. Samet, F. Brabec, and G. R. Hjaltason. Interfacing the SAND spatial browser with FedStats data. Proceedings of the 1st National Conference on Digital Government Research, pages 41-47, Los Angeles, CA, May 2001. http://www.cs.umd.edu/~hjs/pubs/dgo01.pdf 10. E. Tanin, F. Brabec, and H. Samet. Remote access to large spatial databases. Proceedings of the 10th International Symposium on Advances in Geographic Information Systems, pages 5-10, McLean, VA, November 2002. http://www.cs.umd.edu/~hjs/pubs/acmgis02.ps 11. H. Samet. Applications of Spatial Data Structures: Computer Graphics, Image Processing, and GIS, Addison-Wesley, Reading, MA, 1990. ISBN 0-201- 50300-0. 12. H. Samet. The Design and Analysis of Spatial Data Structures, Addison-Wesley, Reading, MA, 1990. ISBN 0-201-50255-0. 13. H. Samet. Spatial Data Structures. in Modern Database Systems: The Object Model, Interoperability, and Beyond, W. Kim, Ed., Addison- Wesley/ACM Press, 1995, 361-385. http://www.cs.umd.edu/~hjs/pubs/kim.ps

14. H. Samet and W. G. Aref. Spatial Data Models and Query Processing. in Modern Database Systems: The Object Model, Interoperability, and Beyond, W. Kim, Ed., Addison-Wesley/ACM Press, 1995, 338-360. http://www.cs.umd.edu/~hjs/pubs/kim2.ps zk9