Geography 43 / 3 Introduction to Geographic Information Science Raster Data and Tesselations Schedule Some Updates Last Lecture We finished DBMS and learned about storage of data in complex databases Relational database models Importance of redundancy, consistency, robustness An idea of normalization (1st - 3rd normal forms) and their importance for DBMS 1
Today s Outline Raster data as spatial information representation in grid cells Properties, attribute tables and structure of raster datasets Encoding of raster values and associated problems (raster/vector conversions) Compression and Resampling Learning Objectives Understanding raster data and their representation in grid cells Discuss properties, and structure of raster datasets Learn about encoding of raster data and raster/vector conversions Understand important compression algorithms and resampling Not New: Raster Data Models Most common raster is composed of squares, called grid cells analogous to pixels in remote sensing images & computer graphics tessella (Latin) - small cubical piece of clay, stone or glass used to make mosaics (Webster) "small square" ( tessera - Greek for four ) 2
Raster Resolution The distance that one side of a grid cell represents on the ground Better to use fine & coarse rather than high & low adjectives. Finer resolution data have smaller/larger grid cells and lower/higher precision, but greater/lower cost in data storage than coarser data. Discrete or Continuous Features discrete continuous qualifies them automatically to represent different conceptualizations of spatial phenomena Raster The Mixed Pixel Problem Task: create a simple land cover map with 2 classes: land and water A =? B =? C =? D =? 3
Understanding Gridded Data: Coding/Conversion Encoding real world entities in a raster data model can also be thought of as conversion if the data are already in a vector data format Vector (points/lines/polygons) raster Raster vector Vector points raster Presence/absence coding method: 1 if the object is present anywhere in the grid cell 0 if it is absent What s an obvious implication of the anywhere rule? Vector lines raster Rules of Connectivity of raster cells (4/) can be enforced Proportion of Line within Cell / Proximity to Cell Center 4
Vector lines raster 3 themes 3 layers What happens if we try to represent different categories in one layer? Vector point values Raster Representing measured points of a continuous phenomenon: Interpolation from point data to surface 6 12 1 11 12 6 11 11 12 13 13 11 13 1 14 12 11 14 13 12 Vector polygons Raster Boundaries of polygons Dependence on spatial resolution Attribute tables How efficient is the representation of large homogeneous polygons with very high resolution?
Attribute tables in Raster data Conversion: Raster Vector What are implications here for the precision of the location of our vector line? How can we decide about the flow of the vector line segments through raster cells? Registration Problems Geo-registration and coregistration of rasters is problematic when pixels don t line up... To solve this the data structure needs to be changed 6
Raster Resampling Nearest neighbor Majority rule Bilinear interpolation Cubic convolution Involves reassigning cell values when changing raster coordinates or geometry (cell size) Rasters often model continuous features using sampled points Bilinear Interpolation: distance weighted averaging
The storage space/resolution tradeoff Decreasing the grid cell size by one-half results in a -fold increase in the storage space required Data Compression Goal: reduce storage size on disk Common lossless methods: run length encoding value point encoding quadtrees What s an example of a lossy compression method? Compression: Run-Length Encoding What s the problem/drawback here?
Compression: Value Point Encoding Point values are positions (row, col) in the grid Compression: Quadtrees A partitioning of heterogeneous space into quarter sections that are homogeneous (search for largest homogeneous regions) Quadtrees Node is a quadrant that is heterogeneous Leaf is a quadrant that is homogeneous Quadrants are assigned an ID number according to their position and level
Quadtrees Advantages Efficient: rapid data search Variable resolution, can generalize data easily for zooming in and out Disadvantages Complex: building the structure takes time Difficult to modify/update Not efficient if data are heterogeneous Raster vs. Vector Characteristics Vector Raster Positional precision Limited only by quality of measurements Defined by cell size Spatial analysis Data structure Good for topological operations/queries, shape analyses, network analyses. Slow overlays. Usually complex Spatial query more difficult, good for local neighborhoods, continuous variable modeling. Rapid overlays. Usually simple, easy to modify or program Storage requirements Coordinate conversion Display & output Relatively small Usually well-supported & simple Very good, map-like, but poor for images Often quite large if not compressed Slow due to data volumes & resampling Good for images, but stair-step edges What data models can be used to represent elevation in a GIS? Multiple representations of elevation