DATA MODELS IN GIS. Prachi Misra Sahoo I.A.S.R.I., New Delhi

Similar documents
Lecture 6: GIS Spatial Analysis. GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

Understanding Geospatial Data Models

Maps as Numbers: Data Models

Representing Geography

layers in a raster model

Statistical surfaces and interpolation. This is lecture ten

Class #2. Data Models: maps as models of reality, geographical and attribute measurement & vector and raster (and other) data structures

17/07/2013 RASTER DATA STRUCTURE GIS LECTURE 4 GIS DATA MODELS AND STRUCTURES RASTER DATA MODEL& STRUCTURE TIN- TRIANGULAR IRREGULAR NETWORK

EAT 233/3 GEOGRAPHIC INFORMATION SYSTEM (GIS)

Purpose: To explore the raster grid and vector map element concepts in GIS.

Computer Database Structure for Managing Data :-

Topic 5: Raster and Vector Data Models

Spatial Data Models. Raster uses individual cells in a matrix, or grid, format to represent real world entities

Review of Cartographic Data Types and Data Models

Maps as Numbers: Data Models

LECTURE 2 SPATIAL DATA MODELS

RECOMMENDATION ITU-R P DIGITAL TOPOGRAPHIC DATABASES FOR PROPAGATION STUDIES. (Question ITU-R 202/3)

Contents of Lecture. Surface (Terrain) Data Models. Terrain Surface Representation. Sampling in Surface Model DEM

SPATIAL DATA MODELS Introduction to GIS Winter 2015

Introduction to Geographic Information Systems Dr. Arun K Saraf Department of Earth Sciences Indian Institute of Technology, Roorkee

Thoughts on Representing Spatial Objects. William A. Huber Quantitative Decisions Rosemont, PA

Geographic Surfaces. David Tenenbaum EEOS 383 UMass Boston

4.0 DIGITIZATION, EDITING AND STRUCTURING OF MAP DATA

Graphic Display of Vector Object

Introduction :- Storage of GIS Database :- What is tiling?

Algorithms for GIS. Spatial data: Models and representation (part I) Laura Toma. Bowdoin College

What can we represent as a Surface?

Scalar Visualization

Grade 9 Math Terminology

Algorithms for GIS csci3225

Ray Tracing Acceleration Data Structures

Raster GIS. Raster GIS 11/1/2015. The early years of GIS involved much debate on raster versus vector - advantages and disadvantages

CS337 INTRODUCTION TO COMPUTER GRAPHICS. Describing Shapes. Constructing Objects in Computer Graphics. Bin Sheng Representing Shape 9/20/16 1/15

Analytical and Computer Cartography Winter Lecture 9: Geometric Map Transformations

Data Representation in Visualisation

M. Andrea Rodríguez-Tastets. I Semester 2008

Geometric Rectification of Remote Sensing Images

CS123 INTRODUCTION TO COMPUTER GRAPHICS. Describing Shapes. Constructing Objects in Computer Graphics 1/15

Prime Time (Factors and Multiples)

Introduction to Geographic Information Science. Some Updates. Last Lecture 4/6/2017. Geography 4103 / Raster Data and Tesselations.

Lesson 5 overview. Concepts. Interpolators. Assessing accuracy Exercise 5

Scientific Visualization Example exam questions with commented answers

DIGITAL TERRAIN MODELLING. Endre Katona University of Szeged Department of Informatics

Digital Image Processing Fundamentals

03 Vector Graphics. Multimedia Systems. 2D and 3D Graphics, Transformations

3D Terrain Modelling of the Amyntaio Ptolemais Basin

Graphics Pipeline 2D Geometric Transformations

Geostatistics Predictions with Deterministic Procedures

THE TOOLS OF AUTOMATED GENERALIZATION AND BUILDING GENERALIZATION IN AN ArcGIS ENVIRONMENT

Computer Graphics Fundamentals. Jon Macey

Chapter 3. Sukhwinder Singh

Accuracy, Support, and Interoperability. Michael F. Goodchild University of California Santa Barbara

Mapping Distance and Density

Alaska Mathematics Standards Vocabulary Word List Grade 7

Advances in geographic information systems and remote sensing for fisheries and aquaculture

Object modeling and geodatabases. GEOG 419: Advanced GIS

Chapter 3: Maps as Numbers

CSC Computer Graphics

Elementary Planar Geometry

Introduction to GIS. Geographic Information Systems SOCR-377 9/24/2015. R. Khosla Fall Semester The real world. What in the world is GIS?

Module 1: Basics of Solids Modeling with SolidWorks

GEOGRAPHIC INFORMATION SYSTEMS Lecture 25: 3D Analyst

Lecture 06. Raster and Vector Data Models. Part (1) Common Data Models. Raster. Vector. Points. Points. ( x,y ) Area. Area Line.

About the Author: Abstract: Contact No: Regional Computer Centre P-32, Transport Depot Road, Kolkata

v Mesh Generation SMS Tutorials Prerequisites Requirements Time Objectives

Raster Data. James Frew ESM 263 Winter

Geometric Entities for Pilot3D. Copyright 2001 by New Wave Systems, Inc. All Rights Reserved

DIGITAL TERRAIN MODELS

COMPARISON OF TWO METHODS FOR DERIVING SKELETON LINES OF TERRAIN

Data Models and Data processing in GIS

Grading and Volumes CHAPTER INTRODUCTION OBJECTIVES

THE CONTOUR TREE - A POWERFUL CONCEPTUAL STRUCTURE FOR REPRESENTING THE RELATIONSHIPS AMONG CONTOUR LINES ON A TOPOGRAPHIC MAP

13 Vectorizing. Overview

Computergrafik. Matthias Zwicker Universität Bern Herbst 2016

9. Three Dimensional Object Representations

Blacksburg, VA July 24 th 30 th, 2010 Georeferencing images and scanned maps Page 1. Georeference

A DATA STRUCTURE FOR A RASTER MAP DATA BASE. Roger L.T. Cederberg National Defence Research Institute Box 1165, S Linkoping, Sweden.

Mathematics Curriculum

Year 9: Long term plan

v Map Module Operations SMS Tutorials Prerequisites Requirements Time Objectives

Multidimensional Data and Modelling - DBMS

Number and Operation Standard #1. Divide multi- digit numbers; solve real- world and mathematical problems using arithmetic.

Figure 1: Workflow of object-based classification

GIS DATA MODELS AND SPATIAL DATA STRUCTURE

Data handling 2: Transformations

2D/3D Geometric Transformations and Scene Graphs

Subdivision Of Triangular Terrain Mesh Breckon, Chenney, Hobbs, Hoppe, Watts

Digital Image Processing

Boardworks Ltd KS3 Mathematics. S1 Lines and Angles

Introduction to GIS 2011

Surface Creation & Analysis with 3D Analyst

CAR-TR-990 CS-TR-4526 UMIACS September 2003

coding of various parts showing different features, the possibility of rotation or of hiding covering parts of the object's surface to gain an insight

The GIS Spatial Data Model

HOUGH TRANSFORM CS 6350 C V

MATHEMATICS Grade 4 Standard: Number, Number Sense and Operations. Organizing Topic Benchmark Indicator Number and Number Systems

CoE4TN4 Image Processing

DIGITAL IMAGE ANALYSIS. Image Classification: Object-based Classification

Watershed Sciences 4930 & 6920 GEOGRAPHIC INFORMATION SYSTEMS

Geometric Representations. Stelian Coros

Transcription:

DATA MODELS IN GIS Prachi Misra Sahoo I.A.S.R.I., New Delhi -110012 1. Introduction GIS depicts the real world through models involving geometry, attributes, relations, and data quality. Here the realization of models is described, with the emphasis on geometric spatial information, attributes and relations. A prerequisite for describing the real world by use of GIS is that the different type of geographical information can be stored in the computer. All the operations in the computer are based on the storage and handling of numbers. This is why the data stored in the computers is known as digital data. In GIS there is need to store graphical figures, images, numerical values and plain text. All these forms of data must be converted into digital representation. In principle, only two different numerical symbols or signals can be stored in a computer 0 and 1. The numerical system based on 0 and 1 is known as the binary system. In order to separate different numbers form each other, the stream of 0 s and 1 s is divided into groups of 8 bits. Each group is known as a byte. Decimal figures are stored with the help of 4 bytes (32 bits), based on logarithmic notation, by separate storage of the number s mantissa and exponent. Text is stored digitally with the help of a code system called ASCII (American Standard Code for Information Interchange). Each number between 0 and 127 corresponds to one sign on the computer s key board. For example, uppercase letters A through Z are represented by numbers between 97 and 90, and lowercase a through z by numbers between 97 and 122. To handle special national signs, a variation of the system has been developed to accommodate up to 255 different signs. Geometric presentations are commonly called digital maps. Strictly speaking, a digital map would be peculiar because it would comprise only numbers (digits). By their very nature, maps are analog, whether they are drawn by hand or machine, or whether they appear on paper or are displayed on a screen. Technical, GIS does not produce digital maps it produces analog maps from digital map data. Nonetheless, the term digital map is now so widely used that the distinction is well understood. Spatial information is presented geometrically in two ways: as vector data in the form of points, lines, and areas (polygons), or as raster data in the form of uniform, systematically organized cells. The vector model and raster model are discussed in the following sections. 2. Basic Data Models in GIS The data model represents a set of guidelines to convert the real world (called entity) to the digitally and logically represented spatial objects consisting of the attributes and geometry. The attributes are managed by thematic or semantic structure while the geometry is represented by geometric-topological structure. There are two major types of geometric data model; vector and raster model as shown in Fig. 1.

Figure 1. Vector and Raster data models 2.1 Vector Data Model Vector model uses discrete points, lines and/or areas corresponding to discrete objects with name or code number of attributes. Given a map, you can tell how map features are like and how the map features are related to one another spatially. 2.1.1 Geometry of Vector Data Model The vector data model consists of three types of geometric objects: point, line, and area. A point may represent a gravel pit, a line may represent a stream, and an area may represent a vegetated area. A point has 0 dimension. A point feature occupies a location and is separate from other features (Figure 2). A line is one-dimensional and has the property of length. A line feature is made of points: a beginning point, an end point, and a series of points marking the shape of the line, which may be a smooth curve or a connection of straight-line segments. Smooth curves are typically generated or fitted by mathematical equations, such as cubic polynomial equations. Straight-line segments may represent human-made features or approximations of curves in data entry. Points that mark the shape of a line feature but are not nodes are called vertices. Line features may intersect or join with other lines and may form a network (Figure 3). An area is two-dimensional and has the properties of area and boundary. The boundary of an area feature separates the interior area from the exterior area. Area features may be isolated or connected. An isolated area feature typically has a node serving as both the beginning and end node. Area features may be surrounded by other areas and form holes within them. Area features may overlap one another and create overlapped areas. For example, the fired areas from previous forest fires may overlap each other (Figure 4). Vector data representation using point, line, area, and volume is not always straightforward because it may depend on map scale and, occasionally, criteria established by government mapping agencies. A city on a 1:1,000,000-scale map is represented as a point, but the same city is shown as an area on a 1:24,000-scale map. A stream is shown as a single line near its III.141

headwaters but as an area along its lower reaches. In this case, the width of the stream determines how it should be represented on a map. 2.1.2 Topology of Vector Data Model Topology expresses explicitly the spatial relationships between geometric objects. The vector data model in ARC/INFO supports three basic topological concepts: 1. Connectivity: Arcs connect to each other at nodes 2. Area definition: An area is defined by a series of connected arcs 3. Contiguity: Arcs have directions and left and right polygons Figure 2. Points with x-, y-coordinates III.142

Figure 3. The data structure of a line data model Figure 4. The data structure of an area data model III.143

2.1.3 Advantages and Disadvantages of Vector Data Models The advantages of the vector data model are: 1. Good representation of entity data models. Compact data structure. 2. Topology can be described explicitly therefore good for network analysis. 3. Coordinate transformation and rubber sheeting is easy. 4. Accurate graphic representation at all scales. 5. Retrieval, updating and generalization of graphics and attributes are possible. The disadvantages of the vector data model are: 1. Complex data structures 2. Combining several polygon networks by intersection and overlay is difficult and requires considerable computer power. 3. Display and plotting may be time consuming and expensive, particularly for high-quality drawing, colouring, and shading. 4. Spatial analysis within basic units such as polygons is impossible without extra data because they are considered to be internally homogeneous. 5. Simulation modeling of process of spatial interaction over paths not defined by explicit topology is more difficult than with raster structures because each spatial entity has a different shape and form. 2.2 Raster Format Raster model uses regularly spaced grid cells in specific sequence. An element of the grid cell is called a pixel (picture cell). The conventional sequence is row by row from the left to the right and then line by line from the top to bottom. Every location is given in two dimensional image coordinates; pixel number and line number, which contains a single value of attributes. 2.2.1 Geometry of Raster Data The geometry of raster data is given by point, line and area objects as follows (see Figure 5) a. Point Objects: A point is given by point ID, coordinates (i, j) and the attributes b. Line Objects: A line is given by line ID, series of coordinates forming the line, and the attributes c. Area Objects: An area segment is given by area ID, a group of coordinates forming the area and the attributes. Area objects in raster model are typically given by "Run Length" that rearranges the raster into the sequence of length (or number of pixels) of each class as shown in Figure 5. The topology of raster model is rather simple as compared with the vector model as shown in Figure 5. The topology of line objects is given by a sequence of pixels forming the line segments. The topology of an area object is usually given by "Run Length" structure which includes Start line no., (start pixel no., number of pixels), second line no., (start pixel no., number of pixels). III.144

2.2.2 Topology of Raster Data Figure 5. Geometry and Topology of Raster Data One of the weak points in raster model is the difficulty in network and spatial analysis as compared with vector model. For example, though a line is easily identified as a group of pixels III.145

which form the line, the sequence of connecting pixels as a chain would be a little difficult in tracing. In case of polygons in raster model, each polygon is easily identified but the boundary and the node (when at least more than three polygons intersect) should be traced or detected. a. Flow Directions A line with directions can be represented by four directions called as the Rook's move in the chess game or eight directions called as the Queen s move, as shown in Figure 6 (a), (b), (c). Figure 6 (c) shows an example of flow directions in the Queen's move. Water flow, links of a network, roads etc. can be represented by the flow directions (or called Freeman chain code). b. Boundary Boundary is defined as 2 x 2 pixel window that has two different classes as shown in Figure 7 (a). If a window is traced in the direction shown in Figure 7 (a), the boundary can be identified. c. Node A node in polygon model can be defined as a 2 x 2 window that has more than three different classes as shown in Figure 7 (b). Figure 7 (c) and (d) show an example of identification of pixels on boundary and node. Figure 6. Flow Directions III.146

Figure 7. Identification of Boundary and Node 2.2.3 Advantages and Disadvantages of Raster Data Models The advantages of the raster data model are: 1. Simple data structures. 2. Location-specific manipulation of attribute data is easy. 3. Many kinds of spatial analysis and filtering may be used. 4. Mathematical modeling is easy because all spatial entities have a simple, regular shape. III.147

5. The technology is cheap. 6. Many forms of data are available. The disadvantages of the raster data model are: 1. Large data volumes. 2. Using large grid cells to reduce data volumes reduces spatial resolution, result in loss of information and an inability to recognize phenomenological defined structures. 3. Crude raster maps are inelegant though graphic elegance is becoming much less of a problem today. Coordinate transformations are difficult and time consuming unless special algorithms and hardware are used and even then may result in loss of information or distortion of grid cell shape. 2.3 Quadtree Data Model Traditionally, the raster model is based on dividing the real world into equal-sized rectangular cells. However, in many cases, it can be more practical to use a model with varying cell size. Larger cells (lower resolution) may be used to represent larger homogeneous areas, and smaller cells (higher resolution) may be used for more finely detailed areas. This approach, known as the quad-tree representation, is a refinement of the block code method. In representing a given areas, the aggregate amount of data involved is proportional to the square of the resolution (into cells). Because the quad-tree model is a very practical concept, it is preferable for the storage of both small and large volumes of data. The quad-tree paradigm divides a geographical area into square cells of sizes varying from relatively large to that of the smallest cell of the raster. Usually, the squares are then quartered into four smaller squares. The quartering may be continued to a suitable level until a square is found to be so homogeneous that it no longer needs to be divided, and the data on it can be stored as a unit. A larger square may therefore comprise several raster cells having the same values. However, homogeneous areas that are not square or do not coincide with the pattern of squares employed may be further divided into homogeneous squares. The structure of the quadtree resembles an inverted tree, whose leaves are pointers to the attributes of homogeneous squares and whose branch forks are pointers to smaller squares hence the name quad-tree (Figure 8). 2.3.1 Advantages and Disadvantages of Quadtree Data Models The advantages of the quad-tree model are: 1. Rapid data manipulation, because homogeneous areas are not divided into the smallest cells used. 2. Rapid search, because larger homogeneous areas are located higher up in the point structure 3. Compact storage, because homogeneous squares are stored as units. 4. Efficient storage structure for certain operations, including searching for neighboring squares or for a square containing a specific point. III.148

5. The disadvantages of the quad-tree model are: 6. Establishing the structure requires considerable processing time. 7. Protracted processing may prolong alterations and updating 8. Data entered must be relatively homogeneous 9. Complex data may require more storage capacity than ordinary raster storage. 3. Advanced Data Models in GIS Figure 8. Quadtree data model In GIS, continuous surface such as terrain surface, meteorological observation (rain fall, temperature, pressure etc.) population density and so on should be modeled. As sampling points are observed at discrete interval, a surface model to present the three dimensional shape; z = f (x, y) should be built to allow the interpolation of value at arbitrary points of interest. Usually the following four types of sampling point structure are modeled into DEM. 3.1 Grid Model A systematic grid, or raster, of spot heights at fixed mutual spaces is often used to describe terrain. Elevation is assumed constant within each cell of the grid; that is, the area represented by each cell is shown as a flat area in the model. Thus, small cells detail terrain more accurately than large cells. The size of cells is constant in a model, so areas with a greater variation of terrain may be described less accurately than those with less variation. The grid model is most suitable for describing random variations in the terrain, whereas the systematic linear structures can easily disappear or be deformed. A possible solution is to store the data as individual points and generate grids of varying density as required. It is debatable whether the grid model represents samples on a grid and can therefore be called a point model, or represents an average across raster cells. In the United States the former seems to be the most usual. Elevation values III.149

are stored in a matrix, and the contiguity between points is thus expressed through the column and line numbers. Different interpolation techniques are used to generate an elevation grid from source data such as points, contour lines, and break lines. In interpolation of elevation values for the cells, it is usual to assume that points located at a distance. The averages of the elevations of those closed to grid points, within a given circle or square, can be assigned to the grid points with inverse weighting in proportion to the intervening distances involved. More advanced statistical methods can replace this kind of simple weighting in order to obtain a best possible model of the terrain based on available data. When the data relate to profiles or contours, grid point elevations are interpolated, in the same way, from the elevations at the intersections of the original data lines and the lines of the grid. 3.2 TIN Model An area model is an array of triangular areas with their corners stationed at selected points of most importance, for which the elevations are known. The inclination of the terrain is assumed to be constant within each triangle. The areas of the triangles may vary, with the smallest representing those areas in which the terrain varies most. The resulting model is called the triangulated irregular network(tin) In so far as possible, small equilateral triangles are preferable. To construct a TIN, as measured points are built and the model thus represents lines of fracture, single points, and random variations in the terrain. The points are established by triangulation and in such a way that no other points are located within each triangle s converted circle. In the TIN model, the x-y-z coordinates of all points, as well as the triangle attributes of inclination and direction, are stored. The triangles are stored in a topological vector data storage structure comprising polygons and nodes, thereby preserving the triangle s contiguity, which eases the calculation of z values for new points. 3.3 Contour lines Interpolation based on proportional distance between adjacent contours is used. TIN is also used. 3.4 Profile Profiles are observed perpendicular to an alignment or a curve such as high ways. In case the alignment is a straight line, grid points will be interpolated. In case the alignment is a curve, TIN will be generated. Figure 9 shows different types of DEMs. III.150

Figure 9. Different types of DEMs III.151

References 1. Bernhardsen, T. (2002) Geographic Information Systems: An Introduction. John Wiley & Sons, Inc. 2. Burrough, P.A. and McDonnell R.A. (1986) Principles of Geographic Information Systems. Oxford: Oxford University Press. 3. Buckley, D. J. The GIS Primer: An Introduction to Geographic Information System. http://www.innovativegis.com/basis/primer/primer.html. 4. Geographic Information System: Primer, Geospatial Training and Analysis Cooperative http://geology.isu.edu/geostac/field_exercise/gisprimer/frameset.html. 5. Davis, B. E. (2001) GIS: A Visual Approach. Onward Press. III.152