Statistical surfaces and interpolation. This is lecture ten

Statistical surfaces and interpolation This is lecture ten

Data models for representation of surfaces So far have considered field and object data models (represented by raster and vector data structures). There is a further type of data model used widely in GIS These are the models used to create approximations of the earth s surface: digital elevation models (DEM) or digital terrain models (DTM).

DEM

What is a DEM? DEMs are traditionally presented as twodimensional computer arrays (like putting graph paper over the terrain). Data may be presented in ASCII (character text) or 2-byte integer binary formats. Elevations are also presented as: points or spots, such as at mountain peaks, lake surface elevations, confluences of streams, and cultural landmarks (e.g. airports, cities, geodetic control marker locations).

Point data for DEMs Points may be determined by cadastral survey, Global Positioning Survey, photo interpretation or other technique. Physical line drop measurements (for bathymetry), which are presumed to be vertical, but which often deviate slightly from the vertical because of currents in the water, movement of the ship from which measurements are made, slope of the bottom surface, etc.

Hill shaded relief

Slope draped over a DEM

TIN and GRID These are called numeric or statistical surfaces, because they are generated from numbers (coordinate points). TIN and gridded networks are the predominant methods of representing surfaces.

Limitations to DTMs There is no way that we could know all elevation values so we interpolate intermediate values. The constriction that faces us is that geometry only has three values associated with it: x,y,z. But you cannot describe a continuous surface using only three variables at discrete intervals. The result is that all DTMs are only approximations of the terrain.

2.5 D vs 3D surfaces What a DTM represents is a number of x,y,z points. We use existing points to compute new spot heights between existing data points. Often DTMs are referred to as 2.5 dimensional rather than 3 dimensional. The reason for this is that each x,y point can only store one z value. In a real three dimensional surface, several z values can be stored with each x,y coordinate pair. E.g. elevation heights, a population density surface and the height of the tree-cover or a building could all be stored with one x,y pair. (x,y,z 1, z 2, z 3, z 4, )

Differences btwn 2.5D and 3D A real 3-D surface can handle different layers of elevations. Another advantage of 3-D models is that they can be used to calculate volumes. On the other hand, 2.5 D models are excellent for visualization of data. It is easy to envisage how terrain looks from above using a DTM. Thus for geographical purposes, a 2.5 D model is fine.

Role of interpolation We use interpolation (or inference based on known values) to compute unknown values on our surface. If we didn t structure the points in some way, then all known data points would have to be searched in order to calculate a new point at a given x,y coordinate pair. So instead, we build topology or contiguity relations into the DTM in order to circumscribe the search for relevant points to use to calculate new points.

GRID and TIN model There are two data structures used for DTMs: TIN and raster. TIN stores topology explicity while raster stores it implicitly.

GRID GRID models for terrain. With grid terrain models, each square of the grid cell is assumed to have a homogeneous elevation so the system is most accurate when the cells are small relative to the area being represented. Since the size of the cells is constant, areas with great terrain variation might have too few cells representing them while areas with very regular elevation might be over-represented. The grid model works best when there aren t any dramatic shifts in the terrain, like faults or cliffs.

GRID 2 One way to make the grid model more flexible is to store the data as individual points and then generate grids of varying size, depending on the terrain. So an area of tumultuous terrain would be represented by very small cells and a low, lying plain would be represented by large cells. Thus the grid model for terrain representation can be thought of in two ways: (i) a point models that is displayed by grid cells; or (ii) an true grid model (raster) that is based on values averaged over the square rasters.

Topology of GRID model In the grid model, the elevation values are stored in a matrix. The topology or contiguity is expressed by the column and row numbers. It is an implicit topology.

Assigning values to GRID There are two way of determining the values associated with each square of the grid system: (i) area search based on a point cloud; and (ii) line search from intersections with contour lines.

Point cloud and line search The assumption with the point cloud method is that the terrain varies most in areas where many data points have been collected. Thus the cloud can be used either to (i)determine the raster size; or (ii) to pick the elevation values for a fixed raster size. With the line search method, the intersection with contour lines is used to assign grid cells. Note that in cases where there are multiple intersections that pertain to a given cell, the points can be averaged.

Questions What about cells that don t intersect with any contour lines? Why does the line intersection method exist?

TINs for representing terrain TIN is an array of triangular areas with their corners located at points of significance in describing terrain. The areas of the triangles may vary and, similar to assumptions of homogeneity associated with cell values in the grid system, the angle of inclination is considered constant for each face of the triangle.

Delauney triangulation We prefer equilateral triangles in a TIN and a TIN in which all the triangles are as equilateral as possible is called a Delaunay triangulation. The triangles for a TIN are stored in a topological data structure with topology explicitly defined (i.e the relationship between all arcs, nodes and their neighbours. The TIN is more difficult to build than a grid model but, once established, is more efficient to store because it takes up less space. Only the points that describe elevation shifts are stored.

Simple Delauney Triangulation and Voronoi network.

Two Adjacent Triangles Which (a) Violate and (b) Honour the Delauney Criterion. Delauney triangles must not include vertices of the circumcircles of the triangles in the network.

A triangulation of spot heights

A wire frame developed using triangulation of spot heights.

Isolines for terrain models Isolines or lines connecting points of equal elevation can be used to depict terrain. This is the basis of the traditional topo map. Ideally, you would have the densest point sets in areas where there is most variation in the terrain. It is also possible to use a combination of isolines and point elevations. e.g. use of survey elevation marks and isolines.

Introduction to interpolation Spatial interpolation is the procedure of estimating the value of properties at unsampled sites within the area covered by existing observations. Interpolation is a type of proximity operation in which attribute values are assigned to new points on the basis of values of existing points. Usually the operation defines a search area in which it looks for points to use in order to assign the new values.

4 mains steps in interpolation 1. Identify a base point 2. Define or compute the search area 3. Select or search for objects (existing data and/or points to assign new values) 4. Manipulate the attribute data in accordance with the interpolation technique chosen Note: in almost all cases the property must be interval or ratio scaled

Rationale for interpolation Can be thought of as the reverse of the process used to select the few points from a DEM which accurately represent the surface Rationale behind spatial interpolation is the observation that points close together in space are more likely to have similar values than points far apart (Tobler's Law of Geography).

Uses for interpolation to provide contours for displaying data graphically to calculate some property of the surface at a given point to change the unit of comparison when using different data structures in different layers frequently used as an aid in the spatial decision making process both in physical and human geography, and in related disciplines such as mineral prospecting and hydrocarbon exploration many of the techniques of spatial interpolation are twodimensional developments of the one dimensional methods originally developed for time series analysis

Types of interpolation processes There are several different ways to classify spatial interpolation procedures.

Point interpolation/areal interpolation Given a number of points whose locations and values are known, determine the values of other points at predetermined locations. Point interpolation is used for data which can be collected at point locations. e.g. weather station readings, spot heights, oil well readings, porosity measurements

Point based 2 Interpolated grid points are often used as the data input to computer contouring algorithms. Once the grid of points has been determined, isolines (e.g. contours) can be threaded between them using a linear interpolation on the straight line between each pair of grid points. This is the most frequent type of interpolation in GIS.

Line to point e.g. contours to elevation grids. We use this when data is taken from existing contour (topo) maps

Areal interpolation Given a set of data mapped on one set of source zones determine the values of the data for a different set of target zones. e.g. given population counts for census tracts, estimate populations for electoral districts

Areal interpolation from census areas to a single grid cell

Global vs local interpolators global interpolators determine a single function which is mapped across the whole region o a change in one input value affects the entire map local interpolators apply an algorithm repeatedly to a small portion of the total set of points o a change in an input value only affects the result within the window

Local interpolation

Global Interpolation

Global/local 2 global algorithms tend to produce smoother surfaces with less abrupt changes o are used when there is an hypothesis about the form of the surface, e.g. a trend some local interpolators may be extended to include a large proportion of the data points in set, thus making them in a sense global the distinction between global and local interpolators is thus a continuum and not a dichotomy o this has led to some confusion and controversy in the literature

Exact vs Approximate interpolators Exact interpolators honor the data points upon which the interpolation is based o the surface passes through all points whose values are known o honoring data points is seen as an important feature in many applications e.g. the oil industry o proximal interpolators, B-splines and Kriging methods all honor the given data points Kriging, as discussed below, may incorporate a nugget effect and if this is the case the concept of an exact interpolator ceases to be appropriate.

Exact Interpolation

Approximate interpolators Approximate interpolators are used when there is some uncertainty about the given surface values o this utilizes the belief that in many data sets there are global trends, which vary slowly, overlain by local fluctuations, which vary rapidly and produce uncertainty (error) in the recorded values o the effect of smoothing will therefore be to reduce the effects of error on the resulting surface

Stochastic and deterministic models We divide discussion of terrain models into two element: random (stochastic) and systematic (deterministic) Stochastic models: where each new value (elevation in this case) is determined without reference to existing values. The continuously varying relief generated in a DTM is the product of random number generation (in the simplest sense) in that we estimate the in-between values.

Deterministic models The second element that goes into making terrain models is systematic. This is the part that is based on real elevation data and allows us to take into account steep cliffs and faults and depressions cut out to let roads pass through mountains. These systematic elements that comprise a DTM are expressed in data as single points or a series of isolines associated with a given value.

Stochastic/Deterministic interpolators Stochastic methods incorporate the concept of randomness o the interpolated surface is conceptualized as one of many that might have been observed, all of which could have produced the known data points Stochastic interpolators include trend surface analysis, Fourier analysis and Kriging Procedures such as trend surface analysis allow the statistical significance of the surface and uncertainty of the predicted values to be calculated.

Gradual/Abrupt interpolators A typical example of a gradual interpolater is inverse distance weighting. o usually produces an interpolated surface with gradual changes o however, if the number of points used in the moving average is reduced to a small number, or even one, there would be abrupt changes in the surface o IDW is a simple way of guessing the values of a field at locations where measurements are not available.

Abrupt interpolators It may be necessary to include barriers in the interpolation process o semipermeable, e.g. weather fronts o will produce quickly changing but continuous values o impermeable barriers, e.g. geologic faults o will produce abrupt changes

Kriging (an example of interpolation) Based on principle that variable may be too complex to be modelled by a smooth mathematical function BUT has spatial dependence. Is an exact interpolator. Based on assumption that the spatial variation of a variable can be related three influences: (i) structural; (ii) random spatial correlation; and (iii) random noise or error. Takes into account a general trend component.

Kriging of rainfall data in Australia

Common uses of DEMs Creation of orthophoto maps Cut and fill problems in road design and other civil engineering and military engineering applications Analysis of cross country visibility Planning of roads, locations of dams etc. Statistical analysis and comparison of terrain Source data for derived maps of aspect, profile curvature, shaded relief and hydrological and ecological modelling Background for thematic maps