Geodatabases Dr. Zhang SPRING 2016 GISC 1401 10/03/2016
Using and making maps Navigating GIS maps Map design Working with spatial data Spatial data infrastructure Interactive maps Map Animations Map layouts Geoprocessing Digitizing File geodatabases Geocoding
Outline Tables Geocodes Data Table Joins Spatial Joins Spatial Data Formats Geodatabases Calculating Geometry 3
TABLES 4
Two kinds of tables in ArcGIS Feature attribute table of map layer Attribute data is part of map layers Data table with geocodes (such as census IDs) Can add as table to ArcMap Can join to map layer to add more attributes to layer Join via same geocode values in both the data table and map layer s attribute table Census data example too many census variables to supply already in feature attribute table, so download custom table and join to appropriate polygon layer 5
Data table format Rectangular table with one value per cell Columns (fields) are attributes Rows are observations (records) 6
Data table format First row must have column names that are selfdocumenting labels E.g., Shape, POP2000 First character of attribute name must be a letter Remaining characters can be any letter, digit, or the underscore character (but no blanks) 7
Data table format All additional rows of a data table must contain only attribute values (raw data) none of the rows can be sums, averages, or other statistics for raw data rows 8
Primary keys Each table has a primary key attribute with two properties Each value is unique There are no null values 9
Field calculator Add computed columns in ArcGIS ArcGIS does not have the query capacity of relational database packages to compute new columns on the fly So must create permanent new columns Full range of computation Can add, multiply, etc. Has numeric and text functions Can concatenate text values 10
Field calculator (numeric) 11
Field calculator (text) Concatenate house number and street fields 12
External table file formats for import into ArcGIS Plain ASCII text with comma separated values (.csv) Very transportable format, very large files Each table record is a row terminated with a line-break character (invisible, non-printing value) Has values separated by a delimiter, usually a comma For data values that contain the delimiter, enclose the value in double quotes Sometimes columns get wrong data type on import (use double quotes to force text data type for digits, say for house numbers) 13
External table file formats for import to ArcGIS Excel (.xls,.xlsx) Excel 2003, up to 65,000 rows and 256 columns Excel 2007, up to 1,048,576 rows and 16,384 columns dbase database table(.dbf) Legacy format ArcMap truncates field names to 1 st 10 characters dbase IV has maximum of 255 columns Can open dbase file in Excel but cannot save dbase from Excel Microsoft Access database (.mdb) Up to 2GB file size See following for other limits: http://www.databasedev.co.uk/access_specifications.html 14
GEOCODES 15
Geocodes (2000) Federal Information Processing Standards (FIPS) Developed by the National Institute of Standards and Technology Codes for place names throughout the world countries states/provinces counties metropolitan statistical areas (MSA s) cities places - Indian reservations, airports, and post offices in the US See http://www.genesys-sampling.com/pages/template2/site2/61/default.aspx for additional geocodes. 16
Geocodes: hierarchical FIPS codes (political boundaries) Country: US State: 42 (Pennsylvania) County: 003 (Allegheny) Minor Civil Division: 4200361000 (Pittsburgh) Census codes (statistical boundaries) Tract: 1917 Block Group: 003 Block: 005 (US420031917003005) Local government cadastral data (legal boundaries) Parcel Block & Lot number 0096-P-00210000000 (1690 Seaton St, Pittsburgh, PA 15226)
World and US 18
US and state 42 State 42 and county 003 19
County 003 and municipality 61000 Municipality 61000 and tract 1917 20
Tract 1917 and block group 003 Block group 003 and block 005 21
Geocodes (2010) ANSI Codes American National Standards Institute Codes Replace the Federal Information Processing Standards (FIPS) The entities covered include: states and statistically equivalent entities counties and statistically equivalent entities named populated and related location entities (such as, places and county subdivisions) American Indian and Alaska Native areas See http://www.census.gov/geo/www/ansi/ansi.html 22
DATA TABLE JOINS 23
Review: table joins Puts two tables together, on the fly, to make one table One-to-one join (e.g., join state attribute data to state shapefile by StateName) One-to-many join (e.g., join code table to feature attribute table to add code description. Many records can use the same code value.) Each table in a join must have key attribute for matching Must have same values and data types for key in both tables 24
Example join + = 25
Problems with Joins Field types are different (e.g. one is numeric and one is text) Text values left align while numeric values right align 26
Solution Create a new field of the same type and use field calculator 27
Solution Both tables are same field types 28
Problems with Joins Data format varies Must remove dashes 29
SPATIAL JOINS 30
Spatial Joins Joins using shape (not attribute field) Enables data aggregation (counting or summing points by polygon) Common Spatial Joins Points to Polygons (counts) Polygons to Points (adds text) Points to points (distances) 31
Points to polygons How many businesses are in each neighborhood? Start with: Business points Neighborhood polygons 32
Points to polygons Right-click neighborhoods > Joins and Relates > Join 33
Spatial join result New polygon layer with count of points (number of architects and engineers) 34
Spatial join result Show as a choropleth map, with labels, or table Neighborhood Name Count Central Business District 53 Southside Flats 14 Shadyside 9 Bloomfield 8 Lower Lawrenceville 8 North Shore 8 Squirrel Hill South 6 Strip District 6 Point Breeze 4 Squirrel Hill North 4 Garfield 3 South Oakland 3 Friendship 2 North Oakland 2 Carrick 2 Central Lawrenceville 2 East Allegheny 2 Mount Washington 2 East Liberty 1 Central Northside 1 Westwood 1 Banksville 1 Brookline 1 Perry North 1 Highland Park 1 Larimer 1 Allegheny West 1 Middle Hill 1 Bluff 1 Southside Slopes 1 35
Points to polygons What neighborhood is a business in? Start with: Business points Neighborhood polygons 36
Polygons to points Right-click business points> Joins and Relates > Join 37
Spatial join result Point shapefile with neighborhood data on each business 38
Points to points How close is the nearest bus stop to a business? Start with: Business points Bus stop points 39
Points to points Right-click business points> Joins and Relates > Join 40
Result Distance field added to new layer of businesses and stops joined 41
SPATIAL DATA FORMATS 42
Esri legacy format: Coverage Folder with multiple files Can have points, lines, and/or polygons Has several intermediate data products (topology) to speed up processing (now calculated on the fly) 43
Esri legacy format: Shapefile Multiple files, all with the same name but different file extensions No intermediate data products, but has indices to speed data processing Widely used to share spatial data files 44
Geodatabases A geodatabase is a container used to hold a collection of datasets (GIS features, tables, raster images, and other objects) Country layer World.gdb Graticule layer 45
The architecture of a geodatabase The geodatabase storage model is based on a series of simple yet essential relational database concepts and leverages the strengths of the underlying database management system (DBMS). Simple tables and well-defined attribute types are used to store the schema, rule, base, and spatial attribute data for each geographic dataset. This approach provides a formal model for storing and working with your data. Through this approach, structured query language (SQL) a series of relational functions and operators can be used to create, modify, and query tables and their data elements. 46
The architecture of a geodatabase You can see how this works by examining how a feature with polygon geometry is modeled in the geodatabase. A feature class is stored as a table, often referred to as the base or business table. Each row in the table represents one feature. The shape column stores the polygon geometry for each feature. The contents of this table, including the shape when stored as a SQL spatial type, can be accessed through SQL. 47
Geodatabase storage in relational databases two primary sets of tables; system tables and dataset tables. At the core of the geodatabase is a standard relational database schema (a series of standard database tables, column types, indexes, and other database objects). The schema is persisted in a collection of geodatabase system tables in the DBMS that defines the integrity and behavior of the geographic information. Well-defined column types are used to store traditional tabular attributes. When the geodatabase is stored within a DBMS, spatial representations, most commonly represented by vectors or rasters, are generally stored using an extended spatial type. 48
Dataset table Dataset tables Each dataset in the geodatabase is stored in one or more tables. The dataset tables work with the system tables to manage data. 49
Enterprise geodatabases Practically unlimited size and multiple simultaneous users Use enterprise data management systems Store spatial datasets in a number of DBMSs: IBM DB2, Microsoft SQL Server, Oracle, or Postgres 50
Personal geodatabase Parallels enterprise geodatabase but on PC Stores datasets in a Microsoft Access.mdb file Limited to 2GB Much overhead in space and extra structure Tempting to apply one s own Access skills, but needs ArcGIS Catalog utility for manipulation 51
File geodatabase Esri s replacement for shapefiles Vector and raster map layers Other objects (tables) Stores one or more datasets in a folder of files with.gdb extension Can be up to 1 TB in size Can be used across platforms Can be compressed and encrypted for read-only, secure use 52
View geodatabases Cannot identify names in Windows Explorer Must use ArcCatalog 53
Non-Esri vector formats Interoperability Ability of different vendors hardware and software to share data Driven by the Internet with standards evolving for open data access (International Organization for Standardization, Open Geospatial Consortium, US Federal Geographic Data Committee) Over 110 vector file formats available in ArcGIS Data Interoperability extension (http://www.esri.com/library/fliers/pdfs/data-interop-formats.pdf) 54
KML (Keyhole Markup Language) XML schema for Internet-based maps Originally created by Keyhole, Inc. for satellite images and purchased by Google to become Google Maps Provides a set of features (points, lines, polygons, images, text, etc.) with lat/long coordinates plus altitude for 3D viewing KMZ is zipped KML and associated files, needed for upload to Google Maps Portability Can import and export KML/KMZ via ArcToolbox in ArcGIS Can upload to Google maps from your computer 55
XY Data Point data table with X and Y attributes Increasingly popular to include X and Y with data Commonly used for GPS data 56
CALCULATING GEOMETRY 57
Point centroids When displaying or analyzing small polygons it is often better to use point centroids 58
Calculate XY fields Add new X and Y fields in the attribute table 59
Calculate XY fields Calculate geometry for X field, repeat for Y 60
XY field results Results are X and Y values based on map properties (e.g. Long/Lat or XY feet) 61
Export table with XY values 62
Add XY data table 63
Export features XY events should be exported as permanent shapefile or feature class 64
Count point centroids Population can be spatially joined to buffer around polluting companies 65
Other geometry calculations Area Perimeter Length 66
Summary Tables Geocodes Data Table Joins Spatial Joins Spatial Data Formats Geodatabases Calculating Geometry 67
Lab and Assignments Lab: Tutorial 4-1. 4-6 Lab: Assignment 4.1, 4.2 Due: 10/03/2016 Project outline 68