Geocoding: The Basics

Similar documents
Geocoding Reference USA data in ArcMap 9.3

1. Start ArcMap by going to the Start menu > All Programs > ArcGIS > ArcMap.

Geographical Information Systems Institute. Center for Geographic Analysis, Harvard University. LAB EXERCISE 1: Basic Mapping in ArcMap

Explore some of the new functionality in ArcMap 10

Key Terms. Attribute join Target table Join table Spatial join

LAB 1: Introduction to ArcGIS 8

Geocoding Address Data

Efficient Geocoding with ArcGIS Pro

Geocoding Address Data

Introduction to GIS & Mapping: ArcGIS Desktop

Geocoding vs. Add XY Data using Reference USA data in ArcMap

Geocoding a dataset in ArcGIS 9.x

Chapter 7. Geocoding in ArcGIS Desktop (ArcMap)

GIS Virtual Workshop: Geocoding

Step by Step GIS. Section 1

West Virginia Site Locator and West Virginia Street Locator

Exercise 1: An Overview of ArcMap and ArcCatalog

Answer the following general questions: 1. What happens when you right click on an icon on your desktop? When you left double click on an icon?

From Non-Spatial Data to Spatial Data. Geocoding & Georeferencing in ArcGIS

In this lab, you will create two maps. One map will show two different projections of the same data.

Manitowoc County, WI Advanced Access GIS Viewer User Documentation

Getting Started with ArcGIS Background and Introduction

ArcMap and Google Earth

Lab 1: Exploring ArcMap and ArcCatalog In this lab, you will explore the ArcGIS applications ArcCatalog and ArcMap. You will learn how to use

Geocoding Reference USA data in ArcMap 10.1

NAACCR Geocoding Tutorial

INTRODUCTION TO GIS WORKSHOP EXERCISE

GIS Virtual Workshop: Buffering

Georeferencing Tutorials and Recommended Practices

Lab Assignment 4 Basics of ArcGIS Server. Due Date: 01/19/2012

Add to the ArcMap layout the Census dataset which are located in your Census folder.

Overview of SAS/GIS Software

ArcGIS Online (AGOL) Quick Start Guide Fall 2018

Basic Tasks in ArcGIS 10.3.x

Introduction to GIS 2011

Georeferencing. Georeferencing: = linking a layer or dataset with spatial coordinates. Registration: = lining up layers with each other

GIS Basics for Urban Studies

An Introduction to Geographic Information Systems (GIS) using ArcGIS 9.2

Geographic Information Systems (GIS 101) Workshop Sponsored by U-Spatial

Exercise 1: Getting to know ArcGIS

Geospatial Day II Introduction to ArcGIS Editor for Open Street Map

Geography 281 Map Making with GIS Project Three: Viewing Data Spatially

Mapping Tabular Data Display XY points from csv

1. Open the New American FactFinder using this link:

WOMEN'S INTERAGENCY HIV STUDY SECTION 40: GEOCODING PROTOCOL

Visual Studies Exercise.Topic08 (Architectural Paleontology) Geographic Information Systems (GIS), Part I

Project 2 CIVL 3161 Advanced Editing

Working with International Census Data in GIS

GIS LAB 1. Basic GIS Operations with ArcGIS. Calculating Stream Lengths and Watershed Areas.

The ArcMap Interface and Using Basic ArcMap Analysis Tools to Create a Map

Geography 281 Map Making with GIS Project Two: Map Design Issues in ArcMap

Chapter 7. A Quick Tour of ArcGIS Pro

Geographical Information Systems Institute. Center for Geographic Analysis, Harvard University

Greenville County Internet Mapping System User s Guide

Chapter 10. Geocoding

Downloading shapefiles and using essential ArcMap tools

Working with 2000 Census Data in ArcGIS: Using the US Census Bureau Web Site for SF1 and SF3 Data

GIS Change Requests: Web Base change requests will replace MSAG Update paper form.

Tutorial 1 Exploring ArcGIS

GIS IN ECOLOGY: CREATING RESEARCH MAPS

Getting to Know ModelBuilder

-In windows explorer navigate to your Exercise_4 folder and right-click the DEC_10_SF1_P1.csv file and choose Open With > Notepad.

NAACCR Webinar Exercises. May 6, 2010 Kevin Henry Francis Boscoe

Downloading 2010 Census Data

GIS Exercise 10 March 30, 2018 The USGS NCGMP09v11 tools

QGIS Tutorials Documentation

ArcGIS Basics: Mapping the US

Chapter 17 Creating a New Suit from Old Cloth: Manipulating Vector Mode Cartographic Data

GIS LAB 8. Raster Data Applications Watershed Delineation

STUDENT PAGES GIS Tutorial Treasure in the Treasure State

Getting Started with GIS using ArcGIS 10.6 What is GIS? and - Module 1 Creating a map document

Basic Queries Exercise - Haiti

Session 3: Cartography in ArcGIS. Mapping population data

Geodatabases. Dr. Zhang SPRING 2016 GISC /03/2016

Exercise One: Estimating The Home Range Of An Individual Animal Using A Minimum Convex Polygon (MCP)

LAB EXERCISE #1 (25pts)

Introduction to Geospatial Technology Lab Series. Lab: Basic Geospatial Analysis Techniques

Priming the Pump Stage II

REDI 5.0 User Manual

QGIS LAB SERIES GST 102: Spatial Analysis Lab 3: Advanced Attributes and Spatial Queries for Data Exploration

Geography 281 Mapmaking with GIS Project One: Exploring the ArcMap Environment

Using Spatial Data in a Desktop GIS; QGIS 2.8 Practical 2

Linear Referencing in ArcGIS. GIS by ESRI

Spreadsheet definition: Starting a New Excel Worksheet: Navigating Through an Excel Worksheet

Editing Parcel Fabrics Tutorial

Schematics in ArcMap Tutorial

This support note will discuss several methods of creating no spray zones for Sentinel GIS>

Mapping Tabular Data

GPS TO GIS EXERCISE (30 PTS)

Data Assembly, Part II. GIS Cyberinfrastructure Module Day 4

8 Querying and Selecting Data

Editing with ArcGIS. Contents

Mapping 2001 Census Data Using ArcView 3.3

Finding GIS Data and Preparing it for Use

Geog 459: Geographic Information Systems. Lesson 2 Laboratory Exercise

Overview of ArcGIS Online Applications. Champaign County

QGIS LAB SERIES GST 102: Spatial Analysis Lab 2: Introduction to Geospatial Analysis

How to calculate population and jobs within ½ mile radius of site

Working with Attribute Data and Clipping Spatial Data. Determining Land Use and Ownership Patterns associated with Streams.

Table of Contents. 1. Prepare Data for Input. CVEN 2012 Intro Geomatics Final Project Help Using ArcGIS

Transcription:

Geocoding: The Basics Karyn Backus CT Dept of Public Health November 2012 ArcGIS 10 This guide is NOT an introduction to GIS or to geocoding. In order to use this guide, it is expected that you are familiar with ArcGIS software and general cartography concepts. If you are not familiar with the software, I recommend that you read through ArcGIS extensive Web-based Help. Some of the concepts can be initially confusing (e.g., coordinate systems and projections) and it is important that you clearly understand them before progressing with your projects or your geocode results may be invalid (although you might not even realize it). As of November 29, 2012: The custom DPH address locator is available on the U drive within the TA_Streets.gdb (geodatabase) folder. U:\SHAREDOC\A_DPH_GIS_User_Group\Geocoding\Address_Locator\TA_Streets.gdb This is an ESRI / ARCGIS specific file type and details about its contents can be viewed through ArcCatalog or ArcMap. Please do not edit or move the.gdb folder or its component files.

TABLE OF CONTENTS Introduction... 1 File Preparation.. 2 Raw Address Data..... 2 Address Locators.. 5 Geocoding an Address Dataset 12 Rematching Addresses 16 Joining Geocoded Points to Towns and Census Levels.... 18 Summarizing Joined Data... 20 Adding Latitude and Longitude Coordinates. 23

INTRODUCTION Geocoding is the process of assigning a location, usually in the form of coordinate values, to an address by comparing the descriptive location elements in the address to those present in the reference material (excerpt from ESRI Help). The key to this sentence is that there are three components to geocoding. The first is the set of addresses that you want to locate on a map. In this guide, these addresses are referred to as raw because they have yet to be geoprocessed. The second component is the reference material. Currently, Tele Atlas is the company that supplies the street reference database for Connecticut s street network. Third, ArcGIS uses an address locator utility to define how the raw address is matched to the street reference database so that a location/coordinate can be assigned. When all three of these elements come together, data can be geocoded. In the first chapter of the guide, the process of preparing your data for geocoding is laid out. The file preparation stage is the most complicated and time-consuming portion of the geocoding process. This chapter discusses the need to review and (re)format the variables within your three components so they are consistent with one another prior to geocoding. Since the street reference database is supplied by an external source (Tele Atlas), it is preferred that the raw address data be adjusted to the format of the reference data rather than vice versa. After your raw data is properly prepared, this guide explains what the Address Locator is and how it works. The second chapter is a step-by-step guide to the process of geocoding in ArcMap. At this stage, the geocoding utility is opened and your inputs are set in accordance with your file preparations. Once the inputs are defined, ArcMap processes the raw addresses and displays the resulting geocoded points as a new layer in your map. You may then review and edit the results as needed. The third chapter covers joining and summarizing geocoded data. After your addresses become x/y coordinate points on the map, it is usually preferred to summarize the data into a meaningful presentation. By joining the individual address points to a specific geography, such as a census tract or a flood plain, the points can be summarized (e.g., counts or averages) and displayed using the new geography. The last chapter explains how to add latitude and longitude coordinates to your results. The x,y coordinates of your geocoded points are based on a projected coordinate system (a flat, two-dimensional surface; just like a paper map) specific to Connecticut. A geographic coordinate system (GCS) uses a three-dimensional spherical surface (like a globe) to define locations on the earth and is referenced by its longitude and latitude values. When sharing data with others, particularly outside of the CT, it is often preferred to provide geocoded data using the more universal system of latitude and longitude. Adding latitude and longitude coordinates to your data table requires re-projecting your map to a GCS using the Project tool in ArcToolbox and then adding XY coordinates using the Add XY tool in ArcToolbox. p. 1

FILE PREPARATION The process of geocoding a dataset requires three major elements: 1) A street reference database that contains street segment information for the region of interest. This is the Tele Atlas street centerline file already prepared for your use. 2) An address file that contains the address information that you want to represent spatially. This is the raw address data you want to convert to points on a map. 3) An address locator that standardizes the information in the address file and compares it to the information in the street reference database using pattern matching algorithms. An address locator has been prepared for use throughout the agency. Preparing the raw addresses for geocoding Basically, geocoding is like taking a pin and tacking it to where it belongs on a map. For our purposes, the pin (a single point) is a residential address and the map is a network of all of the streets in Connecticut. Where the pin is tacked to the map is determined by referencing a database of all of the streets in Connecticut. This database does not contain geographic information for every single house or building in Connecticut. Instead, it contains a series of street segments that have already been mapped to a geographic coordinate. Each segment is defined using an address range (e.g. house numbers). Perhaps Main Street in your town runs from 1 to 99. In the reference database, Main Street may be represented with 3 entries: 1-33 Main St, 34-60 Main St, and 61-99 Main St. Geocoding takes an address (e.g. 14 Main St), matches it to a street and specific segment (1-33 Main St) and then interpolates the position of the address within the range along the segment. Interpolation is an estimate. Since 14 is a little less than halfway between 1 and 33, the address would (probably) be placed a little less than halfway along the street segment. Although the interpolation is an estimate, the address is mapped to that specific geographic coordinate. Thus, the process of geocoding is based on matching. An address has several elements that must be consistent with a reference address in order for a match to occur. House Number Pre-directional Street Name Street Type Post-directional Town State Zipcode 1572 Rhey Ave Wallingford CT 06492 17 N Main St Mystic CT 06355 259 S Main St Ext Hartford CT 06105 p. 2

To achieve consistency, changes to the format or structure of the address data may be required. This is often referred to as cleaning or reformatting the data. 1) Variable Structure: address data fields must be in the proper format Street information should be provided as a city-style address House Number, Prefix or Pre-Directional, Street Name, Street Type, Suffix or Post- Directional Secondary address information (apartment, unit, building, etc) is not a part of the street network, so it is not useful for geocoding. Sometimes, secondary information may interfere with the matching process so it is best to leave it out or delete it when possible. PO boxes, rural routes, and place names (like apartment complexes or nursing homes) are not city-style addresses and cannot be geocoded as is. Zones used for geocoding the address should be in separate fields Zones are determined by what is available in the street reference database. Our street reference database contains Town Name, Town Code and Zip (postal) Code. State is not utilized as a zone because we geocoding CT resident addresses. Streets that surround the state borders may be included in the reference dataset. Street and Zone are the only fields necessary to geocode. It is preferable, however, to keep the identification number as well so that, in the future, the geocode results may be linked with other non-address variables. 2) Pattern Matching It is possible to keep all of the dataset s variables in your address dataset, but it can be more efficient to keep the geocode dataset separate from the primary dataset. The geocoded file contains many fields that will not be needed in the final dataset. Keep the geocode dataset separate also ensures that the primary dataset is not compromised during the geocoding process since sometimes the dbf format used by ArcGIS can truncate variable names and field lengths and can change numbers stored as text back to numeric formats. The geocoding process with ArcGIS uses pattern matching to determine if your address is the same as an address in the street reference database. Because pattern matching is used, it is possible to match addresses that are not exactly identical. This is very useful when there are minor differences, such as a transposed letter in the street name. The pattern matching system can also interfere with successful geocoding when unexpected values are detected. Numbers outside or possibly between the street ranges Dashes and unusual characters Missing street elements, like no house number or no street type p. 3

Incorrectly placed street elements: if an apartment number is placed at the beginning of the city-style street address, the pattern matching system may incorrectly interpret the apartment information to be the street information (24 A23 Main St may become 24 A St). PO Boxes and Rural Routes are not city-style addresses. They do not map to a location on the street map and so they cannot be geocoded. Occasionally, the pattern matching software may think it has matched a PO Box or a Rural Route but this is in error. To prevent such instances, it is recommended that non city-style addresses be removed from the address file prior to geocoding. 3) Variable Types/Consistency: Confirm that the raw address variable formats are consistent with the street reference file variable formats. Some variables are defined as text (aka, character or string) and some as numeric, even when they all appear to be numbers. ArcGIS geocoding system may not recognize two variables as having the same value if one variable is numeric and the other is character. ArcGIS will input a variety of data file formats, however text and numeric format definitions may not be retained by all data file formats. This is particularly relevant for CT Zipcodes where the preceding zero may be dropped when read in as a numeric variable. It is recommended that all data be saved as.dbf for input into ArcGIS. Dbf is the format that ArcGIS uses for its system files and output files. By using the same file format every time, conversion errors are minimized. You may have to check and recheck that the.dbf file is properly saving and properly loading the variables you have defined. It may take some work to get the variables to save and load in the proper format. Be aware that dbf may truncate variable names or field lengths. p. 4

Understanding the Address Locator Structure A custom address locator has been prepared for DPH by Karyn Backus. This locator is based on a customized dual streets locator style that was prepared for Karyn by ESRI. This address locator is called a "local address locator" because it resides on our agency network. Do not use the default, online locators provided by ArcGIS 10 with confidential addresses or related information. Since the custom DPH address locator has already been developed, this section will walk through the key parameters that influence the locator but will not demonstrate creating one. address locator 1. [ESRI software] A dataset in ArcGIS that stores the address attributes, associated indexes, and rules that define the process for translating nonspatial descriptions of places, such as street addresses, into spatial data that can be displayed as features on a map. An address locator contains a snapshot of the reference data used for geocoding, and parameters for standardizing addresses, searching for match locations, and creating output. Address locator files have a.loc file extension. address locator style 1. [geocoding] A template on which an address locator is built. Each template is designed to accommodate a specific format of address and reference data, and geocoding parameters. The address locator style template file has a.lot file extension. The US Address Dual Ranges address locator style is used for the majority of common United States street addresses. This address locator style permits you to provide a range of house number values for both sides of a street segment. With this, the address locator can not only deliver a location along the street segment but also can determine the side of the street segment where the address is located. p. 5

address range 1. [geocoding] Street numbers running from lowest to highest along a street or street segment. Address ranges are generally stored as fields in the attribute table of a street data layer. They often indicate ranges on the left and right sides of streets. Each feature in the reference data represents a street segment with two ranges of addresses that fall along that street segment, one for each side of the street. Each street segment in the reference dataset is depicted with a separate color for illustration purposes. Green background represents Cheshire town. Grey background represents Wallingford town. address element 1. [geocoding] One of the components that comprise an address. House numbers, street names, street types, and street directions are examples of address elements. Each of the US street address locator styles has the same requirements for input address data. Tables of addresses that can be geocoded using these address locators must have an address field containing the street number and street name in addition to the street's prefix direction, prefix type, street type, or suffix direction, if any. p. 6

Basic characteristics of address locator styles provided with ArcGIS Styles Typical reference dataset geometry Typical reference dataset representation Address search parameters Examples Applications US Address Dual Ranges Lines Address range for both sides of street segment All address elements in a single field 320 Madison St. N2W1700 County Rd. 105-30 Union St. Finding a house on a specific side of the street US Address One Range Lines One range for each street segment All address elements in a single field 2 Summit Rd. N5200 County Rd PP 115-19 Post St. Finding a house on a street where side is not needed US Address Single House Points or polygons Each feature represents an address All address elements in a single field 71 Cherry Ln. W1700 Rock Rd. 38-76 Carson Rd. Finding parcels, buildings, or address points US Address ZIP 5 Digit Points or polygons ZIP Code region or centroid Five-digit ZIP Code 22066 Finding a specific ZIP Code location General City State Country Points or polygons City within a state and country City name, state name or abbreviation Rice, WA, USA Finding a specific city in a State and Country zone 1. [geocoding] Additional information about a location or address, used to narrow a geocoding search and increase search speed. Address elements and their related locations such as city, postal code, or country all can act as a zone. Many times, additional fields are found on the reference data that further clarify the location of the attribute including postal codes, states, or countries. This type of information is referred to as zone information and can be used to increase the likelihood of a correct match. Although zone fields are optional when creating an address locator, including the zones such as City, State, and ZIP fields is helpful to facilitate nationwide geocoding. The custom DPH address locator uses two zones to geocode addresses: town and zip. Town zone is defined as the official 169 Connecticut towns. Zipcode zone is the postal code value that was provided by Tele Atlas for each segment. alternate name See Also: alias 1. [geocoding] A name for an address element, usually a street name, that is different from the official or most common name. For example, a highway number might be an alternate name for a street name. Using alternate street names allows you to match an address to a feature using one of many names for the feature. The alternate names are provided in the Tele Atlas street segment dataset. Some segments may have as many as 5 alternate names. Each of these is listed in the alternate names table. The custom DPH address locator has the alternate names option included. For example: If Bridge Street is also known as Slash Road, you can find the same location using 266 Bridge Street as you can using 266 Slash Road. p. 7

place-name alias 2. [geocoding] The formal or common name of a location, such as the name of a school, hospital, or other landmark. For example, "Memorial Hospital" is the place name for the address "893 Memorial Drive." In geocoding, the address locator can be set to accommodate the use of place-name aliases in place of their addresses for matching. A place-name alias is a common name of a location, such as the name of a school, hospital, or other landmark. For example, Memorial Hospital is the place name for the address 893 Memorial Drive. Searching for a location can be done either by the address or its place-name alias. In a place-name alias table, each record represents one place-name and its associated address. When a place-name is entered as an input address, the address locator searches for the location based on the alias name's corresponding address. Alias field The place-name alias table must contain a field that stores the place-names. They are the names that will potentially be entered as the input address. For example, if the table contains a list of schools with their associated addresses, the field in the table that contains the actual school name is used as the Alias field. If the same address has multiple place-names, each name with the same corresponding address should be added to the table. If different addresses have the same place-name, additional zone information, such as City, State, or ZIP Code, should be provided in the table. For example, the table can have a record for Public Library with its address in Atlanta, GA, and another record for Public Library in Dallas, TX. Address fields Based on the address locator style you choose, the place-name alias table should contain the same set of address input fields used by the address locator. For example, if an address locator specifies Streets, City, State, and ZIP as the input fields for matching, the place-name alias table should have the same set of fields. These fields contain the actual addresses for the alias names. The custom DPH address locator does NOT have the place name option included. However, the place name table can be added by the user when setting the final parameters for geocoding. See Karyn for more information. p. 8

Understanding the Address Locator Parameters address locator property 1. [geocoding] A parameter in an address locator that defines the process of geocoding. These are the fields that are defined by the user for the current geocoding session. These should be edited for each geocode session by the individual user. Street or Intersection: The name of the field in your input dataset that contains the street elements. City or Placename: The name of the field in your input dataset the contains the town information. This can be either the Town Code or the Town Name. ZIP Code: The name of the field in your input dataset that contains the zip code. This field must have the preceding zeros: the address locator will not match 6450 to 06450. Output shapefile: Remember to set the location and name of your geocoded shapefile. p. 9

The geocoding options parameters for the custom DPH locator are defaulted to the values that result in the highest quality matches as determined by Karyn. These can be edited for each geocode session by the individual user. Spelling Sensitivity: 80 - This setting controls how much variation the address locator allows when it searches for likely candidates in the reference data. The spelling sensitivity setting for an address locator is a value between 0 and 100. A low value for spelling sensitivity allows Universty or Universe to be treated as match candidates for University. A higher value restricts candidates to exact matches. The spelling sensitivity does not affect the match score of each candidate; it only controls how many candidates the address locator considers. The geocoding process takes longer when you use a lower setting because the address locator has to process and compute scores for more candidates. Minimum Candidate Score: 20 - When an address locator searches for likely candidates in the reference data, it uses this threshold to filter the results presented. Locations that yield a score lower than this threshold are not presented. The minimum candidate score for an address locator is a value between 0 and 100. The minimum candidate score determines which candidates are presented in the Interactive Review and Find dialog boxes. Minimum Match Score: 90 - The minimum match score setting lets you control how closely addresses have to match their most likely candidate in the reference data to be considered a match for the address. The minimum match score for an address locator is a value between 0 and 100. A perfect match yields a score of 100. An address below the minimum match score is considered to have no match. When batch geocoding, the minimum match score must be met or exceeded to be considered a match. If more than one match is found, the candidate with the highest match score is assigned. p. 10

Side Offset: 20 feet - An adjustable value that dictates how far away from either the left or right side of a line feature an address location should be placed. A side offset prevents a point feature from being placed directly over a line feature. A side offset that is too low can make it difficult to accurately join the points to polygons when the point is located on or near a boundary line. A side offset that is too high can result in a point being placed so far from the centerline that it lands in a neighboring polygon or parcel. End Offset: 3 percent - An adjustable value that dictates how far away from the end of a line an address location should be placed. Using an end offset prevents the point from being placed directly over the intersection of cross streets if the address happens to fall on the beginning or end of the street. Match If Candidates Tie: Unchecked - This option allows the geocoder to automatically match the input address to a candidate address when there are more than one candidates with the same minimum match score. The match to the potential candidates will be arbitrary Uncheck this option to prevent arbitrary matches when candidates are tied. Candidates that tie based on address elements but have the same x/ y coordinates will be automatically matched even when this option is unchecked. Output the X and Y Coordinates: Checked - This option is used to populate the geocoded data table with a field for X and a field for Y. The coordinates that are output will be in the projection that the current map is in at the time of geocoding unless otherwise specified in the Advanced Geometry Options window before geocoding (see next section). composite address locator 1. [geocoding] A locator that will cycle through several individual locators. Addresses are matched using the style and settings of each locator. Since it is not possible to re-run the locator on just a portion of the input dataset, a composite locator allows the user to define a hierarchy of locators. More than one style (e.g., odd/even numbering, mixed numbering) More than one reference database (e.g., roof top, centerlines) More than one zone field (e.g., town name, town code, postal name) Different sets of parameters (e.g., change in spelling sensitivities) The custom DPH locator is a composite that cycles through two locators based on the dual-streets style with alternate names: once using Town Code and Zipcode and once using Town Name and Zipcode. This allows flexibility for the users to have their input data with either town name or town code. p. 11

GEOCODING AN ADDRESS DATASET Whenever I start a new map, I always load the town layer first to set the projection to CT State Plane. Since we will be geocoding data, I also add the street centerline shapefile layer to the map so that the streets are viewable during interactive match. Now add to the map the address tables that you want to geocode. Sample_addresses is a.dbf file. Hospital_addresses is an.xls file. Although ArcMap works with both formats, it prefers to use.dbf files. p. 12

Right click on the address table you want to geocode and select Geocode Addresses. This will open the Address Locator selection window. If the address locator you want to use is not listed, you may add one that has been previously created. Click Add and navigate to the intended address locator and add it to the listing. Now select your locator of choice and click OK. p. 13

Review the address locator properties as they pertain to this dataset. o The Address Table will be the file that you right clicked on to geocode. If the file has more than one table, like this.xls file, you will be able to choose which table (or worksheet) you want to geocode. o Review the address fields to be sure any auto-populated values are correct. If they are not auto-populated, use the drop downs to choose the appropriate fields. It is not required that you use all of the input fields. E.g., if you want to geocode on Street and Zip only, you can leave City and State as <None>. o Set the path and filename for your geocoded results which will be output as a shapefile with the geocoded information included as part of the attribute table. Click on Advanced Geometry Options to set the spatial reference to our CT standard: NAD_1983_StatePlane_Connecticut _FIPS_0600_Feet. If you want to change any of your predefined address locator properties, you may do so in the Geocoding Options window. E.g., should you want to initially geocode your data with a Minimum Match Score of 95%, you can change that here. Click OK to geocode the addresses. A status window will pop up with the match results. p. 14

Click on the Rematch button to open the interactive window to review your results. OR If you want to review the results of a file that has been previously geocoded, right click on the geocoded file and select Data > Review/Rematch Addresses. This is the Interactive Rematch window that will serve as your one-stop spot for reviewing and editing your address matches. p. 15

From ESRI Help: Rematching with the Interactive Rematch dialog box A typical workflow Rematching a geocoded feature class can be done by using the Interactive Rematch dialog box in either ArcMap or ArcCatalog; however, working in ArcMap allows more options, such as viewing candidates and results on the map, rematching in an edit session, and reverse geocoding with the Pick Address from Map tool. A geocoded feature class is required to rematch an address. When the Interactive Rematch dialog box is open in ArcMap, you can still interact with the map document. This capability allows you to use other tools to inspect candidates more thoroughly. For example, you can pan, zoom in and out, and use the Identify tool to ensure the address is placed in the correct area. You can also resize, minimize, and maximize the Interactive Rematch dialog box to make it easier to work with the rest of ArcMap. The Interactive Rematch dialog box is shown below, and it is numbered to demonstrate a typical workflow. The steps are outlined below the graphic. 1. The Statistics panel shows how many of your original addresses are matched, tied, or unmatched. 2. The Geocoding results table shows the records from the geocoded feature class. It contains the original address data and attributes indicating the status, score and matched address. You choose the address you want to interactively rematch by clicking a record in the table or using the record selector on the lower left side of the panel. 3. The Address panel displays the address that serves as input for matching. You can edit the information in the text box to possibly find a better match. 4. The candidates discovered for the address you selected in step 2 and modified in step 3 are displayed on the Candidates panel. You can examine the list of candidates and choose the one that you think matches your original address the best. 5. The Candidate details panel shows you the same attributes as the Candidates list but displays only one record at a time so it is easier to read. 6. Click the Match button to rematch the address you selected in step 2 to the candidate you chose in step 4. The output attributes (Status, Score, Match_type, Side, and Match_addr) are updated for the selected record in the Geocoding results table. 7. Select another record and repeat steps 3 to 6. p. 16

REMATCHING ADDRESSES Rematching a geocoded feature class in ArcMap allows you to interact with the map while rematching the addresses. Click the address in the Geocoding results table that you want to rematch. Edit the input address, if necessary, in the Address text box or boxes. Click the Search button to search for candidates or refresh the list of candidates. The candidates are highlighted on the map. Click Zoom to Candidates to zoom to the set of candidates for the address. Click the candidate in the Candidates list that you want to match the address to. The candidate that you choose is highlighted on the map in yellow; the others are in blue. Click Match. Click the next record in the Geocoding result table or click the left or right arrow of the Record selector to navigate the list and repeat the rematch process. Click Refresh to update the list based on the specified result set query. Once you have finished your interactive matching, you may move on to joining and summarizing the data. NOTE: If you create new layers (e.g., join your data to another layer) using your geocode results and then subsequently make edits to your geocoded results, the edits will not be reflected in the new layer. You will have to re-create the layer to include edits to your geocoded points. Thus, it is recommended that you finalize your geocode results before continuing with your project. p. 17

JOINING GEOCODE POINTS TO SPATIAL BOUNDARIES Although our addresses are now geocoded to have coordinates for each address, we might like to know what census tract the point falls into. By joining the geocoded results with a census tract shapefile, the program will add the census tract to the address table based on which census tract polygon the geocoded point falls within. 1) Add the shapefile of interest (e.g., census tract) to the map. 2) To JOIN the data points to the shapefile: Right click on the Geocoding Result layer. Select JOINS and RELATES > JOIN In the JOINS window, select the option based on spatial location. Choose the layer you want to join your address to. In this example, it is TeleAtlas_Tract. Choose the location you want the joined output file to be saved to and the edit the default file name. Perhaps name the layer deaths06_joined_teleatlas_tract. Click OK. The joining process will take a while to complete. You can monitor the progress in the lower left corner of the ArcGIS window. p. 18

To join the geocoded addresses to the census tracts: Choose the option to join the data based on spatial location. Choose the spatial layer you want the data joined with. Leave the option as it falls inside. Choose the name and location to save the joined dataset. Click OK. The joined file will now be added as a layer to the ArcMap and the census tract variables will be added end to the data table. Right-click on the joined dataset and choose the Open Attribute Table option. The table of all of the variables appears as a new window. Scroll to the right and find the variable of interest. In our case, it is TRACT. Right click on TRACT and choose the Summarize option. p. 19

SUMMARIZING JOINED DATA and DISPLAYING IT After joining individual addresses to a shapefile like the census tract, you may want to summarize the number of cases that fall within each track and display these counts in a map. To summarize the address counts by census tract: Right click on your joined layer and select Open Attribute Table. Right click on the TRACT column head and it will appear as the default field for summarizing. You can change the field here to whichever variable you want to be the primary summary variable. The summary statistic by default is N (number of cases) so you will not need to select an option here. In the next option, you can choose additional fields for which you want a summary value and the statistic by which you want them summarized. Choose the location you want the Summary output file saved to and the file name. Perhaps name this Summary_death06_by_tract. Click OK. Then, say YES to add into existing map. To join the census tracts to the count summary. Right click on the layer you want the summary variable to be mapped onto and choose JOINS and RELATES > JOIN. In our example, you want to right-click on the TeleAtlas_Tract shapefile layer. In the first option of the Join Data window, choose Join attributes from a table since this is what you want to do. Choose the appropriate layer for the 2nd option first (this will define the choices for the 1st and 3rd options). In our example, you want to join the census tract to the summary_deaths_by_tract data layer. For the 1st option, you want to join the layers by the TRACT variable. For the 3rd option, you should choose the same variable (TRACT) as in the 1st option or the two datasets won t have a similar value by which the tables can be joined. Click OK. Finally, it is necessary to symbolize the counts so they display in the map. Right click on the census tract layer and select Properties. Within this window, choose the Symbology tab. In the menu on the left, change the default symbol option from Features > Single Symbol to Quantities > Graduated Colors. For the fields options, change the value field to the appropriate summary variable. In our example, you want to select the Count_Tract variable since this contains the sum of all the cases for each census tract. No normalization parameter is needed unless otherwise determined by you. You can select the color ramp and classification breaks as needed for your display goals. p. 20

Click Apply. To view only the gradient census tracts, remove the individually mapped cases from the display map by unselecting those layers in the layer list. You may also find it helpful to display the census tract number on the map. To do this, turn on the Label features option by first right clicking on the TeleAtlas_Tract layer and clicking the Label features so that a check mark appears on the left. In the Summarize window, select the primary variable you want to summarize for option 1 (by default, this will be the variable you rightclicked on). The summary statistic for this variable is the count of cases by tract. Choose any additional variables that you may want to summarize geographically. Note that your summary statistics are limited. Name and locate your output file. You will probably want to use this output summary file outside of ArcMap since it contains all of the joined and summary information. Click OK. Then click YES to add the summary table into the existing map so that the counts can be mapped. p. 21

To map the count summaries by census tract: Right-click on the census_tract_projected layer and choose JOINS and RELATES > JOINS. Change the join-to option to Join attributes from a table. The define the 2nd option to be the summary table of the deaths by census tract. Now choose the variable by which the layers will be joined. For both the 1st and 3rd options, select TRACT as the variable to base the join on. Click OK. NOTE: It will look like nothing has changed in your ArcMap window, but the layers have been joined. To display the count summaries by tract you must properly symbolize the joined census_tract_projected layer. Right-click on the census_tract_projected layer & select Properties at the bottom of the menu. Click on the Symbology tab. Change the menu option on the left from Features > single symbol to Quantities > Graduated colors. Change the value in the Fields section to the appropriate summary variable which is Count_TRACT in our example. Change the color ramp and classification breaks if needed. p. 22

ADDING LATITUDE and LONGITUDE TO YOUR DATASET Since lat/long coordinates are universal (rather than the State Plane coordinate system we use for CT that was used to create the X/Y coordinated in your current table), you may want to add Latitude and Longitude to the dataset as additional fields. 1) In order to add Latitude and Longitude coordinates to your data table, the map must be projected out of the CT State Plane projection and into the universal North American Datum 1983. Open ArcToolbox > Data Management Tools > Projections and Transformations > Features > Project. Select your joined dataset to Project. Name and place your output shapefile. Choose GCS_North_American_1983 for your Output Coordinate System. Click OK. p. 23

2) Once the map is re-projected, click the tool Add XY Coordinates. ArcToolbox > Data Management Tools > Features > Add XY Coordinates This tool will add the X/Y coordinates for every point into the table based on the current projection. Since the current project is GCS_North_American_1983, the coordinates added are Latitude and Longitude. p. 24

The newly added coordinates are usually labeled as Point_X and Point_Y and appear at the end of the data table. (Longitude is Point_X and Latitude is Point_Y). These Lat/Long coordinates should not be confused with the existing X and Y State Plane coordinates that remain in the table. I recommend that you set the alias names for the Point_X and Point_Y coordinates as Y_Longitude and X_Latitude to prevent confusion. You may also choose to alias name X and Y to X_CtSP and Y_CtSP to indicate these coordinates are CT State Plane. If you make edits to the field names, these changes will not be saved to the layer s source file (T:\GIS_Projects\ \deaths06_joined_teleatlas_tract_wlatlon.shp). They will appear only in the map view. p. 25

The final map looks like this and the final dataset looks like this. p. 26

Additional Notes/Comments: Why use relative versus absolute paths? Using absolute paths, the following are true: You can move the document or toolbox anywhere on your computer and the data will be found when you reopen the document or tool. On most personal computers, the location of data is usually constant. That is, you typically don't move your data around much on your personal computer. In such cases, absolute paths are preferred. You can reference data on other disk drives. Using relative paths, these adjustments are necessary: When moving a map document or toolbox, you must also move the referenced data. When delivering documents, toolboxes, and data to another user, relative paths should be used. Otherwise, the recipient's computer must have the same directory structure as yours. p. 27