Assembling Datasets for Species Distribution Models GIS Cyberinfrastructure Course Day 3
Objectives Assemble specimen-level data and associated covariate information for use in a species distribution model Begin processing data without detailed instructions for previously learned tools Effective workflow and file management Effective Troubleshooting
Workflow Data Facts Derived Datasets Information Knowledge What is available The murky, in-between stuff The solution and the power! What data is available to create the datasets? Work the process backwards What datasets are necessary to answer the question? What is the question? Courtesy of: GeoSpatial Technologies at Work
Workflow End Product: point layer including specimen point observations and associated environmental covariates Desired Data: specimen observations, elevation, climate, land use/land cover, roads, hydrography Data Sources: IPANE, USGS NED, WorldClim, Coastal Change Analysis Program (CCAP), ISCGM, National Atlas What is your rough workflow? What should we set up before we begin?
DEM Data You downloaded New England DEM data from the course website last time. This data was assembled from USGS NED data. Before proceeding, can you determine What is the NED source data? What is the spatial resolution? What are the datum and projection? Check the NED FAQ page: http://seamless.usgs.gov/faq_listing.php?id=2 to help answer these questions The original NED data was aggregated to 90m resolution using the Resample tool and reprojected to Albers Equal Area projection using the Project Raster tool for you.
Climate Data ASCII files are compatible with ArcMap, but must be imported (not added directly) What are ASCII files? Go to the course website and click Data: New England Climate to download the zip data file. The data source is WorldClim Once the download is complete, extract the files
Climate Data In ArcMap, open the ASCII to Raster Tool (Conversion Tools To Raster ASCII to Raster) Navigate to your NE Climate ASCII folder and select bio1_ne.txt as the Input ASCII raster file Name the Output raster bio1_ne. This will create a Grid dataset. You can select either Integer or Floating Point data types. What data type should you select if you don t know the data format? How could you investigate the data type?
Climate Data Once you ve decided the proper data type, select it from the dropdown list and click OK to run the tool The output will be automatically added to the map. Open the Layer Properties Source tab. What is the spatial reference?
Climate Data Where can we find the WorldClim spatial reference information? Once we have the spatial reference information, we need to assign it to the new raster Open the Define Projection Tool (Data Management Projections and Transformations Define Projection) Select bio1_ne as the Input Dataset and set the appropriate Coordinate System Click OK to run the tool Does the raster align with the other map layers? Are the datum and projections the same?
Climate Data We now know that the spatial reference of the climate data is in geographic coordinates using WGS 84 datum, which does not match our other data. Use the Project Raster tool to reproject into the proper projection. Import the other climate ASCII file that corresponds to Annual Precipitation and project the layer with the Project Raster Tool.
Climate Data Right click on one climate layer in the Table of Contents and select Zoom to Layer Does the raster align with the other map layers? Are the datum and projection now the same as the other layers?
LULC Data Go to the course website: http://web2.uconn.edu/cyberinfra/module1/outline.html Save and extract the New England LULC data provided for Day 3. The data source is the 2006 CCAP data (http://www.csc.noaa.gov/digitalcoast/data/ccapregional/i ndex.html) Be sure to check the data (ArcCatalog and/or ArcMap) Are these raster or vector data? Where do the data come from? What is the spatial resolution? What are the datum and projection? Reproject if needed.
LULC Data Open the LULC layer attribute table How many classes are included? Change the symbology (hint: open the layer properties) to Unique Values and display the Description field. Click Add All Values if they are not displayed in the preview screen Click OK
Transportation Networks Go to the ISCGM website: http://www.iscgm.org/cgi-bin/fswiki/wiki.cgi Select Download from the left-hand panel Click Go to registration page under the login boxes Fill in the required information and then check your email Once you receive your user id and password, return to the download page and log in From the available options, select Global Map V.1/V.2
Transportation Networks Select North America from the Continent list and select United States from the Country list Select the Trans option and save the file. The download will take a few minutes.
Transportation Networks Unzip the downloaded folder (gm-usa-trans_1_0) and add the area:trans layer to the map (navigate down a few layers in the files to find this) What was added to the map? What is the extent of the added data? What is the spatial reference for the data? We will focus on the roadl layer. What is this layer?
Transportation Networks Clip the roadl layer if necessary. You will need your New England Boundary file in the proper projection for this. Be sure the projections match BEFORE processing! Suggested processing: reproject New England Boundary file to match roadl layer then clip. BEWARE: remove the : from the output file path when running tools! Reproject the clipped roadl layer if necessary Why is it more efficient to reproject the data after clipping? Is this order of operations always an option?
Hydrography Data Navigate to the National Atlas: http://www-atlas.usgs.gov/atlasftp-na.html#hydro0m Select the North American Atlas- Hydrography shapefile and save the file
Hydrography Data Extract the saved folder, then extract the folder within that Add the layers hydro_p and hydro_l to the map Reproject both layers to WGS84. One reprojected, clip the hydro_p and hydro_l layers to the reprojected New England Boundary file you just created to clip roads
Hydrography Data Change the symbology of the hydro_p layer to Unique Values and set the Value for the display to TYPE The map still looks a bit odd, so open the symbology options again Right click the value 9999 and select remove value. Repeat this for value 15. Also, uncheck the <all other values> option. Click OK You should now have only 2 types of hydro polygon features displayed: 16 and 18 What feature types are represented by these codes?
Hydrography Data Changing the symbology only changes how the data are displayed on the map. What we really want is to subset the polygon hydrography layer to include only types 16 and 18. Open the Select by Attributes tool (Selection Menu Select by Attributes) Set the Layer as the reprojected and clipped polygon hydro layer Set the Method to Create a new selection Create the expression using the operator and Get Unique Values buttons Click OK to run the tool
Hydrography Selection Once the tool is complete, open the attribute table and click the lower button to display only the selected records Verify that your selections include only the types you desire Create a layer from the selected features (Recall: right click in the TofC on the layer name Selection Create Layer from Selected Features
Hydrography Selection Recall that this process creates only a temporary layer file that must be converted to a shapefile! Previously, we have used the Feature Class to Shapefile (multiple) tool. Here we will use a second method to make a shapefile (not better or worse, just different) Right click on the new layer you just created from the polygon hydro selection Select Data, then Export Data
Hydrography Selection Export All features and use the same coordinate system as the layer s source data Set the file path to your day 3 folder (and whatever subfolder you think is appropriate) and name the shapefile (for example: hydro_poly_wgs84_ne.shp) Click OK When promoted, add the new shapefile to the map Remove the temporary layer Clear all selections How would you check to see if the selected features were cleared? Reproject both hydro layers into Albers
Specimen-level Data Pick an IPANE species and download data Bring table into ArcMap as done previously and create a new shapefile Quality control the data Check for and correct (if possible) or delete errant points What datum and projection are used for the IPANE data? Reproject the IPANE points to match the other data layers if necessary
Data Check You should now have 8 working data files for New England: 1 elevation raster 2 climate rasters 1 LULC raster 1 road layer 1 polygon hydrography layer 1 line hydrography layer 1 species occurrence file Ensure that all data files are in the same datum and projection (Albers). Reproject any mis-matched layers. Assess for yourself how well you have managed your data files. Are there ways that you could have stored them more efficiently or in a more organized way?
Assemble Dataset Use the Extract Multi Values to Points tool to add data from the two climate layers, the DEM, and the LULC layer to the specimen-level observations. Open the resulting attribute table. Do all of the appended values make sense?
Results We will continue work on this dataset next week, so be sure to save and organize your work! Ultimately, we will have an analyzable dataset for a species distribution model.
Try to open the attribute table for the DEM layer. You will notice that it is not selectable. Floating Point Rasters Open the Source tab of the Layer Properties. You should already know that this is a raster layer, but the pixel type is also important. Here, the pixel type is floating point. This means that fractional values are allowed. This also means that an attribute table cannot be constructed. How can you access the data contained in the DEM?
DEM Attribute Table If you are willing to lose some precision, you can create an attribute table First, convert the pixel type from floating point to integer. Use the Int Tool (Spatial Analyst Math Int) Name the new raster NE_DEM_int Click OK (this will process slowly)
DEM Attribute Table The resulting raster should automatically have an attribute table If not, we can build one. From ArcToolbox, select Data Management Tools Raster Raster Properties Build Raster Attribute Table Select NE_DEM_int from the Input Raster dropdown list and click OK to run the tool
DEM Comparison Use the identify tool to compare values of the floating point and integer DEM rasters For most applications, do you think having access to the attribute table is necessary? Is the precision reduction is acceptable?
Skills Summary Import ASCII files Define projection vs. reprojection Use additional data servers Modify display classes Select by Attributes Create shapefiles using the Export Data function Manipulate attribute table fields Raster pixel type conversion Effective workflow and data management Effective Troubleshooting