I. Project Title Light Detection and Ranging (LIDAR) Processing II. Lead Investigator Ryan P. Lanclos Research Specialist 107 Stewart Hall Department of Geography University of Missouri Columbia Columbia, Missouri 65211 Ph: 573-882-2149 Email: LanclosR@missouri.edu III. Research Goals / Objectives In the Spring of 2000, the Institute for the Development of Commercial Remote Sensing Technologies (ICREST) at the University of Missouri acquired Light Detection and Ranging (LIDAR) data over 70 square kilometers, centered on the City of Springfield, Missouri. Numerous surface modeling methods were evaluated for their vertical accuracy as compared to the source data and other ancillary data, including USGS 7.5-minute DEMs, to determine the most viable option for modeling. Once determined, high accuracy first-return and bare-earth surface models were generated for the flight area. Using these surface models automated building extraction was performed on the data, yielding 69,144 building polygons after the first iteration cleaning, for use in the City s Geographic Information System (GIS). This presentation will discuss the LIDAR processing protocol from start to finish, including raw ASCII text transformations, surface model creation, automated building extraction, polygon cleaning, and LIDAR data limitations. IV. Summary of Research Activities Elevation Accuracy Assessment Our first goal of using LIDAR data was to assess the accuracy of first return elevation values delivered for the flight by TerraPoint, LLC. Using a truth consisting of 60 static Global Positioning System (GPS) points and approximately 49,000 kinematic points collected in the Urban Validation Site of Springfield, Missouri, various surface modeling techniques were assessed to determine the most accurate surface model. This GPS survey was conducted using a survey grade Ashtech Z-Surveyor GPS system, and post processed to sub-meter accuracy. Raw first return LIDAR data for the Urban Validation Site (UVS) was processed and converted into various surface models including Triangulated Irregular Network (TIN), lattice, point grid, topo grid, and multiple kriging variations. As a check for control, the USGS 30-meter DEM for the UVS was tested along with the elevations derived from the static and kinematic GPS points. To perform this accuracy assessment, the following protocol was used for each surface. First, a profile was taken across each surface model in the same location and densified to add vertices. This densified arc was converted to a point file containing the location and elevation values for each point along that line. The resulting point file
elevations were assessed against the truth to determine absolute difference in elevations, and thus the overall elevation accuracy of the surface. The underlying assumption with this method is that the data is continuous, but we found that due to signal absorption and errors in the data, that is not always the case. This method did however allow us to determine the best surface for future analysis and should be considered the standard when conducting future work with the data set. Results shown in Figure 1 indicate that after the completion of the elevation accuracy assessment, the lattice model yielded the best accuracy for the entire UVS with an overall absolute elevation difference of 0.620-meters. Elevation accuracy for the USGS DEM was much worse with an absolute difference of 2.056-meters. Figure 1: Surface Model Accuracy for the Urban Validation Site. UVS Surface Model Accuracy Results: Data Source TIN/LATTICE Point GRID TOPO USGS 30-m DEM Lidar (Raw) 0.620 0.628 0.756 2.056 Static GPS 0.346 0.392 0.386 0.655 Kinematic GPS 1.529 1.726 1.592 1.664 *Absolute Elevation Difference (m) LIDAR Processing Each raw LIDAR text file was uncompressed using WinZip and transferred via File Transfer Protocol (FTP) to a workspace on a RS6000 UNIX machine. This raw text file was assessed using the wc Unix command to determine the number of lines in each file. If the file was found to contain in excess of 5 million lines, the text file was split into smaller files containing no more than 5 million lines each for ease of processing. Each sub-file was processed using an in-house arc macro language (AML) program, which rearranged the text file into a comma-delimited necessary for generation of an Arc coverage. The resulting comma delimited files were loaded into ArcView as a table, and added as an Event Theme. Each Event Theme was then converted to an ArcView shape file then converted to a coverage in ArcInfo using the shapearc. In the point attribute table, both the system number and ID attributes were altered to make the output width large enough to accommodate the number of spaces necessary for a coverage with ~39 million points. Each attribute for the positional information and elevation was altered on each file to the names X, Y and Z, after which all the sub-file coverages were appended in ArcInfo thus creating one coverage of the raw LIDAR points. At this point, all other files were cleaned off to clear the disk space necessary for the subsequent surface modeling. A surface model was developed for each LIDAR point cloud using the following protocol. In ArcInfo, a Triangulated Irregular Network (TIN) model was created from the point coverage using the createtin command. This intermediary model was then used to create a lattice model in ArcInfo using the tinlattice command. This model generated an equally spaced grid from the point cloud with the grid cell surface sloped based on surrounding elevation values. This methodology was developed after the elevation
accuracy assessment determined the lattice model to produce the most accurate results. In an attempt to save disk space and reduce file size, all floating-point grids were converted to integer grids by multiplying all cell values by 100. The resulting grids are substantially smaller than before and still maintain the precision needed for future modeling. This process was performed on all of the raw LIDAR data and for the Terrapoint bald earth LIDAR data resulting in all 16 quarter-quads in grid format. Building Extraction Using the bald earth digital elevation model (DEM) and the raw first-return data DEM, a normalized DEM was created by subtracting the two. That was then used to complete the automated building extraction from the LIDAR data. The normalized DEM is simply described as all features above the surface including all vegetation and buildings. Setting several user parameters, and running several iterations of the in-house AML program allowed the extraction of building footprints from the normalized DEM. These building footprints were then cleaned using the buildingsimplify command in ArcInfo and further cleaned by hand using heads-up digitizing methods in ArcInfo. A second method of automated building extraction was also assessed. This process uses morphology to extract buildings from the normalized DEM in the image software package ENVI. On a small test area, the in-house building extraction program was compared to the ENVI morphology method to determine the most accurate extraction method. Based on the test area, and two different parameter sets for each, the correctness of the in-house building extraction program averaged 95.5% while the ENVI morphology method averaged only 82.5%. This value was found by comparing the extracted buildings to a truth set of building footprints extracted using heads-up digitizing in ArcInfo from a 0.25-meter resolution aerial photography base, georectified using GPS ground control points. From the 75 total reference buildings in the truth set, the in-house program totaled only 3.5 false buildings on average whereas the ENVI morphology method averaged 17 false buildings. Only the ENVI morphology method missed any of the reference buildings where 5 buildings were missed on average, while the in-house program did extract all reference buildings from the data set. Processing Cookbook In an attempt to streamline production of the lattice models from raw ascii text files, a LIDAR cookbook was prepared and distributed to students performing the processing. This cookbook is broken down into separate chapters for the processing and follows the above-mentioned protocol. The result is an Adobe PDF file that is transferable to parties interested in processing large LIDAR data sets using off the shelf software such as the ESRI products that were used. This cookbook proved useful at all steps of processing by documenting commands used and the process by which the protocol was established. Also, when using student hourly workers, the time spent was often distributed between several projects and thus no one person often completed the assigned task. With the use of the cookbook, no matter where a student left off, another was able to pick up from the exact spot and continue processing to completion with little start-up time.
V. Conclusions Upon completion of the LIDAR processing a data transfer was completed with the City of Springfield, Missouri. This transfer was performed in January of 2002 and contained the following data deliverables on CD: 16 - First-Return Arc Grids 16 - Bare Earth Arc Grids 16 - Normalized Arc Grids 16 - Building Extraction Coverages 1 - Appended Building Coverage Total data size was approximately 15 Gigabytes in size and contained a total of 48 grids - 3 grids for each quarter-quad, and 17 building files with the appended coverage containing 69,144 building polygons for the city (Figure 2). Figure 2: Building polygons extracted from LIDAR data numbered 69,144 for the Springfield, Missouri area. Springfield City Limits Building Polygon
From start to finish, the most time intensive processing occurred at the very beginning when attempting to model the extremely large datasets (30+ million points each). During the first attempt, it was found that the data contained errors resulting in a return to the vendor and a re-issue of new data. When processing this data via the original protocol, several limitations were reached in ArcInfo when calculation attribute values for the files. At an undisclosed limit, ArcInfo would not calculate the values for the attribute and simply left the elevation value as zero. This resulted in bands of no-data in the surface models covering large areas of the data. Once it was determined that the problem was a software issue, not a data issue, the current protocol was established and used to model all grids for the study area. The following series of graphics are examples of the types of grids modeled for the LIDAR data. In order they are the raw first return lattice model (Figure 3), the bare earth model for the same area (Figure 4) and the normalized lattice model used for building extraction showing all features above the surface (Figure5). Figure 3: Raw first return LIDAR Lattice model. Notice buildings and surface features as well as areas of signal absorption.
Figure 4: Bare Earth surface model from LIDAR. Notice that remnants of urban features are noticeable in the image. VI. Presentations / Publications LIDAR Surface Modeling and Automated Building Extraction Over Springfield, Missouri, Workshop on Three-Dimensional Analysis of Forest Structure and Terrain Using LIDAR Technology, March 14-15, 2002, Vancouver, British Columbia. Evaluation of Elevation Accuracy from LIDAR Derived Surface Models, ASPRS Annual Conference, April 23-27, 2001, St. Louis, Missouri. Extracting Building Features from LIDAR Imagery Sources, Missouri GIS Conference, March 26-28, 2001, Columbia, Missouri.
Figure 5: Normalized surface model showing all features above the surface. Used primarily for extraction of buildings and vegetation. VII. Students Supported Chilukuri, Venkata - Student Hourly Cooper, Ryan - Graduate Research Assistant Falke, Thomas - Graduate Research Assistant Lanclos, Ryan Graduate Research Assistant Smith, Derek - Graduate Research Assistant VIII. Subject Inventions No inventions or other intellectual property was developed with support from this grant.