Introduction to LiDAR Our goals here are to introduce you to LiDAR data. LiDAR data is becoming common, provides ground, building, and vegetation heights at high accuracy and detail, and is available statewide. Although using it requires downloading and a fair bit a pre- processing, we ve done the first bits because our main goal is to introduce you to the ideas of LiDAR point clouds, and extracting useful heights from them. Data are delivered in tiles, with a set of points providing x, y, and z values above a datum. Densities are typically one to several points per square meter, although not all the points represent ground heights. Since the data are very detailed, data volumes get large quickly, and so they are often stored in a tiling scheme. We often need to merge together several tiles to obtain the complete coverage of an area, and that is true of our St. Paul Campus. We ve stored data in the 4295W directory on the class share drive, under data/lidar. We ve downloaded data for the: 1m DEM (named CampusDEM_large) 1m hillshade (named Campus_HSL) the raw lidar point cloud data, in the CampusLAZ directory, these will be.las or.laz (a compressed form) You ll see the DEM and hillshade are rasters, and the raw lidar point data are in files with strange codes and a.laz extension. The codes are tile number, las datasets are typically large, and so are usually tiled, and the laz is a compressed format that typically reduces the file size by about 90%. We also processed the data, and have a set of tiles named with a height name appended at the end. These have been processed to a set of point shapefiles so that the vector points each contain the height above ground for the LiDAR return. You should copy the DEM, hillshade, and the building and height subdirectories and files from the data directory directory to a local drive, either on the computer or a portable USB drive. You don t need all the tiles, so you may only copy those for your study area rather than downloading them all if you wish. Each file is large, will multiply through processing, slow things down, and clutter your workspace, so subsetting isn t a bad idea in general, but here you ll probably pay little penalty. To subset, refer to the image below, and only copy the files you need for your project area. The boundaries correspond to the tile borders, and the labels the original.laz/.las file names. We ve also provided copies of the.laz to.las files. You don t need to, but notice the.laz files are much smaller than the.las. Because lidar data are often so large, there is usually a tradeoff between saving space and processing speed. We re working with a small area, so it doesn t much matter, but in many real projects it does. 1
Display the DEM and hillshade surfaces. These are largely from an automated process that first tries to identify the bare earth, or ground only returns, and then build a DEM and a hillshade. Notice the odd shapes, sometimes like triangles, on the hillshade surface within the building footprints. These are artifacts of lidar data processing. The algorithms have a hard time identifying the ground near buildings perfectly, so there are often triangular shapes on or near buildings. You also see a roughness and occasional small bumps over grass or forest areas, these are also artifacts of the processing. Also notice the DEM rasters are in UTM15 NAD83(1986). The LiDAR point clouds are also in this datum, so using the DEM and LiDAR points together in the same coordinate system will be fine, but when you are done you will need to transform them to CORS96 UTM 15. You could transform before we begin our LiDAR processing, but if you leave them in NAD83(1986), and then only transform the output, there will be fewer data sets and time to perform. You should know a few more things about LiDAR data. Remember, these LiDAR are from pulses of light sent down from a plane, then reflected back from a target. Precise geometry allows operators to determine the x, y, and z coordinates to within a few inches of each object that reflects back a pulse. There can be many returns measured from each pulse, although we re most often interested in the first returns, to measure object heights, and the last returns, to measure ground elevation. The data are processed before they re delivered, and the collecting organization usually applies algorithms to try to detect ground hits (not all last returns are from the ground), object heights (not all first returns are from the tops of objects), and to provide other information about each data point. Most times each return point contains information on the return number (first, last, or in between), the feature classification (2 = ground, 4 = mid- height vegetation, 5 = high vegetation, 6 = building, 9= water, 0 = unclassified), as well as the scan angle, the strength of the return signal, and other information about the point. Most of your study areas will only use two or three LiDAR tiles. Some may cover four or more. You may want to use batch processing, described in the course materials, for multiple tiles, once you ve established your workflow. 2
Look in the CampusLiDAR directory, in Data directory of the class L drive. Note that there are two subdirectories, one named Buildings, and the other named TallVeg. These have partially processed LiDAR data, as point files. Create I LiDAR directory on your local or USB drive. Create two subdirectories within it, one named Buildings, and the other named TallVeg. Identify the tiles that cover your area in the class TallVeg and Building folder, and copy them from the class drive to their corresponding subdirectories on your local drive, that is, Building tiles to Building, and TallVeg tiles to TallVeg. Load the Building and TallVeg files into a data view for one of your tiles. Verify for each that the proper points have been selected, that is, that most of the building points are concentrated over buildings, and most of the vegetation points concentrated over forests. You can best verify at large scales (when zoomed in), because the point classifier isn t perfect, and given the dot size, the common buildings as trees errors may dominate over a building, or the reverse. It is best to zoom in to a building or two with adjacent trees, and verify that the two files have the points mostly correct. Calculate geometry for z heights to TallVeg Calculate tree heights by subtracting raster value Clean up naming, transfer these and buildings data/directories to class share drive. You can see the Building points, here in yellow, more or less match the building locations. The slight mismatch is because of building lean in the aerial photographs. You might remember that orthophotos correct all locations as if they were on the ground, and the tops of buildings aren t. Thus, the tops get displaced, usually outward from the imaging center relative to the camera position. That s why skyscrapers seem to lean in most vertical orthophotos. 3
Now we should look at heights. First, we need to create a variable to hold our eventual height, measured from the ground to the top of the object. Open the table for each of your point shapefiles, and add a new field named zheight, double or float, precision 12, scale 1. This may take some time for each tile, you could try to use the AddField command as an ArcMap batch job. You are strongly encouraged to do this and the following operations as batch jobs (see course resources, or ask Andy or Paul for help). This will greatly ease your work. Next, we need to extract heights from the points. It is usually more efficient to do it as a batch job with the Add Geometry Attributes tool, specify meters and square meters for units, and our project coordinate system. Note that this tool will also add the x and y coordinates for each point. When the process is finished, use the identify button to query a few building heights (z values). Remember, the values are in meters. Do these values make sense? Are the buildings over 200 meters (650 ft) tall? Why do you think you re getting these values? The Z value represents the orthometric height of the top of the building. This is the height relative to our standard surface, near sea level, and not relative to the local ground surface. We need to subtract the local ground surface from the building z value to get building height. One way is to query the DEM raster below each point, and add it to the table record for each point. We can then subtract the DEM height from the z values we calculated from the Add Geometry Attributes, and get a building height. The Extract Values to Points tool, found at ArcToolbox- > Spatial Analyst Tools- > Extract- > Extract values to points does just this. Specifying the input and outputs will query each cell below each point, and add a column to the new data layer specified. This may take quite a long time, so be patient, and again, this process can be run as a batch job for multiple files. This generates a new column named 4
RASTEVAL that contains the ground elevations near at each point: (ignore the zheight value in the table at left, this is from a test run, in your files it will be zero or null/not assigned) Now you can calculate the building heights for each point. You can do this manually, but it is better to do it as a batch job with the Calculate Field command. Create a new field called something like build_hght for the height above- ground sampled at each point, and then batch job the field calculation. When you are done, you should have a table that looks something like the one to the right: Now, calculate the average height from point samples for one of your buildings. There are several ways to do this, and it would easiest is with the Spatial Join tool. However, it doesn t work much of the time, freezing up, or worse, returning erroneous answers, with no warning that the values are wrong. There is an alternative which appears to work in most cases, that involves multiple steps. First apply the Intersect tool, with the input building polygons, and the processed LiDAR point file. First, we need polygons for a buildings layer, with IDs you ve assigned (typically short integer), and for clarity, a building name. You should already have created this building layer as part of your database. You may have to display both the LiDAR points and an image of the study area with your building, and adjust the building footprint polygon a bit to account for building lean. After ensuring you have a proper buildings file, read the Intersect tool s documentation. Start the Intersect tool, and specify your building and processed (heights calculated) LiDAR points. 5
If you specify All for the join attributes, it will join the attributes by spatial location, into a point file, with the building assigned to each point. We can see the attribute table for the point features below. The last three columns are from the buildings layer, with the building FID, ID, and name, and all the columns before, from FID* through RASTERVAL, are from the point file, including the height for each LiDAR point that landed on the building. 6
Now we need to aggregate the points for each building. The easiest way is through the Summary Statistics tool. This summarized features, and I can assign case attributes. This will calculate the summary statistics for every different value of the case attribute. If I make the building ID my case attribute, it will create a table with summary statistics for each unique building. The tool is shown below: Here I ve specified my intersected building/lidar point set as the input, an output table, and the field, statistics, and case attribute. I am calculating the mean height for the building. You might argue I want the tallest point for a building, but most roofs don t vary much in height, and I was concerned that a building might have a tower or scaffold on top, and hence get the a large error. For tree canopies, a maximum statistic would probably make more sense. Running this creates an output table: Note there is a mean height for each case of the building ID. I can then simply join this to my building layer by 7
the ID column, and copy/save the joined file to get my height for each building. There are some nuances in applying. Multi- leveled buildings may have to be split into the various pieces, and the heights calculated separately to make much sense. If the building footprints are too large, or there are many errors in classification, then there may be a bit of bias downward. Generally, these errors appear to be small in our application, and this measurement is good enough for our use. Perform these processing steps for both the buildings and the canopy data layer, so that each feature has a LiDAR height. As noted earlier, you should probably calculate the maximum statistic in for the canopy data layer, and the mean statistics for the buildings layer, but evaluate and decide for yourself. Turn in a map of showing your subset LiDAR data for your project area, your building polygon, and label each building with your calculated average building heights. Use an image for the background, either one of the high resolution images on the class drive, or one of the WMS layers. Include the usual title, legend, scale, and other standard map elements. Do this same exercise for the trees in your project area. Again, create and turn in a second map of your results, with an image background, the tree polygons, labeled by height. 8