How to perform a quality check of a new dataset QGIS Tutorials and Tips ZanSea zansea@suza.ac.tz
1 Objective GIS datasets can come from many different sources: From a Website. From a USB key given by a colleague From GPS points. From a Commercial data provider. From Open Data Portal Before a dataset can be part of your General database, it is necessary to make a Quality Assessment of the dataset. This tutorial describes the necessary steps for checking the quality of a new dataset and harvest Metadata when possible. 2 Procedure step 1. Place the dataset in a specific folder. That can depend from your Workflow. We recommend you have a DATA_IN folder where you will place all datasets you receive from external sources. The following Architecture could be followed. step 2. Check the file that you have received or downloaded before you open it in QGIS. It can give you a first indication of the quality of your dataset. a. If it is a shapefile, you must have at least the following files: i..shp shape format; the feature geometry itself ii..shx shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly iii..dbf attribute format; columnar attributes for each shape, in dbase IV format If one of them is missing your data is not valid. step 3. Open the dataset in QGIS. If you are prompted for a CRS, by default, use WGS84 and Click OK. Your data should now open. step 4. Let s explore the Properties and Metadata of the file.
a. Open the Properties and Select the General Tab. Check the relevant information you can find. b. Check if you find any relevant information on your file like: projection, file name. Then you can select the Metadata Tab.
step 5. Check the relevant information you can find often, you will not find any of them in this METADATA TAB. You will have to find the relevant information from the source where you got the data: a. From the website you downloaded the data from b. From the institution that gave you the file step 6. After having checked the source, let s explore the geometry of the features and answer the following questions: a. Is it a point, line, polygon? b. How many features are contained within the file? step 7. Open another reference file of your study area (if you work on Tanzania, open the Administrative layer of Tanzania), you can also add a Web Map like Open Street Map or Bing Map or Google Map. step 8. Let s explore the spatial properties of the file. Explore the layer you have loaded and answer the following questions: a. Is the dataset covering my study area? b. Is the dataset covering a more global area? c. Is the dataset well located compared to my reference layer? d. Are the features well distributed or very clustered (some areas are very well covered, others less ) step 9. Now let s explore the attribute table. Select the Field Tab in the Layer Properties. You have a list and a description of all fields contained in the Layer Attribute Tables.
step 10. To get a better understanding of the content of the attributes, you have to open the Attribute Table. Click Right on the Layer Name and select Open Attribute Table.
step 11. Inspect all fields and see if the informations provided are relevant by asking you the following questions: a. Are the fields containing information? b. Are numeric fields showing values? c. Are the title of the fields step 12. At the end of this quality check, if you think the data is of bad quality, then just leave it in the DATA_IN or delete it. step 13. If the quality check showed the data is relevant for being part of the general DATABASE then do the following steps: a. Copy the file in the DATABASE in a folder you will call : SRC_source b. Fill in the METADATA_LIST with as many details as you can. As a minimum, you should have the following information: folder the folder where is stored the data (SRC_Name of the main source) file_name the GIS file name of your file Title the name of the file as it will appear in Geonode Abstract a description of the file Purpose the purpose of this file Restrictions indicate if there are any restrictions License is there any specific license attached Public Domain is this data in the public domain Spatial Representation Type indicate the Spatial representation type : vector, raster Data quality statement give as many indication as you can on the quality of the layer Point of contact Who has created this layer? Metadata author Who has created the Metadata Category What is the category of this layer (follow the Geonode categories) scale original_source URL 3 Questions 4 Exercise yourself