Towards an Ontology-based Soil Information System

Similar documents
The role of vocabularies for estimating carbon footprint for food recipies using Linked Open Data

Linked Data Practices for the Geospatial Community

Making Ontology Documentation with LODE

Linked Data: What Now? Maine Library Association 2017

Reducing Consumer Uncertainty

Enrichment of Sensor Descriptions and Measurements Using Semantic Technologies. Student: Alexandra Moraru Mentor: Prof. Dr.

OWLS-SLR An OWL-S Service Profile Matchmaker

H1 Spring B. Programmers need to learn the SOAP schema so as to offer and use Web services.

Linked Open Data: a short introduction

Linked Data: Fast, low cost semantic interoperability for health care?

Semantic Web Fundamentals

SEXTANT 1. Purpose of the Application

a paradigm for the Introduction to Semantic Web Semantic Web Angelica Lo Duca IIT-CNR Linked Open Data:

Semantic Web Fundamentals

Helmi Ben Hmida Hannover University, Germany

A Tutorial of Viewing and Querying the Ontology of Soil Properties and Processes

SEMANTIC WEB AND COMPARATIVE ANALYSIS OF INFERENCE ENGINES

DBpedia-An Advancement Towards Content Extraction From Wikipedia

ISA Action 1.17: A Reusable INSPIRE Reference Platform (ARE3NA)

Development of guidelines for publishing statistical data as linked open data

Semantic challenges in sharing dataset metadata and creating federated dataset catalogs

Racer: An OWL Reasoning Agent for the Semantic Web

Ontological Decision-Making for Disaster Response Operation Planning

Extracting Finite Sets of Entailments from OWL Ontologies

Accessing information about Linked Data vocabularies with vocab.cc

Design & Manage Persistent URIs

Standardization of Ontologies

Linked Data and RDF. COMP60421 Sean Bechhofer

H1 Spring C. A service-oriented architecture is frequently deployed in practice without a service registry

The Semantic Web Revisited. Nigel Shadbolt Tim Berners-Lee Wendy Hall

An Introduction to the Semantic Web. Jeff Heflin Lehigh University

Ontological Modeling: Part 2

COMBINING X3D WITH SEMANTIC WEB TECHNOLOGIES FOR INTERIOR DESIGN

Probabilistic Information Integration and Retrieval in the Semantic Web

Semantics. Matthew J. Graham CACR. Methods of Computational Science Caltech, 2011 May 10. matthew graham

Linked Data: Standard s convergence

TrOWL: Tractable OWL 2 Reasoning Infrastructure

Enhancing Security Exchange Commission Data Sets Querying by Using Ontology Web Language

Semantic Web and Linked Data

Linked data and its role in the semantic web. Dave Reynolds, Epimorphics

Combining Government and Linked Open Data in Emergency Management

Ontology Servers and Metadata Vocabulary Repositories

Linked Data Evolving the Web into a Global Data Space

USING DECISION MODELS METAMODEL FOR INFORMATION RETRIEVAL SABINA CRISTIANA MIHALACHE *

Querying Description Logics

Semantic Web. Ontology Pattern. Gerd Gröner, Matthias Thimm. Institute for Web Science and Technologies (WeST) University of Koblenz-Landau

Reducing Consumer Uncertainty Towards a Vocabulary for User-centric Geospatial Metadata

When using this architecture for accessing distributed services, however, query broker and/or caches are recommendable for performance reasons.

Observations and Measurements

Integrating SysML and OWL

APPLYING KNOWLEDGE BASED AI TO MODERN DATA MANAGEMENT. Mani Keeran, CFA Gi Kim, CFA Preeti Sharma

Ontology-Based Configuration of Construction Processes Using Process Patterns

SEPA SPARQL Event Processing Architecture

SRI International, Artificial Intelligence Center Menlo Park, USA, 24 July 2009

Jeffery S. Horsburgh. Utah Water Research Laboratory Utah State University

Research Article A Method of Extracting Ontology Module Using Concept Relations for Sharing Knowledge in Mobile Cloud Computing Environment

Mastro Studio: a system for Ontology-Based Data Management

Semantic Integration with Apache Jena and Apache Stanbol

Leveraging metadata standards in ArcGIS to support Interoperability. David Danko and Aleta Vienneau

Querying the Semantic Web

CSc 8711 Report: OWL API

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

Introduction to Linked Data

Opening, Closing Worlds On Integrity Constraints

Grid Resources Search Engine based on Ontology

Ontology Refinement and Evaluation based on is-a Hierarchy Similarity

OWL extended with Meta-modelling

Novel System Architectures for Semantic Based Sensor Networks Integraion

An FCA Framework for Knowledge Discovery in SPARQL Query Answers

The Emerging Web of Linked Data

Publishing Statistical Data and Geospatial Data as Linked Data Creating a Semantic Data Platform

Falcon-AO: Aligning Ontologies with Falcon

Semantic Web: vision and reality

Towards an Integrated Information Framework for Service Technicians

Study and guidelines on Geospatial Linked Data as part of ISA Action 1.17 Resource Description Framework

a paradigm for the Semantic Web Linked Data Angelica Lo Duca IIT-CNR Linked Open Data:

Global Soil Data Task - part of Earth Data Sets (IN-02-C2) Vincent van Engelen - ISRIC Task Point of Contact

Representing Product Designs Using a Description Graph Extension to OWL 2

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

CHAPTER 1 INTRODUCTION

Library of Congress BIBFRAME Pilot. NOTSL Fall Meeting October 30, 2015

Database Refactoring to Increase/Retrieve Information From Precision Agriculture Information System

ARKIVO: an Ontology for Describing Archival Resources

Proposal for Implementing Linked Open Data on Libraries Catalogue

Ontology Matching and the Semantic Web

THE DESCRIPTION LOGIC HANDBOOK: Theory, implementation, and applications

Lecture Telecooperation. D. Fensel Leopold-Franzens- Universität Innsbruck

Maintaining Integrity Constraints in Relational to OWL Transformations

Energy-related data integration using Semantic data models for energy efficient retrofitting projects

THE GETTY VOCABULARIES TECHNICAL UPDATE

Bridging the Gap between Semantic Web and Networked Sensors: A Position Paper

Cross-Fertilizing Data through Web of Things APIs with JSON-LD

Interlinking the web of data: challenges and solutions

Knowledge Representation in Social Context. CS227 Spring 2011

Enriching Query Routing Processes in PDMS with Semantic Information and Information Quality

The European Commission s science and knowledge service. Joint Research Centre

ONTOSEARCH2: SEARCHING AND QUERYING WEB ONTOLOGIES

Semi-automatic Composition of Web Services using Semantic Descriptions

Ontology Creation and Development Model

OWL DL / Full Compatability

Transcription:

21st International Congress on Modelling and Simulation, Gold Coast, Australia, 29 Nov to 4 Dec 2015 www.mssanz.org.au/modsim2015 Towards an Ontology-based Soil Information System Yanfeng Shu a and Qing Liu a a Private Bag 12, Hobart, TAS 7001, Australia Email: Yanfeng.Shu@csiro.au Abstract: Environmental information is critical to the sustainable use and management of the world s resources. Soils are a fundamental part of the environmental information requirement, and appropriate soil data and information are crucial to support evidence-based policy, planning and resource management decisions. For data to be useful, one basic requirement is that they be interpretable. Sufficient information should be provided to allow data to be unambiguously interpreted and used. Examples of such information include the location at which the soil was sampled, the property that was measured, the unit of measure, and quality assurance and quality control information. Furthermore, data should be easily integrated with other data sources, which is required in many modelling applications. For example, simulation of crop production may require, besides soil data, also weather, crop and fertilizer data. To meet these requirements, we have developed a soil ontology for modelling soil information. In this paper, we focus on the design of the ontology and its potential applications. We describe the use of the ontology to facilitate data access by mapping data to the ontology and making them available as Linked Data. We also discuss applications of the ontology for data integration, data classification and data validation. Keywords: Soil information, ontology, Linked Data, data integration, data classification 1462

1 INTRODUCTION Environmental information is critical to the sustainable use and management of the world s resources. Soils are a fundamental part of the environmental information requirement. They underpin agricultural sustainability and food production, climate change and carbon sequestration, biodiversity conservation and the provision of ecosystem services, human health and infrastructure development (DAFF, 2011). Appropriate soil data and information are crucial to support evidence-based policy, planning and resource management decisions (Wilson, 2012). For data to be useful, one basic requirement is that they be interpretable. Sufficient information should be provided to allow data to be unambiguously interpreted and used. Examples of such information include the location at which the soil was sampled, the property that was measured, the unit of measure, and quality assurance and quality control information. Furthermore, data should be easily integrated with other data sources, which is required in many modelling applications. For example, simulation of crop production may require, besides soil data, also weather, crop and fertilizer data. To meet these requirements, we have developed a soil ontology for modelling soil information. We define the ontology by capturing the essential semantics of soil data in terms of concepts and relationships. We express the ontology using the Web Ontology Language (OWL) 1. OWL is the language recommended by the World Wide Web consortium (W3C) for representing ontologies. The logical foundation of OWL is provided by description logics (DLs) (Baader et al., 2003). Being formal logic-based, OWL enables precise representations of knowledge. Moreover, one can reason about representations in OWL and check whether they are consistent or not. As such, OWL is an excellent candidate for a lingua franca in the soil domain, and able to inject the necessary automatization for soil data interpretation and integration. In this paper, we focus on the design of the soil ontology and its potential applications. We describe the use of the ontology to facilitate data access by transforming data into ontology individuals and making them available as Linked Data (Bizer et al., 2009). We also discuss applications of the ontology for data integration, data classification and data validation. The rest of the paper is organised as follows. Section 2 gives details of the soil ontology. Section 3 describes the use of the ontology for transforming soil data into Linked Data. Section 4 outlines several other applications of the ontology. And finally, Section 5 concludes the paper and points out future work. 2 THE SOIL ONTOLOGY We construct the ontology by leveraging existing modelling efforts, including the Open Geospatial Consortium (OGC) s O&M model (Cox, 2007a,b): a conceptual model for observations, measurements and sampling features. ANZSoilML 2 : a GML-based soil information model developed by CSIRO and Landcare Research (NZ). During the construction, we also reuse existing ontologies for describing locations and time: the wgs84 pos vocabulary 3 : modelling anything in space and its geo-location. the W3C Time ontology 4 : modelling temporal units, temporal entities, instants, intervals, etc. Figure 1 shows part of the ontology. A key concept of the ontology is SoilFeature, which represents a conceptual feature that is hypothesized to exist coherently in the world. We define SoilProfileElement as a type of SoilFeature and SoilLayer as a type of SoilProfileElement. Related to SoilFeature are concepts such as Metadata which describes some contextual information of a SoilFeature, e.g. the data source information; SoilClassification which specifies the soil classification terms from an appropriate soil classification scheme (e.g. the Australian Soil Classification); 1 http://www.w3.org/tr/owl-ref/ 2 http://www.clw.csiro.au/aclep/anzsoilml/trunk/docs/html/index.html 3 http://www.w3.org/2003/01/geo/wgs84_pos 4 http://www.w3.org/tr/owl-time/ 1463

Site SpatialEntity SoilColour SoilMaterial composition Category SoilTexture geo:location occurrence soilcolour composition geo:point Metadata metadata SoilFeature SoilProfile Element SoilLayer classifier related Sample SoilClassification SoilSample Property Physical Property propery Quantitative Measure depthrange depthrange QuantityRange uom related Observation observed Property property Result AustralianSoil Classification time:instant resulttime Observation observedresult Quantity uom UnitOfMeasure Figure 1. Portion of the soil ontology. SoilColour which describes the soil colour; and SpatialEntity which describes the spatial information of a SoilFeature. We define Site as a subclass of SpatialEntity, and model its location information by geo:point defined by wgs84 pos. A SoilFeature may have associated observations and measurements. We capture this information using the concept QuantitativeMeasure, and associate a QuantitativeMeasure with a depth range QuantityRange, an observed Property and an observed result Quantity. Examples of soil properties include Bulk Density (BD), drained upper limit (DUL), saturation (SAT), ph and EC. A SoilFeature may also have associated SoilSamples. Given a SoilSample, there may exist an Observation, which is associated with an observed Property, an observed result Quantity, and an observation time time:instant defined by the W3C Time ontology. 3 MAKING SOIL DATA AVAILABLE AS LINKED DATA In this section, we show how to facilitate soil data access by using the ontology described earlier. To facilitate data access and use, we would like to have data available as Linked Data. Linked Data refers to data published on the Web in such a way that it is machine-readable, its meaning is explicitly defined, it is linked to other external data sets, and can in turn be linked to from external data sets (Bizer et al., 2009). We use APSoil as an example to illustrate our approach. APSoil 5 is a database of soil water characteristics enabling estimation of Plant Available Water Capacity (PAWC) for individual soils and crops. It covers many cropping regions of Australia and is designed for use in simulation modelling and agronomic practice. Figure 2 shows the interface of APSoil data. To publish APSoil data as Linked Data, we map data to the ontology and translate data into a set of ontology individuals. During translation, we follow the Linked Data principles 6, i.e. Use URIs as names for things. Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL). Include links to other URIs. so that they can discover more things. In particular, we employ the following URI scheme for the generated individuals: The base URI for all generated individuals is http://www.csiro.au/soil/resource/. 5 https://www.apsim.info/products/apsoil.aspx 6 http://www.w3.org/designissues/linkeddata.html 1464

Figure 2. APSoil data. RDF/HTML browsers SPARQL clients Jetty config file Pubby SPARQL endpoint (Fuseki) csv APSoil TDB triple store Data to Ontology mapper Soil ontology Figure 3. The system architecture for publishing APSoil data as Linked Data. The local ID of the URI of an individual is formed by concatenating the concept name and the unique APSoil number of each site, with a possible extension of the soil layer number or the observed property name if required. We use patterned local IDs. The patterns used include <concept name> <APSoil number>, <concept name> <APSoil number> <soil layer number>, and <concept name> at <APSoil number> <soil layer number> on <observed property name>. Suppose a SoilProfileElement whose corresponding AP- Soil number is 781 and soil layer number is 1. Then we can identify it by http://www.csiro. au/soil/resource/soilprofileelement_781_1. Suppose again a BD measurement was made on it. Then we can identify the measurement by http://www.csiro.au/soil/resource/ QuantitativeMeasure_at_781_1_on_BD. In addition, we link site information (i.e. country and state) to DBPedia 7. We then store the generated RDF in a TDB 8 triple store. Finally, we expose data through a Fuseki 9 -facilitated SPARQL endpoint and integrate the endpoint with Pubby 10. In this way, we publish data as Linked Data and make them available for humans and computers. Figure 3 shows the implemented system architecture. There are two ways to access and explore data. One is to send SPARQL 11 queries to the endpoint. For example, 7 http://dbpedia.org/ 8 https://jena.apache.org/documentation/tdb/ 9 https://jena.apache.org/documentation/fuseki2/ 10 http://www.wiwiss.fu-berlin.de/pubby/ 11 http://www.w3.org/tr/sparql11-query/ 1465

Figure 4. The Linked Data browser facilitated by Pubby. to retrieve the latitude and longitude information of all Tasmanian sites, we can submit the following query: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX soil: <http://www.csiro.au/soil/resource/> PREFIX soil-owl: <http://www.csiro.au/ontology/soil#> PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84 pos#> PREFIX dbpedia: <http://dbpedia.org/resource/> SELECT DISTINCT?point?lat?long WHERE{?x rdf:type soil-owl:soilprofileelement.?x soil-owl:occurrence?site.?site soil-owl:state dbpedia:tasmania.?site geo:location?point.?point geo:lat?lat.?point geo:long?long. } ORDER BY?point Alternatively, one can access data through a RDF/HTML browser facilitated by Pubby which adds a Linked Data interface to the endpoint as shown in Figure 4. 4 OTHER POTENTIAL APPLICATIONS OF THE SOIL ONTOLOGY Besides data access, we can also use the ontology for data integration, data classification and data validation. In this section, we discuss applications of the ontology in these aspects. 4.1 Data Integration One challenge in integrating multiple data sources is resolving heterogeneity issues. Data from different sources typically have different formats or structures. Also, it is common for different sources to use different terminologies and value representations. To address these issues, we can use the ontology as the global model and map data to the ontology. In doing so, we move from different structures, terminologies and value representations of data to structurally flat, precise and consistent representations of data. Suppose that in one data source, BD was measured in g/cc, while in another source BD was measured in kg/cc. In such a case, we can use the ontology to specify the standard unit of measure for BD, and meanwhile model the relationship between g/cc and kg/cc, including how one can be converted into another. In this way, we can support automatic unit conversion during data translation or query processing and thus have consistent value representations for BD. 1466

4.2 Data Classification The basic reasoning tasks for OWL are checking whether a concept is subsumed by another concept, or whether an individual is an instance of a concept. We can automate these reasoning tasks using standard OWL reasoners such as FACT++ 12 and Pellet 13 (Sirin et al., 2007). Suppose we want to classify soils into 3 groups: acidic (ph < 5.5), neutral (5.5 ph 8), alkaline (ph > 8). To do this, we introduce 3 concepts to the ontology, i.e. AcidicSoil, NeutralSoil and AlkalineSoil, and define the following SWRL 14 rules: (a) SoilFeature(?s) QuantitativeMeasure(?p) physicalproperty(?s,?p) propertyname(?p, ph) propertyresult(?p?r) Quantity(?r) value(?r,?v) swrlb:lessthan(?v, 5.5) AcidicSoil(?x) (b) SoilFeature(?s) QuantitativeMeasure(?p) physicalproperty(?s,?p) propertyname(?p, ph) propertyresult(?p?r) Quantity(?r) value(?r,?v) swrlb:greaterthanorequal(?v, 5.5) swrlb:lessthanorequal(?v, 8) NeutralSoil(?x) (c) SoilFeature(?s) QuantitativeMeasure(?p) physicalproperty(?s,?p) propertyname(?p, ph) propertyresult(?p?r) Quantity(?r) value(?r,?v) swrlb:greaterthan(?v, 5.5) AlkalineSoil(?x) Then an OWL reasoner (with the SWRL support) will automatically classify soil features into right groups. 4.3 Data Validation Data validation is an important part of data integration and analysis tasks. In a data validation application, complete knowledge is typically assumed for some or all parts of the application domain. This closed-world assumption, however, contradicts to the open-world semantics of OWL which has the following characteristics (Tao et al., 2010): the presence of the Open World Assumption(OWA): a statement cannot be inferred to be false on the basis of failures to prove it. the absence of the Unique Name Assumption(UNA): two different name may refer to the same object. With such a semantics, conditions that trigger constraint violations in closed world systems would generate new inferences in standard OWL systems. To use OWL for the data validation purpose, one way is to employ a closed-world semantics for OWL, and interpret OWL axioms based only on the information explicitly present in the ontology. As such, we can reduce constraint checking to SPARQL query answering by translating each constraint represented as an OWL axiom into a SPARQL ASK query such that when the constraint is violated, the query is entailed, i.e. a non-empty result is returned (Tao et al., 2010). Suppose that we want to ensure that every soil feature has at least one classifier. We can express this constraint by the axiom: SoilFeature 1 classifier.soilclassification, and then check satisfaction of the constraint by translating it into the following SPAQRL query: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX soil-owl: <http://www.csiro.au/ontology/soil#> ASK WHERE {?x rdf:type soil-owl:soilfeature. FILTER NOT EXISTS {?x soil-owl:classifier?y.?y rdf:type soil-owl:soilclassification.} } When the query is executed, a non-empty result indicates that the constraint is violated, i.e. there exists a soil feature without any classifier. 12 http://owl.man.ac.uk/factplusplus/ 13 http://pellet.owldl.com/ 14 http://www.w3.org/submission/swrl/ 1467

5 CONCLUSIONS In this paper, we presented a soil ontology and discussed its potential applications. In particular, we described the use of the ontology to facilitate data access by transforming data into Linked Data. Our ongoing and future work includes further development of the ontology (e.g. extending it with plant ontology), using it to link more datasets, and applying it in agricultural applications. REFERENCES Baader, F., D. Calvanese, D. McGuinness, D. Nardi, and P. Patel-Schneider (2003). The Description Logic Handbook: Theory, Implementation, and Applications. In Cambridge University Press. Bizer, C., T. Heath, and T. Berners-Lee (2009). Linked Data - The Story So Far. International Journal on Semantic Web and Information Systems 5(3), 1 22. Cox, S. (2007a). Observations and Measurements - Part 1 - Observation schema Version 1.0 OGC. In OGC document 07-022r1. Cox, S. (2007b). Observations and Measurements - Part 2 - Sampling Features Version 1.0 OGC. In OGC document 06-188r1. DAFF (2011). A Stocktake of Australias Current Investment in Soils Research, Development and Extension: A Snapshot for 2010-11. Department of Agriculture, Fisheries and Forestry. Sirin, E., B. Parsia, B. C. Grau, A. Kalyanpur, and Y. Katz (2007). Pellet: A Practical OWL-DL Reasoner. Journal of Web Semantics: Science, Services and Agents on the World Wide Web 5(2), 51 53. Tao, J., E. Sirin, J. Bao, and D. L. McGuinness (2010). Integrity Constraints in OWL. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI 10). Wilson, P. (2012). Improving Australian Soil Data and Information Governance. CSIRO Technical Report. 1468