Graphe-Based Rules For XML Data Conversion to OWL Ontology

Similar documents
Setup of epiphytic assistance systems with SEPIA

Syrtis: New Perspectives for Semantic Web Adoption

Tacked Link List - An Improved Linked List for Advance Resource Reservation

Natural Language Based User Interface for On-Demand Service Composition

An FCA Framework for Knowledge Discovery in SPARQL Query Answers

Fault-Tolerant Storage Servers for the Databases of Redundant Web Servers in a Computing Grid

Change Detection System for the Maintenance of Automated Testing

Multimedia CTI Services for Telecommunication Systems

Assisted Policy Management for SPARQL Endpoints Access Control

DANCer: Dynamic Attributed Network with Community Structure Generator

Teaching Encapsulation and Modularity in Object-Oriented Languages with Access Graphs

Formal modelling of ontologies within Event-B

Linked data from your pocket: The Android RDFContentProvider

Scalewelis: a Scalable Query-based Faceted Search System on Top of SPARQL Endpoints

Improving Collaborations in Neuroscientist Community

BoxPlot++ Zeina Azmeh, Fady Hamoui, Marianne Huchard. To cite this version: HAL Id: lirmm

The New Territory of Lightweight Security in a Cloud Computing Environment

Performance Oriented Decision Making to Guide Web Service Lifecycle

Reverse-engineering of UML 2.0 Sequence Diagrams from Execution Traces

YAM++ : A multi-strategy based approach for Ontology matching task

UsiXML Extension for Awareness Support

Blind Browsing on Hand-Held Devices: Touching the Web... to Understand it Better

QAKiS: an Open Domain QA System based on Relational Patterns

KeyGlasses : Semi-transparent keys to optimize text input on virtual keyboard

Structuring the First Steps of Requirements Elicitation

Multi-atlas labeling with population-specific template and non-local patch-based label fusion

Deformetrica: a software for statistical analysis of anatomical shapes

A Resource Discovery Algorithm in Mobile Grid Computing based on IP-paging Scheme

Generative Programming from a Domain-Specific Language Viewpoint

XBenchMatch: a Benchmark for XML Schema Matching Tools

Service Reconfiguration in the DANAH Assistive System

LaHC at CLEF 2015 SBS Lab

Fuzzy sensor for the perception of colour

Taking Benefit from the User Density in Large Cities for Delivering SMS

Leveraging ambient applications interactions with their environment to improve services selection relevancy

Linux: Understanding Process-Level Power Consumption

Sewelis: Exploring and Editing an RDF Base in an Expressive and Interactive Way

XML Document Classification using SVM

Using a Medical Thesaurus to Predict Query Difficulty

How to simulate a volume-controlled flooding with mathematical morphology operators?

Quality of Service Enhancement by Using an Integer Bloom Filter Based Data Deduplication Mechanism in the Cloud Storage Environment

Comparator: A Tool for Quantifying Behavioural Compatibility

Open Digital Forms. Hiep Le, Thomas Rebele, Fabian Suchanek. HAL Id: hal

Malware models for network and service management

Mokka, main guidelines and future

Regularization parameter estimation for non-negative hyperspectral image deconvolution:supplementary material

Semantic Building Information Model and Multimedia for Facility Management

Every 3-connected, essentially 11-connected line graph is hamiltonian

Relabeling nodes according to the structure of the graph

Catalogue of architectural patterns characterized by constraint components, Version 1.0

A Methodology for Improving Software Design Lifecycle in Embedded Control Systems

A Practical Evaluation Method of Network Traffic Load for Capacity Planning

QuickRanking: Fast Algorithm For Sorting And Ranking Data

Geometric Search in TGTP

SIM-Mee - Mobilizing your social network

Study on Feebly Open Set with Respect to an Ideal Topological Spaces

The HMatch 2.0 Suite for Ontology Matchmaking

MUTE: A Peer-to-Peer Web-based Real-time Collaborative Editor

ASAP.V2 and ASAP.V3: Sequential optimization of an Algorithm Selector and a Scheduler

YANG-Based Configuration Modeling - The SecSIP IPS Case Study

Traffic Grooming in Bidirectional WDM Ring Networks

From medical imaging to numerical simulations

A N-dimensional Stochastic Control Algorithm for Electricity Asset Management on PC cluster and Blue Gene Supercomputer

Very Tight Coupling between LTE and WiFi: a Practical Analysis

Comparison of spatial indexes

[Demo] A webtool for analyzing land-use planning documents

The Connectivity Order of Links

Preliminary analysis of the drive system of the CTA LST Telescope and its integration in the whole PLC architecture

Simulations of VANET Scenarios with OPNET and SUMO

Hierarchical Multi-Views Software Architecture

IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs

Implementing an Automatic Functional Test Pattern Generation for Mixed-Signal Boards in a Maintenance Context

Moveability and Collision Analysis for Fully-Parallel Manipulators

Real-Time Collision Detection for Dynamic Virtual Environments

A Logical Pattern for Integrating Business Intelligence into Information Systems Design

An SCA-Based Middleware Platform for Mobile Devices

Application of RMAN Backup Technology in the Agricultural Products Wholesale Market System

NP versus PSPACE. Frank Vega. To cite this version: HAL Id: hal

HySCaS: Hybrid Stereoscopic Calibration Software

X-Kaapi C programming interface

The Sissy Electro-thermal Simulation System - Based on Modern Software Technologies

The optimal routing of augmented cubes.

Real-Time and Resilient Intrusion Detection: A Flow-Based Approach

DSM GENERATION FROM STEREOSCOPIC IMAGERY FOR DAMAGE MAPPING, APPLICATION ON THE TOHOKU TSUNAMI

Branch-and-price algorithms for the Bi-Objective Vehicle Routing Problem with Time Windows

Fueling Time Machine: Information Extraction from Retro-Digitised Address Directories

FIT IoT-LAB: The Largest IoT Open Experimental Testbed

BugMaps-Granger: A Tool for Causality Analysis between Source Code Metrics and Bugs

The Proportional Colouring Problem: Optimizing Buffers in Radio Mesh Networks

lambda-min Decoding Algorithm of Regular and Irregular LDPC Codes

Modularity for Java and How OSGi Can Help

Representation of Finite Games as Network Congestion Games

Framework for Hierarchical and Distributed Smart Grid Management

Complexity Comparison of Non-Binary LDPC Decoders

A Voronoi-Based Hybrid Meshing Method

On Code Coverage of Extended FSM Based Test Suites: An Initial Assessment

Validating Ontologies against OWL 2 Profiles with the SPARQL Template Transformation Language

Robust IP and UDP-lite header recovery for packetized multimedia transmission

The SANTE Tool: Value Analysis, Program Slicing and Test Generation for C Program Debugging

RETIN AL: An Active Learning Strategy for Image Category Retrieval

Transcription:

Graphe-Based Rules For XML Data Conversion to OWL Ontology Christophe Cruz, Christophe Nicolle To cite this version: Christophe Cruz, Christophe Nicolle. Graphe-Based Rules For XML Data Conversion to OWL Ontology. WEBIST 2010, Apr 2010, Valencia, Spain. INSTICC Press, pp.175-178, isbn: 978-989-674-025-2, 2010. <hal-00639558> HAL Id: hal-00639558 https://hal-univ-bourgogne.archives-ouvertes.fr/hal-00639558 Submitted on 9 Nov 2011 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

GRAPH-BASED RULES FOR XML DATA CONVERSION TO OWL ONTOLOGY Christophe CRUZ, Christophe Nicolle Laboratory Le2i, UMR-5158 CNRS, Université de Bourgogne B.P. 47870, 21078, Dijon CEDEX, France {christophe.cruz, cnicolle}@u-bourgogne.fr Keywords: Ontology population, ontology enrichment, OWL ontology, XML data, RDF, semantic annotation. Abstract: The paper presents a flexible method to enrich and populate an existing OWL ontology from XML data based on graph-based rules. These rules are defined in order to populate automatically a new version of an OWL ontology. Today, most of data exchanged between information systems is done with the help of the XML document. Leading researches in the domain of database systems are moving to semantic model in order to store data and its semantics definition. This flexible method consists in populating an existing OWL ontology from XML data. In paper we present such a method based on the definition of a graph which represents rules that drive the populating process. 1 INTRODUCTION Ontologies are aimed at representing knowledge about a specific domain that are understandable by both developers and computers. For this, ontologies enumerate concepts and relations between concepts (Guarino, 1995) and define properties, functions, constraints and axioms (Studer, 1998). The major issues in ontology development include ontology representation, ontology acquisition, evaluation and ontology maintenance (zhou, 2007). Ontology representation is the main issue in ontology development because its representation has to be understandable by computers and humans. Consequently, an ontology representation language should provide representation adequacy for humans and inference efficiency for computers. Ontology dialects based on description logic (DL) provide a frame-based knowledge representation and profit from the expressiveness of DL reasoning systems. Ontology acquisition refers to the process of the ontology creation such as concepts, relations, individuals and axioms. From an empirical point of view, there are two kinds of ontology modeling processes. The first one is the ontology modeling, which is traditionally carried out by knowledge engineers or domain experts. Actually, these ontologies are built by humans for humans. The second one is in fact the point of view of the semantic Web according to which ontologies are built automatically by computers for computers within sources such as dictionaries, Web documents and database schemas. It has to be noticed that the resulting ontologies are still understandable by humans. As a result, ontology acquisition can benefit significantly from ontology learning (Ding, 2002). Ontology evaluation aims at enhancing the quality of ontologies in order to improve the interoperability among systems and to increase the adoption of ontologies. Ontologies can be evaluated in different ways (Staab, 2004) using measures such as completeness, consistency and correctness (Gomez, 1995). Ontology maintenance concerns the organization, the search and the update process on existing ontologies. The constant evolution of the environment of ontologies makes it very important for ontologies to be evaluated and maintained (Sure, 2002) in order to keep up with the change. To reach this goal, this article presents an automatic population process from XML data to OWL ontologies, a process which is based on a manual mapping between the XML schema elements and the OWL schema elements. If the OWL schema does not contain the required elements

then the ontology has to be enriched by the system manager. The ontology enrichment is the activity of extending an ontology by adding new elements (e.g. concepts, relations, properties, axioms) (Castano, 2007). Our enrichment process consists in annotating knowledge which is contained in XML schemas in order to define the ontology schema (Faatz, 2004). Some automatic processes from ontology learning can be used but this point is beyond the scope of this paper. The ontology population is the activity of adding new instances or individuals to an ontology (Castano, 2007). 3 THE XSD2OWL TOOL The principle of our solution (Matthias, 2004), (Bohing, 2005) consists in annotating and linking the semantic level (OWL schema) and the schematic level (XML schema). The graphical interface used to realize this is incorporated in the tool protégé from Stanford as a plug-in (e.g. fig 1) in order to populate an existing ontology. Once the graph of mapping rules has been defined, the population process is automatic. The user has only to select a list of XML documents which can be validated by the XSD schema. 3.1 The graph-based rule definition The process of annotating consists in defining Basic Mapping Rules (BMR) which appear in the graph as nodes or boxes. These boxes represent annotations on the XSD and OWL documents. Some of the annotations are defined on the XSD schema and are represented by grey boxes. ( xsd:element, xsd:attribute ). The other boxes are annotations on the OWL-DL schema ( owl:class, owl:objectproperty, owl:datatypeproperty ). The color of these boxes follows the colors defined in the application protégé (orange for Concept, blue for ObjectProperties and green for DatatypeProperties, e.g. figure 1). The links between XSD annotations are subelement relationships which are added automatically by the process because these relationships already exist in the XSD schema. In addition, links between OWL annotations are also added automatically because these relationships already exist in the OWL ontology. The process that consists in defining links between annotations of the XSD schema and annotations of the OWL-DL is called Advanced Mapping Rules. These rules which are represented graphically are added manually by the user. Fig. 1. Snapshot of the XSD2OWL plug-in.

An RDF document is generated from the defined rules. This document is used to store all information required during the population process. RDF rules Case 2: Population of a concept associated to n DatatypeProperties (n not null) from an XSD complex element that contains m sub-elements simple and (n-m) attributes. Advanced Mapping Rules Basic Mapping Rules Basic Mapping Rules XML schemas XML instances Fig. 2. Relations between processes and RDF rules. The objective of figure 2 is to describe the relationships between the components of the RDF rules. They are composed of BMR on XML schemas and BMR on OWL schemas. These BMR are used to identify elements required for the mapping process. Advanced Mapping Rules are defined in order to allow the conversion of data from XML data to OWL instances. 3.2 Population process Concerning the population process more than ten cases of use have been identified, but due to a lack of space only few of them can be presented. Case 1: Population of an isolated concept from an XSD element Fig. 3. Use case 1. E In this graph the grey box purchaseorder contains an annotation of the XSD element purchaseorder. In addition the orange box Order contains an annotation to the concept Order in the OWL schema. In order to populate the concept Order from the purchaseorder element data, a link amr:is_a between both boxes is defined. This link is an advanced mapping rule. P OWL schema OWL instances Fig. 4. Use case 2. The population of the ontology is an automatic process based on the mapping graphs. To realize this process, we have defined an algorithm that takes into account the type bmr:id in order to avoid duplicated instances of the ontology. First, it determines all classes that have to be populated. Secondly, all datatypeproperties of each concept are provided to the instances. In the example given in this paper no references are made to the management of restrictions on the properties. Some rules can be defined in order to specify which constraints have to be verified. If these rules are not defined then the restrictions are not checked. The limitation of our solution becomes apparent by the fact that we do not generate an XSL document in order to enrich and populate the ontology. However, the process is complex enough so that, for the moment, it is not relevant to add the generation of an XSL document for the population process. 4 CONCLUSION This paper presents a flexible method to enrich and populate an OWL ontology for the integration of XML data. Basic mapping rules and advanced mapping rules are defined by users and can be reused for other conversions and populations of ontologies. This conversion is the first part of our work. The second part consists in improving the process and in making some suggestions in order to facilitate the mapping to the user. The RDF rules can be used for the automatic extraction of certain elements of the XML schemas that can be converted

in order to help users during the mapping. For instance, a string that contains a date can be detected automatically to guide the user during the conversion. According to (Cruz, 2004), (Klein, 2002), (Lakshmannan, 2003), (Cruz, 2006), data integration can be undertaken by defining rules of mapping between information sources and the ontological level. These rules consist in adding a semantic layer to source elements. They thus provide these elements with semantic definition with regard to a consensual definition of the meaning. For that purpose, ontologies are useful in order to define a common semantic. Furthermore, schema matching is a well studied field that allows to find out automatically identical resources in the different schemas. Schema matching is a manipulation process on schemas that takes two heterogeneous schemas as input and produces as output a set of mapping rules that identifies relations between the elements of the two schemas (Huynh, 2008). This is required in many database applications, such as integration of web data sources, data warehouse loading and XML message mapping. As a future work, we would like to focus to an automatic process by reusing a set of previous RDF rules. In fact, it consists in reusing the mapping knowledge capitalized during different mapping processes. In addition the concatenation rules and the regular expression rules are being prototyped. This implies that new boxes have to be defined and will be connected to XSD boxes and OWL boxes. ACKNOWLEDGEMENTS Authors would like to thank Romain Brochot, Yoan Chabot, Florian Genton for their important contribution on the application instantiation. REFERENCES Bohring, H.; Auer, S., 2005. Mapping XML to OWL Ontologies, Leipziger Informatik-Tage (LIT 2005), Sep. 21-23, 2005, Lecture Notes in Informatics Castano, S., Espinosa, S., Ferrara, A., Karkaletsis, V., Kaya, a., Melzer, S., Moller, R., Montanelli S., Petasis, G., 2007. Ontology Dynamics with Multimedia Information: The BOEMIE Evolution Methodology, International Workshop on Ontology Dynamics (IWOD) ESWC 2007 Workshop, Innsbruck, Austria Cruz, C., Nicolle, C., 2006. Ontology-Based Integration of XML data, Webist, Setubal, Portugal, pp. 30-37 Cruz, I. F., Xiao, H., Hsu, F., 2004. An Ontology-based Framework for Semantic Interoperability between XML Sources, In Eighth International Database Engineering & Applications Symposium (IDEAS) Ding, Y., 2002. Ontology research and development part1 A review of ontology generation, Journal of Information Science, 28, 123 136 Faatz, A., and Steinmetz, R., 2004. Precision and recall for ontology enrichment. ECAI-2004 Workshop on Ontology Learning and Population, Valencia, Spain, Aug. Gomez-Perez, A., 1995. Some ideas and examples to evaluate ontologies, Artificial Intelligence for Applications Guarino, N., 1995. Formal ontology, conceptual analysis and knowledge representation, International Journal of Human-Computer Studies 43, 625 640 Huynh Quyet Thang, Vo Sy Nam, 2008. XML Schema Automatic Matching Solution, International journal on Information Systems Science and Engineering, vo.l 4, number 1 Klein, M., 2002. Interpreting XML via an RDF schema. In ECAI workshop on Semantic Authoring, Annotation & Knowledge Markup (SAAKM 2002), Lyon, France Lakshmannan, L. V., Sadri, F., 2003, Interoperability on XML Data, In Proceeding of the 2nd International Semantic Web Conference. Matthias Ferdinand and Christian Zirpins and D. Trastour, 2004, Lifting XML Schema to OWL, 4th International Conference, ICWE 2004, Munich, Germany, July 26-30, Proceedings, Springer Heidelberg, pp. 354-358 Staab, S., Gomez-Perez, A., Daelemana, W., Reinberger, M.-L. and Noy, N.F., 2004. Why evaluate ontology technologies? Because it works!, Intelligent Systems, IEEE 19, 74 8 Studer, R. Benjamins, R. and Fensel, D., Knowledge engineering: Principles and methods, Data and Knowledge Engineering 25, 161 197 Sure, Y., Staab, S. and Studer, R., 2002. Methodology for development and employment of ontology based knowledge management applications, SIGMOD Rec 31, 18 34 Zhou, L., 2007, Ontology learning: state of the art and open issues, Information Technology and Management archive Volume 8, Issue 3, 241-252