GenTax: A Generic Methodology for Deriving OWL and RDF-S Ontologies from Hierarchical Classifications, Thesauri, and Inconsistent Taxonomies

Similar documents
Products and Services Ontologies: A Methodology for Deriving. OWL Ontologies from Industrial Categorization Standards. Martin Hepp

Access rights and collaborative ontology integration for reuse across security domains

Mapping between Digital Identity Ontologies through SISM

Opus: University of Bath Online Publication Store

SWAD-Europe Deliverable 8.1 Core RDF Vocabularies for Thesauri

! " # Formal Classification. Logics for Data and Knowledge Representation. Classification Hierarchies (1) Classification Hierarchies (2)

SAWSDL Status and relation to WSMO

Ontology-based Product Catalogues: An Example Implementation

ProdLight: A Lightweight Ontology for Product Description Based on Datatype Properties

5 RDF and Inferencing

Open Research Online The Open University s repository of research publications and other research outputs

Taxonomy browsing and ontology evaluation for Wikidata

0.1 Knowledge Organization Systems for Semantic Web

Metadata Common Vocabulary: a journey from a glossary to an ontology of statistical metadata, and back

Harvesting Wiki Consensus - Using Wikipedia Entries as Ontology Elements

Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic

INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) APPLYING SEMANTIC WEB SERVICES. Sidi-Bel-Abbes University, Algeria)

Report from the W3C Semantic Web Best Practices Working Group

Enhancing information services using machine to machine terminology services

Ontological Modeling: Part 7

Copyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata. A Tale of Two Types of Vocabularies

Semantic Web and Natural Language Processing

Lightweight Semantic Web Motivated Reasoning in Prolog

EFFICIENT INTEGRATION OF SEMANTIC TECHNOLOGIES FOR PROFESSIONAL IMAGE ANNOTATION AND SEARCH

Wondering about either OWL ontologies or SKOS vocabularies? You need both!

Publishing Vocabularies on the Web. Guus Schreiber Antoine Isaac Vrije Universiteit Amsterdam

Lecture Telecooperation. D. Fensel Leopold-Franzens- Universität Innsbruck

6. RDFS Modeling Patterns Semantic Web

Taxonomy Tools: Collaboration, Creation & Integration. Dow Jones & Company

GraphOnto: OWL-Based Ontology Management and Multimedia Annotation in the DS-MIRF Framework

Mustafa Jarrar: Lecture Notes on RDF Schema Birzeit University, Version 3. RDFS RDF Schema. Mustafa Jarrar. Birzeit University

Ontology Transformation Using Generalised Formalism

SKOS. COMP62342 Sean Bechhofer

Helmi Ben Hmida Hannover University, Germany

The OASIS Applications Semantic (Inter-) Connection Framework Dionisis Kehagias, CERTH/ITI

SEMANTIC WEB DATA MANAGEMENT. from Web 1.0 to Web 3.0

Ontologies SKOS. COMP62342 Sean Bechhofer

On Supporting HCOME-3O Ontology Argumentation Using Semantic Wiki Technology

Semantic Image Retrieval Based on Ontology and SPARQL Query

Metadata Standards and Applications. 6. Vocabularies: Attributes and Values

Semantic Web. Tahani Aljehani

ARISTOTLE UNIVERSITY OF THESSALONIKI. Department of Computer Science. Technical Report

Lecture 1/2. Copyright 2007 STI - INNSBRUCK

WSMO Working Draft 04 October 2004

Common Pitfalls in Ontology Development

A GML SCHEMA MAPPING APPROACH TO OVERCOME SEMANTIC HETEROGENEITY IN GIS

Ontology Design Patterns

Lessons Learned from a Greenhorn Ontologist

Structural representations of unstructured knowledge

Collaborative Ontology Evolution and Data Quality - An Empirical Analysis

Logical reconstruction of RDF and ontology languages

Taking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria

A Double Classification of Common Pitfalls in Ontologies

Data formats for exchanging classifications UNSD

Chapter 8: Enhanced ER Model

Enhanced Entity-Relationship (EER) Modeling

SEMANTIC DISCOVERY CACHING: PROTOTYPE & USE CASE EVALUATION

Publishing Official Classifications in Linked Open Data

Open Research Online The Open University s repository of research publications and other research outputs

H1 Spring B. Programmers need to learn the SOAP schema so as to offer and use Web services.

ISO Original purpose and possible future

Describing Customizable Products on the Web of Data

Linguaggi Logiche e Tecnologie per la Gestione Semantica dei testi

Interoperability of Protégé using RDF(S) as Interchange Language

Semantic Web Technologies

Logic and Reasoning in the Semantic Web (part I RDF/RDFS)

MASWS Natural Language and the Semantic Web

The Hoonoh Ontology for describing Trust Relationships in Information Seeking

Vocabulary and Semantics in the Virtual Observatory

Process Mediation in Semantic Web Services

Towards a Vocabulary for Data Quality Management in Semantic Web Architectures

OOPS! OntOlogy Pitfalls Scanner!

Produce and Consume Linked Data with Drupal!

The table metaphor: A representation of a class and its instances

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE

Semantic Interoperability Courses

Ontology Development and Engineering. Manolis Koubarakis Knowledge Technologies

ORES-2010 Ontology Repositories and Editors for the Semantic Web

Software Implementation for Taxonomy Browsing and Ontology Evaluation for the case of Wikidata

Ontology Summit2007 Survey Response Analysis. Ken Baclawski Northeastern University

A Lightweight Language for Software Product Lines Architecture Description

A Loose Coupling Approach for Combining OWL Ontologies and Business Rules

RaDON Repair and Diagnosis in Ontology Networks

Towards an ontological analysis of BPMN

Simplifying the Web Service Discovery Process

Adding formal semantics to the Web

The Results of Falcon-AO in the OAEI 2006 Campaign

Scalable Web Service Composition with Partial Matches

Semantic Web. Ontology Pattern. Gerd Gröner, Matthias Thimm. Institute for Web Science and Technologies (WeST) University of Koblenz-Landau

Enterprise Multimedia Integration and Search

SEMANTIC WEB LANGUAGES - STRENGTHS AND WEAKNESS

INTERCONNECTING AND MANAGING MULTILINGUAL LEXICAL LINKED DATA. Ernesto William De Luca

Enhanced retrieval using semantic technologies:

CEN/ISSS WS/eCAT. Terminology for ecatalogues and Product Description and Classification

A faceted lightweight ontology for Earthquake Engineering Research Projects and Experiments

D5.1 v0.1 WSMO Web Service Discovery

KOSO: A Reference-Ontology for Reuse of Existing Knowledge Organization Systems

SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web

3. Finding Components in Component Repositories

3. Finding Components in Component Repositories Component Search. Obligatory Literature. References

Transcription:

Leopold Franzens Universität Innsbruck GenTax: A Generic Methodology for Deriving OWL and RDF-S Ontologies from Hierarchical Classifications, Thesauri, and Inconsistent Taxonomies Martin HEPP DERI Innsbruck University of Innsbruck Jos de Bruijn Faculty of Computer Science Free University of Bolzano Copyright 2006 DERI Innsbruck www.deri.at

Goal: Derive OWL and RDF-S ontologies from any hierachical classification Being able to derive consistent RDF-S,OWL, and WSML ontologies from hierarchical classifications High degree of automation, i.e., without the need for manual analysis of conceptual elements Ability to transform SKOS vocabularies into RDF- S, OWL, or WSML

Motivation: Hierarchical classifications as a major resource Hierarchical classification systems are a major resource for structuring information Well established means in information management However, reuse for building ontologies not trivial Fuzzy notion of class membership Context-dependent semantics

Assets for the Semantic Web: A Wealth of Consensual Concepts UNSPSC, http://www.unspsc.org 20,700 classes, 55 top-level categories ecl@ss, http://www.eclass.de 25,000 classes, 25 top-level categories eotd, http://www.eotd.org 59,000 classes, 79 top-level categories DMOZ Wikipedia Categories etc.

Definition: Hierarchical classifications We view a hierarchical categorization schema as a directed graph where nodes represent categories and edges represents the narrower term or has subcategory relation. Depending on the context, a set is related to each category. This set represents the items associated with the category in a particular context. 2007 Vacations Work Italy Spain ISWC

Observations Classifications do not require a context independent definition of their intended meaning We can use the same classification in multiple distinct contexts with very different meanings, as long as we keep the contexts apart Invoices vs. Documents vs. Tasks When we use the labels as definitions for sets, we interprete them over the context of usage The meaning of a label and the meaning of the edges are mutually dependent We may face several anomalies

Anomalies Local labels Pictures->Italy->Summer 2006 Pictures-> Pictures.Italy -> Pictures.Italy.Summer_2006 Hierarchy: Depending on the context over which we interprete a label, the original set of arcs may or may not constitute a valid subsumption hierarchy

The Semantics of Category Labels 1 2 Concept in some context (Example: Pictures of Italy ) 1 2 Category Concept (Example: Anything that can in any reasonable context be subsumed under the label Pictures.Italy )

Naїve Approach: One ontology class per each label (leads to ontologies limited in use) Peter Miller, a sales represenntative with expertise in the Radio and TV domain instanceof Invoice # 1234, which was mainly about TV-related costs mysony Widescreen An actual TV set mysonywidescreen.jpg An image showing a TV set instanceof Radio and TV instanceof instanceof TV Set SubClassOf SubClassOf TV Maintenance Example for two contexts that differ in whether the original edges would constitute a valid subsumption hierarchy! Example: Anything that can in any reasonable context be subsumed under the label Radio and TV Examples for Contexts: Products User s Manuals Employees with Sales Expertise

Solution 1:One class for each node in the target context (but impossible in OWL) Radio and TV Meant as an actual product described by this label TV Set ex:taxonomysubclassof ex:taxonomysubclassof TV Maintenance Still, you have to select one target context! FORALL A ex:taxonomysubclassof B AND B ex:taxonomysubclassof C A ex:taxonomysubclassof C

The GenTax Idea 1. Define two Master Concepts: a) MC1 any item for which the orginal hierarchy was intended. b) MC2 the set of all entities in the target context 2. Check whether the orginal hierarchy is a valid subsumption hierarchy if you interprete the categories as label MC1 3. Check whether each label MC2 is a proper subclass of this label MC1 4. For steps 3 and 4, we take representative samples only! Category v2 MC2 rdfs:subclassof Category v1 MC2 instance 1 rdf:type 1 1 rdf:type instance 2 Category v1 MC1 Category v2 MC1 rdfs:subclassof 2 rdfs:subclassof 2

Special (Lucky) Case rdfs:subclassof Category v1 MC1 2 instance 1 rdf:type Category v1 MC2 rdfs:subclassof 1 rdfs:subclassof Category v2 MC1 2 rdfs:subclassof Category v2 MC2 1 instance 2 rdf:type

Nice Properties Automatic creation of lightweight ontologies possible that require only subclassof as a modeling element Resulting ontologies can be expressed in most popular ontology languages Original hierarchy can be preserved while still being able to design more specific ontology classes for each label Only a small sample necessary to decide upon proper conceptual modeling No need to manually analyze each single element of large classifications

Statistical Diagnosis of Anomalies and Modeling Choices Idea We draw a representative sample of the input classification We ask a human to decide for this small sample whether for this element certain modeling choices are correct e.g. whether, as categories for expenses, TV maintenance is a subclass of TV Set same for local labels and other anomalies We accept or reject that modeling choice for the full classification based on the sample Advantages We have a solid statistical basis for the decision We can chose a suitable degree of confidence depending on the target domain of the ontology classifying Web documents vs. life sciences

Tooling: SKOS2GenTax Converter Will be online shortly at http://www.heppnetz.de/skos2gentax/

Full Paper Martin Hepp, Jos de Bruijn: GenTax: A Generic Methodology for Deriving OWL and RDF-S Ontologies from Hierarchical Classifications, Thesauri, and Inconsistent Taxonomies, Proceedings of the 4th European Semantic Web Conference (ESWC 2007), June 3-7, Innsbruck, Austria, in: E. Fraconi, M. Kifer, and W. May (Eds.): ESWC 2007, LNCS 4519, Springer 2007, pp.129-144.

Leopold Franzens Universität Innsbruck Thank you. Martin HEPP DERI Innsbruck University of Innsbruck Jos de Bruijn Faculty of Computer Science Free University of Bolzano Copyright 2006 DERI Innsbruck www.deri.at