Advanced Information Systems Laboratory Cost Action C2 Converting a thesaurus into an ontology: the use case of URBISOC J. Nogueras-Iso, J. Lacasta Alcalá de Henares, 4-5 May 2007 http://iaaa.cps.unizar.es Department of Computer Science and Systems Engineering
Outline. Motivation 2. Conversion into SKOS 3. Conversion into Towntology format 4. Conversion into OWL 5. Conclusions 2
. Motivation Definition of thesaurus a set of terms that describe the vocabulary of a controlled indexing language, formally organized so that the a priori relationships between concepts are made explicit (BT/NT, RT, TR, USE/UF, ) A term is a word or phrase that represents a conceptual category Traditionally used in digital libraries to improve precision and recall of information retrieval systems Provide a specialized vocabulary for the homogeneous classification of resources Supply users with a suitable vocabulary for the retrieval 3
Thesaurus vs ontology Ontology: explicit specification of a conceptualization Categorization of ontologies Linguistic/terminological ontologies (glossaries, controlled vocabularies, taxonomies, thesauri) Implementation-driven ontologies (conceptual schemas, knowledge bases) Formal ontologies Thesauri are ontologies (lexical ontologies) with weak semantics Semantic of relations results quite ambiguous RT, NT, BT, 4
Contribution of thesauri to development of formal ontologies They represent interesting source for the development of more formal ontologies Problem: there is no standardized representation for exchange There is no possibility to compare thesauri and detect commonalities Purpose of this presentation Conversion of thesauri into standardized representation This facilitates the comparison of ontologies the integration of concepts into formal ontologies 5
2. Conversion into SKOS Simple Knowledge Organisation System (SKOS) A W3C initiative for the representation of knowledge organization systems such as thesauri, classification schemes, subject heading lists, taxonomies, and other types of controlled vocabulary It provides a standard way to represent knowledge organisation systems using the Resource Description Framework (RDF) RDF encoding facilitates the interoperability of different computer applications using or sharing the same knowledge base It is becoming a de-facto standard 6
SKOS SKOS is a collection of 3 different RDF- Schema application profiles SKOS-Core store common properties and relations SKOS-Mapping describe relations between different KOS SKOS-Extension indicate specific relations and properties only contained in some type of KOS 7
SKOS-Core A KOS consists of a set of concepts (labelled as skos:concept) that are grouped by a concept scheme (skos:conceptscheme) Identified by means of a URI, it can be described with Dublin Core metadata The relation of the concept scheme with the most general concepts (top concepts) of the KOS is done through the skos:hastopconcept relation Each concept consists of a URI and a set of properties and relations to other concepts Concept properties skos.preflabel: label that better identifies a concept (for thesauri it must be unique) skos.altlabel: alternative label (synonyms or spelling variations of the preferred label) skos.scopenote: annotations about the ways to use a concept skos.definition: definition 8
SKOS-Core skos.example: examples of use in different languages skos.prefsymbol, skos.altsymbol: preferred or some alternative graphic symbols (e.g. graphical representation of a mathematical formula) skos.notation: representation of multiple encoding of concepts Concept relations skos:inscheme: to indicate the concept scheme including the concept skos.broader - skos.narrower: reciprocal relations to represent generalization-specialization relations skos.related: associative relationships between concepts indicating that two concepts are related in some way 9
SKOS-Core 0
URBISOC original format The alphabetical list of terms (http://pci204.cindoc.csic.es/tesauros/tes_ur ba/tes_urba.htm) is transformed into text files [ISO-2788 relationships + URI] ESPACIO NT CIUDADES NT PERCEPCION DEL ESPACIO NT TERRITORIO RT GEOGRAFIA URI http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_e8.htm#espacio ESPACIO COTIDIANO BT CAMPO DE PERCEPCION URI http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_e8.htm#espaciocotidiano ESPACIO EXTERIOR BT CAMPO DE PERCEPCION RT ESPACIO INTERIOR URI http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_e8.htm#espacioexterior Espacio Forestal SYN ZONAS FORESTALES URI http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_e8.htm#espacioforestal ESPACIO IMAGINARIO BT CAMPO DE PERCEPCION URI http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_e8.htm#espacioimaginario
Mapping between URBISOC and SKOS skos:conceptscheme skos:preflabel skos:altlabel skos:scopenote 0.. 0..n 0..n 0.. lang lang lang lang 0..n skos:concept URI skos:definition broader narrower related skos:broader skos:narrower skos:related skos:altsymbol 0..n skos:prefsymbol 0.. 2
Example of SKOS generated file ESPACIO NT CIUDADES NT PERCEPCION DEL ESPACIO NT TERRITORIO RT GEOGRAFIA URI http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_e8.htm#espacio... <rdf:description rdf:about="http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_e8.htm#espacio"> <skos:related rdf:resource="http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_g0.htm#geografia"/> <skos:narrower rdf:resource="http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_p3.htm#percepciondelespa CIO"/> <skos:inscheme rdf:resource="http://pci204.cindoc.csic.es/tesauros/tes_urba/urbisoc"/> <skos:preflabel xml:lang="es">espacio</skos:preflabel> <skos:narrower rdf:resource="http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_t.htm#territorio"/> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#concept"/> <skos:narrower rdf:resource="http://pci204.cindoc.csic.es/tesauros/tes_urba/html/urb_c5.htm#ciudades"/> </rdf:description>.
SKOS format visualized through ThManager Open Source tool, http://thmanager.sourceforge.net/ 4
3. Conversion into Towntology Conversion from SKOS into Towntology format Details of the conversion The conversion is an adaptation from previous conversion (thesaurus -> SKOS) The main difference relies in the management of relationships SKOS defines strictly the types of available properties and relations Towntology format defines some properties but allows users to define new relationships In addition to the mapping between concepts, we need to define new relation types 5
Mapping between SKOS and Towntology DC skos:conceptscheme Ontology RelationType skos:preflabel skos:altlabel skos:scopenote skos:definition skos:broader skos:narrower skos:related skos:altsymbol 0.. 0..n 0..n 0.. 0..n lang lang lang lang broader 0..n skos:concept URI narrower related Head -Title -Language -Custodian -Last_Modif_Date * -RelationTypes Body -Relations Relation -Originator : Attribute -Insertion _Date : Attribute -Concept_Orig (Atrr: ID ) -Concept_Dest (Attr : ID) -Type (Attr : ID) -Properties (Attr : #empty, optional (true/false ))) * -ID : Attribute -Relation_Name -Relation_Def -Relation_Properties(Atrr : #empty, symmetric (true/false ), transitive (true, false ), maybeoptional (true/false )) -Concepts -Domains * * Domain -ID : Attribute -Domain_Name -Domain_Def Concept -ID : Attribute -Concept_Name -Concept_Domain (Attr : ID ) * Resource * Term -Terms * -ConceptDefs -Multimedia -URI -Resource _Description * -Terms ConceptDef -Originator : Attribute -Insertion _Date : Attribute -Concept_Def_Text ConceptDefSource -Ref * -Autors skos:prefsymbol 0.. ResourceSource -Ref -Autors Autor * 6
URBISOC with Towntology tool 7
4. Conversion into OWL Conversion from SKOS to OWL Web Ontology Language Created by the W3C Working Group Derived from DAML+OIL language Based on RDF 3 layers OWL Lite: extends RDF(S) and gathers the most common features of OWL Intended for users that only need to create class taxonomies and simple constraints OWL DL: includes the complete OWL vocabulary OWL Full: provides more flexibility than OWL 8
Main elements from OWL, classes Classes for defining classes and restrictions owl:class: specializes rdf:class owl:restriction: specializes owl:class and is used to define property restrictions for classes (number restrictions, existencial restrictions, ) Classes for defining properties owl:objectproperty: define properties that connect a class with antoher class owl:datatypeproperty: define properties to connect a class with a datatype Classes for defining inequality among individuals, enumerations of datatypes, predefined classes, describing ontologies, ontology versioning
Main elements from OWL, properties Properties for defining class expressions Conjunction (intersectionof), disjunction (unionof), negation (complementof) Collection of individuals (oneof) Property restrictions: name (onproperty) + o Value restriction (allvaluesfrom) o Existencial restriction (somevaluesfrom) o Role fillers (hasvalue) o Number restriction (cardinality, maxcardinality, mincardinality)
Details of conversion There isn t a real change of format SKOS is based on RDF OWL is based on RDF SKOS resources, properties and relation types are made to inherit from the structure of OWL SKOS instances don t change with the conversion 2
SKOS and OWL owl:class 2 * owl:objectproperty -domain : owl:class -range : owl:class skos:concept -Restriction :maxcardinality (skosbroader ) = -Restriction : minmaxcardinality )inscheme = -Restriction :maxcardinality (preflabel)= x lang -Restriction :maxcardinality (definition )= x lang -Restriction :maxcardinality (prefsymbol )= skos:concpetscheme -Restriction :mincardinality(hastopconcept )= -dcmetadata skos :semanticrelation -domain : skos :concept -range : skos :concept skos:hastopconcept -domain : skos :concpetscheme -range : skos :concept skos:inscheme -domain : skos :concept -range : skos :concpetscheme owl:datatypeproperty -domain : owl:class -range skos:related -SymmetricProperty skos:broader -TransitiveProperty -inverseof:skos :narrower skos:narrower -TransitiveProperty -inverseof:skos :broader skos :preflabel -domain : skos :concept -range : string -lang skos :altlabel -domain : skos :concept -range : string -lang skos :example -domain : skos :concept -range : string -lang skos:definition -domain : skos :concept -range : string -lang skos:scopenote -domain : skos :concept -range : string -lang skos :symbol -domain : skos :concept -range : URI skos:notation -domain : skos :concept -range : string -type skos:prefsymbol skos:altsymbol 22
OWL format visualized through Protegé 23
OWL format visualized through Protegé (II) 24
5. Conclusions URBISOC thesaurus available in different standardized formats, facilitating aligning, merging or other types of processing The process can be reused for other thesauri following the same structure Main part: analyzing the original structure of source format In SKOS and OWL representations each thesaurus concept/term is considered as an instance of a general Concept class Is this the right approach? Should we consider each concept/term as a separate class? The conversion to Towntology considers each concept/term as a separate class (no chance to include instances) Something in between? In general thesauri make no distinction between classes and individuals (NTIs are not frequent) 25
Possible future work Conversion of URBISOC into an OWL ontology Classes with well-defined meaning (not only skos:concept, ), properties and relations (not only BT, NT, ) Find patterns to specify better the meaning of relations: Partes de una Ciudad, Tipos arquitectónicos
Advanced Information Systems Laboratory http://iaaa.cps.unizar.es 27