Making sense of content Wondering about either OWL ontologies or SKOS vocabularies? You need both! ISKO UK SKOS Event London, 21st July 2008 bernard.vatant@mondeca.com
A few words about Mondeca Founded : 1999 Founder and CEO : Jean Delahousse Team : 15 persons, and steadily growing Tagline : «Making Sense of Content» Early adopter of Semantic Web languages and technologies Business view : Software and services allowing companies and communities to organize, leverage, integrate, augment, publish and otherwise manage their enterprise knowledge assets Flag product : ITM (Intelligent Topic Manager) Does what it says on the box : «Intelligent Topic Management» Quick demo, time permitting Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 2
What exactly is «Intelligent Topic Management»? A topic is any formal representation of a concept, thing, subject and generally of anything that is identified and described in a structured / formal way Some topic types you might want to manage intelligently if possible Classes and instances - as defined in ontologies and knowledge bases Concepts - as defined in thesauri and similar vocabularies Categories - as defined in classifications and taxonomies Named entities Topics are more or less explicitly defined in legacy Natural language concepts defined in vocabularies or found in documents Data at large a lot of implicit topics are trapped in data bases! There are many ways to represent topics! How do you know which kind of representation is «intelligent»? Depends on what you want to do! A «good» representation is a good balance between some a priori ontological status of the concept (the business view of the world) the intended functional use of the concept in the information system Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 3
OWL vs SKOS In SKOS The focus is the relation between resources and vocabulary A librarian view of the world Organising vocabulary helps find stuff based on «about-ness» Main topics are concepts : formal instances of the class «skos:concept» The main use of SKOS concepts is to index, classify, search and retrieve resources Based on a limited but extensible set of attributes and relationships Leveraging the documentation / library / thesaurus expertise legacy In OWL (Ontology Web Language) The focus is the description of things in a domain (business objects) A knowledge representation / artificial intelligence / logic view of the world The topics are classes, properties and individuals Linked by arbitrarily complex constraints The main use of OWL ontologies is to structure business objects and data Allowing consistency checking, interface control, inference Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 4
Librarian vs Logician Librarian : «SKOS is cool because it is simple. I don t care about complex ontologies, I don t need inference.» Some ontologies are very simple Some SKOS concept schemes can be highly complex Librarian should be aware of other things that documents and vocabulary Logician : «I don t understand what those SKOS concepts are. All concepts are classes. All I need is OWL.» Logician does not really care about documents, vocabulary, terminology Sometimes he does not even care about individuals But at some point he will need to find a document among tens of millions Actually you need both Logician and Librarian expertise But it might be tricky To have them sit at the same table To have them agree on a common language To have them listen to each other Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 5
So you need an ontology (OWL) when You deal with individual business objects Typically to manage a back-office knowledge base Restaurants, Hotels, Tourism Activities in Burgundy Olympic Games results You want to describe those business objects Using attributes specific to a class or generic to several classes Restaurants have geolocation, opening days and min/max menu prices Hotels have geolocation, opening days, category and number of rooms Competitions have discipline, participants, results, medals Using specific constraints on those attributes Allowing control of interfaces, control of integrity of data You want to use those descriptions To control consistency of data about your objects To perform formal queries on those data (of arbitrary complexity) Find a 3-Star camping site near a river in Burgundy, proposing fishing activity, and close to a winery producing Chardonnay Rouge. Find British athletes who got a gold medal in Athletics at Seoul Olympic Games Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 6
And you need a vocbulary (SKOS) when You have vocabulary legacy Architectural styles, hotel categories, olympic sports and disciplines You want to organize this vocabulary Using generic-specific and associative relationships Allowing synonyms and multiple languages You want to use this vocabulary To index, search and retrieve documents or any kind of resources Most of the time, you need both! You deal with business objects AND vocabulary concepts So you need an ontology (OWL) AND vocabularies (SKOS) Question : how do you use them together? Let s see some use cases and possible approaches Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 7
Use case #1 : Annuaire de l Administration Française Context : The directory of French Public Administration Notoriously of arbitrary complexity! User access requirements Navigation using both thematic and geographic «taxonomies» Standard Website navigation experience for general public end users No technical vocabulary here! Offices details are stored in a back-office knowledge base Based on a rather flat and simple ontology of the Who, What, Where Using a variety of authority lists > Presentation of the back-office ontology in an ontology editor Front publication uses a somewhat independent taxonomy > Presentation of the front Web http://lannuaire.service-public.fr/ Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 8
Use case #2 : The Olympic Games Context Federate knowledge bases of the International Olympic Committee Disciplines, competitions, ceremonies, athletes, results, medals Providing reference for consistent indexing of multimedia documents Olympic disciplines/events are business objects Each discipline has a date of introduction in Olympic Games, world olympic records Each discipline is organized with specific events Athletics > 100m Men > Final This organization may vary in Olympic games history Each event for specific Olympic Games has participants, results, medals You need an ontology of disciplines! Olympic disciplines as also indexing concepts Photograph : Start of the Men s 100-Metre Final, Seoul, 24 September, 1988 From left to right: Carl Lewis, Linford Christie, Calvin Smith and Ben Johnson. For indexing, a SKOS organisation of disciplines might be enough But the ontology can support consistent indexing! For example knowing the participants of the Seoul final Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 9
Some Business Objects are also Concepts (or the other way round) 1. Disciplines are concepts 2. Disciplines have specific business properties Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 10
OWL classes hierarchy vs SKOS concepts hierarchy Concept labels Concept class hierarchy Here be OWL! Concept hierarchy Here be SKOS! Concept hierarchy management Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 11
Looking at RDF triples Elements of the OWL ontology : SKOS classes + specific attributes :Discipline rdf:type owl:class :Event rdf:type owl:class :Discipline rdfs:subclassof skos:concept :Event rdfs:subclassof skos:concept :olympicsince rdfs:domain :Discipline :longname rdfs:subpropertyof skos:altlabel Instances description : SKOS concepts + specific attributes :AT rdf:type :Discipline :AT skos:preflabel Athletics @en :AT skos:preflabel Athlétisme @fr :ATM001 rdf:type :Event :ATM001 skos:preflabel 100m (M) @en :ATM001 skos:broader :AT :ATM001 :longname Athletics/100m(M) @en :AT :olympicsince 1896-04-06 Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 12
OWL hierarchy vs SKOS hierarchy - continued Beware of hierarchy mismatch! Discipline and Event are both subclasses of skos:concept The «broader» concept of an Event is a Discipline 100 m (H) BT Athletics But Event is not a subclass of Discipline! 100 m (H) IS NOT a Discipline Events do not inherit all properties of Disciplines SKOS applications can ignore the non-skos specific e.g., define a ConceptScheme including all instances of «Sports Concept» OWL-aware applications can implement stronger constraints An Event is a concept of which broader concept is a Discipline :Event owl:equivalentclass [ a owl:restriction; owl:onproperty skos:broader; owl:allvaluesfrom :Discipline ]; Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 13
Use case #3 : Classifying tourism offer In the back office we have (OWL-driven) Hotels with their category, labels, number of rooms Restaurants with their mini/maxi menu price, labels Shows with their category tarif mini/maxi In the front we have simple navigation categories (in SKOS) Travel budget Low budget Affordable First class Luxury The front categories are populated with rules such as Hotel AND 2star => Low budget Hotel AND 3star => Affordable Restaurant AND menumini between 10 and 15 => Affordable Show AND tarifmini between 8 and 12 => Affordable Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 14
Use case #3 : Classification rules in SPARQL CONSTRUCT {?x dc:subject :affordable } # Classify as affordable WHERE { # 3 stars Hotels {?x a :Hotel.?x :category :_3Star. } # or Restaurants with menu from 10 to 15 UNION {?x a :Restaurant.?x :menumini?m. FILTER (?m > 10). FILTER (?m < 15). } # or Shows with tarif from 8 to 12 UNION {?x a :Show.?y :tarifmini?t. FILTER (?t > 8). FILTER (?t < 12). } } Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 15
Conclusion There is a variety of ways to combine OWL and SKOS In most information management systems, you need both OWL is good for building a strong back-office structure With an ontology of business objects SKOS is good for front navigation and user experience The back-to-front transformation can be based on a variety of rules Including the use of SPARQL queries Mondeca 2008 - ISKO UK - SKOS Event, 21st July 2008, London 16