Terminologies, Knowledge Organization Systems, Ontologies

Similar documents
0.1 Knowledge Organization Systems for Semantic Web

SKOS. COMP62342 Sean Bechhofer

The UNESCO Thesaurus

Ontologies SKOS. COMP62342 Sean Bechhofer

The AGROVOC Concept Scheme - A Walkthrough

Use of Ontology for production of access systems on Legislation, Jurisprudence and Comments

Publishing Vocabularies on the Web. Guus Schreiber Antoine Isaac Vrije Universiteit Amsterdam

Copyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata. A Tale of Two Types of Vocabularies

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Agricultural bibliographic data sharing & interoperability in China

The Semantic Web DEFINITIONS & APPLICATIONS

THE GETTY VOCABULARIES TECHNICAL UPDATE

SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web

A brief introduction to SKOS

Reducing Consumer Uncertainty

Taxonomy Tools: Collaboration, Creation & Integration. Dow Jones & Company

VocBench v2.0 User Manual

Copyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata. A Tale of Two Types of Vocabularies

CrossCult Knowledge Base A co-inhabitant of cultural heritage ontology and vocabulary classification

On practical aspects of enhancing semantic interoperability using SKOS and KOS alignment

Converting a thesaurus into an ontology: the use case of URBISOC

Enhancing information services using machine to machine terminology services

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

Opus: University of Bath Online Publication Store

Metadata Standards and Applications. 6. Vocabularies: Attributes and Values

Fusing Corporate Thesaurus Management with Linked Data using PoolParty

VocBench v2.1 User Manual

Semantics. Matthew J. Graham CACR. Methods of Computational Science Caltech, 2011 May 10. matthew graham

Ontology Summit2007 Survey Response Analysis. Ken Baclawski Northeastern University

Enriching thesauri with ontological information: Eurovoc thesaurus and DALOS domain ontology of consumer law

(Geo)DCAT-AP Status, Usage, Implementation Guidelines, Extensions

For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS

Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic

Helmi Ben Hmida Hannover University, Germany

Available online at ScienceDirect. Procedia Computer Science 52 (2015 )

A service based on Linked Data to classify Web resources using a Knowledge Organisation System

The Agricultural Ontology Server: A Tool for Knowledge Organisation and Integration

Languages and tools for building and using ontologies. Simon Jupp, James Malone

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

> Semantic Web Use Cases and Case Studies

A Semantic MediaWiki-Empowered Terminology Registry

Data formats for exchanging classifications UNSD

Metadata Issues in Long-term Management of Data and Metadata

Ontology Servers and Metadata Vocabulary Repositories

Terminology Management Platform (TMP)

ECHA -term User Guide

Report from the W3C Semantic Web Best Practices Working Group

CEN/ISSS WS/eCAT. Terminology for ecatalogues and Product Description and Classification

Semantic Web Update W3C RDF, OWL Standards, Development and Applications. Dave Beckett

Initial Operating Capability & The INSPIRE Community Geoportal

Describing Knowledge Organization Systems in BARTOC and JSKOS

Chinese Agricultural Thesaurus and its application on data sharing & interoperability

Innovation in Thesaurus Management

STS Infrastructural considerations. Christian Chiarcos

Overview. Pragmatics of RDF/OWL. The notion of ontology. Disclaimer. Ontology types. Ontologies and data models

Metadata Common Vocabulary: a journey from a glossary to an ontology of statistical metadata, and back

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

INCORPORATING A SEMANTICALLY ENRICHED NAVIGATION LAYER ONTO AN RDF METADATABASE

University of Huddersfield Repository

European Conference on Quality and Methodology in Official Statistics (Q2008), 8-11, July, 2008, Rome - Italy

Semantic challenges in sharing dataset metadata and creating federated dataset catalogs

A Study of Future Internet Applications based on Semantic Web Technology Configuration Model

Semantic Web Fundamentals

Extracting knowledge from Ontology using Jena for Semantic Web

Data is the new Oil (Ann Winblad)

Library Technology Conference, March 20, 2014 St. Paul, MN

VocBench v2.3 User Manual

Interoperability Standards Rationale for PAPI. Sean McGrath Technical Lead Pan African Parliamentary Interoperability Framework Initiative

The NEPOMUK project. Dr. Ansgar Bernardi DFKI GmbH Kaiserslautern, Germany

Vocabulary Alignment for archaeological Knowledge Organization Systems

SKOS - Simple Knowledge Organization System

ELI A technical implementation guide. Author, ELI Task Force

SKOS and the Ontogenesis of Vocabularies

Semantic Interoperability Courses

SWAD-Europe Deliverable 8.3: RDF Encoding of Multilingual Thesauri

Collaborative editing of knowledge resources for cross-lingual text mining

case study The Asset Description Metadata Schema (ADMS) A common vocabulary to publish semantic interoperability assets on the Web July 2011

Linked.Art & Vocabularies: Linked Open Usable Data

TopBraid EVN. A Tour of Recent Enhancements. Copyright 2014 TopQuadrant Inc. Slide 1

It Is What It Does: The Pragmatics of Ontology for Knowledge Sharing

The Semantic Web Revisited. Nigel Shadbolt Tim Berners-Lee Wendy Hall

Proposal for Implementing Linked Open Data on Libraries Catalogue

Building a missing item in INSPIRE: The Re3gistry

Marcia Lei Zeng Kent State University Kent, Ohio, USA

Library of Congress Controlled Vocabularies as Linked Data:

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

Reducing Consumer Uncertainty Towards a Vocabulary for User-centric Geospatial Metadata

Open Ontology Repository Initiative

Standards for classifying services and related information in the public sector

Europeana update: aspects of the data

Linked Data: Fast, low cost semantic interoperability for health care?

The MUSING Approach for Combining XBRL and Semantic Web Data. ~ Position Paper ~

Application Services for Knowledge Organisation and System Integration

Project. Deliverable. Revision History. Project Acronym: AthenaPlus Grant Agreement number: Project Title:

Semantiska webben DFS/Gbg

AutoFocus, an Open Source Facet-Driven Enterprise Search Solution

Ontologies and thesauri How to answer complex questions using interoperability?

Semantic Technology. Opportunities

Transcription:

Terminologies, Knowledge Organization Systems, Ontologies Gerhard Budin University of Vienna TSS July 2012, Vienna

Motivation and Purpose Knowledge Organization Systems In this unit of TSS 12, we focus on the role of terminologies as tools to organize and retrieve knowledge (-> Knowledge Organization Systems, KOS) Major types of KOS are: Thesauri (in information science: controlled vocabularies for indexing and information retrieval) Classification systems (hierarchical concept systems, usually domain-specific, sometimes universal in scope) taxonomies, nomenclatures (mostly in natural sciences, systematic arrangements of terms seen as scientific names Ontologies (in IT -> formal conceptual shared specifications, domain ontologies often created by formalizing the previously listed types of KOS

Knowledge (organization) systems Cognitive knowledge systems collective knowledge systems, cultural systems, social systems, language and communication systems Formal knowledge systems, knowledge representation systems, semantic systems (Semantic Web) Applications: Knowledge organization as part of knowledge management (Nonaka, Takeuchi, et al) Knowledge organization as daily practice in libraries and information systems (for more than 2000 years) Knowledge organization as formal representations in collective knowledge systems -> Semantic Web applications

What is knowledge organization? 1. A part of information and library science, a part of philosophy of science and of epistemology, but also of knowledge management and knowledge engineering Investigating and representing structures of knowledge Epistemological aspects, cognitive science aspects Linguistic and socio-cultural aspects (e.g. folk taxonomies) Historical aspects (e.g. Leibniz, encyclopedism, administrative categorizations in ancient societies, history of science, etc.) 2. Practical work: creating and using knowledge organization systems (see further down) 3. Knowledge organization is also a crucial process in linguistic action (sprachliches Handeln) Text organization both in reception and production

Functions of knowledge organization systems 1. Instruments of structuring and archiving the content of large scale collections 2. Structural components of information systems 3. Support of targeted retrieval of information based on conceptual search criteria 4. Search aids, visual navigation, query languages 5. Communication support tools (cross-lingual, crossdisciplinary, cross-cultural) 6. Instruments of corporate knowledge management 7. Learning support, orientation support, didactic tools

Properties of knowledge organization systems 1. Conceptual structures (hierarchical and nonhierarchical structures) 2. Explicitation of conceptual links, definitions (mono- or multilingual) 3. Terminological and linguistic standardization 4. Increasingly formalized and digital (in particular as ontologies ) 5. Different scales (from small KOS to large ones (more than 200.000 concepts) 6. Increasingly with visualized structures, interactive user interfaces 7. Static or dynamic (e.g. ontologies for modelling business processes in companies)

Ontologies as formal knowledge systems Computer science: From Ontology as a traditional field of philosophy (theory of being, existence, theory of objects, etc.) to formal, digitally represented concept systems/ knowledge systems Concepts are explicitly defined terms are assigned Relations between concepts are explicitated Terms are standardized Logical application rules and constraints are specified Ontologies as knowledge representation systems

Domain-specific knowledge organization systems Medicine, health, bio- and life sciences Business, trade Industry, engineering Natural sciences Administration, government Culture Pedagogy Linguistics Etc.

Semantic Web the Web of data with meaning in the sense that a computer program can learn enough about what the data means to process it. a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. a collaborative effort led by World Wide Web Consortium (W3C) with participation from a large number of researchers and industrial partners. based on the Resource Description Framework (RDF), which integrates a variety of applications using XML for syntax and URIs for naming. (http://www.w3.org/people/berners-lee/weaving/glossary.html)

RDF The Resource Description Framework (RDF) is a family of W3C specifications originally designed as a metadata data model. It has come to be used as a general method for conceptual description or modelling of information that is implemented in web resources, using a variety of syntax formats. The RDF data model is based upon the idea of making statements about resources (in particular Web resources) in the form of triples. Triples are the expressions of statements about resources which are presented as subject-predicate-object expressions. The subject denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between the subject and the object. The RDF specification is based on XML encoding.

OWL The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web. OWL is based on the RDF specification OWL 2 (with a family of formats) is the new version also used in Protégé and other ontology editors

SKOS Simple Knowledge Organization System SKOS is based on the RDF specification and enables a migration towards OWL ontologies ( missing link ). SKOS is more and more required by Web services. SKOS is not a formal knowledge representation language, not a formal ontology (no axioms, etc.) SKOS is rather used for modeling controlled vocabularies such as thesauri or classifications which are of a different nature than formal ontologies. The ideas or meanings described by thesauri or other kinds of terminology are referred to as concepts -> skosification of controlled vocabularies (thesauri, etc.) and other terminologies!

Thesaurus Example: EUROVOC Thesaurus of the European Union Using a ppt presentation by Chr. Laaboudi-Spoiden at the 2010 EUROVOC conference:

General and multidisciplinary 21 fields (domains) Two-digit notation + heading 127 microthesauri Four-digit notation + heading Covers both European Union and national point of view with emphasis on parliamentary activities Laaboudi-Spoiden 2010

EuroVoc - Multilingual content 22 official languages 2 other languages Croatian and Serbian Laaboudi-Spoiden 2010

EuroVoc Language equivalence Set of 6 797 concepts Preferred Terms 24 language equivalences Non Preferred Terms Language-dependent Symmetric equivalence Preferred Terms Relationships (BT/NT, RT) Laaboudi-Spoiden 2010

EuroVoc Previous Data Model One ID number per descriptors 22 language equivalents Standard thesaurus relationships BT/NT (Broader/Narrower) RT (Associative) Attributes Non Descriptors Scope Notes Share the descriptor ID False equivalence (USE/UF) Descriptor <RECORD> <DESCRIPTOR_ID>5482</DESCRIPTEUR_ID> <LIBELLE>climate change</libelle> </RECORD> Non Descriptor <RECORD> <DESCRIPTEUR_ID>5482</DESCRIPTEUR_ID> <UF> <UF_EL>global warming</uf_el> <UF_EL>climatic change</uf_el> </UF> </RECORD> Laaboudi-Spoiden 2010

EuroVoc Previous Data Model Standards ISO 2788-1986: monolingual (1986) ISO 5964-1985: multilingual (1985) Revised by ISO 25964: Thesauri and interoperability with other 18 vocabularies Preferred Term Non Preferred Term Laaboudi-Spoiden 2010 Descriptor - Non Descriptor

New Data Model - Concepts Thesaurus Concept Multilingual URI= http://eurovoc.europa.eu/5482 19 Laaboudi-Spoiden 2010

New Data Model Thesaurus Terms Thesaurus Concept Multilingual URI= http://eurovoc.europa.eu/5482 Thesaurus Terms [language specific] URI= http://eurovoc.europa.eu/218409 http://eurovoc.europa.eu/125206 Laaboudi-Spoiden 2010 20

New Data Model Thesaurus Terms URI= http://eurovoc.europa.eu/125206 URI= http://eurovoc.europa.eu/125207 Laaboudi-Spoiden 2010 21

Data Model Summary Concept (multilingual) Concept Terms relation Standard relationships (BT/NT, RT) Relations to the Microthesaurus Group of concepts Thesaurus terms (language-dependent) Lexical representation of a concept in a given language 2 types of term Preferred term (PT) Non preferred term (NPT) Equivalence relation between PT/NPT (USE/UF) What s new? One URI by concept, term Laaboudi-Spoiden 2010

EuroVoc Website Top menu Editorial content Left menu Laaboudi-Spoiden 2010

EuroVoc - Architecture OWL Ontology - Data model XML RDF Terminologies Concept & Terms [URI] SKOS/RDF Partial SKOS EuroVoc content XHTML PDF XML/SKOS Workflow XLS ITM (Intelligent Topic Manager) Drupal (Web Content Management) Oracle Text (Search engine) EuroVoc Web site Laaboudi-Spoiden 2010 24

EuroVoc website Browse the subject-oriented version Laaboudi-Spoiden 2010

EuroVoc website Thesaurus concept Term details in the selected content language Laaboudi-Spoiden 2010

EuroVoc website Term details Preferred Term Non Preferred Term Laaboudi-Spoiden 2010

EuroVoc website Map Laaboudi-Spoiden 2010

Lisbon Treaty EC EU European Community European Union Source of the proposals The European Parliament (15%) The national parliament libraries (SE, RO and SK) Terminologists from EC DGT EuroVoc users The Publications Office EuroVoc 4.4 Online publication: first semester 2011 powers of the EC Institutions powers of the EU Institutions EC competition EU competition delegated decision non-legislative act legislative act implementing regulation special legislative procedure Laaboudi-Spoiden 2010

Ongoing and future developments of EUROVOC Available in SKOS Visualization Online version Thesaurus mapping and alignment software support Update of content

Other major thesauri GEMET General European Multilingual Environmental Thesaurus UNESCO Thesaurus CEDEFOP Thesaurus (vocational training) AAT Art and Architecture Thesaurus AGROVOC Thesaurus (FAO) General trends: Preparing for Semantic web applications RDF, SKOS, Linked Data, ontologies Networking/mapping/interoperability

For more information on ontologies, knowledge organization systems, on our projects mentioned above, on further reading, related tools, etc. Please contact Gerhard Budin University of Vienna Centre for Translation Studies gerhard.budin@univie.ac.at