Languages and tools for building and using ontologies. Simon Jupp, James Malone

Similar documents
Taking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision

Knowledge Representations. How else can we represent knowledge in addition to formal logic?

warwick.ac.uk/lib-publications

Community-based ontology development, alignment, and evaluation. Natasha Noy Stanford Center for Biomedical Informatics Research Stanford University

Tania Tudorache Stanford University. - Ontolog forum invited talk04. October 2007

protege-tutorial Documentation

SKOS. COMP62342 Sean Bechhofer

Ontologies SKOS. COMP62342 Sean Bechhofer

0.1 Knowledge Organization Systems for Semantic Web

A Tool for Storing OWL Using Database Technology

ONTOLOGY LIBRARIES: A STUDY FROM ONTOFIER AND ONTOLOGIST PERSPECTIVES

Putting OWL into production at the European Bioinformatics Institute

WHO ICD11 Wiki LexWiki, Semantic MediaWiki and the International Classification of Diseases

SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web

NCI Thesaurus, managing towards an ontology

Introduction to RDF and the Semantic Web for the life sciences

Disease Information and Semantic Web

Today: RDF syntax. + conjunctive queries for OWL. KR4SW Winter 2010 Pascal Hitzler 3

SEMANTIC SUPPORT FOR MEDICAL IMAGE SEARCH AND RETRIEVAL

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 95-96

Acquiring Experience with Ontology and Vocabularies

The National Cancer Institute's Thésaurus and Ontology

Simplified Approach for Representing Part-Whole Relations in OWL-DL Ontologies

Prototyping a Biomedical Ontology Recommender Service

Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic

Protégé-2000: A Flexible and Extensible Ontology-Editing Environment

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94

Structure of This Presentation

Orchestrating Music Queries via the Semantic Web

JENA: A Java API for Ontology Management

OWL-DBC The Arrival of Scalable and Tractable OWL Reasoning for Enterprise Knowledge Bases

Main topics: Presenter: Introduction to OWL Protégé, an ontology editor OWL 2 Semantic reasoner Summary TDT OWL

Helmi Ben Hmida Hannover University, Germany

LexGrid Philosophy, Model and Interfaces Harold R Solbrig Division of Biomedical Statistics and Informatics Mayo Clinic

BUILDING THE SEMANTIC WEB

Ontology Summit2007 Survey Response Analysis. Ken Baclawski Northeastern University

Presented By Aditya R Joshi Neha Purohit

The OWL API: An Introduction

Terminologies, Knowledge Organization Systems, Ontologies

Semantic Web for Earth and Environmental Terminology (SWEET) Status, Future Development and Community Building

OASIS Electronic Trial Master File Standard Technical Committee

Making BioPAX SPARQL

Introduction to Protégé. Federico Chesani, 18 Febbraio 2010

MSc Advanced Computer Science School of Computer Science The University of Manchester

Smart Open Services for European Patients. Work Package 3.5 Semantic Services Definition Appendix E - Ontology Specifications

Semantics. Matthew J. Graham CACR. Methods of Computational Science Caltech, 2011 May 10. matthew graham

OWL 2 The Next Generation. Ian Horrocks Information Systems Group Oxford University Computing Laboratory

GraphOnto: OWL-Based Ontology Management and Multimedia Annotation in the DS-MIRF Framework

Ontology mutation testing

Linked Data and RDF. COMP60421 Sean Bechhofer

STS Infrastructural considerations. Christian Chiarcos

QuickTime and a Tools API Breakout. TIFF (LZW) decompressor are needed to see this picture.

A Tutorial of Viewing and Querying the Ontology of Soil Properties and Processes

jcel: A Modular Rule-based Reasoner

Extracting knowledge from Ontology using Jena for Semantic Web

Semantics Modeling and Representation. Wendy Hui Wang CS Department Stevens Institute of Technology

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

BIOLOGICAL PATHWAYS AND THE SEMANTIC WEB

RESTful Encapsulation of OWL API

Ontology Refinement and Evaluation based on is-a Hierarchy Similarity

Modularity in Ontologies: Introduction (Part A)

CHAPTER 1 INTRODUCTION

Semantic Web Fundamentals

Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research

Interoperability of Protégé 2.0 beta and OilEd 3.5 in the Domain Knowledge of Osteoporosis

AutoFocus, an Open Source Facet-Driven Enterprise Search Solution

SkyEyes: A Semantic Browser For the KB-Grid

OWL 2 Update. Christine Golbreich

Ontology Servers and Metadata Vocabulary Repositories

Using SPARQL to Query BioPortal Ontologies and Metadata

Metadata Standards and Applications. 6. Vocabularies: Attributes and Values

WebProtégé. Protégé going Web. Tania Tudorache, Jennifer Vendetti, Natasha Noy. Stanford Center for Biomedical Informatics

Auditing Redundant Import in Reuse of a Top Level Ontology for the Drug Discovery Investigations Ontology (DDI)

Semantic Annotation and Linking of Medical Educational Resources

CHAPTER 2. Overview of Tools and Technologies in Ontology Development

Brain, a library for the OWL2 EL profile

Ontrez Project Report National Center for Biomedical Ontology November, 2007

Protégé Plug-in Library: A Task-Oriented Tour

The Semantic Web DEFINITIONS & APPLICATIONS

Semantic Web. Ontology Pattern. Gerd Gröner, Matthias Thimm. Institute for Web Science and Technologies (WeST) University of Koblenz-Landau

Collaborative & WebProtégé

Ontology Engineering. CSE 595 Semantic Web Instructor: Dr. Paul Fodor Stony Brook University

Terminology Harmonization

Utilizing NCBO Tools to Develop & Use an ECG Ontology

Linked Data: Fast, low cost semantic interoperability for health care?

NCBO Technology: Powering semantically aware applications

BIO-ONTOLOGIES: A KNOWLEDGE REPRESENTATION RESOURCE IN BIOINFORMATICS

Comparing SNOMED CT and the NCI Thesaurus through Semantic Web Technologies

Introduction to ontologies

Table of Contents. iii

INF3580/4580 Semantic Technologies Spring 2017

A method for recommending ontology alignment strategies

A WEB-BASED TOOLKIT FOR LARGE-SCALE ONTOLOGIES

The OWL API: An Introduction

WebGUI & the Semantic Web. William McKee WebGUI Users Conference 2009

ONTOLOGIES FOR BIOMEDICINE HOW TO MAKE

Semantic Interoperability. Being serious about the Semantic Web

Programming THE SEMANTIC WEB. Build an application upon Semantic Web models. Brief overview of Apache Jena and OWL-API.

Transcription:

An overview of ontology technology Languages and tools for building and using ontologies Simon Jupp, James Malone jupp@ebi.ac.uk, malone@ebi.ac.uk

Outline Languages OWL and OBO classes, individuals, relations, labels, annotations, differences between languages Developing ontologies (Protégé) Viewing ontologies (BioPortal, OLS)* Tools for data annotation (Annotator, Whatizit, Zooma)* Ontologies in applications (Atlas, KupKB)* Goal to think about which ontologies you might use given some data and what criteria you use to decide

Terminology, classification and coding standards Linnaeus 18th century of species Long tradition of building classification systems for medicine Bertillon 19 th Century International Classification of Disease (which later became ICD) Others include SNOMED-CT, MeSH WordNet - Lexicon databases Cyc - knowledge base of common sense knowledge

Knowledge representation systems Building terminologies presents some major challenges Sharing terminologies Interoperability of terminologies between applications Managing consistency Querying and inferring relationships Long history of research in Computer Science Rule based systems / expert systems Frame based systems First order logic Description logics

Look away now!

Ontology Lingua franca Lots of terminology borrowed Philosophy, mathematics, computer science Languages used in ontologies have basic components: Classes / Concepts / Terms / Types Individuals / Instances / Members Relationships / Properties / Roles In addition there are some useful human friendly components: labels (for names of things) Annotations (for things like human readable definitions)

From Description Logics to OWL DL s are a family of knowledge representation language Precisely defined semantics for automated reasoning Handles decidable inference problems Efficient reasoning algorithms 1990 s several languages for representing DLs DAML and OIL 2001 W3C working group formed to develop standard 2004 OWL (Web Ontology Language) specification released

Lovely OWL Squeeze the lovely axioms out of him http://www.w3.org/tr/owl-features/

Well, almost

OWL constructs OWL gives you standard for describing ontologies Lots of constructs but you probably won t need them all Class assertions Subclasses, equivalent classes, intersection, union and complement classes Individual assertions Types, relationships to other individuals Relationships Functional, transitive, symmetric and more

OWL Syntax There are many syntaxes for representing an OWL ontology The W stands for Web OWL primary syntax built on stack of web technology defined by the W3C XML at the bottom of the stack Then RDF but more on that tomorrow

Working with OWL Luckily we have tools for working with OWL

Protégé - http://protege.stanford.edu Open source software for editing OWL ontologies

OWL API - http://owlapi.sourceforge.net

We can even use spreadsheets http://populous.org.uk

OWL reasoners This is where the real strength of OWL lies ELK Infer statements that are not explicitly stated in the ontology Subsumption, equivalence, consistency and instantiation testing

Reasoner plugins to Protégé

Showing unsatisfiable classes

Getting explanations

Inferring subsumption

Defining classes

After classification with a reasoner

Alternatives to OWL Simple Knowledge Organisation system (SKOS) http://www.w3.org/2004/02/skos/ W3C standard for Thesauri, Classification Schemes, Taxonomies, Subject Headings, Other types of Controlled Vocabulary Lighter-weight semantics Less ability to do reasoning and consistency checking

So what about OBO? Open Biomedical Ontology Language Born out of the needs of the life science Less support for reasoning Better support for meta-data Human readable syntax Initially developed for the Gene Ontology OBO Foundry now a coordinated effort for a wide range of reference ontologies of the life sciences http://www.obofoundry.org

OBO edit tools support

OBO and OWL Several attempt to map the languages OBO has a mapping to subset of OWL OBO essentially another OWL syntax Few OBO constructs not possible in OWL Obo Ontology Release Tool (Oort) Roundtrips between OBO and OWL OWL for reasoning and checking + OBO for development

Summary Ontology brings together multiple disciplines Can be overwhelming Lots of opinion (we have lots) OWL, reasoners, ontology editors All maturing technologies Lots of academic/research software Developing good ontologies is a lot like developing good software

Viewing Ontologies in BioPortal http://bioportal.bioontology.org/ Public repository of biomedical ontologies Hosted at the National Centre for Biomedical Ontology (NCBO), Stanford University 426 ontologies covering around 5.8 million terms though many of these will be duplicates Allows uploads of OWL and OBO ontologies

Viewing Ontologies in BioPortal http://bioportal.bioontology.org/

Viewing Ontologies in BioPortal

Viewing Ontologies in OLS Ontology Lookup Service (OLS) based at EBI Browser based on OBO version of ontologies only Web site and web service search facility Visualisation of terms in graph showing common OBO relations such as is_a and part_of

Viewing Ontologies in OLS http://www.ebi.ac.uk/ontology-lookup

Viewing Ontologies in OLS

Semantic Web Search - http://swoogle.umbc.edu/ Index semantic web data and ontologies (more on sem web tomorrow) Can return results in Google like way Not curated caveat emptor

Working out differences between ontologies Bubastis www.ebi.ac.uk/efo/bubastis An OWL ontology is a collection of triples: No requirement on ordering so traditional diff does not work

Ontologies in Applications Programming libraries exist to help utilise ontologies OWL-API is a Java library for creating, manipulating and serialising OWL ontologies Mechanisms for asking for class annotations, subclasses, parent classes and axioms Contains parsers and writers for several formats including RDF/XML, OWL/XML, Turtle and OBO Apache Jena can also handle OWL ontologies

Ontologies in Applications OWL-API used in several applications Expression Atlas uses it to produce tree browser and query expansion

Tools for Data Annotation Effectively annotating data with ontology terms is an open research problem Primary consideration is between coverage and precision, i.e. is goal blanket coverage or high accuracy in annotations Determining the balance is the key which depends upon use case For patient medical record may want high precision even at cost of low coverage For tagging a paper high coverage at cost of lower precision may be acceptable

Tools for Data Annotation NCBO Annotator http://bioportal.bioontology.org/annotator NCBO Annotator can utilise the power of all the ontologies in BioPortal to match text to ontology labels and synonyms Uses combination of metrics to match including: semantic distance matching meaning Ontological distance using is_a closures Ontology mappings two mapped terms may mean same thing Produces a set of ontology classes which match parts of the text entered No scoring or ranking mechanism so there is some overhead in translating this

Tools for Data Annotation NCBO Annotator http://bioportal.bioontology.org/annotator

Tools for Data Annotation NCBO Annotator http://bioportal.bioontology.org/annotator

Tools for Data Annotation WhatIzIt http://www.ebi.ac.uk/webservices/whatizit/

Tools for Data Annotation WhatIzIt http://www.ebi.ac.uk/webservices/whatizit/

But noise remains

Terminizer http://terminizer.org/ Limited to OBO Foundry ontologies

Much overlap in bio-ontologies

Which Bio-ontologies? There are number of tools for viewing and editing ontologies and building them into applications Most are ontology agnostic So how does one find the ontology(ies) that best fits one s needs? What is a good ontology? An open research problem, many different opinions We need to identify needs first what are they, in and of themselves? How can we measure the ontologies against these needs? We use competency questions and use cases

Use case Familiar paradigm to those working in software engineering In software engineering this is typically interaction between user and system to achieve objective In ontology engineering use cases can be subdivided Interaction with data and ontology Interaction with system and ontology Interaction with user and ontology Often require different considerations

Interaction with data and ontology Typical competency questions: Does it cover all of my data (by class name)? Do definition s of class correspond to my data? Is x a subclass of y (because I need it to be)? Can I use this to integrate with Dr Smith s data?

Interaction with system and ontology Typical competency questions: What is limit of querying possible? Will it allow for real-time querying? Is it so big it will break everything? Will an ontology update break my system?

Interaction with user and ontology Typical competency questions: Does the hierarchy look like something user would understand? Do the names of things correspond to user understanding? Do the definitions of things correspond to user understanding?

Task: Finding Ontologies for your data Ex1 example For zebrafish transcripts observed included: GLUT, TOMM40 & TKT Pathways relating to energy metabolism were also observed (1,100 metabolic, 10 glycolysis/gluconeogenesis, 30 pentose phosphate, 51fructose and mannose metabolism, 52 galactose metabolism, and 71 fatty acid metabolism pathways Ex2 look at the OBI, BioAssay & NCI Thesaurus ontologies for these terms: Assay, study, microarray Ex3 clinical data Osteosarcoma biopsy, Homo sapiens, 4 years, MAP 2 presurgery, 60% necrosis, time until recurrence of 126.3 months