A cell-cycle knowledge integration framework

Similar documents
Towards a Semantic Systems Biology: Biological Knowledge Management Using Semantic Web Technologies. Erick Antezana

protege-tutorial Documentation

Benchmarking triple stores with biological data

Structure of This Presentation

Web Ontology Language (OWL)

Languages and tools for building and using ontologies. Simon Jupp, James Malone

Ontology Development and Engineering. Manolis Koubarakis Knowledge Technologies

Technologies and best practices for building bio-ontologies

Semantic Query Answering with Time-Series Graphs

The OWL API: An Introduction

OWL 2 The Next Generation. Ian Horrocks Information Systems Group Oxford University Computing Laboratory

Knowledge Representations. How else can we represent knowledge in addition to formal logic?

BIOINFORMATICS ORIGINAL PAPER

Deep Integration of Scripting Languages and Semantic Web Technologies

Local Closed World Reasoning with OWL 2

An Online Ontology: WiktionaryZ

Supplementary Materials for. A gene ontology inferred from molecular networks

Towards using OWL DL as a metamodelling framework for ATL

Explaining Subsumption in ALEHF R + TBoxes

BIO-ONTOLOGIES: A KNOWLEDGE REPRESENTATION RESOURCE IN BIOINFORMATICS

Semantic Web. Ontology Pattern. Gerd Gröner, Matthias Thimm. Institute for Web Science and Technologies (WeST) University of Koblenz-Landau

SKOS. COMP62342 Sean Bechhofer

Representation and validation of domain and range restrictions in a relational database driven ontology maintenance system

Presented By Aditya R Joshi Neha Purohit

Parallel and Distributed Reasoning for RDF and OWL 2

The OWL Instance Store: System Description

Ontologies SKOS. COMP62342 Sean Bechhofer

Racer: An OWL Reasoning Agent for the Semantic Web

A Tutorial of Viewing and Querying the Ontology of Soil Properties and Processes

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision

Simplified Approach for Representing Part-Whole Relations in OWL-DL Ontologies

Context Ontology Construction For Cricket Video

IPA: networks generation algorithm

Main topics: Presenter: Introduction to OWL Protégé, an ontology editor OWL 2 Semantic reasoner Summary TDT OWL

Mining the Biomedical Research Literature. Ken Baclawski

On Ordering and Indexing Metadata for the Semantic Web

Semantics and Ontologies for Geospatial Information. Dr Kristin Stock

Description Logic. Eva Mráková,

Taking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria

A Semantic Model for Federated Queries Over a Normalized Corpus

An Architecture for Semantic Enterprise Application Integration Standards

Knowledge Engineering. Ontologies

JENA: A Java API for Ontology Management

Brain, a library for the OWL2 EL profile

Smart Open Services for European Patients. Work Package 3.5 Semantic Services Definition Appendix E - Ontology Specifications

l A family of logic based KR formalisms l Distinguished by: l Decidable fragments of FOL l Closely related to Propositional Modal & Dynamic Logics

Measuring inter-annotator agreement in GO annotations

Graphs,EDA and Computational Biology. Robert Gentleman

Complex Query Formulation Over Diverse Information Sources Using an Ontology

Learning from the Masters: Understanding Ontologies found on the Web

Bryan Smith May 2010

Integration of constraints documented in SBML, SBO, and the SBML Manual facilitates validation of biological models

Week 4. COMP62342 Sean Bechhofer, Uli Sattler

Helmi Ben Hmida Hannover University, Germany

CHAPTER 6 RESOLVING INCONSISTENCIES IN UML MODELS

Chapter 4 Research Prototype

The GENIA corpus Linguistic and Semantic Annotation of Biomedical Literature. Jin-Dong Kim Tsujii Laboratory, University of Tokyo

Introduction to Protégé. Federico Chesani, 18 Febbraio 2010

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction

Publishing OWL ontologies with Presto

Opus: University of Bath Online Publication Store

Extracting knowledge from Ontology using Jena for Semantic Web

A Tool for Storing OWL Using Database Technology

EBI is an Outstation of the European Molecular Biology Laboratory.

The National Cancer Institute's Thésaurus and Ontology

Web Services Annotation and Reasoning

INF3580 Semantic Technologies Spring 2012

GRID-DL Semantic GRID Information Service

NCI Thesaurus, managing towards an ontology

OWL 2 Profiles. An Introduction to Lightweight Ontology Languages. Markus Krötzsch University of Oxford. Reasoning Web 2012

OWL a glimpse. OWL a glimpse (2) requirements for ontology languages. requirements for ontology languages

ARISTOTLE UNIVERSITY OF THESSALONIKI. Department of Computer Science. Technical Report

SEMANTIC WEB LANGUAGES - STRENGTHS AND WEAKNESS

CHRONOS: A Reasoning Engine for Qualitative Temporal Information in OWL

Pattern-based design, part I

CSc 8711 Report: OWL API

Integrated Access to Biological Data. A use case

Developing Biomedical Ontologies using Protégé

ELENA: Creating a Smart Space for Learning. Zoltán Miklós (presenter) Bernd Simon Vienna University of Economics

Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction

IMCE MOF2 / OWL2 Integration

Open Ontology Repository Initiative

Formal description and visual modeling of complex biological systems using BioUML workbench and BioUML Network Edition

PlantSimLab An Innovative Web Application Tool for Plant Biologists

Semantic Web. Lecture XIII Tools Dieter Fensel and Katharina Siorpaes. Copyright 2008 STI INNSBRUCK

An Improving for Ranking Ontologies Based on the Structure and Semantics

Building the NNEW Weather Ontology

OWL as a Target for Information Extraction Systems

INF3580/4580 Semantic Technologies Spring 2017

The Semantic Web DEFINITIONS & APPLICATIONS

Construction of the dialysis and transplantation ontology, advantages, limits, and questions about Protégé OWL. Abstract 16 May 2004

Text mining tools for semantically enriching the scientific literature

A Framework for BioCuration (part II)

Using ontologies function management

Collaborative Ontology Construction using Template-based Wiki for Semantic Web Applications

Bioinformatics Data Distribution and Integration via Web Services and XML

STEM. Short Time-series Expression Miner (v1.1) User Manual

OWL 2 Syntax and Semantics Sebastian Rudolph

Ontology mutation testing

Bridging the Gap between Semantic Web and Networked Sensors: A Position Paper

Transcription:

A cell-cycle knowledge integration framework Erick Antezana Dept. of Plant Systems Biology. Flanders Interuniversity Institute for Biotechnology/Ghent University. Ghent BELGIUM. erant@psb.ugent.be http://www.psb.ugent.be/cbd/

Overview Introduction Aim Data integration pipeline CCO engineering Exploiting reasoning services Conclusions Future work

Introduction Amount of data generated in biological experiments continues to grow exponentially Shortage of proper approaches or tools for analyzing this data has created a gap between raw data and knowledge Lack of a structured documentation of knowledge leaves much of the data extracted from these raw data unused Differences in the technical languages used (synonymy and polysemy) have complicated the analysis and interpretation of the data

Aim Capture the knowledge of the CC process especially the dynamic aspects of the terms and their interrelations, and to promote sharing, reuse and enable better computational integration with existing resources Sample: Cyclin B (what) is located in Cytoplasm (where) during Interphase (when) where what when This will allow biologists to ask questions to the KB Four model organisms: At, Sc, Sp, Hs

Method CCO should capture the semantics of the temporal aspects and dynamics of the cell-cycle process CCO forms the knowledge base core Knowledge representation: OBO and OWL-DL Existing relationships have been extended Data sources: Association files (GO) PPI data: IntAct, BIND, DIP Reactome Cell-cycle functional data Data obtained using bioinformatics

OBO and OWL Open Biomedical Ontologies: OBO Standard Human readable Tools (e.g. OBOEdit) http://obo.sourceforge.net Web Ontology Language: OWL (Full, DL, Lite) OWL Full OWL DL OWL Lite Reasoning capabilities vs. computational cost ratio Computer readable Formal foundation (Description Logics: http://dl.kr.org/) http://www.w3c.org/tr/2004/rec-owl-features-20040210

Pipeline ontology integration format mapping data integration data annotation consistency checking maintenance data annotation semantic improvement

Reusing ontologies GO only considers subsumption (is_a) and partonomic inclusion (part_of). Maintainability issues in GO. GO and the RO: core ontologies in CCO All the processes from GO under the cell-cycle (GO:0007049) term were taken into account, while RO was completely imported. 304 terms adopted from GO 15 relationships from RO. The CCO is updated daily and checked using data from GO.

Motivating scenarios Molecular biologist: interacting components, events, roles that each component play. Hypothesis evaluation. Bioinformatician: data integration, annotation, modeling and simulation. General audience: educational purposes.

Competency questions What is a X-type CDK? What is Y-type cyclin? In what events is CDK Z involved? In what events does Rb participate? Which CDKs are involved in the endoreduplication process? Which proteins are phosphorylated by kinase X? Which CDK pertains to [G1 S G2 M] phase?

Formats mapping: OBO<=>OWL Mapping not totally biunivocal; however, all the data has been preserved. Missing properties in OWL relations: reflexivity, asymmetry, intransitivity and partonomic relationships. Existential and universal restrictions cannot be explicitly represented in OBO => Consider all as existential.

Mapping: obo2owl terms

Mapping: obo2owl relationships

CCO accession number CCO: [CPFRTIB]nnnnnnn namespace sub-namespace 7 digits C: cellular component P: biological process F: molecular function R: reference T: taxon I: interaction B: biomolecule Examples in CCO: CCO: P0000056 ( cell cycle ) CCO: B0001314 ( p53_human ) In other ontologies: OBO_REL: has_participant GO:0007049 ( cell cycle )

CCO entry CCO:P0000016

CCO entry CCO:P0000016

CCO entry CCO:P0000016

CCO entry CCO:P0000016

Reasoning capabilities OWL-DL: mathematical foundation (description logics) Automatic detection and handling of inconsistencies and misclassifications Reasoners (e.g. RACER, Pellet) Protégé (DIG interface)

Single inheritance principle Principle: No class in a classification should have more than one is_a parent on the immediate higher level (Smith B. et al.) Detecting the relationships which violate that rule using a reasoner Solution: disjoint among the terms at the same level of the structure 32 problems found: 4: part_of instead of is_a 18: should stay without any change (FP) 10: not consistent (used terminology)

Upper Level Ontology Based on the concepts introduced by Smith et al.

CCO status #relationships = #RO + #CCO = 15 + 5 = 20 #terms = 15 (ULO) + 304 (process branch) + 20 (xref, ref, etc) #interactions = 124 (IntAct) #genes/proteins/transcripts = 1961 TAIR: 228 GeneDB_Spombe: 1032 GOA Human: 1292 SGD: 798

CCO in OBO Edit

CCO in Protégé * Cell cycle Cell cycle checkpoint Cell cycle arrest * http://protege.stanford.edu

CCO API Set of PERL modules influenced by go-perl Features: OBO parsing Ontology handling obo2owl, owl2obo XSL transformations

CCO availability http://www.sbcellcycle.org/cco/html/index.html OBO, OWL, XML and API (Perl) Very soon: advanced queries Very soon: http://www.cellcycleontology.org A cell-cycle knowledge integration framework. Data Integration in Life Sciences, DILS 2006, LNBI 4075, pp. 19-34, 2006.

CCO online

Conclusions A data integration pipeline prototype covering the entire life cycle of the knowledge base. Concrete problems and initial results related to the implementation of automatic format mappings between ontologies and inconsistency checking issues are shown. Existing integration obstacles due to the diversity of data formats and lack of formalization approaches as well as the trade-offs that are common in biological sciences.

Future work The knowledge will be weighted or scored according to some defined evidence codes expressing the support media similar to those implemented in GO (experimental, electronically inferred, and so forth). A query system and a web user interface are also foreseen. The ultimate aim of the project is to support hypothesis evaluation about cell-cycle regulation issues.

Acknowledgments Martin Kuiper (UGent/VIB) Vladimir Mironov (UGent/VIB) Elena Tsiporkova (UGent/VIB) Mikel Egaña (Manchester University, Robert Stevens group)

A cell-cycle knowledge integration framework Erick Antezana Dept. of Plant Systems Biology. Flanders Interuniversity Institute for Biotechnology/Ghent University. Ghent BELGIUM. erant@psb.ugent.be