Making BioPAX SPARQL

Similar documents
Table of Contents. iii

Deep integration of Python with Semantic Web technologies

Short notes about OWL 1

Semantic Web Test

Ontological Modeling: Part 2

November NYC meeting - Summary. BP L3 ratified Pathway layout Governance editors subgroups Next meeting - COMBINE

Main topics: Presenter: Introduction to OWL Protégé, an ontology editor OWL 2 Semantic reasoner Summary TDT OWL

Today: RDF syntax. + conjunctive queries for OWL. KR4SW Winter 2010 Pascal Hitzler 3

Using ontologies function management

Forward Chaining Reasoning Tool for Rya

Reasoning with the Web Ontology Language (OWL)

Querying Data through Ontologies

Semantic Web and Linked Data

Introducing Linked Data

Formalising the Semantic Web. (These slides have been written by Axel Polleres, WU Vienna)

Genea: Schema-Aware Mapping of Ontologies into Relational Databases

H1 Spring B. Programmers need to learn the SOAP schema so as to offer and use Web services.

Web Ontology Language: OWL

Logic and Reasoning in the Semantic Web (part I RDF/RDFS)

FOUNDATIONS OF SEMANTIC WEB TECHNOLOGIES

Semantic Web Ontologies

Day 2. RISIS Linked Data Course

The P2 Registry

LINKING BACKGROUND INFORMATION

Developing markup metaschemas to support interoperation among resources with different markup schemas

An Introduction to the Semantic Web. Jeff Heflin Lehigh University

Linguaggi Logiche e Tecnologie per la Gestione Semantica dei testi

A Relaxed Approach to RDF Querying

2. RDF Semantic Web Basics Semantic Web

KDI OWL. Fausto Giunchiglia and Mattia Fumagallli. University of Trento

RDF AND SPARQL. Part III: Semantics of RDF(S) Dresden, August Sebastian Rudolph ICCL Summer School

Semantic Web Technologies: Web Ontology Language

INF3580/4580 Semantic Technologies Spring 2017

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

Bryan Smith May 2010

INF3580 Semantic Technologies Spring 2012

FOUNDATIONS OF SEMANTIC WEB TECHNOLOGIES

Flexible Tools for the Semantic Web

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

A General Approach to Query the Web of Data

Building Blocks of Linked Data

TMCL and OWL. Lars Marius Garshol. Bouvet, Oslo, Norway

RDF Schema. Mario Arrigoni Neri

Semantic Technologies & Triplestores for BI

Appendix B: The LCA ontology (lca.owl)

INF3580/4580 Semantic Technologies Spring 2017

Linked data and its role in the semantic web. Dave Reynolds, Epimorphics

TopBraid Composer. Getting Started Guide. Version 5.2. September 21, TopBraid Composer, Copyright TopQuadrant, Inc.

XML and Semantic Web Technologies. III. Semantic Web / 1. Ressource Description Framework (RDF)

Using RDF to Model the Structure and Process of Systems

Languages and tools for building and using ontologies. Simon Jupp, James Malone

Semantic Web Systems Querying Jacques Fleuriot School of Informatics

Scaling the Semantic Wall with AllegroGraph and TopBraid Composer. A Joint Webinar by TopQuadrant and Franz

The Semantic Web RDF, RDF Schema, and OWL (Part 2)

Schema-Agnostic Query Rewriting in SPARQL 1.1

Knowledge Representation for the Semantic Web

Semantics in RDF and SPARQL Some Considerations

Querying the Semantic Web

Semantic Web. Ontology and OWL. Morteza Amini. Sharif University of Technology Fall 95-96

Exercise 3.1 (Win-Move Game: Draw Nodes) Consider again the Win-Move-Game. There, WinNodes and LoseNodes have been axiomatized.

Semantic Web. MPRI : Web Data Management. Antoine Amarilli Friday, January 11th 1/29

LECTURE 09 RDF: SCHEMA - AN INTRODUCTION

The Semantic Web. Mansooreh Jalalyazdi

Advances in Data Integration & Representation in Systems Biology

Semantic Web Technologies: Theory & Practice. Axel Polleres Siemens AG Österreich

Semantic Web In Depth: Resource Description Framework. Dr Nicholas Gibbins 32/4037

Mustafa Jarrar: Lecture Notes on RDF Schema Birzeit University, Version 3. RDFS RDF Schema. Mustafa Jarrar. Birzeit University

Presented By Aditya R Joshi Neha Purohit

RDF Data Management: Reasoning on Web Data

Today s Plan. INF3580 Semantic Technologies Spring Model-theoretic semantics, a quick recap. Outline

Semantic Web Solutions

Semantic Technologies

Orchestrating Music Queries via the Semantic Web

An RDF-based Distributed Expert System

Semantics. Matthew J. Graham CACR. Methods of Computational Science Caltech, 2011 May 10. matthew graham

Chapter 2 AN INTRODUCTION TO THE OWL WEB ONTOLOGY LANGUAGE 1. INTRODUCTION. Jeff Heflin Lehigh University

Goals. Introduction to Ontologybased. Service Semantics. Functional Semantics. Need more than XML Schema. Non-functional Semantics

Seman&cs)of)RDF) )S) RDF)seman&cs)1)Goals)

Generating of RDF graph from a relational database using Jena API

SPARQL: An RDF Query Language

RDF Semantics by Patrick Hayes W3C Recommendation

Unit 2 RDF Formal Semantics in Detail

2. Knowledge Representation Applied Artificial Intelligence

Semantic Web Engineering

OWL Tutorial. LD4P RareMat / ARTFrame Meeting Columbia University January 11-12, 2018

protege-tutorial Documentation

Web Ontology Language: OWL

Toward Analytics for RDF Graphs

An RDF Storage and Query Framework with Flexible Inference Strategy

Extracting Ontologies from Standards: Experiences and Issues

RDF /RDF-S Providing Framework Support to OWL Ontologies

SEMANTIC WEB 05 RDF SCHEMA MODELLING SEMANTICS IMRAN IHSAN ASSISTANT PROFESSOR, AIR UNIVERSITY, ISLAMABAD

RuleML and SWRL, Proof and Trust

ONTOLOGY QUERY LANGUAGES FOR THE SEMANTIC WEB: A PERFORMANCE EVALUATION ZHIJUN ZHANG. (Under the Direction of John.A.

Semantic Annotation, Search and Analysis

OSM Lecture (14:45-16:15) Takahira Yamaguchi. OSM Exercise (16:30-18:00) Susumu Tamagawa

Implementing and extending SPARQL queries over DLVHEX

12th ICCRTS. On the Automated Generation of an OWL Ontology based on the Joint C3 Information Exchange Data Model (JC3IEDM)

FOUNDATIONS OF SEMANTIC WEB TECHNOLOGIES

<?xml version='1.0' encoding='iso '?> <!DOCTYPE rdf:rdf [ <!ENTITY rdf ' <!

Transcription:

Making BioPAX SPARQL

hands on... start a terminal create a directory jena_workspace, move into that directory download jena.jar (http://tinyurl.com/3vlp7rw) download biopax data (http://www.biopax.org/junk/homosapiens.nt or a smaller file (http://www.biopax.org/junk/ Escherichia_coli.nt) Andrea on hand to help... //sourceforge.net/apps/mediawiki/biopax/index.php?title=biopax/ OWL_and_SPARQL#Tutorial_Material or //sourceforge.net/apps/mediawiki/biopax/index.php?title=harmony2011

Coming up... Semantic Web Stack RDF URI based integration RDFS semantic integration OWL semantic integration BioPAX SPARQL Hands on Questions/Follow up

Can BioPAX SPARQL?

Can BioPAX SPARQL? A global naming scheme

Can BioPAX SPARQL? Data exchange standards A global naming scheme

Can BioPAX SPARQL? Data representation format Data exchange standards A global naming scheme

Can BioPAX SPARQL? Data representation format Data exchange standards * A global naming scheme

Can BioPAX SPARQL? * Data representation format Data exchange standards * A global naming scheme

Can BioPAX SPARQL? * * Data representation format Data exchange standards * A global naming scheme

Can BioPAX SPARQL? * Queries Data representation format Data exchange standards * * * A global naming scheme

Can BioPAX SPARQL? * Queries Data representation format Data exchange standards A global naming scheme * * * *

Semantic Web

Semantic Web Data representation format

RDF - integration Symmetric Properties (If x=y then y=x) Transitive Properties (if x=y and y=z then x=z)

RDF - integration

RDF - integration

RDF - integration Fundamentally, new data can be inferred from existing data by following logical rules. e.g. transitive closure

RDF - integration Fundamentally, new data can be inferred from existing data by following logical rules. e.g. transitive closure

RDF - integration Biopax has an XML syntax and... N-triples How do you query these triples (quick demo)

RDF - XML

RDF - Ntriples <http://www.reactome.org/biopax#mate_substrates intracellular_> <http://www.biopax.org/release/biopax-level3.owl#memberphysicalentity <http://www.reactome.org/biopax#cimetidine intracellular_>. <http://www.reactome.org/biopax#mate_substrates intracellular_> <http://www.biopax.org/release/biopax-level3.owl#memberphysicalentity <http://www.reactome.org/biopax#creatinine intracellular_>. <http://www.reactome.org/biopax#mate_substrates intracellular_> <http://www.biopax.org/release/biopax-level3.owl#memberphysicalentity <http://www.reactome.org/biopax#guanidine intracellular_>. <http://www.reactome.org/biopax#mate_substrates intracellular_> <http://www.biopax.org/release/biopax-level3.owl#memberphysicalentity <http://www.reactome.org/biopax#metformin intracellular_>. <http://www.reactome.org/biopax#mate_substrates intracellular_> <http://www.biopax.org/release/biopax-level3.owl#memberphysicalentity <http://www.reactome.org/biopax#procainamide intracellular_>. <http://www.reactome.org/biopax#mate_substrates intracellular_> <http://www.biopax.org/release/biopax-level3.owl#memberphysicalentity

SPARQL - Graph Pattern 1 select all interactions in a pathway that are BiochemicalReactions: SELECT?name WHERE {?pathway rdf:type bp:pathway.?pathway bp:pathwaycomponent?c.?c bp:name?name.?c rdf:type bp:biochemicalreaction }

BioPAX uses RDFS 1st layer of Semantics

RDFS is about sets RDF creates a graph structure to represent data (there is no meaning) This is what SPARQL will Query. BioPAX uses RDFS contructs RDFS is the vocabulary that is used in the RDF individuals that are related to one another and how... Introduces the concept of a distinguished resource Specific triples have a special meaning (specified inferences) Type Propogation rdfs:subclassof (isa subsumption hierarchy) Relationship Propogation rdfs:subpropertyof Usage rdfs:domain rdfs:range

BioPAX Type Propogation x rdf:type rdfs:subclassof y.

BioPAX - Relationship Propogation domain catalysis object property controlled range conversion

BioPAX - Relationship Propogation bp:participant rdf:type owl:objectproperty rdfs:subpropertyof rdf:type rdf:type owl:functionalproperty bp:controlled rdfs:domain rdfs:range bp:pathway bp:control bp:interaction rdf:type rdf:type rdfs:class rdfs:class inference: IF bp:controlled rdfs:domain bp:control and x bp:controlled y THEN x rdf:type bp:control NB: inference not inheritance

RDFS - extended integration disclaimer - set operations not directly supported Set Intersection C A B but it can be aproximated C rdfs:subclassof A, C rdfs:subclassof B given x rdf:type C then infererence: x rdf:type A and x rdf:type B Set Union - integration! A B C can be approximated in a similar way Property intersection and Property Union can work in the same way using rdfs:subpropertyof r bp:protein bp:molecularinteraction x y dd:proteininteraction s dp:proteininteraction rdfs:subpropertyof bp:molecularinteraction

bp:absoluteregion

RDFS - The semantics of domain and range N.B. inference (not restrictions) domain and range functions return sets which are to be interpreted as intersections Conjunctive: Properties can have any number of domains and ranges - They all apply rdf:type owl:functionalproperty bp:dnaregionreference bp:rnaregionreference rdfs:domain bp:absoluteregion rdfs:range inference: IF x bp:absolulteregion y THEN x rdf:type bp:dnaregionreference AND x rdf:type bp:rnaregionreference bp:sequencelocation

RDFS - The semantics of domain and range N.B. inference (not restrictions) domain and range functions return sets which are to be interpreted as intersections Conjunctive: Properties can have any number of domains and ranges - They all apply rdf:type owl:functionalproperty bp:dnaregionreference bp:rnaregionreference rdfs:domain bp:absoluteregion rdfs:range inference: IF x bp:absolulteregion y THEN x rdf:type bp:dnaregionreference AND x rdf:type bp:rnaregionreference bp:sequencelocation

2nd layer of Semantics BioPAX uses OWL

bp:bindingfeature

OWL bp:bindsto rdf:type owl:objectproperty owl:inverseof rdf:type rdf:type owl:functionalproperty bp:bindsto bp:entityfeature rdfs:domain rdfs:range rdfs:subclassof bp:bindingfeature rdf:type rdfs:class rdf:type rdf:type owl:inversefunctionalproperty owl:symmetricproperty

OWL Symmetric Properties owl:inverseof owl:inverseof owl:inverseof. owl:inverseof rdf:type owl:symmetricproperty. Transitive Properties if (X partof Y) Π (Y partof Z) : (X partof Z) Functional Properties only one value (for any value of x there can be only one value of x²) (x hassquared A) Π (x hassquared B) Π (hassquared rdf:type owl:functionalproperty) : A sameas B InverseFunctional Properties think of keys in relational databases... (A componentof x) Π (B componentof x) Π (componentof rdf:type owl:inversefunctionalproperty) : A same as B (or using x we can get both A and B)

Caveats - Gotchas BioPAX assumes a closed world Integrity constraints (CWA) prevent incorrect values from being asserted in a model used for validation/parsing/data input single model that contains only the facts asserted in databases you are aiming for one version of the truth BioPAX uses OWL constructs in strange ways, strange things can happen... SPARQL does not understand OWL

OWL InverseFunctionalProperty bp:complex rdfs:domain rdf:type owl:objectproperty rdfs:subclassof bp:component rdfs:range rdf:type owl:inversefunctionalproperty bp:physicalentity inference: IF A bp:component x AND B bp:component x THEN B rdf:type bp:complex, A rdf:type bp:complex AND A owl:isa B (i.e. the same complex) strangeness!

SPARQL, how does this effect queries? A Box Queries RDF Graph

SPARQL - the basics Triple pattern, like an RDF triple but with a variable <subject> <predicate>?obj. Variables?subj rdf:type?obj. Each triple that matches the pattern will bind an actual value from the RDF data set All possible bindings are considered Graph Patterns multiple triple patterns

SPARQL - Query Structure BASE - all relative URIs resolved against PREFIX - SPARQL Equivalent of XML namespace BASE <h(p://www.biopax.org/release/> PREFIX rdf: <h(p://www.w3.org/1999/02/22 rdf syntax ns#> PREFIX bp: <biopax level3.owl#> SELECT?x FROM <> WHERE { {?x rdf:type bp:protein. } SELECT - data items to be returned by the query FROM - data against which query will run WHERE - graph pattern

SPARQL - Graph Pattern Graph Patterns within a graph pattern a variable must have the same value no matter where it is used e.g.?protein is always bound to the same resource therefore a resource that does not contain all the triples won t satisfy the graph pattern you can not select a variable if it is not listed in the graph pattern you can use shortcuts e.g. SELECT *

SPARQL - Graph Pattern 1 select all interactions in a pathway that are BiochemicalReactions: SELECT?name WHERE {?pathway rdf:type bp:pathway.?pathway bp:pathwaycomponent?c.?c bp:name?name.?c rdf:type bp:biochemicalreaction }

SPARQL - Graph Pattern 2 select all proteins that take part in a molecular interaction and are part of a complex: SELECT?participant WHERE {?protein rdf:type bp:protein.?interaction rdf:type bp:complexassembly.?interaction bp:participant?protein. } UNION {?protein rdf:type bp:protein.?complex rdf:type bp:complex.?complex bp:component?protein. }

SPARQL - Graph Pattern 3 select all complexes that have components that are both complexes and proteins (a left outer join in SQL) SELECT * WHERE {?x bp:component?y.?x rdf:type bp:complex.?y rdf:type bp:complex. OPTIONAL {?y rdf:type bp:protein} }

SPARQL - Query SELECT ASK DESCRIBE CONSTRUCT

hands on... start a terminal create a directory jena_workspace, move into that directory download jena.jar (http://tinyurl.com/3vlp7rw) download biopax data (http://www.biopax.org/junk/homosapiens.nt or a smaller file (http://www.biopax.org/junk/ Escherichia_coli.nt) Andrea on hand to help...

Semantics T Box Queries

but...not T box What does this mean? If you are combining a BioPAX graph with another BioPAX graph - nothing - still an A box query BUT if you are combining a BioPAX graph with another RDF graph with OWL property characteristics - then you need to compute the T box, materilize the inferred triples, and then you can SPARQL!! Can we compute a T Box? BioPAX++? A BioPAX extension with closure axioms? and now for the substance.

BioPAX is very SPARQLy For all that is wrong with BioPAX, there is alot that you can do with a little RDFS, a little OWL and alot of SPARQL. Using Protege BioPAX/SPARQL/SQWRL... Using Jena BioPAX/SPARQL...