Open PHACTS. An Introduction and Explanation March Acknowledgements: Contains contributions from across the Open PHACTS partners.

Size: px
Start display at page:

Download "Open PHACTS. An Introduction and Explanation March Acknowledgements: Contains contributions from across the Open PHACTS partners."

Transcription

1 Open PHACTS An Introduction and Explanation March 2012 Acknowledgements: Contains contributions from across the Open PHACTS partners.

2 Public Domain Drug Discovery Data: Pharma are accessing, processing, storing & re-processing Literature Patents PubChem Genbank Databases Downloads x each company Data Integration Data Analysis Firewalled Databases Why?

3 The Innovative Medicines Initiative EC funded public-private partnership for pharmaceutical research Focus on key problems Efficacy, Safety, Education & Training, Knowledge Management The Open PHACTS Project Create a semantic integration hub ( Open Pharmacological Space ) Delivering services to support on-going drug discovery programs in pharma and public domain Not just another project; Leading academics in semantics, pharmacology and informatics, driven by solid industry business requirements 23 academic partners, 8 pharmaceutical companies, 3 biotechs Work split into clusters: Tehnical Build (focus here) Scientific Drive Community & Sustainability The Project

4 Open PHACTS Project Partners Pfizer Limited Coordinator Universität Wien Managing entity Technical University of Denmark University of Hamburg, Center for Bioinformatics BioSolveIT GmBH Consorci Mar Parc de Salut de Barcelona Leiden University Medical Centre Royal Society of Chemistry Vrije Universiteit Amsterdam Spanish National Cancer Research Centre University of Manchester Maastricht University Aqnowledge University of Santiago de Compostela Rheinische Friedrich-Wilhelms-Universität Bonn AstraZeneca GlaxoSmithKline Esteve Novartis Merck Serono H. Lundbeck A/S Eli Lilly Netherlands Bioinformatics Centre Swiss Institute of Bioinformatics ConnectedDiscovery EMBL-European Bioinformatics Institute Janssen Pharmaceutica OpenLink The Open PHACTS Foundation

5 A user-friendly, full featured interface that allows scientists to explore and interrogate integrated biological and chemical data What will users see?

6 A Precompetitive Infrastructure Begin the task of creating an environment that can also power future collaborative efforts (public & industry) Expose Industry Experience: Create drug-discovery focused tools outside of the firewall, influenced by decades of practical experience A Pharmacology Use Case Showcase one application this technology: a stable, responsive, user-orientated system for Pharmacology Analysis A Data Publishing Methodology Develop standards and methodologies to promote good data sharing and interoperability An exemplar project for the use of the Nanopublication concept A technical approach that can be repeated in other areas OPS OPS 3 Pillars

7 Major Work Streams Build: OPS service layer and resource integration Drive: Development of exemplar work packages & Applications Sustain: Community engagement and long-term sustainability Consumer Firewall Supplier Firewall OPS Service Layer Assertion & Meta Data Mgmt Transform / Translate Integrator Corpus 1 Target Dossier Db 2 Compound Dossier Db 3 Db 4 Pharmacological Networks Std Public Vocabularies Business Rules Corpus 5 Work Stream 2: Exemplar Drug Discovery Informatics tools Develop exemplar services to test OPS Service Layer Target Dossier (Data Integration) Pharmacological Network Navigator (Data Visualisation) Compound Dossier (Data Analysis) Work Stream 1: Open Pharmacological Space (OPS) Service Layer Standardised software layer to allow public DD resource integration Define standards and construct OPS service layer Develop interface (API) for data access, integration and analysis Develop secure access models Existing Drug Discovery (DD) Resource Integration

8 Number sum Nr of 1 Question All oxido,reductase inhibitors active <100nM in both human and mouse Given compound X, what is its predicted secondary pharmacology? What are the on and off,target safety concerns for a compound? What is the evidence and how reliable is that evidence (journal impact factor, KOL) for findings associated with a compound? Given a target find me all actives against that target. Find/predict polypharmacology of actives. Determine ADMET profile of actives For a given interaction profile, give me compounds similar to it The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data in serine protease assays for molecules that contain substructure X. Retrieve all experimental and clinical data for a given list of compounds defined by their chemical structure (with options to match stereochemistry or not) A project is considering Protein Kinase C Alpha (PRKCA) as a target. What are all the compounds known to modulate the target directly? What are the compounds that may modulate the target directly? i.e. return all cmpds active in assays where the resolution is at least at the level of the target family (i.e. PKC) both from structured assay databases and the literature Give me all active compounds on a given target with the relevant assay data Give me the compound(s) which hit most specifically the multiple targets in a given pathway (disease) Identify all known protein-protein interaction inhibitors Business Question Based Requirements

9 A use case driven approach Main architecture, technical implementation and primary capabilities driven by a set of prioritised research questions Based on the main research questions define prioritised data sources Three Exemplars will be developed to demonstrate the capabilites of the OPS System and to define interfaces and input/output standards Three Use cases have been defined to benchmark the OPS system towards current standard workflows in data retrieval and mining

10 Example Research questions Give all compounds with IC50 < xxx for target Y in species W and Z plus assay data What substructures are associated with readout X (target, pathway, disease, ) Give all experimental and clinical data for compound X Give all targets for compound X or a compound with a similarity > y%

11 Exemplar Services Chem-Bio Navigator: querying and visualization of sets of pharmacologically annotated small molecules, on basis of chemical substructures, pharmacophores, biological activities Target Dossier: in silico dossiers about targets, incorporating related information on sequences, structures, pathways, diseases and small molecules Polypharmacology Browser: map coverage of the chemo-biological space, to facilitate the polypharmacological profiling of small molecules

12 The Semantic Web & Linked Data Annotated OPS with ontologies Linked Knowledge Harmonising data sets Concept vocabulary services Identity resolution services Concept maps Semantic Normalisation Syntactic Normalisation OPS in RDF Linked Data KEGG URI CheBI URI BRENDA URI Open Data

13 Nanopublications Capturing scientific information in the Triple Store Nanopublications & OPS (1)

14 Nanopublication facilitates the identification of individual assertions in RDF graphs. (A) represents a standard RDF conversion. In (B), colours represent corresponding data in row and graph representations. Nanopublications are an additional layer on top of RDF An implementation is the use of Named Graphs to identify specific scientific assertions This offers two major benefits 1. An ability to circle the triples which make up an individual scientific assertion (which makes viewing and understanding the central fact by both human and computers a lot easier) 2. The assignment of provenance and attribution to facts, allowing for data citability (see value of data paper on previous slide OPS has produced a series of guidelines to producing Nanopublication-RDF for data contributors Nanopublications & OPS (2)

15 Lash Up Demo Play

16 OPS GUI OPS Framework Architecture. Dec 2011 App Framework Identity & Vocabulary Management Web Service API Sparql OPS Data Model Semantic Data Workflow Engine Web Services RDF Data Cache Chemistry Normalisation & Registration Descripto r RDF 1 Descripto r RDF 2 Descripto r Nanopub RDF 3 Descripto r Nanopub RDF 4 Public Vocabularies Data 1 Data 2 Data 3 Data 4

17 OPS Framework Implementation Summary A Semantic Data Publishing approach. Individual data sets exist across the web, each is converted to an RDF representation (ideally by the data owner themselves) which includes a data set descriptor (data set meta data, including last updated information). RDF is created according to commonly used principles, including good URIs from stable, open vocabularies from the NCBO This data may also be published as Nanopublications a mechanism for assigning provenance and providing citeability and credit for data usage The Harvester monitors the sources of interest (with data-owners consent). When it identifies new data, via the data set descriptor, it loads this into the triple store. The triple store works as an application-specific cache. Only data specified by the project, required to serve specific use-cases is loaded. The cache may be obliterated and rebuilt; no primary data is stored here Chemicals are normalized and validated via the ChemSpider registration system at data loading time. Vocabulary management is provided by ConceptWiki. Resolution of multiple identifiers for the same entity and multiple URL forms for the same identifier is provided by a semantic identifier mapping service using BridgeDb. Mappings between vocabularies are provided by individual datasources and major systems such as the NCBO and ConceptWiki The Large Knowledge Collider (LarKC) is a semantic workflow engine and sits on top of the triple store. Its wraps the vocabulary and identity services as plugins, such that they can be access in-line in Sparql queries. LarKC is compatible with many triple stores allowing for different optimizations. Structure similarity searches are provided by ChemSpider web services, also wrapped as LarKC plugins and thus accessible via Sparql. Other services are similarly integrated as required User interfaces access the system via RESTful/JSON web services which provide simplified access to optimised commonly requested queries The OPS Graphical User Interface is built using a Ruby-On-Rails, ExtJS framework. The scaffold for this was seeded from the Lundbeck LSP4All system

18 RDF is our chosen format; it is well suited to describing complex data, open and supported by a growing body of tools and scientists Critical data sources are published as RDF and include data set descriptors (see here & here), which are an existing standard, promoted by Open PHACTS. This plays a major role in identifying and maintaining content in integration systems such as OPS Providing richer meta data for each dataset increases the value and scope for reuse of that data beyond Open PHACTS Producers of RDF can choose to enhance their RDF with information required to create Nanopublications, and promote the citability of data We aim to contribute to standards around RDF publishing that promote interoperability and data reuse Data Publishing Methodology

19 Main chemical needs are the registration of molecules (including validation/sanitisation) and ability to perform structure searches This is an example of an external service, provided by ChemSpider (RSC) Each database that contains molecules (SMILES, INCHI, SDF, MOL formats) must be registered with ChemSpider This process validates structures, and assigns them a unique CSID CSIDs, preferred name, synonyms, Inchi string & key and SMILES are published back into the cache Chemical structure standardisation rules are is based upon the FDAs Food and Drug Administration Substance Registration System Standard Operating Procedure, available at Some modifications for OPS have been made and these will be published shortly. ChemSpider also provides validated synonyms for many chemicals to ConceptWiki to enable entry to the system via free text typing of a compound name Chemistry Representation

20 Increased value to Pharma through The ability to use commercial and private (internal) data The ability to switch off internal systems to achieve the same aims as OPS Thus, providing a platform that allows for secure access of such data is essential Subject of a pilot study, spring 2012 The Need

21 The core platform will be built on open source technology. This includes the data harvester, the semantic workflow engine/api code, the Open PHACTS GUI and associated widgets The standards for producing RDF/Nanopublications will all be open and available An open version of the system will be available at openphacts.org, fully functional with public data Interested parties will also be able to download the core platform and instantiate it on their own servers, having everything they need to run a local system should this be required Note: OPS is decoupled from any specific RDF database engine. It should be possible to run the platform on a range of free and commercial platforms that meet certain criteria (to be published) What Is Open?

22 Associated partners Organisations, most will join here Support, information Exchange of ideas, data, technology Opportunities to demo at community webinars Need MoU Development partnerships Influence on API developments Opportunities to demo ideas & use cases to core team Need MoU and annexe MoU +Annexe Associated partners Development partnerships Consortium Consortium 22 current members Open PHACTS and the scientific community

23 Open systems need an engaged community, to grow, develop and sustain. Joining as an Associated Partner is the first step. Associated Partners: we have a MoU ready to sign for mutual support and exchange of ideas, data or technology Most will fit here, regular contact, support and review Opportunity to present ideas and use cases to core Open PHACTS team Development Partnerships: when we want to do some more specific work together, develop APIs, new data, algorithms etc. Greater access to the core of the project through agreed collaborative annexe to MoU Eg we are working actively on a certain business question How to get involved.

24 We actively manage our partners, and the wider community We term this the Open PHACTS Waiting Room, managed by a Gatekeeper Our relationships with all partners are visible at openphacts.org What we are doing together and why Opportunities to engage and develop will open and based on project needs We hold regular community workshops and events Learn more about Open PHACTS and the Open Pharmacological Space Participate in new ideas and functions Engage in development of new use cases, help us answer new questions Contribute to development, and engage in plans for sustaining the Open Pharmacological Space Please contact us to join the debate Open PHACTS Waiting Room

Open PHACTS. Deliverable 5.3.4

Open PHACTS. Deliverable 5.3.4 Deliverable 5.3.4 First release of Polypharmacology browser, Pharmatrek, to community The Poly-pharmacology Browsers Prepared by DTU, PSMAR Approved by DTU, PSMAR, AZ, Janssen, UNIVIE September 2013 Version

More information

SureChem and ChEMBL. ACS CINF webinar. John P. Overington & Nicko Goncharoff

SureChem and ChEMBL. ACS CINF webinar. John P. Overington & Nicko Goncharoff SureChem and ChEMBL ACS CINF webinar John P. Overington & Nicko Goncharoff 8 th April 2014 Assay/Target ChEMBL Data for Drug Discovery 1. Scientific facts 3. Insight, tools and resources for translational

More information

Open PHACTS Explorer: Pharmacology by Enzyme Family

Open PHACTS Explorer: Pharmacology by Enzyme Family Open PHACTS Explorer: Pharmacology by Enzyme Family This document is a tutorial for using Open PHACTS Explorer (explorer.openphacts.org) to obtain pharmacological information for families of enzymes classified

More information

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Amanda Clare 1,3, Samuel Croset 2,3 (croset@ebi.ac.uk), Christoph Grabmueller 2,3, Senay

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

SELF-SERVICE SEMANTIC DATA FEDERATION

SELF-SERVICE SEMANTIC DATA FEDERATION SELF-SERVICE SEMANTIC DATA FEDERATION WE LL MAKE YOU A DATA SCIENTIST Contact: IPSNP Computing Inc. Chris Baker, CEO Chris.Baker@ipsnp.com (506) 721 8241 BIG VISION: SELF-SERVICE DATA FEDERATION Biomedical

More information

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision A Semantic Web-Based Approach for Harvesting Multilingual Textual Definitions from Wikipedia to Support ICD-11 Revision Guoqian Jiang 1,* Harold R. Solbrig 1 and Christopher G. Chute 1 1 Department of

More information

Programming in the Life Sciences

Programming in the Life Sciences Programming in the Life Sciences In the Maastricht Science Programme Open PHACTS Community Workshop London, 26 June 2014 1 Who am I? Teacher at Dept. Bioinformatics BiGCaT, NUTRIM, FHML, UM. http://chem-bla-ics.blogspot.com/

More information

Unstructured Text in Big Data The Elephant in the Room

Unstructured Text in Big Data The Elephant in the Room Unstructured Text in Big Data The Elephant in the Room David Milward ICIC, October 2013 Click Unstructured to to edit edit Master Master Big title Data style title style Big Data Volume, Variety, Velocity

More information

The Path to Linked Data in BioPharma

The Path to Linked Data in BioPharma The Path to Linked Data in BioPharma Tom Plasterer, PhD. integrated informatics Semantic Framework Lead (i2sf) Integrated t R&D Informatics and Knowledge Management Abstract As BioPharma adapts to incorporate

More information

Taking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria

Taking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria Taking a view on bio-ontologies Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria Who we are European Bioinformatics Institute one of world s largest bio data and service providers

More information

National Centre for Text Mining NaCTeM. e-science and data mining workshop

National Centre for Text Mining NaCTeM. e-science and data mining workshop National Centre for Text Mining NaCTeM e-science and data mining workshop John Keane Co-Director, NaCTeM john.keane@manchester.ac.uk School of Informatics, University of Manchester What is text mining?

More information

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,

More information

KNIME Enalos+ Molecular Descriptor nodes

KNIME Enalos+ Molecular Descriptor nodes KNIME Enalos+ Molecular Descriptor nodes A Brief Tutorial Novamechanics Ltd Contact: info@novamechanics.com Version 1, June 2017 Table of Contents Introduction... 1 Step 1-Workbench overview... 1 Step

More information

Acquiring Experience with Ontology and Vocabularies

Acquiring Experience with Ontology and Vocabularies Acquiring Experience with Ontology and Vocabularies Walt Melo Risa Mayan Jean Stanford The author's affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended

More information

KNIME Enalos+ Modelling nodes

KNIME Enalos+ Modelling nodes KNIME Enalos+ Modelling nodes A Brief Tutorial Novamechanics Ltd Contact: info@novamechanics.com Version 1, June 2017 Table of Contents Introduction... 1 Step 1-Workbench overview... 1 Step 2-Building

More information

Customisable Curation Workflows in Argo

Customisable Curation Workflows in Argo Customisable Curation Workflows in Argo Rafal Rak*, Riza Batista-Navarro, Andrew Rowley, Jacob Carter and Sophia Ananiadou National Centre for Text Mining, University of Manchester, UK *Corresponding author:

More information

XML in the bipharmaceutical

XML in the bipharmaceutical XML in the bipharmaceutical sector XML holds out the opportunity to integrate data across both the enterprise and the network of biopharmaceutical alliances - with little technological dislocation and

More information

The Expansive Reach of ChemSpider as a Resource for the Chemistry Community. Antony Williams University of Oregon, April 24 th 2013

The Expansive Reach of ChemSpider as a Resource for the Chemistry Community. Antony Williams University of Oregon, April 24 th 2013 The Expansive Reach of ChemSpider as a Resource for the Chemistry Community Antony Williams University of Oregon, April 24 th 2013 The World of Online Chemistry Property databases Compound aggregators

More information

enanomapper database, search tools and templates Nina Jeliazkova, Nikolay Kochev IdeaConsult Ltd. Sofia, Bulgaria

enanomapper database, search tools and templates Nina Jeliazkova, Nikolay Kochev IdeaConsult Ltd. Sofia, Bulgaria enanomapper database, search tools and templates Nina Jeliazkova, Nikolay Kochev IdeaConsult Ltd. Sofia, Bulgaria www.ideaconsult.net Ø enanomapper database: data model, technology; NANoREG data transfer

More information

Life Sciences Oracle Based Solutions. June 2004

Life Sciences Oracle Based Solutions. June 2004 Life Sciences Oracle Based Solutions June 2004 Overview of Accelrys Leading supplier of computation tools to the life science and informatics research community: Bioinformatics Cheminformatics Modeling/Simulation

More information

ToxPredict Beta Testing Report Template

ToxPredict Beta Testing Report Template ToxPredict Beta Testing Report Template Grant Agreement Acronym Name Coordinator Health-F5-2008-200787 OpenTox An Open Source Predictive Toxicology Framework Douglas Connect Contract No. Document Type:

More information

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology Making research data repositories visible and discoverable Robert Ulrich Karlsruhe Institute of Technology Outline Background Mission Schema, Icons, Quality and Workflow Interface Growth Cooperations Experiences

More information

TEXT MINING: THE NEXT DATA FRONTIER

TEXT MINING: THE NEXT DATA FRONTIER TEXT MINING: THE NEXT DATA FRONTIER An Infrastructural Approach Dr. Petr Knoth CORE (core.ac.uk) Knowledge Media institute, The Open University United Kingdom 2 OpenMinTeD Establish an open and sustainable

More information

RDF friendly Chemical Taxonomies for Semantic Web (Using ORACLE/MySQL

RDF friendly Chemical Taxonomies for Semantic Web (Using ORACLE/MySQL RDF friendly Chemical Taxonomies for Semantic Web (Using ORACLE/MySQL MySQL) Downloads T.N.Bhat Bhat*, J. Barkley NIST, Gaithersburg USA bhat@nist.gov Query 3-D data Query 2-D data Prasanna MD, Vondrasek

More information

Building innovative drug discovery alliances. Migrating to ChemAxon

Building innovative drug discovery alliances. Migrating to ChemAxon Building innovative drug discovery alliances Migrating to ChemAxon Evotec AG, Migrating to ChemAxon, May 2011 Agenda Evotec Why migrate? Searching for Library Enumeration Replacement Migrating a small

More information

warwick.ac.uk/lib-publications

warwick.ac.uk/lib-publications Original citation: Zhao, Lei, Lim Choi Keung, Sarah Niukyun and Arvanitis, Theodoros N. (2016) A BioPortalbased terminology service for health data interoperability. In: Unifying the Applications and Foundations

More information

NCI Thesaurus, managing towards an ontology

NCI Thesaurus, managing towards an ontology NCI Thesaurus, managing towards an ontology CENDI/NKOS Workshop October 22, 2009 Gilberto Fragoso Outline Background on EVS The NCI Thesaurus BiomedGT Editing Plug-in for Protege Semantic Media Wiki supports

More information

Webinar Annotate data in the EUDAT CDI

Webinar Annotate data in the EUDAT CDI Webinar Annotate data in the EUDAT CDI Yann Le Franc - e-science Data Factory, Paris, France March 16, 2017 This work is licensed under the Creative Commons CC-BY 4.0 licence. Attribution: Y. Le Franc

More information

PROJECT FINAL REPORT. Tel: Fax:

PROJECT FINAL REPORT. Tel: Fax: PROJECT FINAL REPORT Grant Agreement number: 262023 Project acronym: EURO-BIOIMAGING Project title: Euro- BioImaging - Research infrastructure for imaging technologies in biological and biomedical sciences

More information

Presenting Eidogen/Sertanty Kinase Knowledge Base (KKB) via Dotmatics browser. Kerim Babaoglu

Presenting Eidogen/Sertanty Kinase Knowledge Base (KKB) via Dotmatics browser. Kerim Babaoglu Presenting Eidogen/Sertanty Kinase Knowledge Base (KKB) via Dotmatics browser Kerim Babaoglu September 25 th, 2012 How we used to look at data. Accord for Excel SD file ChemDraw for Excel Spotfire ISIS

More information

VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems

VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems Jan Polowinski Martin Voigt Technische Universität DresdenTechnische Universität Dresden 01062 Dresden, Germany

More information

Overview. IBEX - access and exploit SAR data from patents and journals

Overview. IBEX - access and exploit SAR data from patents and journals Better Compounds. Faster IBEX - access and exploit SAR data from patents and journals Péter Várkonyi, Christian Hoppe, Sorel Muresan AZ Global Compound Sciences Computational Chemistry Overview GVKBIO

More information

Structural Bioinformatics

Structural Bioinformatics Structural Bioinformatics Elucidation of the 3D structures of biomolecules. Analysis and comparison of biomolecular structures. Prediction of biomolecular recognition. Handles three-dimensional (3-D) structures.

More information

Languages and tools for building and using ontologies. Simon Jupp, James Malone

Languages and tools for building and using ontologies. Simon Jupp, James Malone An overview of ontology technology Languages and tools for building and using ontologies Simon Jupp, James Malone jupp@ebi.ac.uk, malone@ebi.ac.uk Outline Languages OWL and OBO classes, individuals, relations,

More information

Text mining tools for semantically enriching the scientific literature

Text mining tools for semantically enriching the scientific literature Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the

More information

What is Text Mining? Sophia Ananiadou National Centre for Text Mining University of Manchester

What is Text Mining? Sophia Ananiadou National Centre for Text Mining   University of Manchester National Centre for Text Mining www.nactem.ac.uk University of Manchester Outline Aims of text mining Text Mining steps Text Mining uses Applications 2 Aims Extract and discover knowledge hidden in text

More information

Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit

Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit Martin Scharm, Dagmar Waltemath Department of Systems Biology and Bioinformatics University of Rostock

More information

Description of the European Big Data Hackathon 2019

Description of the European Big Data Hackathon 2019 EUROPEAN COMMISSION EUROSTAT Ref. Ares(2018)6073319-27/11/2018 Deputy Director-General Task Force Big Data Description of the European Big Data Hackathon 2019 Description of the European Big Data Hackathon

More information

Integration in the 21 st -Century Enterprise. Thomas Blackadar American Chemical Society Meeting New York, September 10, 2003

Integration in the 21 st -Century Enterprise. Thomas Blackadar American Chemical Society Meeting New York, September 10, 2003 Integration in the 21 st -Century Enterprise Thomas Blackadar American Chemical Society Meeting New York, September 10, 2003 The Integration Bill of Rights Integrate = to form, coordinate, or blend into

More information

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit

More information

Data Immersion : Providing Integrated Data to Infinity Scientists. Kevin Gilpin Principal Engineer Infinity Pharmaceuticals October 19, 2004

Data Immersion : Providing Integrated Data to Infinity Scientists. Kevin Gilpin Principal Engineer Infinity Pharmaceuticals October 19, 2004 Data Immersion : Providing Integrated Data to Infinity Scientists Kevin Gilpin Principal Engineer Infinity Pharmaceuticals October 19, 2004 Informatics at Infinity Understand the nature of the science

More information

ReaxysTutorial. Dr. QF Carlos F. Lagos

ReaxysTutorial. Dr. QF Carlos F. Lagos ReaxysTutorial Dr. QF Carlos F. Lagos Agenda 1) Reaxys Basics Main Settings Query Menu: Reaction, Substances and Properties, Authors and citations Generate a structure t from a name Commercial Availability

More information

One Search Many Answers

One Search Many Answers One Search Many Answers Bringing together results from multiple databases through the DiscoveryGate Platform Carmen Nitsche, VP Content Fall 2009 ACS Meeting Washington, D.C. Information Driven R&D Is

More information

The RMap Project: Linking the Products of Research and Scholarly Communication Tim DiLauro

The RMap Project: Linking the Products of Research and Scholarly Communication Tim DiLauro The RMap Project: Linking the Products of Research and Scholarly Communication 2015 04 22 Tim DiLauro Motivation Compound objects fast becoming the norm for outputs of scholarly communication.

More information

D4.2 Data Management System

D4.2 Data Management System D4.2 Data Management System 116019 - RESCEU REspiratory Syncytial virus Consortium in EUrope WP4 Prospective data collection Lead contributor Other contributors Louis Bont (3 UMCU) l.bont@umcutrecht.nl

More information

ANNUAL REPORT Visit us at project.eu Supported by. Mission

ANNUAL REPORT Visit us at   project.eu Supported by. Mission Mission ANNUAL REPORT 2011 The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing

More information

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license Inge Van Nieuwerburgh OpenAIRE NOAD Belgium Tools&Services OpenAIRE EUDAT can be reused under the CC BY license Open Access Infrastructure for Research in Europe www.openaire.eu Research Data Services,

More information

Science-as-a-Service

Science-as-a-Service Science-as-a-Service The iplant Foundation Rion Dooley Edwin Skidmore Dan Stanzione Steve Terry Matthew Vaughn Outline Why, why, why! When duct tape isn t enough Building an API for the web Core services

More information

Interoperability and transparency The European context

Interoperability and transparency The European context JOINING UP GOVERNMENTS EUROPEAN COMMISSION Interoperability and transparency The European context ITAPA 2011, Bratislava Francisco García Morán Director General for Informatics Background 2 3 Every European

More information

Transitioning to Symyx

Transitioning to Symyx Whitepaper Transitioning to Symyx Notebook by Accelrys from Third-Party Electronic Lab Notebooks Ordinarily in a market with strong growth, vendors do not focus on competitive displacement of competitor

More information

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond Alessia Bardi and Paolo Manghi, Institute of Information Science and Technologies CNR Katerina Iatropoulou, ATHENA, Iryna Kuchma and Gwen Franck, EIFL Pedro Príncipe, University of Minho OpenAIRE Fostering

More information

ehealth Ministerial Conference 2013 Dublin May 2013 Irish Presidency Declaration

ehealth Ministerial Conference 2013 Dublin May 2013 Irish Presidency Declaration ehealth Ministerial Conference 2013 Dublin 13 15 May 2013 Irish Presidency Declaration Irish Presidency Declaration Ministers of Health of the Member States of the European Union and delegates met on 13

More information

SciENCV - Putting the Pieces Together VIVO

SciENCV - Putting the Pieces Together VIVO SciENCV - Putting the Pieces Together VIVO Jon Corson-Rikert August 27, 2012 1 What is VIVO? An open community with strong national and international participation Focusing primarily on research information

More information

Big Data in Translational Science

Big Data in Translational Science Big Data in Translational Science Albert Wang Associate Director, Translational R&D IT Bristol-Myers Squibb 2015 AAPS Annual Meeting Agenda Perspectives on Big Data Big Data in Translational R&D Selected

More information

Unlocking the full potential of location-based services: Linked Data driven Web APIs

Unlocking the full potential of location-based services: Linked Data driven Web APIs Unlocking the full potential of location-based services: Linked Data driven Web APIs Open Standards for Linked Organisations about Raf Buyle Ziggy Vanlishout www.vlaanderen.be/informatievlaanderen 6.4

More information

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper. Semantic Web Company PoolParty - Server PoolParty - Technical White Paper http://www.poolparty.biz Table of Contents Introduction... 3 PoolParty Technical Overview... 3 PoolParty Components Overview...

More information

re3data.org - Making research data repositories visible and discoverable

re3data.org - Making research data repositories visible and discoverable re3data.org - Making research data repositories visible and discoverable Robert Ulrich, Karlsruhe Institute of Technology Hans-Jürgen Goebelbecker, Karlsruhe Institute of Technology Frank Scholze, Karlsruhe

More information

ehealth and DSM, Digital Single Market

ehealth and DSM, Digital Single Market ehealth and DSM, Digital Single Market Dr. Christoph Klein Interoperable data, access and sharing ehealth, Wellbeing and Ageing DG Communications Networks, Content and Technology European Commission, Luxembourg,

More information

SETTING UP AN HCS DATA ANALYSIS SYSTEM

SETTING UP AN HCS DATA ANALYSIS SYSTEM A WHITE PAPER FROM GENEDATA JANUARY 2010 SETTING UP AN HCS DATA ANALYSIS SYSTEM WHY YOU NEED ONE HOW TO CREATE ONE HOW IT WILL HELP HCS MARKET AND DATA ANALYSIS CHALLENGES High Content Screening (HCS)

More information

D360: Unlock the value of your scientific data Solving Informatics Problems for Translational Research

D360: Unlock the value of your scientific data Solving Informatics Problems for Translational Research D360: Unlock the value of your scientific data Solving Informatics Problems for Translational Research Dr. Fabian Bös, Senior Application Scientist Certara Spain SL Martin-Kollar-Str. 17, 81829 Munich

More information

WP7: Patents Case Study

WP7: Patents Case Study MOLTO WP7: Patents Case Study Meritxell Gonzàlez Bermúdez 2nd Year Review Barcelona, March 20th, 2012 Objectives To create a prototype of MT and NL retrieval of patents in the bio- medical & pharmaceu;cal

More information

onem2m AND SMART M2M INTRODUCTION, RELEASE 2/3

onem2m AND SMART M2M INTRODUCTION, RELEASE 2/3 onem2m AND SMART M2M INTRODUCTION, RELEASE 2/3 Presenter: Omar Elloumi, onem2m TP Chair, Nokia Bell Labs and CTO group omar.elloumi@nokia.com onem2m www.onem2m.org 2016 onem2m Outline Introduction to onem2m

More information

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication Citation for published version: Patel, M & Duke, M 2004, 'Knowledge Discovery in an Agents Environment' Paper presented at European Semantic Web Symposium 2004, Heraklion, Crete, UK United Kingdom, 9/05/04-11/05/04,.

More information

Guide to Database Curation and New Structure Deposition January 2010

Guide to Database Curation and New Structure Deposition January 2010 and New Structure Deposition January 2010 Copyright RSC Worldwide Ltd Table of Contents 1. Introduction Overview of content of ChemSpider What is Data Curation 2. How to Get Started Site Registration Logging

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Integrating Complex Financial Workflows in Oracle Database Xavier Lopez Seamus Hayes Oracle PolarLake, LTD 2 Copyright 2011, Oracle

More information

INSPIRE overview and possible applications for IED and E-PRTR e- Reporting Alexander Kotsev

INSPIRE overview and possible applications for IED and E-PRTR e- Reporting Alexander Kotsev INSPIRE overview and possible applications for IED and E-PRTR e- Reporting Alexander Kotsev www.jrc.ec.europa.eu Serving society Stimulating innovation Supporting legislation The European data puzzle 24

More information

Helix Nebula, the Science Cloud

Helix Nebula, the Science Cloud Helix Nebula, the Science Cloud A strategic Plan for a European Scientific Cloud Computing Infrastructure NORDUNet 2012, Oslo 18 th -20 th September Maryline Lengert, ESA Strategic Goal Helix Nebula, the

More information

Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies

Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies Anthony C. Arvanites Lead Discovery Informatics Company Introduction Founded: 1999 Platform: Combining

More information

CEF e-invoicing. Presentation to the European Multi- Stakeholder Forum on e-invoicing. DIGIT Directorate-General for Informatics.

CEF e-invoicing. Presentation to the European Multi- Stakeholder Forum on e-invoicing. DIGIT Directorate-General for Informatics. CEF e-invoicing Presentation to the European Multi- Stakeholder Forum on e-invoicing 20 October 2014 DIGIT Directorate-General for Informatics Connecting Europe Facility (CEF) Common financing instrument

More information

e-infrastructure: objectives and strategy in FP7

e-infrastructure: objectives and strategy in FP7 "The views expressed in this presentation are those of the author and do not necessarily reflect the views of the European Commission" e-infrastructure: objectives and strategy in FP7 National information

More information

Digital repositories as research infrastructure: a UK perspective

Digital repositories as research infrastructure: a UK perspective Digital repositories as research infrastructure: a UK perspective Dr Liz Lyon Director This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 UKOLN is supported by: Presentation

More information

Developing a Benchmark Suite for Semantic Web Data from Existing Workflows

Developing a Benchmark Suite for Semantic Web Data from Existing Workflows Developing a Benchmark Suite for Semantic Web Data from Existing Workflows Antonis Troumpoukis 1, Angelos Charalambidis 1, Giannis Mouchakis 1, Stasinos Konstantopoulos 1, Ronald Siebes 2, Victor de Boer

More information

Opus: University of Bath Online Publication Store

Opus: University of Bath Online Publication Store Patel, M. (2004) Semantic Interoperability in Digital Library Systems. In: WP5 Forum Workshop: Semantic Interoperability in Digital Library Systems, DELOS Network of Excellence in Digital Libraries, 2004-09-16-2004-09-16,

More information

INSPIRE & Environment Data in the EU

INSPIRE & Environment Data in the EU INSPIRE & Environment Data in the EU Andrea Perego Research Data infrastructures for Environmental related Societal Challenges Workshop @ pre-rda P6 Workshops, Paris 22 September 2015 INSPIRE in a nutshell

More information

Triple store databases and their role in high throughput, automated extensible data analysis

Triple store databases and their role in high throughput, automated extensible data analysis Triple store databases and their role in high throughput, automated extensible data analysis San Diego CINF Talk: Workflow! Introduction to the Combechem Project! Smart Dark Labs! Semantics & Databases!

More information

RDF Workshop. Building an RDF representation of the the ChEMBL Database. Mark Davies. ChEMBL Group, Technical Lead 30/04/2014

RDF Workshop. Building an RDF representation of the the ChEMBL Database. Mark Davies. ChEMBL Group, Technical Lead 30/04/2014 RDF Workshop Building an RDF representation of the the ChEMBL Database Mark Davies ChEMBL Group, Technical Lead 30/04/2014 Overview Brief introduction to ChEMBL database Approaches to mapping relational

More information

AmbitXT v2.1.0 Manual

AmbitXT v2.1.0 Manual AmbitXT v2.1.0 Manual June 2009 1 Table of Contents Introduction... 2 Functions of AMBIT XT v2.1.0... 2 Workflow of AMBIT XT v2.1.0... 3 Using Database Utilities... 4 General Information... 4 Prerequisite

More information

Groovy in Jenkins. Ioannis K. Moutsatsos. Repurposing Jenkins for Life Sciences Data Pipelining

Groovy in Jenkins. Ioannis K. Moutsatsos. Repurposing Jenkins for Life Sciences Data Pipelining Groovy in Jenkins Ioannis K. Moutsatsos Repurposing Jenkins for Life Sciences Data Pipelining Who Am I? Research scientist at local pharmaceutical company Software engineer Open Source advocate and contributor

More information

Indiana University Research Technology and the Research Data Alliance

Indiana University Research Technology and the Research Data Alliance Indiana University Research Technology and the Research Data Alliance Rob Quick Manager High Throughput Computing Operations Officer - OSG and SWAMP Board Member - RDA Organizational Assembly RDA Mission

More information

Global South-South Development Expo November 2017 Antalya, Turkey. Presented By

Global South-South Development Expo November 2017 Antalya, Turkey. Presented By Global South-South Development Expo 2017 27 November 2017 Antalya, Turkey Presented By Oswaldo REQUES Focal Point South-South Cooperation WORLD INTELLECTUAL PROPERTY ORGANIZATION (WIPO) GENEVA SWITZERLAND

More information

Guide to SciVal Experts

Guide to SciVal Experts Guide to SciVal Experts Contents What is SciVal Experts and How Can I Benefit From It?....... 3 How is My Profile Created?... 4 The SciVal Experts Interface.... 5-6 Organization Home Page Unit Individual

More information

SKOS. COMP62342 Sean Bechhofer

SKOS. COMP62342 Sean Bechhofer SKOS COMP62342 Sean Bechhofer sean.bechhofer@manchester.ac.uk Ontologies Metadata Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies

More information

Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework. Maryann E.

Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework. Maryann E. Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework Maryann E. Martone University of California, San Diego What does this mean? 3D Volumes

More information

Services to Make Sense of Data. Patricia Cruse, Executive Director, DataCite Council of Science Editors San Diego May 2017

Services to Make Sense of Data. Patricia Cruse, Executive Director, DataCite Council of Science Editors San Diego May 2017 Services to Make Sense of Data Patricia Cruse, Executive Director, DataCite Council of Science Editors San Diego May 2017 How many journals make data sharing a requirement of publication? https://jordproject.wordpress.com/2013/07/05/going-back-to-basics-reusing-data/

More information

The ELIXIR of Linked Data

The ELIXIR of Linked Data The ELIXIR of Linked Data Professor Carole Goble (UK node) Barend Mons (NL node), Helen Parkinson (EMBL-EBI node) The Interoperability Services Backbone Team European Life Sciences Infrastructure for Biological

More information

Linked Data: Fast, low cost semantic interoperability for health care?

Linked Data: Fast, low cost semantic interoperability for health care? Linked Data: Fast, low cost semantic interoperability for health care? About the presentation Part I: Motivation Why we need semantic operability in health care Why enhancing existing systems to increase

More information

Data is the new Oil (Ann Winblad)

Data is the new Oil (Ann Winblad) Data is the new Oil (Ann Winblad) Keith G Jeffery keith.jeffery@keithgjefferyconsultants.co.uk 20140415-16 JRC Workshop Big Open Data Keith G Jeffery 1 Data is the New Oil Like oil has been, data is Abundant

More information

Semantic Knowledge Discovery OntoChem IT Solutions

Semantic Knowledge Discovery OntoChem IT Solutions Semantic Knowledge Discovery OntoChem IT Solutions OntoChem IT Solutions GmbH Blücherstr. 24 06120 Halle (Saale) Germany Tel. +49 345 4780472 Fax: +49 345 4780471 mail: info(at)ontochem.com Get the Gold!

More information

Ontologies SKOS. COMP62342 Sean Bechhofer

Ontologies SKOS. COMP62342 Sean Bechhofer Ontologies SKOS COMP62342 Sean Bechhofer sean.bechhofer@manchester.ac.uk Metadata Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies

More information

EUDAT - Open Data Services for Research

EUDAT - Open Data Services for Research EUDAT - Open Data Services for Research Johannes Reetz EUDAT operations Max Planck Computing & Data Centre Science Operations Workshop 2015 ESO, Garching 24-27th November 2015 EUDAT receives funding from

More information

Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data (

Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data ( Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data (http://bioqueries.uma.es) María Jesús García-Godoy, Ismael Navas-Delgado, José Francisco Aldana Montes Computing

More information

The European Commission s science and knowledge service. Joint Research Centre

The European Commission s science and knowledge service. Joint Research Centre The European Commission s science and knowledge service Joint Research Centre GeoDCAT-AP The story so far Andrea Perego, Antonio Rotundo, Lieven Raes GeoDCAT-AP Webinar 6 June 2018 What is GeoDCAT-AP Geospatial

More information

Reducing Consumer Uncertainty

Reducing Consumer Uncertainty Spatial Analytics Reducing Consumer Uncertainty Towards an Ontology for Geospatial User-centric Metadata Introduction Cooperative Research Centre for Spatial Information (CRCSI) in Australia Communicate

More information

Database of historical places, persons, and lemmas

Database of historical places, persons, and lemmas Database of historical places, persons, and lemmas Natalia Korchagina Outline 1. Introduction 1.1 Swiss Law Sources Foundation as a Digital Humanities project 1.2 Data to be stored 1.3 Final goal: how

More information

Use of Semantic Technologies at Eli Lilly and Company. J Phil Brooks Information Consultant, SE Data Team Discover IT Eli Lilly and Company

Use of Semantic Technologies at Eli Lilly and Company. J Phil Brooks Information Consultant, SE Data Team Discover IT Eli Lilly and Company Use of Semantic Technologies at Eli Lilly and Company J Phil Brooks Information Consultant, SE Data Team Discover IT Eli Lilly and Company Notable Semantic Projects at Lilly Discovery Metadata Integration

More information

Financial Dataspaces: Challenges, Approaches and Trends

Financial Dataspaces: Challenges, Approaches and Trends Financial Dataspaces: Challenges, Approaches and Trends Finance and Economics on the Semantic Web (FEOSW), ESWC 27 th May, 2012 Seán O Riain ebusiness Copyright 2009. All rights reserved. Motivation Changing

More information

European Location Framework (ELF) acting as a facilitator implementing INSPIRE

European Location Framework (ELF) acting as a facilitator implementing INSPIRE www.eurogeographics.org European Location Framework (ELF) acting as a facilitator implementing INSPIRE Saulius Urbanas, Mick Cory (EuroGeographics) 29 October 2016 Copyright 2013 EuroGeographics EuroGeographics

More information

B2FIND and Metadata Quality

B2FIND and Metadata Quality B2FIND and Metadata Quality 3 rd EUDAT Conference 25 September 2014 Heinrich Widmann and B2FIND team 1 Outline B2FIND the EUDAT Metadata Service Semantic Mapping of Metadata Quality of Metadata Summary

More information

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009 Maximizing the Value of STM Content through Semantic Enrichment Frank Stumpf December 1, 2009 What is Semantics and Semantic Processing? Content Knowledge Framework Technology Framework Search Text Images

More information