The CALBC RDF Triple store: retrieval over large literature content

Size: px
Start display at page:

Download "The CALBC RDF Triple store: retrieval over large literature content"

Transcription

1 The CALBC RDF Triple store: retrieval over large literature content Samuel Croset, Christoph Grabmüller, Chen Li, Silverstras Kavaliauskas, Dietrich Rebholz-Schuhmann 10 th December 2010, Berlin EBI is an Outstation of the European Molecular Biology Laboratory.

2 Outline Motivation Integrating multiple resources CALBC Corpus LexEBI Public databases Querying the Triple Store

3 Outline Motivation Integrating multiple resources CALBC Corpus LexEBI Public databases Querying the Triple Store

4 Why representing scientific literature in RDF? Scientific literature: Primary data resource reporting novel scientific findings Text-mining: Biological entities recognition Population of biomedical databases through curators RDF representation: Standardization of the content extracted Exploitation of the literature in the Semantic Web

5 Outline Motivation Integrating multiple resources CALBC Corpus LexEBI Public databases Querying the Triple Store

6 CALBC Corpus Collaborative Annotation of a Large Biomedical Corpus 4 Project partners Medline abstract related to Immunology annotated Harmonization Silver Standard Corpus I (SSCI) Challenge Participants annotate SSCI Harmonization with best results Silver Standard Corpus II (SSCII)

7 CALBC Corpus Advantages of the CALBC Corpus: Large-scale corpus 4 semantic types: Gene-Protein, Diseases, Chemicals and Species Generated in a purely automatic way Highly reproducible

8 CALBC in RDF dc:date k/rebholz/core/cor pus_calbc 92 calbc:isin calbc:hassentence Seshadri, M S dc:creator Varkey, K dc:identifier <urn:issn: > Hepatitis B surface antigen (HBsAg) positive polyarteritis nodosa. A report of two cases and review of literature dc: calbc:

9 CALBC in RDF calbc:hasstartposition calbc:ispartof calbc:hasendposition calbc: 46 A calbc:isentitytype CHED calbc:haslabel prostaglandin s

10 Outline Motivation Integrating multiple resources CALBC Corpus LexEBI Public databases Querying the Triple Store

11 LexEBI BioThesaurus: Complete term repository for the biomedical domain LexEBI XML Features: Frequency count for the occurrence of the term in British National Corpus (BNC) or in MEDLINE Disambiguation Mapping to original resource (URI) Normalization

12 LexEBI in RDF 16 :isin W8 k/rebholz/core/cor pus_lexebi :species :hasvariant A :FrequencyInMedline 30 : < A :hasvariant :istype :SurfaceForm True :istype :FrequencyInMedline :SurfaceForm ftsj Orthographic Ribosomal large subunit methyltransferas e E

13 Outline Motivation Integrating multiple resources CALBC Corpus LexEBI Public databases Querying the Triple Store

14 Uniprot Human RDF <Unigene uniprot:unigene uniprot:chebi <ChEBI <HGNC <Specie uniprot:ispartof uniprot:hgnc uniprot:classifiedwith <GO uniprot: uniprot:... <GeneCards uniprot:genecards uniprot:ensemble <UniProt uniprot:hasinteraction uniprot:citations <Medline <Ensemble

15 Uniprot RDF <Unigene uniprot:unigene uniprot:chebi <ChEBI Array Express and uniprot: Chebi... on the way <HGNC <Specie uniprot:ispartof uniprot:hgnc uniprot:classifiedwith <GO uniprot: <GeneCards uniprot:genecards uniprot:ensemble <UniProt uniprot:hasinteraction uniprot:citations <Medline <Ensemble

16 Outline Motivation Integrating multiple resources CALBC Corpus LexEBI Public databases Querying the Triple Store

17 Querying the Triple Store 16 uniprot:q21ww8 uniprot:hasinteraction <UniProt calbcsentence: calbc:ispartof lexebi:hasvariant uniprot:classifiedwith <GO term calbc:hasannotation calbc:hasstartposition A calbc:hasendposition A lexebi:frequencyinmedline lexebi:surfaceform calbc:haslabel calbc:isentitytype ftsj pubmed: PRGE

18 Use cases Normalization of CALBC named entities Disambiguation of CALBC named entities Term collocation at the sentence level e.g. Evidence for Gene Disease association Checking consistency of bioinformatics resources from literature

19 Thank you for your attention

A Semantic Model for Federated Queries Over a Normalized Corpus

A Semantic Model for Federated Queries Over a Normalized Corpus A Semantic Model for Federated Queries Over a Normalized Corpus Samuel Croset, Christoph Grabmüller, Dietrich Rebholz-Schuhmann 17 th March 2010, Hinxton EBI is an Outstation of the European Molecular

More information

Text mining tools for semantically enriching the scientific literature

Text mining tools for semantically enriching the scientific literature Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the

More information

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Amanda Clare 1,3, Samuel Croset 2,3 (croset@ebi.ac.uk), Christoph Grabmueller 2,3, Senay

More information

RLIMS-P Website Help Document

RLIMS-P Website Help Document RLIMS-P Website Help Document Table of Contents Introduction... 1 RLIMS-P architecture... 2 RLIMS-P interface... 2 Login...2 Input page...3 Results Page...4 Text Evidence/Curation Page...9 URL: http://annotation.dbi.udel.edu/text_mining/rlimsp2/

More information

Customisable Curation Workflows in Argo

Customisable Curation Workflows in Argo Customisable Curation Workflows in Argo Rafal Rak*, Riza Batista-Navarro, Andrew Rowley, Jacob Carter and Sophia Ananiadou National Centre for Text Mining, University of Manchester, UK *Corresponding author:

More information

The Text Analytics Challenge BioCreative V - Extraction of causal network information in BEL

The Text Analytics Challenge BioCreative V - Extraction of causal network information in BEL The Text Analytics Challenge BioCreative V - Extraction of causal network information in BEL http://tinyurl.com/beltask Fabio Rinaldi Outline Biomedical text mining, motivation Competitive evaluations:

More information

Information Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si

Information Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si Information Retrieval, Information Extraction, and Text Mining Applications for Biology Slides by Suleyman Cetintas & Luo Si 1 Outline Introduction Overview of Literature Data Sources PubMed, HighWire

More information

efip online Help Document

efip online Help Document efip online Help Document University of Delaware Computer and Information Sciences & Center for Bioinformatics and Computational Biology Newark, DE, USA December 2013 K K S I K K Table of Contents INTRODUCTION...

More information

Update: MIRIAM Registry and SBO

Update: MIRIAM Registry and SBO Update: MIRIAM Registry and SBO Nick Juty, EMBL-EBI 3rd Sept, 2011 Overview MIRIAM Registry MIRIAM Guidelines.. MIRIAM Registry content URIs (URN form), example Summary/current developments SBO Purpose

More information

Chemical name recognition with harmonized feature-rich conditional random fields

Chemical name recognition with harmonized feature-rich conditional random fields Chemical name recognition with harmonized feature-rich conditional random fields David Campos, Sérgio Matos, and José Luís Oliveira IEETA/DETI, University of Aveiro, Campus Universitrio de Santiago, 3810-193

More information

Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit

Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit Martin Scharm, Dagmar Waltemath Department of Systems Biology and Bioinformatics University of Rostock

More information

Bioinformatics Hubs on the Web

Bioinformatics Hubs on the Web Bioinformatics Hubs on the Web Take a class The Galter Library teaches a related class called Bioinformatics Hubs on the Web. See our Classes schedule for the next available offering. If this class is

More information

How to store and visualize RNA-seq data

How to store and visualize RNA-seq data How to store and visualize RNA-seq data Gabriella Rustici Functional Genomics Group gabry@ebi.ac.uk EBI is an Outstation of the European Molecular Biology Laboratory. Talk summary How do we archive RNA-seq

More information

SciMiner User s Manual

SciMiner User s Manual SciMiner User s Manual Copyright 2008 Junguk Hur. All rights reserved. Bioinformatics Program University of Michigan Ann Arbor, MI 48109, USA Email: juhur@umich.edu Homepage: http://jdrf.neurology.med.umich.edu/sciminer/

More information

Measuring inter-annotator agreement in GO annotations

Measuring inter-annotator agreement in GO annotations Measuring inter-annotator agreement in GO annotations Camon EB, Barrell DG, Dimmer EC, Lee V, Magrane M, Maslen J, Binns ns D, Apweiler R. An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.

More information

A Framework for BioCuration (part II)

A Framework for BioCuration (part II) A Framework for BioCuration (part II) Text Mining for the BioCuration Workflow Workshop, 3rd International Biocuration Conference Friday, April 17, 2009 (Berlin) Martin Krallinger Spanish National Cancer

More information

@Note2 tutorial. Hugo Costa Ruben Rodrigues Miguel Rocha

@Note2 tutorial. Hugo Costa Ruben Rodrigues Miguel Rocha @Note2 tutorial Hugo Costa (hcosta@silicolife.com) Ruben Rodrigues (pg25227@alunos.uminho.pt) Miguel Rocha (mrocha@di.uminho.pt) 23-01-2018 The document presents a typical workflow using @Note2 platform

More information

Deliverable D4.3 Release of pilot version of data warehouse

Deliverable D4.3 Release of pilot version of data warehouse Deliverable D4.3 Release of pilot version of data warehouse Date: 10.05.17 HORIZON 2020 - INFRADEV Implementation and operation of cross-cutting services and solutions for clusters of ESFRI Grant Agreement

More information

Document Retrieval using Predication Similarity

Document Retrieval using Predication Similarity Document Retrieval using Predication Similarity Kalpa Gunaratna 1 Kno.e.sis Center, Wright State University, Dayton, OH 45435 USA kalpa@knoesis.org Abstract. Document retrieval has been an important research

More information

Biobtree: A tool to search, map and visualize bioinformatics identifiers and special keywords [version 1; referees: awaiting peer review]

Biobtree: A tool to search, map and visualize bioinformatics identifiers and special keywords [version 1; referees: awaiting peer review] SOFTWARE TOOL ARTICLE Biobtree: A tool to search, map and visualize bioinformatics identifiers and special keywords [version 1; referees: awaiting peer review] Tamer Gur European Bioinformatics Institute,

More information

A curation pipeline and web-services for PDF documents

A curation pipeline and web-services for PDF documents A curation pipeline and web-services for PDF documents André Santos 1, Sérgio Matos 1, David Campos 2 and José Luís Oliveira 1 1 DETI/IEETA, University of Aveiro, 3810-193 Aveiro, Portugal {aleixomatos,andre.jeronimo,jlo}@ua.pt

More information

SELF-SERVICE SEMANTIC DATA FEDERATION

SELF-SERVICE SEMANTIC DATA FEDERATION SELF-SERVICE SEMANTIC DATA FEDERATION WE LL MAKE YOU A DATA SCIENTIST Contact: IPSNP Computing Inc. Chris Baker, CEO Chris.Baker@ipsnp.com (506) 721 8241 BIG VISION: SELF-SERVICE DATA FEDERATION Biomedical

More information

Integrated Access to Biological Data. A use case

Integrated Access to Biological Data. A use case Integrated Access to Biological Data. A use case Marta González Fundación ROBOTIKER, Parque Tecnológico Edif 202 48970 Zamudio, Vizcaya Spain marta@robotiker.es Abstract. This use case reflects the research

More information

Improving Interoperability of Text Mining Tools with BioC

Improving Interoperability of Text Mining Tools with BioC Improving Interoperability of Text Mining Tools with BioC Ritu Khare, Chih-Hsuan Wei, Yuqing Mao, Robert Leaman, Zhiyong Lu * National Center for Biotechnology Information, 8600 Rockville Pike, Bethesda,

More information

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit

More information

Facilitating Semantic Alignment of EBI Resources

Facilitating Semantic Alignment of EBI Resources Facilitating Semantic Alignment of EBI Resources 17 th March, 2017 Tony Burdett Technical Co-ordinator Samples, Phenotypes and Ontologies Team www.ebi.ac.uk What is EMBL-EBI? Europe s home for biological

More information

Projects Tools BLAH proposal Conclusion. OntoGene/BioMeXT

Projects Tools BLAH proposal Conclusion. OntoGene/BioMeXT OntoGene/BioMeXT The Bio Term Hub and OGER Lenz Furrer, Nico Colic, Fabio Rinaldi University of Zurich and Swiss Institute of Bioinformatics January 10, 2018 Outline Projects Tools BLAH proposal Conclusion

More information

EBI patent related services

EBI patent related services EBI patent related services 4 th Annual Forum for SMEs October 18-19 th 2010 Jennifer McDowall Senior Scientist, EMBL-EBI EBI is an Outstation of the European Molecular Biology Laboratory. Overview Patent

More information

Mining the Biomedical Research Literature. Ken Baclawski

Mining the Biomedical Research Literature. Ken Baclawski Mining the Biomedical Research Literature Ken Baclawski Data Formats Flat files Spreadsheets Relational databases Web sites XML Documents Flexible very popular text format Self-describing records XML Documents

More information

Welcome - webinar instructions

Welcome - webinar instructions Welcome - webinar instructions GoToTraining works best in Chrome or IE avoid Firefox due to audio issues with Macs To access the full features of GoToTraining, use the desktop version by clicking switch

More information

A computational method for the extraction of pharmacogenomic relationships from text

A computational method for the extraction of pharmacogenomic relationships from text A computational method for the extraction of pharmacogenomic relationships from text Adrien Coulet 1,2, Nigam Shah 2, Yael Garten 2, Mark Musen 2, Russ Altman 2 1 LORIA, INRIA Nancy Grand-Est 2 Stanford

More information

The GENIA corpus Linguistic and Semantic Annotation of Biomedical Literature. Jin-Dong Kim Tsujii Laboratory, University of Tokyo

The GENIA corpus Linguistic and Semantic Annotation of Biomedical Literature. Jin-Dong Kim Tsujii Laboratory, University of Tokyo The GENIA corpus Linguistic and Semantic Annotation of Biomedical Literature Jin-Dong Kim Tsujii Laboratory, University of Tokyo Contents Ontology, Corpus and Annotation for IE Annotation and Information

More information

Software review. Biomolecular Interaction Network Database

Software review. Biomolecular Interaction Network Database Biomolecular Interaction Network Database Keywords: protein interactions, visualisation, biology data integration, web access Abstract This software review looks at the utility of the Biomolecular Interaction

More information

Overview of BioCreative VI Precision Medicine Track

Overview of BioCreative VI Precision Medicine Track Overview of BioCreative VI Precision Medicine Track Mining scientific literature for protein interactions affected by mutations Organizers: Rezarta Islamaj Dogan (NCBI) Andrew Chatr-aryamontri (BioGrid)

More information

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009 Maximizing the Value of STM Content through Semantic Enrichment Frank Stumpf December 1, 2009 What is Semantics and Semantic Processing? Content Knowledge Framework Technology Framework Search Text Images

More information

Semantic MediaWiki (SMW) for Scientific Literature Management

Semantic MediaWiki (SMW) for Scientific Literature Management Semantic MediaWiki (SMW) for Scientific Literature Management Bahar Sateli, René Witte Semantic Software Lab Department of Computer Science and Software Engineering Concordia University, Montréal SMWCon

More information

Brain, a library for the OWL2 EL profile

Brain, a library for the OWL2 EL profile Brain, a library for the OWL2 EL profile Samuel Croset 1, John Overington 1, and Dietrich Rebholz-Schuhmann 1 EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK croset@ebi.ac.uk Abstract.

More information

A Semantic Role Repository Linking FrameNet and WordNet

A Semantic Role Repository Linking FrameNet and WordNet A Semantic Role Repository Linking FrameNet and WordNet Volha Bryl, Irina Sergienya, Sara Tonelli, Claudio Giuliano {bryl,sergienya,satonelli,giuliano}@fbk.eu Fondazione Bruno Kessler, Trento, Italy Abstract

More information

Text Mining. Representation of Text Documents

Text Mining. Representation of Text Documents Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,

More information

Construction of Viral Hepatitis Bilingual Bibliographic. Database with Protein Text Mining and Information. Integration Functions

Construction of Viral Hepatitis Bilingual Bibliographic. Database with Protein Text Mining and Information. Integration Functions Construction of Viral Hepatitis Bilingual Bibliographic Database with Protein Text Mining and Information Integration Functions Heng Chen* Yongjuan Zhang Chunhong Lin Liwen Zhang Tao Chen Shanghai Institutes

More information

Semantic Web and Natural Language Processing

Semantic Web and Natural Language Processing Semantic Web and Natural Language Processing Wiltrud Kessler Institut für Maschinelle Sprachverarbeitung Universität Stuttgart Semantic Web Winter 2014/2015 This work is licensed under a Creative Commons

More information

Promoting Ranking Diversity for Biomedical Information Retrieval based on LDA

Promoting Ranking Diversity for Biomedical Information Retrieval based on LDA Promoting Ranking Diversity for Biomedical Information Retrieval based on LDA Yan Chen, Xiaoshi Yin, Zhoujun Li, Xiaohua Hu and Jimmy Huang State Key Laboratory of Software Development Environment, Beihang

More information

BioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data

BioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data BioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data María-Esther Vidal 1, Louiqa Raschid 2, Natalia Márquez 1, Jean Carlo Rivera 1, and Edna Ruckhaus 1 1 Universidad

More information

A new methodology for gene normalization using a mix of taggers, global alignment matching and document similarity disambiguation

A new methodology for gene normalization using a mix of taggers, global alignment matching and document similarity disambiguation A new methodology for gene normalization using a mix of taggers, global alignment matching and document similarity disambiguation Mariana Neves 1, Monica Chagoyen 1, José M Carazo 1, Alberto Pascual-Montano

More information

SBML to BioPAX. MIRIAM Annotations in use. Camille Laibe

SBML to BioPAX. MIRIAM Annotations in use. Camille Laibe SBML to BioPAX MIRIAM Annotations in use Camille Laibe CellML Workshop, New Zealand, April 2009 TALK OUTLINE MIRIAM SBML to BioPAX conversion MIRIAM Minimum Information Requested In the Annotation of (biochemical)

More information

Validation of Automated Protein Annotation

Validation of Automated Protein Annotation Validation of Automated Protein Annotation Francisco M. Couto Mário J. Silva Pedro M. Coutinho DI FCUL TR 05 24 December 2005 Departamento de Informática Faculdade de Ciências da Universidade de Lisboa

More information

The ELIXIR of Linked Data

The ELIXIR of Linked Data The ELIXIR of Linked Data Professor Carole Goble (UK node) Barend Mons (NL node), Helen Parkinson (EMBL-EBI node) The Interoperability Services Backbone Team European Life Sciences Infrastructure for Biological

More information

Visualizing Semantic Metadata from Biological Publications

Visualizing Semantic Metadata from Biological Publications Visualizing Semantic Metadata from Biological Publications Johannes Hellrich, Erik Faessler, Ekaterina Buyko and Udo Hahn Jena University Language and Information Engineering (JULIE) Lab Friedrich-Schiller-Universität

More information

Assessment of NER solutions against the first and second CALBC Silver Standard Corpus

Assessment of NER solutions against the first and second CALBC Silver Standard Corpus JOURNAL OF BIOMEDICAL SEMANTICS RESEARCH Open Access Assessment of NER solutions against the first and second CALBC Silver Standard Corpus Dietrich Rebholz-Schuhmann 1*, Antonio Jimeno Yepes 1, Chen Li

More information

Bio wikis. Paolo Romano Bioinformatics, National Cancer Research Institute, Genova

Bio wikis. Paolo Romano Bioinformatics, National Cancer Research Institute, Genova Bio wikis Paolo Romano (paolo.romano@istge.it) Bioinformatics, National Cancer Research Institute, Genova Outline o Wiki systems: aims and technologies o Working with wikis: practical issues for setting

More information

The user interactive task (IAT) in BioCreative Challenges BioCreative Workshop on Text Mining Applications April 7, 2014

The user interactive task (IAT) in BioCreative Challenges BioCreative Workshop on Text Mining Applications April 7, 2014 The user interactive task (IAT) in BioCreative Challenges BioCreative Workshop on Text Mining Applications April 7, 2014 N., PhD Research Associate Professor Protein Information Resource CBCB, University

More information

Retrieval of Highly Related Documents Containing Gene-Disease Association

Retrieval of Highly Related Documents Containing Gene-Disease Association Retrieval of Highly Related Documents Containing Gene-Disease Association K. Santhosh kumar 1, P. Sudhakar 2 Department of Computer Science & Engineering Annamalai University Annamalai Nagar, India. santhosh09539@gmail.com,

More information

Exploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix

Exploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Exploring and Exploiting the Biological Maze Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Motivation An abundance of biological data sources contain data about scientific entities, such as

More information

Topics of the talk. Biodatabases. Data types. Some sequence terminology...

Topics of the talk. Biodatabases. Data types. Some sequence terminology... Topics of the talk Biodatabases Jarno Tuimala / Eija Korpelainen CSC What data are stored in biological databases? What constitutes a good database? Nucleic acid sequence databases Amino acid sequence

More information

A discovery platform for translational research

A discovery platform for translational research A discovery platform for translational research - DisGeNET-RDF&SPARQL - Usage and Modeling Challenges Núria Queralt Rosinach Integrative Biomedical Informatics Group (IBI) Research Programme on Biomedical

More information

Introduction to RDF and the Semantic Web for the life sciences

Introduction to RDF and the Semantic Web for the life sciences Introduction to RDF and the Semantic Web for the life sciences Simon Jupp Sample Phenotypes and Ontologies Team European Bioinformatics Institute jupp@ebi.ac.uk Practical sessions Converting data to RDF

More information

Advances in Data Integration & Representation in Systems Biology

Advances in Data Integration & Representation in Systems Biology Advances in Data Integration & Representation in Systems Biology Susie Stephens Principal Product Manager, Life Sciences Oracle susie.stephens@oracle.com Outline Systems Biology Data Requirements Semantic

More information

Unstructured Text in Big Data The Elephant in the Room

Unstructured Text in Big Data The Elephant in the Room Unstructured Text in Big Data The Elephant in the Room David Milward ICIC, October 2013 Click Unstructured to to edit edit Master Master Big title Data style title style Big Data Volume, Variety, Velocity

More information

BIOLOGICAL PATHWAYS AND THE SEMANTIC WEB

BIOLOGICAL PATHWAYS AND THE SEMANTIC WEB BIOLOGICAL PATHWAYS AND THE SEMANTIC WEB Andra Waagmeester, Tina Kutmon, Egon Willighagen, and Alex Pico Univ. Maastricht, NL, and Gladstone Institutes, CA, USA What we will talk about today Introduc*on

More information

Automatic annotation in UniProtKB using UniRule, and Complete Proteomes. Wei Mun Chan

Automatic annotation in UniProtKB using UniRule, and Complete Proteomes. Wei Mun Chan Automatic annotation in UniProtKB using UniRule, and Complete Proteomes Wei Mun Chan Talk outline Introduction to UniProt UniProtKB annotation and propagation Data increase and the need for Automatic Annotation

More information

TEXT MINING: THE NEXT DATA FRONTIER

TEXT MINING: THE NEXT DATA FRONTIER TEXT MINING: THE NEXT DATA FRONTIER An Infrastructural Approach Dr. Petr Knoth CORE (core.ac.uk) Knowledge Media institute, The Open University United Kingdom 2 OpenMinTeD Establish an open and sustainable

More information

An UIMA based Tool Suite for Semantic Text Processing

An UIMA based Tool Suite for Semantic Text Processing An UIMA based Tool Suite for Semantic Text Processing Katrin Tomanek, Ekaterina Buyko, Udo Hahn Jena University Language & Information Engineering Lab StemNet Knowledge Management for Immunology in life

More information

EMBL-EBI Patent Services

EMBL-EBI Patent Services EMBL-EBI Patent Services 5 th Annual Forum for SMEs October 6-7 th 2011 Jennifer McDowall EBI is an Outstation of the European Molecular Biology Laboratory. Patent resources at EBI 2 http://www.ebi.ac.uk/patentdata/

More information

Acquiring Experience with Ontology and Vocabularies

Acquiring Experience with Ontology and Vocabularies Acquiring Experience with Ontology and Vocabularies Walt Melo Risa Mayan Jean Stanford The author's affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended

More information

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction Molecular Recognition Features (MoRFs) are short, intrinsically disordered regions in proteins that undergo

More information

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

2) NCBI BLAST tutorial   This is a users guide written by the education department at NCBI. Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take

More information

Genescene: Biomedical Text and Data Mining

Genescene: Biomedical Text and Data Mining Claremont Colleges Scholarship @ Claremont CGU Faculty Publications and Research CGU Faculty Scholarship 5-1-2003 Genescene: Biomedical Text and Data Mining Gondy Leroy Claremont Graduate University Hsinchun

More information

PROJECT PERIODIC REPORT

PROJECT PERIODIC REPORT PROJECT PERIODIC REPORT Grant Agreement number: 257403 Project acronym: CUBIST Project title: Combining and Uniting Business Intelligence and Semantic Technologies Funding Scheme: STREP Date of latest

More information

New generation of patent sequence databases Information Sources in Biotechnology Japan

New generation of patent sequence databases Information Sources in Biotechnology Japan New generation of patent sequence databases Information Sources in Biotechnology Japan EBI is an Outstation of the European Molecular Biology Laboratory. Patent-related resources Patents Patent Resources

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

Search-based Entity Disambiguation with Document-Centric Knowledge Bases

Search-based Entity Disambiguation with Document-Centric Knowledge Bases Search-based Entity Disambiguation with Document-Centric Knowledge Bases Stefan Zwicklbauer University of Passau Passau, 94032 Germany stefan.zwicklbauer@unipassau.de Christin Seifert University of Passau

More information

pubmed.miner: An R package with text-mining algorithms to analyse PubMed abstracts

pubmed.miner: An R package with text-mining algorithms to analyse PubMed abstracts pubmed.miner: An R package with text-mining algorithms to analyse PubMed abstracts JYOTI RANI, AB RAUF SHAH and SRINIVASAN RAMACHANDRAN J. Biosci. 40(4), October 2015, 671 682, Indian Academy of Sciences

More information

Taking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria

Taking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria Taking a view on bio-ontologies Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria Who we are European Bioinformatics Institute one of world s largest bio data and service providers

More information

NERD workshop. Luca ALMAnaCH - Inria Paris. Berlin, 18/09/2017

NERD workshop. Luca ALMAnaCH - Inria Paris. Berlin, 18/09/2017 NERD workshop Luca Foppiano @ ALMAnaCH - Inria Paris Berlin, 18/09/2017 Agenda Introducing the (N)ERD service NERD REST API Usages and use cases Entities Rigid textual expressions corresponding to certain

More information

Australian Journal of Basic and Applied Sciences. Named Entity Recognition from Biomedical Abstracts An Information Extraction Task

Australian Journal of Basic and Applied Sciences. Named Entity Recognition from Biomedical Abstracts An Information Extraction Task ISSN:1991-8178 Australian Journal of Basic and Applied Sciences Journal home page: www.ajbasweb.com Named Entity Recognition from Biomedical Abstracts An Information Extraction Task 1 N. Kanya and 2 Dr.

More information

BioEve: User Interface Framework Bridging IE and IR. Pradeep Kanwar

BioEve: User Interface Framework Bridging IE and IR. Pradeep Kanwar BioEve: User Interface Framework Bridging IE and IR by Pradeep Kanwar A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science Approved October 2010 by the Graduate

More information

SAPIENT Automation project

SAPIENT Automation project Dr Maria Liakata Leverhulme Trust Early Career fellow Department of Computer Science, Aberystwyth University Visitor at EBI, Cambridge mal@aber.ac.uk 25 May 2010, London Motivation SAPIENT Automation Project

More information

Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework. Maryann E.

Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework. Maryann E. Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework Maryann E. Martone University of California, San Diego What does this mean? 3D Volumes

More information

A Semantic Web for Bioinformatics: Goals, Tools, Systems, Applications

A Semantic Web for Bioinformatics: Goals, Tools, Systems, Applications A Semantic Web for Bioinformatics: Goals, Tools, Systems, Applications Mid June, 2007 Department of Computer Science, University of Pise, Italy Why Semantic Web Biological information: an underused resource

More information

Ontology Refinement Using Implicit User Preferences: A case study in cultural tourism domain

Ontology Refinement Using Implicit User Preferences: A case study in cultural tourism domain Ontology Refinement Using Implicit User Preferences: A case study in cultural tourism domain Krich Nasingkun 1,2,3, Mitsuru Ikeda 1, Boontawee Suntisrivaraporn 2, and Thepchai Supnithi 3 1 Japan Advance

More information

Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data (

Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data ( Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data (http://bioqueries.uma.es) María Jesús García-Godoy, Ismael Navas-Delgado, José Francisco Aldana Montes Computing

More information

Database of Curated Mutations (DoCM) ournal/v13/n10/full/nmeth.4000.

Database of Curated Mutations (DoCM)     ournal/v13/n10/full/nmeth.4000. Database of Curated Mutations (DoCM) http://docm.genome.wustl.edu/ http://www.nature.com/nmeth/j ournal/v13/n10/full/nmeth.4000.h tml Home Page Information in DoCM DoCM uses many data sources to compile

More information

Alternative Tools for Mining The Biomedical Literature

Alternative Tools for Mining The Biomedical Literature Yale University From the SelectedWorks of Rolando Garcia-Milian May 14, 2014 Alternative Tools for Mining The Biomedical Literature Rolando Garcia-Milian, Yale University Available at: https://works.bepress.com/rolando_garciamilian/1/

More information

National Centre for Text Mining NaCTeM. e-science and data mining workshop

National Centre for Text Mining NaCTeM. e-science and data mining workshop National Centre for Text Mining NaCTeM e-science and data mining workshop John Keane Co-Director, NaCTeM john.keane@manchester.ac.uk School of Informatics, University of Manchester What is text mining?

More information

Building an Allergens Ontology and Maintaining it using Machine Learning Techniques

Building an Allergens Ontology and Maintaining it using Machine Learning Techniques Building an Allergens Ontology and Maintaining it using Machine Learning Techniques Alexandros G. Valarakos et al. Software and Knowledge Engineering Laboratory, Institute of Informatics and Telecommunications,

More information

HsAgilentDesign db

HsAgilentDesign db HsAgilentDesign026652.db January 16, 2019 HsAgilentDesign026652ACCNUM Map Manufacturer identifiers to Accession Numbers HsAgilentDesign026652ACCNUM is an R object that contains mappings between a manufacturer

More information

Metasearch Process for Transcription Targets

Metasearch Process for Transcription Targets Step 1 Select 'Genes' This is the primary interface for the Metasearch add on to Thomson Reuters (GeneGO) platform. Metasearch allows one to make complex queries for information extraction. This document

More information

Flexible Integration of Molecular-Biological Annotation Data: The GenMapper Approach

Flexible Integration of Molecular-Biological Annotation Data: The GenMapper Approach Flexible Integration of Molecular-Biological Annotation Data: The GenMapper Approach Hong-Hai Do 1 and Erhard Rahm 2 1 Interdisciplinary Centre for Bioinformatics, 2 Department of Computer Science University

More information

Information Resources in Molecular Biology Marcela Davila-Lopez How many and where

Information Resources in Molecular Biology Marcela Davila-Lopez How many and where Information Resources in Molecular Biology Marcela Davila-Lopez (marcela.davila@medkem.gu.se) How many and where Data growth DB: What and Why A Database is a shared collection of logically related data,

More information

Tutorial:OverRepresentation - OpenTutorials

Tutorial:OverRepresentation - OpenTutorials Tutorial:OverRepresentation From OpenTutorials Slideshow OverRepresentation (about 12 minutes) (http://opentutorials.rbvi.ucsf.edu/index.php?title=tutorial:overrepresentation& ce_slide=true&ce_style=cytoscape)

More information

Semantic Technology. Opportunities

Semantic Technology. Opportunities Semantic Technology Opportunities Avinash Punekar Scientific Publishing Services April 2011 2 Semantic Technology April 2011 3 What is Semantic Technology? ² Semantic Web ² Web 3.0 ² Linked Open Data /

More information

ClinVar. Jennifer Lee, PhD, NCBI/NLM/NIH ClinVar

ClinVar. Jennifer Lee, PhD, NCBI/NLM/NIH ClinVar ClinVar What is ClinVar ClinVar is a freely available, central archive for associating observed variation with supporting clinical and experimental evidence for a wide range of disorders. The database

More information

hgu133plus2.db December 11, 2017

hgu133plus2.db December 11, 2017 hgu133plus2.db December 11, 2017 hgu133plus2accnum Map Manufacturer identifiers to Accession Numbers hgu133plus2accnum is an R object that contains mappings between a manufacturer s identifiers and manufacturers

More information

Large-Scale Semantic Indexing and Question Answering in Biomedicine

Large-Scale Semantic Indexing and Question Answering in Biomedicine Large-Scale Semantic Indexing and Question Answering in Biomedicine E. Papagiannopoulou *, Y. Papanikolaou *, D. Dimitriadis *, S. Lagopoulos *, G. Tsoumakas *, M. Laliotis **, N. Markantonatos ** and

More information

BioMinT: Biological Text Mining EU FP5 Quality of Life Project

BioMinT: Biological Text Mining EU FP5 Quality of Life Project BioMinT: Biological Text Mining EU FP5 Quality of Life Project Dr. Dipl.-Ing. Österreichisches Forschungsinstitut für Artificial Intelligence Motivation Economic and business pressures are forcing drug

More information

Visualization and text mining of patent and non-patent data

Visualization and text mining of patent and non-patent data of patent and non-patent data Anton Heijs Information Solutions Delft, The Netherlands http://www.treparel.com/ ICIC conference, Nice, France, 2008 Outline Introduction Applications on patent and non-patent

More information

Development of Text Mining Tools for Information Retrieval from Patents

Development of Text Mining Tools for Information Retrieval from Patents Development of Text Mining Tools for Information Retrieval from Patents Tiago Alves 1,2(B),Rúben Rodrigues 1, Hugo Costa 2, and Miguel Rocha 1 1 Centre Biological Engineering, University of Minho, 4710-057

More information

DOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL

DOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL DOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL Shuguang Wang Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA swang@cs.pitt.edu Shyam Visweswaran Department of Biomedical

More information

mgu74a.db November 2, 2013 Map Manufacturer identifiers to Accession Numbers

mgu74a.db November 2, 2013 Map Manufacturer identifiers to Accession Numbers mgu74a.db November 2, 2013 mgu74aaccnum Map Manufacturer identifiers to Accession Numbers mgu74aaccnum is an R object that contains mappings between a manufacturer s identifiers and manufacturers accessions.

More information

PrEVIEw: Clustering and Visualising PubMed using Visual Interface

PrEVIEw: Clustering and Visualising PubMed using Visual Interface PrEVIEw: Clustering and Visualising PubMed using Visual Interface Syeda Sana e Zainab 1, Qaiser Mehmood 1, Durre Zehra 1, Dietrich Rebholz-Schuhmann 1, and Ali Hasnain 1 Insight Centre for Data Analytics,

More information