Overview of BioCreative VI Precision Medicine Track
|
|
- Loraine Shields
- 6 years ago
- Views:
Transcription
1 Overview of BioCreative VI Precision Medicine Track Mining scientific literature for protein interactions affected by mutations Organizers: Rezarta Islamaj Dogan (NCBI) Andrew Chatr-aryamontri (BioGrid) Sun Kim (NCBI) Don Comeau (NCBI) Zhiyong Lu (NCBI) Data Curators: Andrew Chatr-aryamontri (BioGrid) Jennifer Rust (BioGrid) Christie Chang (BioGrid) Rose W. Oughtred (BioGrid) Lorrie Boucher (BioGrid)
2 Precision Medicine Prevention and treatment of disease taking into account variability in environment, lifestyle and genetic profile of each individual. 2
3 BioCreative Challenges Series Workshop Location Year GM GN GO PPI IAT BC I Granada, Spain 2004 x x x BC II Madrid, Spain 2007 x x x BC II.5 Madrid, Spain 2009 x BC III Bethesda, USA 2010 x x x CTD / CDR Curation Workflow BC 2012 DC, USA 2012 x x x BC IV Bethesda, USA 2013 X x x x x BioC CHEM DNER BC V Sevilla, Spain 2015 X X X x x x x x BEL Organization Committee of BioCreative 2017: BioGrid: Andrew Chatr-aryamontri CNIO: Martin Krallinger, Alfonso Valencia Colorado: Kevin Cohen MITRE: Lynette Hirschman NCBI: Sun Kim, Rezarta Dogan, Don Comeau, Zhiyong Lu PIR: Cecilia Arighi, Cathy Wu SBI: Fabio Finaldi, Julien Gobeill, Pascale Gaudet, Patrick Ruch SCAI: Juliane Fluck, Sumit Madan Chung-Chi Huang, and Zhiyong Lu Community challenges in biomedical text mining over 10 years: success, failure and the future. Briefings in Bioinformatics; 2015
4 Objectives of the Precision Medicine Track Input Unstructured data in biomedical literature Identify precision medicine relevant information in scientific literature Support database curators select articles describing molecular interactions that depend on genetic variability Foster development of tools that can triage scientific literature for relevant studies Foster development of tools that can extract specific PPI relations Knowledgebase Structured and normalized information
5 Precision Medicine Track in BioCreative VI Task 1:Document Triage Identifying relevant PubMed citations describing genetic mutations affecting protein-protein interactions Task 2: Relation Extraction Extracting experimentally verified PPI affected by the presence of a genetic mutation 5
6
7
8
9 What information do curators look for? The goal of the Precision Medicine Task was to annotate mutations that affect the stability of proteinprotein interactions. The PSI-MI community standard includes this type of information in the schema but BioGRID doesn t routinely annotate such information.
10 Data: from the curators point of view Mutations Naturally occurring mutations Synthetic mutations, routinely used in lab practice to study gene function Protein-protein interactions Physical interactions Biochemical reactions Self-interactions Aggregations
11 The Precision Medicine track training corpus was generated as a result of two data selection and validation methods: Data Repurposing Text Mining Triage 2,852 IntAct articles, containing inthe-abstract information about binding interfaces and mutations influencing the interactions were reviewed All PubMed articles were scored with PIE the Search, and were filtered with tmvar selecting 1,200 for manual review
12 Triage Annotation Curated database selected articles (PPI set) Text mining tools selected articles (TM set) Complete Training Set Positives ,730 42% Negatives % Total % Methods Avg. Prec. Precision Recall F1 Positive Negative Ratio 10-fold CV (PPI set) % Validation (TM set) % 10-fold CV (all data) %
13 Training data relation extraction task 597 PubMed abstracts with 760 in-abstract PPI relations affected by mutations. These relations were curated in IntAct and were reviewed and verified for purposes of this task Number of unique genes: 1,053 Common species: Human, house mouse, thale cress, yeast, Norway rat, E-coli
14 PM Track: Testing Data 1,500 PubMed articles were extracted via state-of-the-art PPI and mutation detecting text mining methods These articles have not been previously curated for PPI, and are not in IntAct or other databases Each article was reviewed by at least two data curators who consistently met and discussed discrepancies Each article is curated for triage as relevant for curation or not Relevant for curation articles are curated for PPI relations 14
15 Phases of curation 1. Five curators work on 20 PubMed articles discuss all positive and negative selections discuss the annotation tool and its functionality 2. Two sets of 100 articles are annotated by three curators each Discuss all positive selections and resolve all discrepancies Finalize annotation guidelines and agreements on relation extraction 3. All articles are annotated by a pair of curators Detailed reports are prepared, and all inconsistencies and discrepancies are resolved
16 Bioconcepts of interest to curators for this task List of curated relations between two identifiable bioentities Save annotation Curation categories helping curators classify any given article Space for curators to enter optional comments regarding the article Title and abstract of selected articles with bioconcepts of interest highlighted List of identified bioconcepts, that can be edited by curators. Related mentions of the same concept are grouped together.
17 Inter-annotator agreement Annotator agreements and disagreements Curatable NonCuratable LabelReview RelationReview Typically, for 100 articles: 41 are labelled positive 41 are labelled negative 18 are reviewed for label 23 are reviewed for relations Total articles reviewed: 253 for label 328 for relations
18 Annotation Review Cases Gene organism assignment is difficult Not clear which organism the gene belongs to Gene mentioned could be linked to a family of genes Not all Curatable-labelled articles have explicit relations mentioned in the title or abstract Full text curation is necessary Curators have annotated different relations and there are more than one interactions described in the article Curators had marked the article for further discussion
19 Complete dataset Dataset Articles Positive Negative Articles with relations Number of relations Training 4,082 1,729 2, Testing 1,
20 Precision Medicine Track: Timeline January 2017: Sample annotation of 250 PubMed articles and proof of concept March 2017: Training data annotation for Triage Complete April 2017: Repurposing of IntAct PPI curations for the relation extraction task complete May 2017: Training dataset formatted in BioC (XML/JSON) and made available online 27 Text mining teams registered to participate in the challenge June 2017: Phase 1 and 2 of test data annotation August 2017: Test data annotation complete September 2017: Test data available to challenge participants and evaluation 20
21 Evaluation Evaluation script was made available to all participants Dual purpose (evaluation + format check) Precision/recall/average precision For Relation Extraction task: Exact match HomoloGene match
22 Submission format Triage <infon key= relevant >YES/NO</infon> <infon key= confidence > Real value between 0 and 1</infon> Relations <relation id="r1"> <infon key="gene1">geneid-1</infon> <infon key="gene2">geneid-2</infon> <infon key="relation">ppim</infon> <infon key= confidence >0.XY</infon> </relation>
23 Baseline systems Triage Task SVM classifier using unigram and bigram features from titles and abstracts Relations Task Co-occurrence method Gene names were predicted and normalized via GeneNormPlus Mutation and sequence variation prediction were not used If two genes are predicted in the same sentence, a relation is predicted
24 Participation Team Number Triage Task Relation Task Total 10 teams/22 runs 6 teams/14 runs
25 Team Number Submission Avg Prec Precision Recall F1 Data Format Run JSON 374 Run JSON Run JSON Run JSON 375 Run JSON Run JSON 379 Run XML 405 Run JSON Run XML 414 Run XML Run XML Run XML 418 Run XML Run XML Run XML 419 Run XML Run XML 420 Run JSON Run XML 421 Run XML Run XML 433 Run JSON BASELINE
26 System Submission Precision Recall F1 Data Format Run XML 375 Run XML Run XML 379 Run XML Run XML Run XML 391 Run XML Run XML 405 Run JSON Run JSON Run JSON 420 Run JSON Run JSON 433 Run JSON BASELINE
27 F1 Avg Prec 418 Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run Run BASELINE Run Run Run Run BASELINE Run Run Run Run Run Run Run Run Run Run Run Run Run
28 Micro F1 Macro F1 420 Run Run Run Run Run Run Run Run Run Run Run Run BASELINE Run Run BASELINE Run Run Run Run Run Run Run Run Run Run Run Run Run Run
29 Summary Precision Medicine Track brought together 11 teams worldwide Produced a high quality, manually curated, 5,546 PubMed article corpus containing 2,459 curatable articles for PPI affected by mutations 1,285 articles are curated for relations, with a total of 1,682 relations 22 text mining systems were submitted for the triage task, and 14 for relation extraction As curators are interested in capturing more specialized information such as molecular interactions affected by genetic variations, they will benefit from this work.
30 Summary For Triage: 16 systems outperformed the baseline based on F1-score, 9 of which showed a statistically significant result For the relations task 7 systems outperformed the baseline and all of these results were statistically significant The relations defined in this task are not generally described in a single sentence The corpus is beneficial both for training systems that can extract information of practical value in precision medicine initiative, as well as for training systems that can extract abstract level relations, necessitating paragraph-level understanding.
31 Thank you
A Framework for BioCuration (part II)
A Framework for BioCuration (part II) Text Mining for the BioCuration Workflow Workshop, 3rd International Biocuration Conference Friday, April 17, 2009 (Berlin) Martin Krallinger Spanish National Cancer
More informationThe Text Analytics Challenge BioCreative V - Extraction of causal network information in BEL
The Text Analytics Challenge BioCreative V - Extraction of causal network information in BEL http://tinyurl.com/beltask Fabio Rinaldi Outline Biomedical text mining, motivation Competitive evaluations:
More informationBioC: a minimalist approach to interoperability for biomedical text processing. Don Comeau
BioC: a minimalist approach to interoperability for biomedical text processing Don Comeau Outline Background and origin of BioC What is BioC? Available Tools and Corpora 2 BioCreative Critical Assessment
More informationThe user interactive task (IAT) in BioCreative Challenges BioCreative Workshop on Text Mining Applications April 7, 2014
The user interactive task (IAT) in BioCreative Challenges BioCreative Workshop on Text Mining Applications April 7, 2014 N., PhD Research Associate Professor Protein Information Resource CBCB, University
More informationImproving Interoperability of Text Mining Tools with BioC
Improving Interoperability of Text Mining Tools with BioC Ritu Khare, Chih-Hsuan Wei, Yuqing Mao, Robert Leaman, Zhiyong Lu * National Center for Biotechnology Information, 8600 Rockville Pike, Bethesda,
More informationA curation pipeline and web-services for PDF documents
A curation pipeline and web-services for PDF documents André Santos 1, Sérgio Matos 1, David Campos 2 and José Luís Oliveira 1 1 DETI/IEETA, University of Aveiro, 3810-193 Aveiro, Portugal {aleixomatos,andre.jeronimo,jlo}@ua.pt
More informationPMC text mining subset in BioC: 2.3 million full text articles and growing
PMC text mining subset in BioC: 2.3 million full text articles and growing Donald C. Comeau, Chih-Hsuan Wei, Rezarta Islamaj Doğan and Zhiyong Lu National Center for Biotechnology Information, U.S. Library
More informationMeasuring inter-annotator agreement in GO annotations
Measuring inter-annotator agreement in GO annotations Camon EB, Barrell DG, Dimmer EC, Lee V, Magrane M, Maslen J, Binns ns D, Apweiler R. An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.
More informationNatural Language Processing Pipelines to Annotate BioC Collections with an Application to the NCBI Disease Corpus
Natural Language Processing Pipelines to Annotate BioC Collections with an Application to the NCBI Disease Corpus Donald C. Comeau *, Haibin Liu, Rezarta Islamaj Doğan and W. John Wilbur National Center
More informationProjects Tools BLAH proposal Conclusion. OntoGene/BioMeXT
OntoGene/BioMeXT The Bio Term Hub and OGER Lenz Furrer, Nico Colic, Fabio Rinaldi University of Zurich and Swiss Institute of Bioinformatics January 10, 2018 Outline Projects Tools BLAH proposal Conclusion
More informationBiomedical literature mining for knowledge discovery
Biomedical literature mining for knowledge discovery REZARTA ISLAMAJ DOĞAN National Center for Biotechnology Information National Library of Medicine Outline Biomedical Literature Access Challenges in
More informationSciMiner User s Manual
SciMiner User s Manual Copyright 2008 Junguk Hur. All rights reserved. Bioinformatics Program University of Michigan Ann Arbor, MI 48109, USA Email: juhur@umich.edu Homepage: http://jdrf.neurology.med.umich.edu/sciminer/
More informationPPI Finder: A Mining Tool for Human Protein-Protein Interactions
PPI Finder: A Mining Tool for Human Protein-Protein Interactions Min He 1,2., Yi Wang 1., Wei Li 1 * 1 Key Laboratory of Molecular and Developmental Biology, Institute of Genetics and Developmental Biology,
More informationefip online Help Document
efip online Help Document University of Delaware Computer and Information Sciences & Center for Bioinformatics and Computational Biology Newark, DE, USA December 2013 K K S I K K Table of Contents INTRODUCTION...
More informationCustomisable Curation Workflows in Argo
Customisable Curation Workflows in Argo Rafal Rak*, Riza Batista-Navarro, Andrew Rowley, Jacob Carter and Sophia Ananiadou National Centre for Text Mining, University of Manchester, UK *Corresponding author:
More informationEFFICIENT AUTOMATED PROCESSING OF BIOMEDICAL LITERATURE
EFFICIENT AUTOMATED PROCESSING OF BIOMEDICAL LITERATURE NICO COLIC 1. Introduction The rate at which biomedical research papers are published is ever increasing. Because of this, professionals rely on
More informationRanking of CTD articles and interactions using the OntoGene pipeline
Ranking of CTD articles and interactions using the OntoGene pipeline Fabio Rinaldi, Simon Clematide and Simon Hafner Institute of Computational Linguistics, University of Zurich {rinaldi,siclemat}@cl.uzh.ch,{hafnersimon@gmail.com}
More informationRLIMS-P Website Help Document
RLIMS-P Website Help Document Table of Contents Introduction... 1 RLIMS-P architecture... 2 RLIMS-P interface... 2 Login...2 Input page...3 Results Page...4 Text Evidence/Curation Page...9 URL: http://annotation.dbi.udel.edu/text_mining/rlimsp2/
More informationSemi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction
Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction Pavel P. Kuksa, Rutgers University Yanjun Qi, Bing Bai, Ronan Collobert, NEC Labs Jason Weston, Google Research NY Vladimir
More informationIntegrated Access to Biological Data. A use case
Integrated Access to Biological Data. A use case Marta González Fundación ROBOTIKER, Parque Tecnológico Edif 202 48970 Zamudio, Vizcaya Spain marta@robotiker.es Abstract. This use case reflects the research
More informationBenchmarking biomedical text mining web servers at BioCreative V.5: the technical Interoperability and Performance of annotation Servers - TIPS track
Benchmarking biomedical text mining web servers at BioCreative V.5: the technical Interoperability and Performance of annotation Servers - TIPS track Martin Pérez-Pérez 1,2, Gael Pérez-Rodríguez 1,2, Aitor
More informationInformation Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si
Information Retrieval, Information Extraction, and Text Mining Applications for Biology Slides by Suleyman Cetintas & Luo Si 1 Outline Introduction Overview of Literature Data Sources PubMed, HighWire
More informationExtraction of biomedical events using case-based reasoning
Extraction of biomedical events using case-based reasoning Mariana L. Neves Biocomputing Unit Centro Nacional de Biotecnología - CSIC C/ Darwin 3, Campus de Cantoblanco, 28049, Madrid, Spain mlara@cnb.csic.es
More informationAutomatic annotation in UniProtKB using UniRule, and Complete Proteomes. Wei Mun Chan
Automatic annotation in UniProtKB using UniRule, and Complete Proteomes Wei Mun Chan Talk outline Introduction to UniProt UniProtKB annotation and propagation Data increase and the need for Automatic Annotation
More informationA new methodology for gene normalization using a mix of taggers, global alignment matching and document similarity disambiguation
A new methodology for gene normalization using a mix of taggers, global alignment matching and document similarity disambiguation Mariana Neves 1, Monica Chagoyen 1, José M Carazo 1, Alberto Pascual-Montano
More informationDocument Retrieval using Predication Similarity
Document Retrieval using Predication Similarity Kalpa Gunaratna 1 Kno.e.sis Center, Wright State University, Dayton, OH 45435 USA kalpa@knoesis.org Abstract. Document retrieval has been an important research
More informationThe CALBC RDF Triple store: retrieval over large literature content
The CALBC RDF Triple store: retrieval over large literature content Samuel Croset, Christoph Grabmüller, Chen Li, Silverstras Kavaliauskas, Dietrich Rebholz-Schuhmann croset@ebi.ac.uk 10 th December 2010,
More informationThis document contains information about the annotation workflow for the Full BioCreative interactive task.
BioCreative IV-User Interactive Task RLIMS-P Annotation Task This document contains information about the annotation workflow for the Full BioCreative interactive task. Annotation Workflow using RLIMS-P
More informationA STACKED GRAPHICAL MODEL FOR ASSOCIATING SUB-IMAGES WITH SUB-CAPTIONS
A STACKED GRAPHICAL MODEL FOR ASSOCIATING SUB-IMAGES WITH SUB-CAPTIONS ZHENZHEN KOU, WILLIAM W. COHEN, AND ROBERT F. MURPHY Machine Learning Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh,
More informationEBI services. Jennifer McDowall EMBL-EBI
EBI services Jennifer McDowall EMBL-EBI The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number 226073 (Integrating
More informationClinVar. Jennifer Lee, PhD, NCBI/NLM/NIH ClinVar
ClinVar What is ClinVar ClinVar is a freely available, central archive for associating observed variation with supporting clinical and experimental evidence for a wide range of disorders. The database
More informationGenescene: Biomedical Text and Data Mining
Claremont Colleges Scholarship @ Claremont CGU Faculty Publications and Research CGU Faculty Scholarship 5-1-2003 Genescene: Biomedical Text and Data Mining Gondy Leroy Claremont Graduate University Hsinchun
More informationA Semantic Web for Bioinformatics: Goals, Tools, Systems, Applications
A Semantic Web for Bioinformatics: Goals, Tools, Systems, Applications Mid June, 2007 Department of Computer Science, University of Pise, Italy Why Semantic Web Biological information: an underused resource
More informationChapter 6 Evaluation Metrics and Evaluation
Chapter 6 Evaluation Metrics and Evaluation The area of evaluation of information retrieval and natural language processing systems is complex. It will only be touched on in this chapter. First the scientific
More informationText mining tools for semantically enriching the scientific literature
Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the
More informationA Feature Generation Algorithm for Sequences with Application to Splice-Site Prediction
A Feature Generation Algorithm for Sequences with Application to Splice-Site Prediction Rezarta Islamaj 1, Lise Getoor 1, and W. John Wilbur 2 1 Computer Science Department, University of Maryland, College
More informationRelational Retrieval Using a Combination of Path-Constrained Random Walks
Relational Retrieval Using a Combination of Path-Constrained Random Walks Ni Lao, William W. Cohen University 2010.9.22 Outline Relational Retrieval Problems Path-constrained random walks The need for
More informationSoftware review. Biomolecular Interaction Network Database
Biomolecular Interaction Network Database Keywords: protein interactions, visualisation, biology data integration, web access Abstract This software review looks at the utility of the Biomolecular Interaction
More informationNCBI News, November 2009
Peter Cooper, Ph.D. NCBI cooper@ncbi.nlm.nh.gov Dawn Lipshultz, M.S. NCBI lipshult@ncbi.nlm.nih.gov Featured Resource: New Discovery-oriented PubMed and NCBI Homepage The NCBI Site Guide A new and improved
More informationRetrieval of Highly Related Documents Containing Gene-Disease Association
Retrieval of Highly Related Documents Containing Gene-Disease Association K. Santhosh kumar 1, P. Sudhakar 2 Department of Computer Science & Engineering Annamalai University Annamalai Nagar, India. santhosh09539@gmail.com,
More informationEVIDENCE FOR SHOWING GENE/PROTEIN NAME SUGGESTIONS IN BIOSCIENCE LITERATURE SEARCH INTERFACES
EVIDENCE FOR SHOWING GENE/PROTEIN NAME SUGGESTIONS IN BIOSCIENCE LITERATURE SEARCH INTERFACES ANNA DIVOLI, MARTI A. HEARST, MICHAEL A. WOOLDRIDGE School of Information, UC Berkeley {divoli,hearst,mikew}@.ischool.berkeley.edu
More informationPainless Relation Extraction with Kindred
Painless Relation Extraction with Kindred Jake Lever and Steven JM Jones Canada s Michael Smith Genome Sciences Centre 570 W 7th Ave, Vancouver BC, V5Z 4S6, Canada {jlever,sjones}@bcgsc.ca Abstract Relation
More informationValidation of Automated Protein Annotation
Validation of Automated Protein Annotation Francisco M. Couto Mário J. Silva Pedro M. Coutinho DI FCUL TR 05 24 December 2005 Departamento de Informática Faculdade de Ciências da Universidade de Lisboa
More informationIPA: networks generation algorithm
IPA: networks generation algorithm Dr. Michael Shmoish Bioinformatics Knowledge Unit, Head The Lorry I. Lokey Interdisciplinary Center for Life Sciences and Engineering Technion Israel Institute of Technology
More informationUsing open access literature to guide full-text query formulation. Heather A. Piwowar and Wendy W. Chapman. Background
Using open access literature to guide full-text query formulation Heather A. Piwowar and Wendy W. Chapman Background Much scientific knowledge is contained in the details of the full-text biomedical literature.
More informationFinding and Exporting Data. BioMart
September 2017 Finding and Exporting Data Not sure what tool to use to find and export data? BioMart is used to retrieve data for complex queries, involving a few or many genes or even complete genomes.
More informationExploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix
Exploring and Exploiting the Biological Maze Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Motivation An abundance of biological data sources contain data about scientific entities, such as
More informationMeSH-based dataset for measuring the relevance of text retrieval
MeSH-based dataset for measuring the relevance of text retrieval Won Kim, Lana Yeganova, Donald C Comeau, W John Wilbur, Zhiyong Lu National Center for Biotechnology Information, NLM, NIH, Bethesda, MD,
More informationData Curation Profile Human Genomics
Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date
More informationAll abstracts should be reviewed by meeting organisers prior to submission to BioMed Central to ensure suitability for publication.
Abstract supplements - Guidelines for Organisers General requirements Abstracts submitted to the journal must be original and must not have been previously published elsewhere. Abstracts published on a
More informationInteractive Machine Learning (IML) Markup of OCR Generated Text by Exploiting Domain Knowledge: A Biodiversity Case Study
Interactive Machine Learning (IML) Markup of OCR Generated by Exploiting Domain Knowledge: A Biodiversity Case Study Several digitization projects such as Google books are involved in scanning millions
More informationBlast2GO Teaching Exercises
Blast2GO Teaching Exercises Ana Conesa and Stefan Götz 2012 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2 2 Perform a complete annotation process with Blast2GO
More informationBrat2BioC: conversion tool between brat and BioC
Brat2: conversion tool between and Antonio Jimeno Yepes 1,2, Mariana Neves 3,4, Karin Verspoor 1,2 1 NICTA Victoria Research Lab, Melbourne VIC 3010, Australia 2 Department of Computing and Information
More informationAlternative Tools for Mining The Biomedical Literature
Yale University From the SelectedWorks of Rolando Garcia-Milian May 14, 2014 Alternative Tools for Mining The Biomedical Literature Rolando Garcia-Milian, Yale University Available at: https://works.bepress.com/rolando_garciamilian/1/
More informationSemantic Knowledge Discovery OntoChem IT Solutions
Semantic Knowledge Discovery OntoChem IT Solutions OntoChem IT Solutions GmbH Blücherstr. 24 06120 Halle (Saale) Germany Tel. +49 345 4780472 Fax: +49 345 4780471 mail: info(at)ontochem.com Get the Gold!
More informationNancy Baker 1, Thomas Knudsen 2, Antony Williams 2
SOFTWARE TOOL ARTICLE Abstract Sifter: a comprehensive front-end system to PubMed [version 1; referees: 2 approved] Nancy Baker 1, Thomas Knudsen 2, Antony Williams 2 1Leidos, Research Triangle Park, NC,
More informationLarge Scale Chinese News Categorization. Peng Wang. Joint work with H. Zhang, B. Xu, H.W. Hao
Large Scale Chinese News Categorization --based on Improved Feature Selection Method Peng Wang Joint work with H. Zhang, B. Xu, H.W. Hao Computational-Brain Research Center Institute of Automation, Chinese
More informationDOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL
DOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL Shuguang Wang Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA swang@cs.pitt.edu Shyam Visweswaran Department of Biomedical
More informationEvaluation of different biological data and computational classification methods for use in protein interaction prediction.
Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Yanjun Qi, Ziv Bar-Joseph, Judith Klein-Seetharaman Protein 2006 Motivation Correctly
More information2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.
Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take
More informationSAPIENT Automation project
Dr Maria Liakata Leverhulme Trust Early Career fellow Department of Computer Science, Aberystwyth University Visitor at EBI, Cambridge mal@aber.ac.uk 25 May 2010, London Motivation SAPIENT Automation Project
More informationData Capture and Data Analysis Hupo Plasma Proteome Project workshop Bethesda, July 2003, Henning Hermjakob, EBI
Data Capture and Data Analysis Hupo Plasma workshop Bethesda, July 2003, Henning Hermjakob, EBI Aims Ensure data comparability to allow Comparative analysis of results Presentation of results User-friendly
More informationConnecting Text Mining and Pathways using the PathText Resource
Connecting Text Mining and Pathways using the PathText Resource Sætre, Kemper, Oda, Okazaki a, Matsuoka b, Kikuchi c, Kitano d, Tsuruoka, Ananiadou, Tsujii e a Computer Science, University of Tokyo, Hongo
More information2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI.
2 Navigating the NCBI Instructions Aim: To become familiar with the resources available at the National Center for Bioinformatics (NCBI) and the search engine Entrez. Instructions: Write the answers to
More informationThe LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases
International Symposium on Integrative Bioinformatics 2010 The LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases M Lange, K Spies, C Colmsee, S Flemming, M Klapperstück,
More informationText-mining-assisted biocuration workflows in Argo
Database, 2014, 1 14 doi: 10.1093/database/bau070 Original article Original article Text-mining-assisted biocuration workflows in Argo Rafal Rak 1, *, Riza Theresa Batista-Navarro 1,2, Andrew Rowley 1,
More informationA Feature Selection Method to Handle Imbalanced Data in Text Classification
A Feature Selection Method to Handle Imbalanced Data in Text Classification Fengxiang Chang 1*, Jun Guo 1, Weiran Xu 1, Kejun Yao 2 1 School of Information and Communication Engineering Beijing University
More informationBovineMine Documentation
BovineMine Documentation Release 1.0 Deepak Unni, Aditi Tayal, Colin Diesh, Christine Elsik, Darren Hag Oct 06, 2017 Contents 1 Tutorial 3 1.1 Overview.................................................
More informationDiscovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London
Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,
More informationRecommending MeSH terms for annotating biomedical articles
Recommending MeSH terms for annotating biomedical articles Minlie Huang, 1,2 Aurélie Névéol, 2 Zhiyong Lu 2 1 State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for
More informationChemical name recognition with harmonized feature-rich conditional random fields
Chemical name recognition with harmonized feature-rich conditional random fields David Campos, Sérgio Matos, and José Luís Oliveira IEETA/DETI, University of Aveiro, Campus Universitrio de Santiago, 3810-193
More informationClassification of Protein Crystallization Imagery
Classification of Protein Crystallization Imagery Xiaoqing Zhu, Shaohua Sun, Samuel Cheng Stanford University Marshall Bern Palo Alto Research Center September 2004, EMBC 04 Outline Background X-ray crystallography
More informationMeter Trouble Report PUBLIC. A Guide for Market Participants. Issue 6.0 IMP_GDE_0098
PUBLIC IMP_GDE_0098 + Meter Trouble Report A Guide for Market Participants Issue 6.0 This document is a guide for market participants to the use of the Meter Trouble Report workflow application. Public
More informationThe software comes with 2 installers: (1) SureCall installer (2) GenAligners (contains BWA, BWA- MEM).
Release Notes Agilent SureCall 4.0 Product Number G4980AA SureCall Client 6-month named license supports installation of one client and server (to host the SureCall database) on one machine. For additional
More informationDigital The Harold B. Lee Library
Digital Preservation @ The Harold B. Lee Library CIMA 23 May 2013 How we got here? 1. Understanding Digital Preservation 2. Search for Content 3. Maintain Optical Disc Storage 4. In House Preservation
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationAutomatic Domain Partitioning for Multi-Domain Learning
Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels
More informationClassification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014
Classification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014 Sungbin Choi, Jinwook Choi Medical Informatics Laboratory, Seoul National University, Seoul, Republic of
More informationThe GENIA corpus Linguistic and Semantic Annotation of Biomedical Literature. Jin-Dong Kim Tsujii Laboratory, University of Tokyo
The GENIA corpus Linguistic and Semantic Annotation of Biomedical Literature Jin-Dong Kim Tsujii Laboratory, University of Tokyo Contents Ontology, Corpus and Annotation for IE Annotation and Information
More informationA quick review. Which molecular processes/functions are involved in a certain phenotype (e.g., disease, stress response, etc.)
Gene expression profiling A quick review Which molecular processes/functions are involved in a certain phenotype (e.g., disease, stress response, etc.) The Gene Ontology (GO) Project Provides shared vocabulary/annotation
More informationBioinformatics Hubs on the Web
Bioinformatics Hubs on the Web Take a class The Galter Library teaches a related class called Bioinformatics Hubs on the Web. See our Classes schedule for the next available offering. If this class is
More informationA simple method to extract abbreviations within a document using regular expressions
A simple method to extract abbreviations within a document using regular expressions Christian Sánchez, Paloma Martínez Computer Science Department, Universidad Carlos III of Madrid Avd. Universidad, 30,
More informationAutomatically Constructing a Directory of Molecular Biology Databases
Automatically Constructing a Directory of Molecular Biology Databases Luciano Barbosa, Sumit Tandon, and Juliana Freire School of Computing, University of Utah Abstract. There has been an explosion in
More informationChristian Sánchez, Paloma Martínez. Computer Science Department, Universidad Carlos III of Madrid Avd. Universidad, 30, Leganés, 28911, Madrid, Spain
A proposed system to identify and extract abbreviation definitions in Spanish biomedical texts for the Biomedical Abbreviation Recognition and Resolution (BARR) 2017 Christian Sánchez, Paloma Martínez
More informationMDA Blast2GO Exercises
MDA 2011 - Blast2GO Exercises Ana Conesa and Stefan Götz March 2011 Bioinformatics and Genomics Department Prince Felipe Research Center Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2
More informationExtracting patient data from tables in clinical literature Case study on extraction of BMI, weight and number of patients
Extracting patient data from tables in clinical literature Case study on extraction of BMI, weight and number of patients Nikola Milosevic 1, Cassie Gregson 2, Robert Hernandez 2 and Goran Nenadic 1,3
More informationInteroperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research
Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research Ian Fore, D.Phil. Associate Director, Biorepository and Pathology Informatics Senior Program
More informationCIS UDEL Working Notes on ImageCLEF 2015: Compound figure detection task
CIS UDEL Working Notes on ImageCLEF 2015: Compound figure detection task Xiaolong Wang, Xiangying Jiang, Abhishek Kolagunda, Hagit Shatkay and Chandra Kambhamettu Department of Computer and Information
More informationSupplementary Note 1: Considerations About Data Integration
Supplementary Note 1: Considerations About Data Integration Considerations about curated data integration and inferred data integration mentha integrates high confidence interaction information curated
More informationGIDMP: GOOD PROTEIN-PROTEIN INTERACTION DATA METAMINING PRACTICE
CELLULAR & MOLECULAR BIOLOGY LETTERS http://www.cmbl.org.pl Received: 06 October 2010 Volume 16 (2011) pp 258-263 Final form accepted: 28 February 2011 DOI: 10.2478/s11658-011-0004-1 Published online:
More informationMaximizing Public Data Sources for Sequencing and GWAS
Maximizing Public Data Sources for Sequencing and GWAS February 4, 2014 G Bryce Christensen Director of Services Questions during the presentation Use the Questions pane in your GoToWebinar window Agenda
More informationNERD workshop. Luca ALMAnaCH - Inria Paris. Berlin, 18/09/2017
NERD workshop Luca Foppiano @ ALMAnaCH - Inria Paris Berlin, 18/09/2017 Agenda Introducing the (N)ERD service NERD REST API Usages and use cases Entities Rigid textual expressions corresponding to certain
More informationHumboldt-University of Berlin
Humboldt-University of Berlin Exploiting Link Structure to Discover Meaningful Associations between Controlled Vocabulary Terms exposé of diploma thesis of Andrej Masula 13th October 2008 supervisor: Louiqa
More informationMinimal Metadata Standards and MIIDI Reports
Dryad-UK Workshop Wolfson College, Oxford 12 September 2011 Minimal Metadata Standards and MIIDI Reports David Shotton, Silvio Peroni and Tanya Gray Image BioInformatics Research Group Department of Zoology
More informationGenome Browsers - The UCSC Genome Browser
Genome Browsers - The UCSC Genome Browser Background The UCSC Genome Browser is a well-curated site that provides users with a view of gene or sequence information in genomic context for a specific species,
More informationG-TACT User Guide Ensuring Accurate Classification of BRCA Variants Module: RUN 1
G-TACT User Guide Ensuring Accurate Classification of BRCA Variants Module: RUN 1 Description The G-TACT User Guide provides assistance to users in all aspects of using the Ensuring Accurate Classification
More informationBioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data
BioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data María-Esther Vidal 1, Louiqa Raschid 2, Natalia Márquez 1, Jean Carlo Rivera 1, and Edna Ruckhaus 1 1 Universidad
More informationGenome Browsers Guide
Genome Browsers Guide Take a Class This guide supports the Galter Library class called Genome Browsers. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule,
More informationCRFVoter: Chemical Entity Mention, Gene and Protein Related Object recognition using a conglomerate of CRF based tools
CRFVoter: Chemical Entity Mention, Gene and Protein Related Object recognition using a conglomerate of CRF based tools Wahed Hemati, Alexander Mehler, and Tolga Uslu Text Technology Lab, Goethe Universitt
More informationToward an interactive article: integrating journals and biological databases
BMC Bioinformatics This Provisional PDF corresponds to the article as it appeared upon acceptance. Fully formatted PDF and full text (HTML) versions will be made available soon. Toward an interactive article:
More information