The Text Analytics Challenge BioCreative V - Extraction of causal network information in BEL
|
|
- Rafe Henderson
- 5 years ago
- Views:
Transcription
1 The Text Analytics Challenge BioCreative V - Extraction of causal network information in BEL Fabio Rinaldi
2 Outline Biomedical text mining, motivation Competitive evaluations: BioNLP, BioCreative The BEL BioCreative V Outlook
3 Motivation The purpose of biomedical curation activities is to help the Life Sciences community to make sense of all the data that is accumulating. A. Bairoch, The future of annotation/ biocuration. Nature Preceedings 2009.
4 Growth of PubMed citations from 1986 to Lu Z Database 2011;2011:baq036 The Author(s) Published by Oxford University Press.
5 Motivation The purpose of biomedical curation activities is to help the Life Sciences community to make sense of all the data that is accumulating. Nobody will ever be able to manually annotate all the macromolecular biological entities that exist on this planet, and consequently automatization is the only solution. A. Bairoch, The future of annotation/biocuration. Nature Preceedings 2009.
6 Why text mining? Massive amount of published material, human curation is impossible Text mining can assist Database curation Targeted searches by scientist Identification of research targets by industry Build systemic networks Text mining technologies are regularly evaluated through community assessments
7 Goals of community assessments Determine state of the art Monitor improvements Investigation different approaches Evaluation of new strategies Identification of positive / negative features Scientific forum stimulate progress in research
8 Competitive evaluations BioCreative BioNLP BioASQ i2b2 (medical) CALBC CLEF-ER QA4MRE Semeval
9 BioASQ Three editions so far: 2013, 2014, 2015 Two tasks: a. Large-scale online biomedical semantic indexing Annotate PubMed abstract with classes from the MeSH hierarchy b. Introductory biomedical semantic QA Questions to be answered with relevant concepts (from designated terminologies and ontologies), relevant articles (in English, from designated article repositories), relevant snippets (from the relevant articles), and relevant RDF triples (from designated ontologies).
10 BioNLP shared task
11 BioNLP 2009 Task 1. Core event extraction (mandatory) Unary events: 70%; binding and regulation: 40% Task 2. Event enrichment (optional) phosphorylation of TRAF2 (Type:Phosphorylation, Theme:TRAF2) localization of beta-catenin into nucleus (Type:Localization, Theme:beta-catenin, ToLoc:nucleus) Task 3. Negation and speculation recognition (optional) TRADD did not interact with TES2 (Negation (Type:Binding, Theme:TRADD, Theme:TES2)) Total number of participants: 24
12 BioNLP 2011 [GE] GENIA; p: 15, F: 53% (full text), F: 57% (abstracts) [EPI] Epigenetics and Post-translational Modifications; p: 7, F: 53% [ID] Infectious Diseases; p: 7, F: 56% Bacteria Track: [BB] Biotopes (p:3, F: 45%), [BI] Gene Interactions (p: 1, F: 77.0%), [REN] Bacteria Gene Renaming (p:3, F: 87.0%) [CO] Protein/Gene Coreference Task, p:6, F: 34.1% [REL] Entity Relations Supporting Task, p: 4, F: 57.7%
13 BioNLP 2013 [GE] Genia Event Extraction; p: 10, F: 51% [CG] Cancer Genetics; p: 2, F: 55.4% [PC] Pathway Curation; p: 2, F: 52.8% [GRO] Corpus Annotation with Gene Regulation Ontology; p: 1, F: 22% [GRN] Gene Regulation Network in Bacteria; p: 5, SER: 73% [BB] Bacteria Biotopes (semantic annotation by an ontology); P: 5, entities SER: 46%, relations: 40%, events: 14%
14 BioCreative
15 BioCreative I (2004) BioCreative I 27 Teams, Granada, Spain Hirschman et al. Overview of BioCreative: critical assessment of information extraction for biology. BMC Bioinformatics (2005), 6:S1 Tracks: identification of gene mentions in text and linking protein database entries to abstracts. extraction of human gene product annotations with GO terms
16 BioCreative II (2006) BioCreative II 44 teams, Madrid, Spain Krallinger et al. Evaluation of text-mining systems in Biology: overview of the Second BioCreative community challenge, Genome Biology (2008), 9:S1 Tracks Gene mention tagging [GM] Gene normalization [GN] Extraction of protein-protein interactions from text [PPI]: IAS (article), IPS (pair), IMS (methods), ISS (evidence)
17 BioCreative II: GM, GN
18 BioCreative II: PPI
19 Species
20 BioCreative II.5 (2009) Challenge run through web services Participants: 16 teams Corpus: FEBS Letter 2007 Goal: Reproduce the Structured Digital Abstract Subtasks: [ACT] Article classification; AUC: 67.8% [INT] Interactor Normalization; AUC: 43.5% [IPT] Interaction Pair; AUC: 22.2%
21
22 Structured Digital Abstracts
23 BioCreative III (2010) Tasks: gene mentions (GM); p: 14, TAP-10: 34.6% protein-protein interactions article classification (ACT); p: 10, AUC: 68% experimental Methods (IMT); p: 9, AUC: 53% interactive task (IAT); p: 6
24 PPI-IMT
25 BioCreative IV (2013) Task 1: BioC (PyBioC); p: 9 Task 2: CHEMDNER (chemicals); p: 27 Task 3: CTD web service; p: 7 Best F-score: 87.39% CEM, 88.20% CDI gene: 61%, chemical: 74%, disease: 51%, act: 54% Task 4: GO annotations; p: 8 Task A (evidence text), best F: 0.27 (exact) / 0.38 Task B (predict GO terms), best F: 0.13 / 0.34 (hier.) Task 5: Interactive task; p: 9
26 BioCreative V (2015) Collaborative Biocurator Assistant Task (BioC) CHEMDNER patents Chemical-disease relation (CDR) task Extraction of causal network information in Biological Expression Language (BEL) Interactive Curation (IAT)
27 The BEL BioCreative V
28 BEL Track: Timeline Oct 2014: preparation of proposal for Task Nov 2014: approval of proposal Dec 2014: administrative and contractual arrangements Jan 2015: official start of supported activity Feb 2015: release sample set Mar 2015: release training set Mar-May 2015: preparation of evaluation framework and supporting data Jun 15-18: release of test set and official evaluation
29 Datasets and supporting material Sample set Training set 295 BEL statements with evidence BEL statements with evidence Supporting material: BioC version Structural graphs Fragments (tsv representation of BEL statements) Entities (list of entities contained in BEL statement)
30 Tasks Task 1 Given textual evidence for a BEL statement, generate the corresponding BEL statement Data: 100 Sentences Accept 3 runs per participant Accept up to 10 BEL statements or fragments per sentence Task 2 Given a BEL statement, provide at most 10 additional evidence sentences Data: 100 BEL statements, verified to have evidence in PubMed Accept 1 run per participant, each with 10 sentences ranked by relevance
31 Simplifications Selection of statements: non-nested Selection of relationships decreases/directlydecreases, increases/directlyincreases Namespaces: Six namespaces considered (HGNC, MGI, EGID, GOPB, MESHD, CHEBI) Equivalence between HGNC, MGI and EGID Simplification of abundance functions for gene/protein: p() can be used instead of g(), m() and r()) Restrictions and equivalences of functions Simplification of Abundance Modifier Functions: Cellular locations are not requested P is used as default argument to pmod() (pmod(p)) Simplification of functions: act() is used instead of cat(), tscript(), kin(), gtp(), sec(), surf() etc.
32 Documentation BEL track initial pages at biocreative.org BEL track extended description at openbel.org: wiki.openbel.org/display/bioc/biocreative+home Setup of the task Sample and Training data Evaluation details
33 Information to participants Broadcast calls to several mailing list Announcements on the BioCreative mailing list Set-up dedidated google group 13 registered users Used to deliver target information about the challenge setup Evaluation web SCAI
34 Evaluation metrics Primary: Term (T), Function (F), Relationship (R), Full BEL statement (F). Secondary: Function (Fs), Relationship (Rs) 2nd stage includes gold standard entities
35 Format conversions Definition of BEL/BioC format (collaboration with NLM/NCBI) Parsing of BEL statements via ruby, conversion BEL into BioC (and RDF) Visualization of BEL structure via graph
36
37 Timeline: next steps Jul 2015: Feedback to participants on evaluation results Preparation of proceedings: Evaluation kaggle Aug 2015: Arrangements for workshop Possible revisions of overview paper and participants papers Sep 2015: overview paper collection of participants papers Workshop, September 9-11, Sevilla, Spain Sep Dec: Writing of journal paper (DATABASE)
38 BEL task: challenges One sentence is a very limited context Disambiguation of named entities is context dependent One sentence often does not offer sufficient context, in particular for species identification No negative set Several levels of analysis: entities, functions, relations Multiple / large namespaces
39 Outlook A considerable investment in terms of time and resources Yet few participants expected, why? Novelty of the task (it takes time to adapt tools or create new ones) Short time available for development, due to late start and early evaluation Investment made so far will pay off in future editions Participants will become gradually more familiar with the nature of the task Evaluation framework can be reused Documentation can be partially reused (extended and adapted) Workshop raises attention to BEL in the text mining community
40 Bel Task kaggle.com? How well can a general purpose Machine Learning Community solve a biomedical Text Mining task? Challenges and Benefits Provide the evidence in a support data format which lowers the NLP requirements for participants Model the task as a multi-label, multi-class prediction problem NER output, Chunking, Dependency Parsing Representing structured outcomes in this way seems to be interesting for the ML community (e.g. Meka) Get more ML approaches tested on the Bel Task data set Kaggle has an active and innovative community
41 Summary Text mining in biology: essential for coping with the information deluge Competitive evaluations: provide rigorous evaluation in a controlled environment BEL challenge: a novel task with a great potential However: more time is needed to allow participants to become familiar with such a complex framework and develop useful systems
42 Acknowledgments BEL task BioCreative Juliane Fluck Martin Krallinger,CNIO Sumit Madan Florian Leitner, CNIO Tilia Ellendorff Simon Clematide Alfonso Valencia, CNIO Lynette Hirschman, MITRE Adrian van der Lek Sam Ansari Julia Hoeng Manuel Peitsch
Overview of BioCreative VI Precision Medicine Track
Overview of BioCreative VI Precision Medicine Track Mining scientific literature for protein interactions affected by mutations Organizers: Rezarta Islamaj Dogan (NCBI) Andrew Chatr-aryamontri (BioGrid)
More informationA Framework for BioCuration (part II)
A Framework for BioCuration (part II) Text Mining for the BioCuration Workflow Workshop, 3rd International Biocuration Conference Friday, April 17, 2009 (Berlin) Martin Krallinger Spanish National Cancer
More informationText mining tools for semantically enriching the scientific literature
Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the
More informationInformation Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si
Information Retrieval, Information Extraction, and Text Mining Applications for Biology Slides by Suleyman Cetintas & Luo Si 1 Outline Introduction Overview of Literature Data Sources PubMed, HighWire
More informationProjects Tools BLAH proposal Conclusion. OntoGene/BioMeXT
OntoGene/BioMeXT The Bio Term Hub and OGER Lenz Furrer, Nico Colic, Fabio Rinaldi University of Zurich and Swiss Institute of Bioinformatics January 10, 2018 Outline Projects Tools BLAH proposal Conclusion
More informationThe CALBC RDF Triple store: retrieval over large literature content
The CALBC RDF Triple store: retrieval over large literature content Samuel Croset, Christoph Grabmüller, Chen Li, Silverstras Kavaliauskas, Dietrich Rebholz-Schuhmann croset@ebi.ac.uk 10 th December 2010,
More informationBioC: a minimalist approach to interoperability for biomedical text processing. Don Comeau
BioC: a minimalist approach to interoperability for biomedical text processing Don Comeau Outline Background and origin of BioC What is BioC? Available Tools and Corpora 2 BioCreative Critical Assessment
More informationCustomisable Curation Workflows in Argo
Customisable Curation Workflows in Argo Rafal Rak*, Riza Batista-Navarro, Andrew Rowley, Jacob Carter and Sophia Ananiadou National Centre for Text Mining, University of Manchester, UK *Corresponding author:
More informationThe user interactive task (IAT) in BioCreative Challenges BioCreative Workshop on Text Mining Applications April 7, 2014
The user interactive task (IAT) in BioCreative Challenges BioCreative Workshop on Text Mining Applications April 7, 2014 N., PhD Research Associate Professor Protein Information Resource CBCB, University
More informationTEES 2.2: Biomedical Event Extraction for Diverse Corpora
RESEARCH Open Access TEES 2.2: Biomedical Event Extraction for Diverse Corpora Jari Björne 1,2*, Tapio Salakoski 1,2 From BioNLP Shared Task 2013 Sofia, Bulgaria. 9 August 2013 Abstract Background: The
More informationChemical name recognition with harmonized feature-rich conditional random fields
Chemical name recognition with harmonized feature-rich conditional random fields David Campos, Sérgio Matos, and José Luís Oliveira IEETA/DETI, University of Aveiro, Campus Universitrio de Santiago, 3810-193
More informationImproving Interoperability of Text Mining Tools with BioC
Improving Interoperability of Text Mining Tools with BioC Ritu Khare, Chih-Hsuan Wei, Yuqing Mao, Robert Leaman, Zhiyong Lu * National Center for Biotechnology Information, 8600 Rockville Pike, Bethesda,
More informationBenchmarking biomedical text mining web servers at BioCreative V.5: the technical Interoperability and Performance of annotation Servers - TIPS track
Benchmarking biomedical text mining web servers at BioCreative V.5: the technical Interoperability and Performance of annotation Servers - TIPS track Martin Pérez-Pérez 1,2, Gael Pérez-Rodríguez 1,2, Aitor
More informationA curation pipeline and web-services for PDF documents
A curation pipeline and web-services for PDF documents André Santos 1, Sérgio Matos 1, David Campos 2 and José Luís Oliveira 1 1 DETI/IEETA, University of Aveiro, 3810-193 Aveiro, Portugal {aleixomatos,andre.jeronimo,jlo}@ua.pt
More informationMeasuring inter-annotator agreement in GO annotations
Measuring inter-annotator agreement in GO annotations Camon EB, Barrell DG, Dimmer EC, Lee V, Magrane M, Maslen J, Binns ns D, Apweiler R. An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.
More information@Note2 tutorial. Hugo Costa Ruben Rodrigues Miguel Rocha
@Note2 tutorial Hugo Costa (hcosta@silicolife.com) Ruben Rodrigues (pg25227@alunos.uminho.pt) Miguel Rocha (mrocha@di.uminho.pt) 23-01-2018 The document presents a typical workflow using @Note2 platform
More informationWhat is Text Mining? Sophia Ananiadou National Centre for Text Mining University of Manchester
National Centre for Text Mining www.nactem.ac.uk University of Manchester Outline Aims of text mining Text Mining steps Text Mining uses Applications 2 Aims Extract and discover knowledge hidden in text
More informationRLIMS-P Website Help Document
RLIMS-P Website Help Document Table of Contents Introduction... 1 RLIMS-P architecture... 2 RLIMS-P interface... 2 Login...2 Input page...3 Results Page...4 Text Evidence/Curation Page...9 URL: http://annotation.dbi.udel.edu/text_mining/rlimsp2/
More informationExtraction of biomedical events using case-based reasoning
Extraction of biomedical events using case-based reasoning Mariana L. Neves Biocomputing Unit Centro Nacional de Biotecnología - CSIC C/ Darwin 3, Campus de Cantoblanco, 28049, Madrid, Spain mlara@cnb.csic.es
More informationSemi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction
Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction Pavel P. Kuksa, Rutgers University Yanjun Qi, Bing Bai, Ronan Collobert, NEC Labs Jason Weston, Google Research NY Vladimir
More informationThis document contains information about the annotation workflow for the Full BioCreative interactive task.
BioCreative IV-User Interactive Task RLIMS-P Annotation Task This document contains information about the annotation workflow for the Full BioCreative interactive task. Annotation Workflow using RLIMS-P
More informationA new methodology for gene normalization using a mix of taggers, global alignment matching and document similarity disambiguation
A new methodology for gene normalization using a mix of taggers, global alignment matching and document similarity disambiguation Mariana Neves 1, Monica Chagoyen 1, José M Carazo 1, Alberto Pascual-Montano
More informationNatural Language Processing Pipelines to Annotate BioC Collections with an Application to the NCBI Disease Corpus
Natural Language Processing Pipelines to Annotate BioC Collections with an Application to the NCBI Disease Corpus Donald C. Comeau *, Haibin Liu, Rezarta Islamaj Doğan and W. John Wilbur National Center
More informationRanking of CTD articles and interactions using the OntoGene pipeline
Ranking of CTD articles and interactions using the OntoGene pipeline Fabio Rinaldi, Simon Clematide and Simon Hafner Institute of Computational Linguistics, University of Zurich {rinaldi,siclemat}@cl.uzh.ch,{hafnersimon@gmail.com}
More informationUpdate: MIRIAM Registry and SBO
Update: MIRIAM Registry and SBO Nick Juty, EMBL-EBI 3rd Sept, 2011 Overview MIRIAM Registry MIRIAM Guidelines.. MIRIAM Registry content URIs (URN form), example Summary/current developments SBO Purpose
More informationExploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix
Exploring and Exploiting the Biological Maze Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Motivation An abundance of biological data sources contain data about scientific entities, such as
More informationNew Concept for Article 36 Networking and Management of the List
New Concept for Article 36 Networking and Management of the List Kerstin Gross-Helmert, AFSCO 28 th Meeting of the Focal Point Network EFSA, MTG SEAT 00/M08-09 THE PRESENTATION Why a new concept? What
More informationMaximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009
Maximizing the Value of STM Content through Semantic Enrichment Frank Stumpf December 1, 2009 What is Semantics and Semantic Processing? Content Knowledge Framework Technology Framework Search Text Images
More informationSAPIENT Automation project
Dr Maria Liakata Leverhulme Trust Early Career fellow Department of Computer Science, Aberystwyth University Visitor at EBI, Cambridge mal@aber.ac.uk 25 May 2010, London Motivation SAPIENT Automation Project
More informationAcquiring Experience with Ontology and Vocabularies
Acquiring Experience with Ontology and Vocabularies Walt Melo Risa Mayan Jean Stanford The author's affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended
More informationBio wikis. Paolo Romano Bioinformatics, National Cancer Research Institute, Genova
Bio wikis Paolo Romano (paolo.romano@istge.it) Bioinformatics, National Cancer Research Institute, Genova Outline o Wiki systems: aims and technologies o Working with wikis: practical issues for setting
More informationLearning to Answer Biomedical Factoid & List Questions: OAQA at BioASQ 3B
Learning to Answer Biomedical Factoid & List Questions: OAQA at BioASQ 3B Zi Yang, Niloy Gupta, Xiangyu Sun, Di Xu, Chi Zhang, Eric Nyberg Language Technologies Institute School of Computer Science Carnegie
More informationEVENT EXTRACTION WITH COMPLEX EVENT CLASSIFICATION USING RICH FEATURES
Journal of Bioinformatics and Computational Biology Vol. 8, No. 1 (2010) 131 146 c 2010 The Authors DOI: 10.1142/S0219720010004586 EVENT EXTRACTION WITH COMPLEX EVENT CLASSIFICATION USING RICH FEATURES
More informationSciMiner User s Manual
SciMiner User s Manual Copyright 2008 Junguk Hur. All rights reserved. Bioinformatics Program University of Michigan Ann Arbor, MI 48109, USA Email: juhur@umich.edu Homepage: http://jdrf.neurology.med.umich.edu/sciminer/
More informationMIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion
MIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion Sara Lana-Serrano 1,3, Julio Villena-Román 2,3, José C. González-Cristóbal 1,3 1 Universidad Politécnica de Madrid 2 Universidad
More informationConnecting Text Mining and Pathways using the PathText Resource
Connecting Text Mining and Pathways using the PathText Resource Sætre, Kemper, Oda, Okazaki a, Matsuoka b, Kikuchi c, Kitano d, Tsuruoka, Ananiadou, Tsujii e a Computer Science, University of Tokyo, Hongo
More informationA Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision
A Semantic Web-Based Approach for Harvesting Multilingual Textual Definitions from Wikipedia to Support ICD-11 Revision Guoqian Jiang 1,* Harold R. Solbrig 1 and Christopher G. Chute 1 1 Department of
More informationDevelopment of Text Mining Tools for Information Retrieval from Patents
Development of Text Mining Tools for Information Retrieval from Patents Tiago Alves 1,2(B),Rúben Rodrigues 1, Hugo Costa 2, and Miguel Rocha 1 1 Centre Biological Engineering, University of Minho, 4710-057
More informationEFFICIENT AUTOMATED PROCESSING OF BIOMEDICAL LITERATURE
EFFICIENT AUTOMATED PROCESSING OF BIOMEDICAL LITERATURE NICO COLIC 1. Introduction The rate at which biomedical research papers are published is ever increasing. Because of this, professionals rely on
More informationProfiling Medical Journal Articles Using a Gene Ontology Semantic Tagger. Mahmoud El-Haj Paul Rayson Scott Piao Jo Knight
Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger Mahmoud El-Haj Paul Rayson Scott Piao Jo Knight Origin and Outcomes Currently funded through a Wellcome Trust Seed award Collaboration
More informationScholarly Big Data: Leverage for Science
Scholarly Big Data: Leverage for Science C. Lee Giles The Pennsylvania State University University Park, PA, USA giles@ist.psu.edu http://clgiles.ist.psu.edu Funded in part by NSF, Allen Institute for
More informationDocument Retrieval using Predication Similarity
Document Retrieval using Predication Similarity Kalpa Gunaratna 1 Kno.e.sis Center, Wright State University, Dayton, OH 45435 USA kalpa@knoesis.org Abstract. Document retrieval has been an important research
More informationTurning Text into Insight: Text Mining in the Life Sciences WHITEPAPER
Turning Text into Insight: Text Mining in the Life Sciences WHITEPAPER According to The STM Report (2015), 2.5 million peer-reviewed articles are published in scholarly journals each year. 1 PubMed contains
More informationCOURSE LISTING. Courses Listed. Training for Database & Technology with Modeling in SAP HANA. 20 November 2017 (12:10 GMT) Beginner.
Training for Database & Technology with Modeling in SAP HANA Courses Listed Beginner HA100 - SAP HANA Introduction Advanced HA300 - SAP HANA Certification Exam C_HANAIMP_13 - SAP Certified Application
More informationText-mining-assisted biocuration workflows in Argo
Database, 2014, 1 14 doi: 10.1093/database/bau070 Original article Original article Text-mining-assisted biocuration workflows in Argo Rafal Rak 1, *, Riza Theresa Batista-Navarro 1,2, Andrew Rowley 1,
More informationEVIDENCE FOR SHOWING GENE/PROTEIN NAME SUGGESTIONS IN BIOSCIENCE LITERATURE SEARCH INTERFACES
EVIDENCE FOR SHOWING GENE/PROTEIN NAME SUGGESTIONS IN BIOSCIENCE LITERATURE SEARCH INTERFACES ANNA DIVOLI, MARTI A. HEARST, MICHAEL A. WOOLDRIDGE School of Information, UC Berkeley {divoli,hearst,mikew}@.ischool.berkeley.edu
More informationShrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent
More informationPowering Knowledge Discovery. Insights from big data with Linguamatics I2E
Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural
More informationThe Final Updates. Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University of Oxford, UK
The Final Updates Supported by the NIH grant 1U24 AI117966-01 to UCSD PI, Co-Investigators at: Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University
More informationIntegrated Access to Biological Data. A use case
Integrated Access to Biological Data. A use case Marta González Fundación ROBOTIKER, Parque Tecnológico Edif 202 48970 Zamudio, Vizcaya Spain marta@robotiker.es Abstract. This use case reflects the research
More informationMaster Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala
Master Project Various Aspects of Recommender Systems May 2nd, 2017 Master project SS17 Albert-Ludwigs-Universität Freiburg Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue
More informationPROJECT PERIODIC REPORT
PROJECT PERIODIC REPORT Grant Agreement number: 257403 Project acronym: CUBIST Project title: Combining and Uniting Business Intelligence and Semantic Technologies Funding Scheme: STREP Date of latest
More informationLecture 5. Functional Analysis with Blast2GO Enriched functions. Kegg Pathway Analysis Functional Similarities B2G-Far. FatiGO Babelomics.
Lecture 5 Functional Analysis with Blast2GO Enriched functions FatiGO Babelomics FatiScan Kegg Pathway Analysis Functional Similarities B2G-Far 1 Fisher's Exact Test One Gene List (A) The other list (B)
More informationA STACKED GRAPHICAL MODEL FOR ASSOCIATING SUB-IMAGES WITH SUB-CAPTIONS
A STACKED GRAPHICAL MODEL FOR ASSOCIATING SUB-IMAGES WITH SUB-CAPTIONS ZHENZHEN KOU, WILLIAM W. COHEN, AND ROBERT F. MURPHY Machine Learning Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh,
More informationThe GENIA corpus Linguistic and Semantic Annotation of Biomedical Literature. Jin-Dong Kim Tsujii Laboratory, University of Tokyo
The GENIA corpus Linguistic and Semantic Annotation of Biomedical Literature Jin-Dong Kim Tsujii Laboratory, University of Tokyo Contents Ontology, Corpus and Annotation for IE Annotation and Information
More informationUnstructured Text in Big Data The Elephant in the Room
Unstructured Text in Big Data The Elephant in the Room David Milward ICIC, October 2013 Click Unstructured to to edit edit Master Master Big title Data style title style Big Data Volume, Variety, Velocity
More informationBiomedical literature mining for knowledge discovery
Biomedical literature mining for knowledge discovery REZARTA ISLAMAJ DOĞAN National Center for Biotechnology Information National Library of Medicine Outline Biomedical Literature Access Challenges in
More informationA Technical Introduction to the Semantic Search Engine SeMedico
Talk in the Semesterprojekt Entwicklung einer Suchmaschine für Alternativmethoden zu Tierversuchen January 12, 2018 Humboldt-Universität zu Berlin A Technical Introduction to the Semantic Search Engine
More informationVisualizing Semantic Metadata from Biological Publications
Visualizing Semantic Metadata from Biological Publications Johannes Hellrich, Erik Faessler, Ekaterina Buyko and Udo Hahn Jena University Language and Information Engineering (JULIE) Lab Friedrich-Schiller-Universität
More informationStakeholder consultation process and online consultation platform
Stakeholder consultation process and online consultation platform Grant agreement no.: 633107 Deliverable No. D6.2 Stakeholder consultation process and online consultation platform Status: Final Dissemination
More informationA Semantic Model for Federated Queries Over a Normalized Corpus
A Semantic Model for Federated Queries Over a Normalized Corpus Samuel Croset, Christoph Grabmüller, Dietrich Rebholz-Schuhmann 17 th March 2010, Hinxton EBI is an Outstation of the European Molecular
More informationBioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data
BioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data María-Esther Vidal 1, Louiqa Raschid 2, Natalia Márquez 1, Jean Carlo Rivera 1, and Edna Ruckhaus 1 1 Universidad
More informationA RapidMiner framework for protein interaction extraction
A RapidMiner framework for protein interaction extraction Timur Fayruzov 1, George Dittmar 2, Nicolas Spence 2, Martine De Cock 1, Ankur Teredesai 2 1 Ghent University, Ghent, Belgium 2 University of Washington,
More informationMedical Event Extraction using the Swedish FrameNet, a pilot study
Medical Event Extraction using the Swedish FrameNet, a pilot study DIMITRIOS KOKKINAKIS Centre for Language Technology University of Gothenburg Sweden dimitrios.kokkinakis@svenska.gu.se Overview From entities
More informationUse of Semantic Technologies at Eli Lilly and Company. J Phil Brooks Information Consultant, SE Data Team Discover IT Eli Lilly and Company
Use of Semantic Technologies at Eli Lilly and Company J Phil Brooks Information Consultant, SE Data Team Discover IT Eli Lilly and Company Notable Semantic Projects at Lilly Discovery Metadata Integration
More informationClassification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014
Classification and retrieval of biomedical literatures: SNUMedinfo at CLEF QA track BioASQ 2014 Sungbin Choi, Jinwook Choi Medical Informatics Laboratory, Seoul National University, Seoul, Republic of
More informationCACAO Training. Jim Hu and Suzi Aleksander Spring 2016
CACAO Training Jim Hu and Suzi Aleksander Spring 2016 1 What is CACAO? Community Assessment of Community Annotation with Ontologies (CACAO) Annotation of gene function Competition Within a class Between
More informationOriginal article Using the OntoGene pipeline for the triage task of BioCreative 2012
Original article Using the OntoGene pipeline for the triage task of BioCreative 2012 Fabio Rinaldi 1, *, Simon Clematide 1, Simon Hafner 1, Gerold Schneider 1, Gintare_ Grigonyte_ 1, Martin Romacker 2
More informationEnabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services
Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK REVIEW PAPER ON IMPLEMENTATION OF DOCUMENT ANNOTATION USING CONTENT AND QUERYING
More informationWEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS
1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,
More informationefip online Help Document
efip online Help Document University of Delaware Computer and Information Sciences & Center for Bioinformatics and Computational Biology Newark, DE, USA December 2013 K K S I K K Table of Contents INTRODUCTION...
More informationA few contributions of the SIFR project
A few contributions of the SIFR project Semantic Indexing of French biomedical Resources Data seminar- December 10th 2015 LIRMM University of Montpellier Clement Jonquet, Mathieu Roche, Sandra Bringay
More informationNational Centre for Text Mining NaCTeM. e-science and data mining workshop
National Centre for Text Mining NaCTeM e-science and data mining workshop John Keane Co-Director, NaCTeM john.keane@manchester.ac.uk School of Informatics, University of Manchester What is text mining?
More informationIPA: networks generation algorithm
IPA: networks generation algorithm Dr. Michael Shmoish Bioinformatics Knowledge Unit, Head The Lorry I. Lokey Interdisciplinary Center for Life Sciences and Engineering Technion Israel Institute of Technology
More informationCOURSE LISTING. Courses Listed. with SAP Hybris Marketing Cloud. 24 January 2018 (23:53 GMT) HY760 - SAP Hybris Marketing Cloud
with SAP Hybris Marketing Cloud Courses Listed HY760 - SAP Hybris Marketing Cloud C_HYMC_1702 - SAP Certified Technology Associate - SAP Hybris Marketing Cloud (1702) Implementation Page 1 of 12 All available
More informationOverview of the NTCIR-12 MobileClick-2 Task
Overview of the NTCIR-12 MobileClick-2 Task Makoto P. Kato (Kyoto U.), Tetsuya Sakai (Waseda U.), Takehiro Yamamoto (Kyoto U.), Virgil Pavlu (Northeastern U.), Hajime Morita (Kyoto U.), and Sumio Fujita
More informationExploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications
Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Amanda Clare 1,3, Samuel Croset 2,3 (croset@ebi.ac.uk), Christoph Grabmueller 2,3, Senay
More informationText Mining. Representation of Text Documents
Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,
More informationGenescene: Biomedical Text and Data Mining
Claremont Colleges Scholarship @ Claremont CGU Faculty Publications and Research CGU Faculty Scholarship 5-1-2003 Genescene: Biomedical Text and Data Mining Gondy Leroy Claremont Graduate University Hsinchun
More informationCRFVoter: Chemical Entity Mention, Gene and Protein Related Object recognition using a conglomerate of CRF based tools
CRFVoter: Chemical Entity Mention, Gene and Protein Related Object recognition using a conglomerate of CRF based tools Wahed Hemati, Alexander Mehler, and Tolga Uslu Text Technology Lab, Goethe Universitt
More informationExtracting reproducible simulation studies from model repositories using the CombineArchive Toolkit
Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit Martin Scharm, Dagmar Waltemath Department of Systems Biology and Bioinformatics University of Rostock
More informationThis report is based on sampled data. Jun 1 Jul 6 Aug 10 Sep 14 Oct 19 Nov 23 Dec 28 Feb 1 Mar 8 Apr 12 May 17 Ju
0 - Total Traffic Content View Query This report is based on sampled data. Jun 1, 2009 - Jun 25, 2010 Comparing to: Site 300 Unique Pageviews 300 150 150 0 0 Jun 1 Jul 6 Aug 10 Sep 14 Oct 19 Nov 23 Dec
More informationPrecise Medication Extraction using Agile Text Mining
Precise Medication Extraction using Agile Text Mining Chaitanya Shivade *, James Cormack, David Milward * The Ohio State University, Columbus, Ohio, USA Linguamatics Ltd, Cambridge, UK shivade@cse.ohio-state.edu,
More informationComplex-to-Pairwise Mapping of Biological Relationships using a Semantic Network Representation
Complex-to-Pairwise Mapping of Biological Relationships using a Semantic Network Representation Juho Heimonen, 1 Sampo Pyysalo, 2 Filip Ginter 1 and Tapio Salakoski 1,2 1 Department of Information Technology,
More informationSemantic Annotation and Linking of Medical Educational Resources
5 th European IFMBE MBEC, Budapest, September 14-18, 2011 Semantic Annotation and Linking of Medical Educational Resources N. Dovrolis 1, T. Stefanut 2, S. Dietze 3, H.Q. Yu 3, C. Valentine 3 & E. Kaldoudi
More informationCharacterization and Modeling of Deleted Questions on Stack Overflow
Characterization and Modeling of Deleted Questions on Stack Overflow Denzil Correa, Ashish Sureka http://correa.in/ February 16, 2014 Denzil Correa, Ashish Sureka (http://correa.in/) ACM WWW-2014 February
More informationTURNING TEXT INTO INSIGHT: TEXT MINING IN THE LIFE SCIENCES
TURNING TEXT INTO INSIGHT: TEXT MINING IN THE LIFE SCIENCES According to The STM Report (2015), 2.5 million peer-reviewed articles are published in scholarly journals each year. 1 PubMed contains more
More informationDatabase of Curated Mutations (DoCM) ournal/v13/n10/full/nmeth.4000.
Database of Curated Mutations (DoCM) http://docm.genome.wustl.edu/ http://www.nature.com/nmeth/j ournal/v13/n10/full/nmeth.4000.h tml Home Page Information in DoCM DoCM uses many data sources to compile
More informationPMC text mining subset in BioC: 2.3 million full text articles and growing
PMC text mining subset in BioC: 2.3 million full text articles and growing Donald C. Comeau, Chih-Hsuan Wei, Rezarta Islamaj Doğan and Zhiyong Lu National Center for Biotechnology Information, U.S. Library
More informationAsks for clarification of whether a GOP must communicate to a TOP that a generator is in manual mode (no AVR) during start up or shut down.
# Name Duration 1 Project 2011-INT-02 Interpretation of VAR-002 for Constellation Power Gen 185 days Jan Feb Mar Apr May Jun Jul Aug Sep O 2012 2 Start Date for this Plan 0 days 3 A - ASSEMBLE SDT 6 days
More informationBlast2GO Teaching Exercises
Blast2GO Teaching Exercises Ana Conesa and Stefan Götz 2012 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2 2 Perform a complete annotation process with Blast2GO
More informationSteering Committee Meeting
Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationBrat2BioC: conversion tool between brat and BioC
Brat2: conversion tool between and Antonio Jimeno Yepes 1,2, Mariana Neves 3,4, Karin Verspoor 1,2 1 NICTA Victoria Research Lab, Melbourne VIC 3010, Australia 2 Department of Computing and Information
More informationAn UIMA based Tool Suite for Semantic Text Processing
An UIMA based Tool Suite for Semantic Text Processing Katrin Tomanek, Ekaterina Buyko, Udo Hahn Jena University Language & Information Engineering Lab StemNet Knowledge Management for Immunology in life
More informationINAB Mandatory and Guidance Documents Policy and Index
INAB Mandatory and Guidance s Policy and Index This publication is aimed at assisting in determining what documents are relevant to various organisations and at providing contact points for accessing such
More informationNCI Thesaurus, managing towards an ontology
NCI Thesaurus, managing towards an ontology CENDI/NKOS Workshop October 22, 2009 Gilberto Fragoso Outline Background on EVS The NCI Thesaurus BiomedGT Editing Plug-in for Protege Semantic Media Wiki supports
More informationWebsite Redevelopment Content Information Session. Presentation by
Website Redevelopment Content Information Session Presentation by December 3, 2010 Agenda December 3rd & 10th, 2010 10:00 10:10 Welcome & Introductions 10:10 10:20 Project Status & Development Schedule
More informationNational Smart Metering Program Testing Framework Work Stream Actions Log
- ustomer section of the table Description / Progress Source/Origin Workstream/ owner Who By When Status Reporting 20090923 4 TFWG need to feed into the BPPWG the issues about service levels and performance
More informationMeSH : A Thesaurus for PubMed
Scuola di dottorato di ricerca in Scienze Molecolari Resources and tools for bibliographic research MeSH : A Thesaurus for PubMed What is MeSH? Who uses MeSH? Why use MeSH? Searching by using the MeSH
More information