Core Technology Development Team Meeting
|
|
- Maryann Cleopatra Morrison
- 5 years ago
- Views:
Transcription
1 Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: Access Code: For international call in numbers, please visit:
2 Agenda Updates on action items DataMed Evaluation LinkOut update Inclusion of more repositories into DataMed : plan and course of action DataMed v1.5 release before BD2K AHM Updates from all team members Supported by the NIH grant 1U24 AI to the University of California, San Diego 2
3 Updates- action items Generate a HELP and FAQ page please start adding material here ASAP Robust server to host biocaddie to be set up ongoing work by Jeff and Claudiu Feedback on video Pilot Project Integration Supported by the NIH grant 1U24 AI to the University of California, San Diego 3
4 Evaluation on Benchmark Datasets Xiaoling Chen
5 Benchmark Datasets Datasets: (before V0.5) Repositories: 20 Test queries: 15 Index of benchmark dataset
6 Example of query Query 4: Find all data types related to inflammation during oxidative stress in human hepatic cells across all databases Keywords Query: inflammation oxidative stress human hepatic cells Expanded Query: inflammation oxidative stress human hepatic cells Chronic inflammatory reaction morphologic abnormality cell infiltration arthritis disorder Oxidative Stresses Human Homo sapiens organism Man Tympanic cells set Cellulae tympanicae Mother Cell Stem cell Unit Colony Forming Progenitor
7 Queries Natural query 1 Find protein sequencing data related to bacterial chemotaxis across all databases 2 Search for data of all types related to MIP-2 gene related to biliary atresia across all databases 3 Search for all data types related to gene TP53INP1 in relation to p53 activation across all databases 4 Find all data types related to inflammation during oxidative stress in human hepatic cells across all databases 5 Search for gene expression and genetic deletion data that mention CD69 in memory augmentation studies across all databases 6 Search for data of all types related to the LDLR gene related to cardiovascular disease across all databases 7 Search for gene expression datasets on photo transduction and regulation of calcium in blind D. melanogaster 8 Search for proteomic data related to regulation of calcium in blind D. melanogaster keyword Protein sequencing bacterial chemotaxis Mip-2 biliary atresia TP53INP1 p53 activation inflammation oxidative stress human hepatic cells CD69 memory augmentation LDLR gene cardiovascular disease photo transduction regulation calcium blind D melanogaster proteomic data regulation calcium blind Drosophila melanogaster
8 Queries Natural query 9 Search for data of all types related to the ob gene in obese M. musculus across all databases 10 Search for data of all types related to energy metabolism in obese M. musculus 11 Search for all data for the HTT gene related to Huntingtonís disease across all databases 12 Search for data on neural brain tissue in transgenic mice related to Huntingtonís disease 13 Search for all data on the SNCA gene related to Parkinsonís disease across all databases 14 Search for data on nerve cells in the substantia nigra in mice across all keyword ob gene obese Mus musculus energy metabolism obese Mus musculus HTT gene Huntington disease neural brain tissue Huntington disease transgenic mice SNCA gene Parkinson Disease nerve cells substantia nigra mice
9 Which records are annotated? For each query, query in two versions, keywords version and expansion query in four search engines (Lucence, Indri, Terrier, SemanticVectors) Get first 300 results from each search engine, combine and delete the duplicated ones. For each query, maximum 2400 (300*4*2) records are annotated.
10 Gold Standard (Annotated results) id annotated relevant partial relevant not relevant
11 Trec_eval tool A standard tool used by the TREC community for evaluating an ad hoc retrieval run, given the results file and a standard of judged results. Metrics: Name infap infndcg ip@n Prec@rec 11 points Explanation inferred Average Precision. A commonly used measure by information retrieval community (based on random sampling) Inferred normalized discounted cumulated gain. A commonly used measure that incorporates graded relevance judgments. Precision after N retrieved Interpolated recall-precision averages at n recall n = [0.00, ,0.30,0.40,0.50,0.60,0.70,0.80,0.90,1.
12 Search in _all field using default TF-IDF algorithm in ES, retrieve first 300 records query infap infndcg all TREC 2014 CDS track 30 topics Top Top infap infndcg P@
13 Different Search engines Search Engine (basic query) infap infndcg ElasticSearch Lucene Terrier Indri Semantic Search Engine (expanded query) infap infndcg ElasticSearch Lucene Terrier Indri Semantic
14 Fields 187 unique fields in the benchmark dataset Run search on each field, w 11 fields is in integer or date format, cannot be searched using terms (METADATA.FemaleNum, METADATA.dataItem.releaseDate, METADATA.dataItem.depositionDate, etc) w 68 fields does not return results (METADATA.organization.homePage, METADATA.dataset.dateAccession, METADATA.internal.rank, METADATA.datastandard.license, etc) w 108 fields return results
15 Search on important fields Search field infap infndcg TITLE description Mulit_fields (Title and description) Multi_match, cross_fields Mulit_fields (Title and description) Multi_match,cross_fields,title^ _all _all (concatenate only title and description) _all (concatenate Special _all the field 108 fields that returned Combines results) the original values from each field as a string. The distinction between field lengths disappears in the _all field. The shorter the field, the more important.
16 Next steps Try different similarity algorithms in ES (BM25, LMD, DFR, etc) Explore different parameters in the query. Try boost in the query components and query fields. Expand synonyms using NLP server Try different relationship between synonyms
17 LinkOut Update - Databases available for linking
18 LinkOut update Meeting with Kathy Kwan Pubmed to be the first try Kathy sent us provider ID & ftp site We will provide sample data Supported by the NIH grant 1U24 AI to the University of California, San Diego 18
19 Inclusion of more repositories into DataMed : plan and course of action DK3ParoPWewow5lmqlGJKQc0/edit#gid=0 0BsVdI_tddAXZrXS5eQc8K8/edit#gid= Each site should get at least 2 repos mapped/team per week Progress to be reported every CDT meeting with plans for next set of repos for mapping Supported by the NIH grant 1U24 AI to the University of California, San Diego 19
20 DATS 2.1 Mapping UCSD - DBMI UTHealth UCSD CRBS Recently Completed LSDB, NDAR, HMP, American Gut Project (in EBI), IRD-JCVI, GeneNetwork Retina, EMDB, BMRB, TCGA, NBIA, Epigenomics, RGD, ClinVar, Vectorbase, IntAct (10) YPED, Uniprot - Swisprot, CIL, NURSA, ICPSR, Neuromorpho, openfmri, NIDDK CR, Physiobank, CIA-Datacite, ICPSR, PeptideAtlas, CVRG, Gemma, GEO, ArrayExpress, CTN, LINCS,PDB, NeuroVault Results, NeuroVault Atlas, NeuroVault Collections, MPD, ProteomeXChange, NITRC, ASCB Cell Image Library (25) Currently Working On Nature Scientific SRA, Dryad, Clinical Trials, dbgap, Phenogen Informatics, Human Proteinpedia (6) Uniprot Trembl, bioproject, Datacite Biomedical Repositories (30 selected based on content) (33) Up Next Waiting for new assignment EuPathDB, Diabetic Retinopathy Clinical Research Network, Diabetes Research in Children Network, Candida Genome Database (4) EU Clinical Trials Network, OmicsDI, (4) Waiting on repository response SimTK, EuPathDB Mendeley Data (OAuth connection), IMEx (waiting for source data feed), ImmPort (Contacted source for feed info), Cancer Nanotechnology Laboratory portal (Contacted source) Supported by the NIH grant 1U24 AI to the University of California, San Diego
21 DataMed Ingestion UCSD - DBMI UTHealth UCSD CRBS Recently Completed LSDB, GeneNetwork Retina, EMDB, Epigenomics, ClinVar, BMRB, TCGA (4) YPED, Uniprot - Swisprot, CIL, NURSA, ICPSR, Neuromorpho, openfmri, NIDDK CR, Physiobank, CIA-Datacite, ICPSR, PeptideAtlas, CVRG, Gemma, GEO, ArrayExpress, CTN, LINCS,PDB, NeuroVault Results, NeuroVault Atlas, NeuroVault Collections, MPD, ProteomeXChange, NITRC, ASCB Cell Image Library (25) Currently Working On AmericanGut, (EBI), HMP, NDAR NSRR, RGD, Vectorbase, IntAct (4) Uniprot Trembl, bioproject, Datacite Biomedical Repositories (30 selected based on content) (33) Up Next IRD (JCVI), NatureScientific Phenogen Informatics, Human Proteinpedia, Diabetic Retinopathy Clinical Research Network, Diabetes Research in Children Network, Candida Genome Database (5) EU Clinical Trials Network, OmicsDI (2) Waiting on repository response SimTK, EuPathDB Mendeley Data (OAuth connection), IMEx (waiting for source data feed), ImmPort (Contacted source for feed info), Cancer Nanotechnology Laboratory portal (Contacted source) Supported by the NIH grant 1U24 AI to the University of California, San Diego
22 DataMed release DataMed v1.5 release before BD2K AHM: w Increased number of repositories mapped to DATS 2.1 : target ~ 40 repos mapped to DATS 2.1 w Additional functions sorting Visualization user activity tracking NLP at backend UI functionality needs testing before release November 17 th Release on November 22 nd Supported by the NIH grant 1U24 AI to the University of California, San Diego 22
23 Github Issues Total Issues 158 Number Open 58 Number Closed 100 Associated with v1.0 Number Open 12 Number Closed 8 Usability Issues Number Open 23 Number Closed 10 Associated with v0.5 Number Open 23 Number Closed 63 Number of Bugs Number Open 5 Number Closed 12 Number of Enhancements Number Open 21 Number Closed 28 Number of Questions Number Open 9 Number Closed 11 Number of Help Wanted Number Open 3 Number Closed 0 Supported by the NIH grant 1U24 AI to the University of California, San Diego 23
24 Ongoing work Task Supported by the NIH grant 1U24 AI to the University of California, San Diego Status 1 Metadata Ingestion 1.1 Import repositories expansion Ongoing 1.2 Data repository suggestion form at DataMed George/Xiaoling / Sanda Ongoing 1.3 Metadata mapping review/ reconciliation between curators Ongoing 1.4 Metadata management Ongoing 1.5 Indexing Ongoing 1.6 NLP-based indexing : Gene/protein, Disease, Drug/chemical, Biological process, Organism, Format, Implemented at backend Access, Cell types 1.7 Bulk download of indices Not Started 2 Terminology server 2.3 Integrate terminology server (Indexing) Ongoing 4 Interface Design 4.2 Design interface usability issues Ongoing 4.5 Display most Accessed Datasets Not Started 24
25 Ongoing work Task Status 5 Personalized search 5.1 Improve the tracking system Ongoing 6 Searching/Ranking algorithms 6.1 Similar datasets to be expanded Ongoing 7 Display of results 7.1 Sort datasets author, published date, repository, title Ongoing 7.2 What fields should be displayed? Ongoing Additional filters: File type Data Restrictions (data use agreement, restricted, unrestricted) Data Level (participant/aggregate) 7.3 Population (mouse, human, etc) 8 Link to external resources 1. Pubmed: click through to pubmed records of citing publications: copy citation to clipboard Scholix Framework for Linking Data and Literature 3. Linkout Not started Not Started Supported by the NIH grant 1U24 AI to the University of California, San Diego 25
26 Ongoing work Task Status 10 Documentation 10.1 Source code Ongoing 10.2 Tutorials Not Started 10.3 Help menu Ongoing 10.4 Video Ongoing 11 Usability studies 11.2 User studies Ongoing Data Duplication issue: Create a plan for how to best display/represent the duplicate in the metadata records and set up a meeting to discuss the workflow for displaying the duplicates in the metadata records Jeff/Anu Additional field in index Generation of benchmark for the dataset Completed 14 Relationship Network Graph 15 Collaborative research support Supported by the NIH grant 1U24 AI to the University of California, San Diego 26
27 Other issues Please deposit codes in GitHub. Please contact me at if you need access hp Any other issues? Thank You
Core Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationExecutive Committee Meeting
Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationExecutive Committee Meeting
Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationMetadata Ingestion and Processinng
biomedical and healthcare Data Discovery Index Ecosystem Ingestion and Processinng Jeffrey S. Grethe, Ph.D. 2017 BioCADDIE All Hands Meeting prototype Ingestion Indexing Repositories Ingestion ElasticSearch
More informationExecutive Committee Meeting
Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationExecutive Committee Meeting
Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationSteering Committee Meeting
Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationSteering Committee Meeting
Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationSteering Committee Meeting
Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please isit: https://www.readytalk.com/account-administration/international-numbers
More informationMulti-field query expansion is effective for biomedical dataset retrieval
Database, 2017, 1 20 doi: 10.1093/database/bax062 Original article Original article Multi-field query expansion is effective for biomedical dataset retrieval Mohamed Reda Bouadjenek* and Karin Verspoor
More informationAgenda. Clarification of issues Quarter definition Steering and Executive Committee composition Dissemination and community outreach activities
Agenda Clarification of issues Quarter definition Steering and Executive Committee composition Dissemination and community outreach activities Progress and updates Y1Q3 and plans for Y1Q4 Plan for the
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationThe Final Updates. Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University of Oxford, UK
The Final Updates Supported by the NIH grant 1U24 AI117966-01 to UCSD PI, Co-Investigators at: Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University
More informationMinutes. Date: Location: UCSD BRF2 5A03. Attendees Present
Executive Committee Meeting Location: UCSD BRF2 5A03 Date: 8-16-16 Start time: 10:00 am PDT End time: 11:30 am PDT Meeting Objective Attendees Present Minute Taker Executive Committee Meeting UCSD: Lucila
More informationSteering Committee Meeting
Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationeveloping DataMed the current status
eeloping DataMed the current status Hua Xu Core Deelopment Team (CDT) biocaddie AHM 2017 8/8/17 Supported by the NIH grant 1U24 AI117966-01 to the Uniersity of California, San Diego 1 Outline CDT Roles
More informationExercises. Biological Data Analysis Using InterMine workshop exercises with answers
Exercises Biological Data Analysis Using InterMine workshop exercises with answers Exercise1: Faceted Search Use HumanMine for this exercise 1. Search for one or more of the following using the keyword
More informationMetadata Discovery and Integration to Support Repurposing of Heterogeneous Data using the OpenFurther Platform
Metadata Discovery and Integration to Support Repurposing of Heterogeneous Data using the OpenFurther Platform biocaddie All Hands Meeting September 11 th, 2016 Ram Gouripeddi & Julio Facelli Department
More informationExploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix
Exploring and Exploiting the Biological Maze Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Motivation An abundance of biological data sources contain data about scientific entities, such as
More informationNCBI News, November 2009
Peter Cooper, Ph.D. NCBI cooper@ncbi.nlm.nh.gov Dawn Lipshultz, M.S. NCBI lipshult@ncbi.nlm.nih.gov Featured Resource: New Discovery-oriented PubMed and NCBI Homepage The NCBI Site Guide A new and improved
More informationTutorial:OverRepresentation - OpenTutorials
Tutorial:OverRepresentation From OpenTutorials Slideshow OverRepresentation (about 12 minutes) (http://opentutorials.rbvi.ucsf.edu/index.php?title=tutorial:overrepresentation& ce_slide=true&ce_style=cytoscape)
More informationAlternative Tools for Mining The Biomedical Literature
Yale University From the SelectedWorks of Rolando Garcia-Milian May 14, 2014 Alternative Tools for Mining The Biomedical Literature Rolando Garcia-Milian, Yale University Available at: https://works.bepress.com/rolando_garciamilian/1/
More informationHarmonizing biocaddie Metadata Schemas for Indexing Clinical Research Datasets Using Semantic Web Technologies
Harmonizing biocaddie Metadata Schemas for Indexing Clinical Research Datasets Using Semantic Web Technologies Harold R. Solbrig 1, Guoqian Jiang 1 1 Mayo Clinic College of Medicine, Rochester, MN [solbrig.harold,
More informationMouse BIRN Data Integration. Maryann Martone Mouse All Hands Meeting
Mouse BIRN Data Integration Maryann Martone 2005 Mouse All Hands Meeting Specific Aims Specific Aim 1: Data Access and Management Continue development of multi-scale databases along existing lines extending
More informationPresenter: Payam Karisani
Presenter: Payam Karisani Team members: Payam Karisani, CS Ph.D. Student (Team lead) Eugene Agichtein, Associate Professor/Advisor Intelligent Information Access Laboratory (IR Lab) Computer Science &
More informationPrototyping a Biomedical Ontology Recommender Service
Prototyping a Biomedical Ontology Recommender Service Clement Jonquet Nigam H. Shah Mark A. Musen jonquet@stanford.edu 1 Ontologies & data & annota@ons (1/2) Hard for biomedical researchers to find the
More informationOntology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework. Maryann E.
Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework Maryann E. Martone University of California, San Diego What does this mean? 3D Volumes
More informationSheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms
Sheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms Yikun Guo, Henk Harkema, Rob Gaizauskas University of Sheffield, UK {guo, harkema, gaizauskas}@dcs.shef.ac.uk
More informationMinimal Metadata Standards and MIIDI Reports
Dryad-UK Workshop Wolfson College, Oxford 12 September 2011 Minimal Metadata Standards and MIIDI Reports David Shotton, Silvio Peroni and Tanya Gray Image BioInformatics Research Group Department of Zoology
More informationBiomedical literature mining for knowledge discovery
Biomedical literature mining for knowledge discovery REZARTA ISLAMAJ DOĞAN National Center for Biotechnology Information National Library of Medicine Outline Biomedical Literature Access Challenges in
More informationTools for Researchers
University of Miami Scholarly Repository Faculty Research, Publications, and Presentations Department of Health Informatics 1-1-2015 Tools for Researchers Carmen Bou-Crick M.S.L.S. University of Miami,
More informationA Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data Repositories Tim Clark (Harvard Medical School & Massachusetts General Hospital) Martin Fenner (DataCite) Mercè Crosas (Institute for Quantiative Social Science,
More informationSNUMedinfo at TREC CDS track 2014: Medical case-based retrieval task
SNUMedinfo at TREC CDS track 2014: Medical case-based retrieval task Sungbin Choi, Jinwook Choi Medical Informatics Laboratory, Seoul National University, Seoul, Republic of Korea wakeup06@empas.com, jinchoi@snu.ac.kr
More informationMeasuring inter-annotator agreement in GO annotations
Measuring inter-annotator agreement in GO annotations Camon EB, Barrell DG, Dimmer EC, Lee V, Magrane M, Maslen J, Binns ns D, Apweiler R. An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.
More informationThe CALBC RDF Triple store: retrieval over large literature content
The CALBC RDF Triple store: retrieval over large literature content Samuel Croset, Christoph Grabmüller, Chen Li, Silverstras Kavaliauskas, Dietrich Rebholz-Schuhmann croset@ebi.ac.uk 10 th December 2010,
More informationQuery Reformulation for Clinical Decision Support Search
Query Reformulation for Clinical Decision Support Search Luca Soldaini, Arman Cohan, Andrew Yates, Nazli Goharian, Ophir Frieder Information Retrieval Lab Computer Science Department Georgetown University
More informationOverview. TREC Genomics Track Plenary. The central dogma of biology. At the intersection of digital biology and IR. Overview of this session
Overview TREC Genomics Track Plenary William Hersh Track Chair Oregon Health & Science University hersh@ohsu.edu http://medir.ohsu.edu/~genomics Introductory comments Track history 2003 track Primary task
More informationTaking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria
Taking a view on bio-ontologies Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria Who we are European Bioinformatics Institute one of world s largest bio data and service providers
More informationThe LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases
International Symposium on Integrative Bioinformatics 2010 The LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases M Lange, K Spies, C Colmsee, S Flemming, M Klapperstück,
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting Agenda v Updates regarding last meeting action items v Presentation by Ergin about Ontology Services v Brief updates from others Supported by the NIH grant 1U24
More informationPowering Knowledge Discovery. Insights from big data with Linguamatics I2E
Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural
More informationTutorial. Identification of Variants Using GATK. Sample to Insight. November 21, 2017
Identification of Variants Using GATK November 21, 2017 Sample to Insight QIAGEN Aarhus Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.qiagenbioinformatics.com AdvancedGenomicsSupport@qiagen.com
More informationSupplementary Note 1: Considerations About Data Integration
Supplementary Note 1: Considerations About Data Integration Considerations about curated data integration and inferred data integration mentha integrates high confidence interaction information curated
More informationThe IEEE Metadata Standard for Supporting Big Data Management
The IEEE Metadata Standard for Supporting Big Data Management Alex MH Kuo 1,2 (Ph.D) 1 School of Health Information Science University of Victoria, BC, Canada. 2 CEDAR, School of Medicine University of
More informationHow to store and visualize RNA-seq data
How to store and visualize RNA-seq data Gabriella Rustici Functional Genomics Group gabry@ebi.ac.uk EBI is an Outstation of the European Molecular Biology Laboratory. Talk summary How do we archive RNA-seq
More informationCACAO Training. Jim Hu and Suzi Aleksander Spring 2016
CACAO Training Jim Hu and Suzi Aleksander Spring 2016 1 What is CACAO? Community Assessment of Community Annotation with Ontologies (CACAO) Annotation of gene function Competition Within a class Between
More informationteachers A how-to guide for SLI 2015
A how-to guide for teachers These materials are based upon work supported by the National Science Foundation under Grant Nos. IIS-1441561, IIS-1441471, & IIS-1441481. Any opinions, findings, and conclusions
More informationEnabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services
Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit
More informationRelevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline
Relevance Feedback and Query Reformulation Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price IR on the Internet, Spring 2010 1 Outline Query reformulation Sources of relevance
More informationUse of Semantic Technologies at Eli Lilly and Company. J Phil Brooks Information Consultant, SE Data Team Discover IT Eli Lilly and Company
Use of Semantic Technologies at Eli Lilly and Company J Phil Brooks Information Consultant, SE Data Team Discover IT Eli Lilly and Company Notable Semantic Projects at Lilly Discovery Metadata Integration
More informationSoftware review. Biomolecular Interaction Network Database
Biomolecular Interaction Network Database Keywords: protein interactions, visualisation, biology data integration, web access Abstract This software review looks at the utility of the Biomolecular Interaction
More informationSemantic Scholar. ICSTI Towards a More Efficient Review of Research Literature 11 September
Semantic Scholar ICSTI Towards a More Efficient Review of Research Literature 11 September 2018 Allen Institute for Artificial Intelligence (https://allenai.org/) Non-profit Research Institute in Seattle,
More informationExploring the Query Expansion Methods for Concept Based Representation
Exploring the Query Expansion Methods for Concept Based Representation Yue Wang and Hui Fang Department of Electrical and Computer Engineering University of Delaware 140 Evans Hall, Newark, Delaware, 19716,
More informationA System for Ontology-Based Annotation of Biomedical Data
A System for Ontology-Based Annotation of Biomedical Data Clement Jonquet, Mark A. Musen, and Nigam Shah Stanford Center for Biomedical Informatics Research Stanford University School of Medicine Medical
More informationOutline. Possible solutions. The basic problem. How? How? Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity
Outline Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity Lecture 10 CS 410/510 Information Retrieval on the Internet Query reformulation Sources of relevance for feedback Using
More informationFacilitating Semantic Alignment of EBI Resources
Facilitating Semantic Alignment of EBI Resources 17 th March, 2017 Tony Burdett Technical Co-ordinator Samples, Phenotypes and Ontologies Team www.ebi.ac.uk What is EMBL-EBI? Europe s home for biological
More informationRetrieval of Highly Related Documents Containing Gene-Disease Association
Retrieval of Highly Related Documents Containing Gene-Disease Association K. Santhosh kumar 1, P. Sudhakar 2 Department of Computer Science & Engineering Annamalai University Annamalai Nagar, India. santhosh09539@gmail.com,
More informationwarwick.ac.uk/lib-publications
Original citation: Zhao, Lei, Lim Choi Keung, Sarah Niukyun and Arvanitis, Theodoros N. (2016) A BioPortalbased terminology service for health data interoperability. In: Unifying the Applications and Foundations
More informationNCI Thesaurus, managing towards an ontology
NCI Thesaurus, managing towards an ontology CENDI/NKOS Workshop October 22, 2009 Gilberto Fragoso Outline Background on EVS The NCI Thesaurus BiomedGT Editing Plug-in for Protege Semantic Media Wiki supports
More informationSusanna-Assunta Sansone, PhD. Metadata WG3 chair.
Susanna-Assunta Sansone, PhD Metadata WG3 chair 3-workgroup@biocaddie.org WG3 Metadata v v Full description: goals, synergies, phases, members & files Joint effort with BD2K Center for Expanded Data Annotation
More informationOmics Discovery Index Discovering and Linking Public Omics Datasets
Omics Discovery Index Discovering and Linking Public Omics Datasets Yasset Perez-Riverol a,,*, Mingze Bai a,b,, Felipe da Veiga Leprevost c, Silvano Squizzato a, Young Mi Park a, Kenneth Haug a, Adam J.
More informationThe ELIXIR of Linked Data
The ELIXIR of Linked Data Professor Carole Goble (UK node) Barend Mons (NL node), Helen Parkinson (EMBL-EBI node) The Interoperability Services Backbone Team European Life Sciences Infrastructure for Biological
More informationUpdate on Dataverse Dryad-Dataverse Community Meeting. Mercè Crosas, Elizabeth Quigley & Eleni Castro. Data Science > IQSS > Harvard University
Update on Dataverse Image credit: David Bygott (CC-BY-NC-SA) 2014 Dryad-Dataverse Community Meeting Mercè Crosas, Elizabeth Quigley & Eleni Castro Data Science > IQSS > Harvard University Introduction
More informationCDIS Biomedical Data Commons
CDIS Biomedical Data Commons Computational Life Science Seminar Series October 18, 2017 Michael Fitzsimons Center for Data Intensive Science Agenda What is a Data Commons? Data Commons at CDIS NCI GDC
More informationUC San Diego UC San Diego Electronic Theses and Dissertations
UC San Diego UC San Diego Electronic Theses and Dissertations Title Information Retrieval in Biomedical Research: From Articles to Datasets Permalink https://escholarship.org/uc/item/660390nr Author Wei,
More informationWSU-IR at TREC 2015 Clinical Decision Support Track: Joint Weighting of Explicit and Latent Medical Query Concepts from Diverse Sources
WSU-IR at TREC 2015 Clinical Decision Support Track: Joint Weighting of Explicit and Latent Medical Query Concepts from Diverse Sources Saeid Balaneshin-kordan, Alexander Kotov, and Railan Xisto Department
More informationRoy Lowry, Gwen Moncoiffe and Adam Leadbetter (BODC) Cathy Norton and Lisa Raymond (MBLWHOI Library) Ed Urban (SCOR) Peter Pissierssens (IODE Project
Roy Lowry, Gwen Moncoiffe and Adam Leadbetter (BODC) Cathy Norton and Lisa Raymond (MBLWHOI Library) Ed Urban (SCOR) Peter Pissierssens (IODE Project Office) Linda Pikula (IODE GEMIM/NOAA Library) Data
More informationBig Data in Translational Science
Big Data in Translational Science Albert Wang Associate Director, Translational R&D IT Bristol-Myers Squibb 2015 AAPS Annual Meeting Agenda Perspectives on Big Data Big Data in Translational R&D Selected
More informationMaximizing Public Data Sources for Sequencing and GWAS
Maximizing Public Data Sources for Sequencing and GWAS February 4, 2014 G Bryce Christensen Director of Services Questions during the presentation Use the Questions pane in your GoToWebinar window Agenda
More informationThis is the author s version of a work that was submitted/accepted for publication in the following source:
This is the author s version of a work that was submitted/accepted for publication in the following source: Koopman, Bevan, Bruza, Peter, Sitbon, Laurianne, & Lawley, Michael (2011) AEHRC & QUT at TREC
More informationTSRI, 400-S PubMed / MyNCBI
TSRI, 400-S helplib@scripps.edu 858-784-8705 PubMed / MyNCBI My NCBI is a free service available in PubMed (and all other NCBI databases) that allows you to save searches, set up email alerts for search
More informationBioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data (
Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data (http://bioqueries.uma.es) María Jesús García-Godoy, Ismael Navas-Delgado, José Francisco Aldana Montes Computing
More informationHuman Disease Models Tutorial
Mouse Genome Informatics www.informatics.jax.org The fundamental mission of the Mouse Genome Informatics resource is to facilitate the use of mouse as a model system for understanding human biology and
More informationIntegrated Access to Biological Data. A use case
Integrated Access to Biological Data. A use case Marta González Fundación ROBOTIKER, Parque Tecnológico Edif 202 48970 Zamudio, Vizcaya Spain marta@robotiker.es Abstract. This use case reflects the research
More informationClinVar. Jennifer Lee, PhD, NCBI/NLM/NIH ClinVar
ClinVar What is ClinVar ClinVar is a freely available, central archive for associating observed variation with supporting clinical and experimental evidence for a wide range of disorders. The database
More informationRLIMS-P Website Help Document
RLIMS-P Website Help Document Table of Contents Introduction... 1 RLIMS-P architecture... 2 RLIMS-P interface... 2 Login...2 Input page...3 Results Page...4 Text Evidence/Curation Page...9 URL: http://annotation.dbi.udel.edu/text_mining/rlimsp2/
More informationProbabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval
robabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval ayam Karisani, Emory University Zhaohui Qin, Emory University Eugene Agichtein, Emory University Journal Title:
More informationApplied Bioinformatics
Applied Bioinformatics Course Overview & Introduction to Linux Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu What is bioinformatics Bio Bioinformatics
More informationBioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data
BioNav: An Ontology-Based Framework to Discover Semantic Links in the Cloud of Linked Data María-Esther Vidal 1, Louiqa Raschid 2, Natalia Márquez 1, Jean Carlo Rivera 1, and Edna Ruckhaus 1 1 Universidad
More informationOpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond
Alessia Bardi and Paolo Manghi, Institute of Information Science and Technologies CNR Katerina Iatropoulou, ATHENA, Iryna Kuchma and Gwen Franck, EIFL Pedro Príncipe, University of Minho OpenAIRE Fostering
More informationOntrez Project Report National Center for Biomedical Ontology November, 2007
Ontrez Project Report National Center for Biomedical Ontology November, 2007 Executive summary Currently, genomics data and data repositories in the public domain are expanding at an explosive pace. 1
More informationdr.ir. D. Hiemstra dr. P.E. van der Vet
dr.ir. D. Hiemstra dr. P.E. van der Vet Abstract Over the last 20 years genomics research has gained a lot of interest. Every year millions of articles are published and stored in databases. Researchers
More information