eveloping DataMed the current status
|
|
- Homer Russell
- 5 years ago
- Views:
Transcription
1 eeloping DataMed the current status Hua Xu Core Deelopment Team (CDT) biocaddie AHM /8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 1
2 Outline CDT Roles and Roadmap DataMed General Architecture Search engine Ne features Next steps 8/8/17 Supported by the NIH grant #1U24 AI to the Uniersity of California, San Diego 2
3 Roles of CDT Deelop a functional prototype of DDI DataMed Implement guidelines/suggestions by WGs Integrate systems/modules deeloped by pilot projects and supplements Engage end users into DataMed deelopment and ealuation 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 3
4 V0.1 DDI architecture Data ingestion Search function Feedback collection Data identifier Data indexing Dataset result display Terminology serer search engine Interface design Usability needs analysis Ranking algorithm Search function Architecture Wrap up of pilot projects RFA for pilot on Harester for DDI schema V0.2 CDT Roadmap Usability study phase I Ranking Metadata management V0.5 Personalized search Link dataset to external resources Search algorithm Metadata ingestion Import repositories Repository submission form Map metadata to DATs model 2.x NLP-based indexing/searching V1.0 Documentation Usability study phase II & III Data duplication issue Generation of benchmark datasets Terminology serer - indexing Visualization Personalized search Improe the tracking system Search/Ranking algorithms Similar datasets to be expanded Display of results Sort datasets Additional filters V2.0 V3.0 Pilot project integration Usability study Web API
5 Accomplishments Present DataMed V 1.5 No, 2016 V 2.0 Feb, 2017 V 3.0 Jul, 2017 Data ingestion Ingestion of 74 data repositories Implementation of DATS 2.2 metadata model Enhancement modules to indexing pipeline Search engine NLP serice Terminology serice ElasticSearch optimization Other functionalities Ealuation Benchmark data set / Data Retrieal challenge User surey 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 5
6 DataMed Architecture Ingestion Indexing Data Sources Repositories Online datasets Metadata Ingestion ElasticSearch User Interface Funding Agencies Publishers Data producers Terminology & NLP serer
7 Search engine architecture Model Vie Responsible for constructing ES query and retrieing search results. User interaction; Responsible for rendering of model. Controller Responsible for responding to user input, e.g.. generate corresponding search keyords, search fields, facets fields etc 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 7
8 Search engine orkflo Query Concept extraction NLP serer Synonym expansion Terminology serer Facets ElasticSearch Ranked results Search Refined Query 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 8
9 What s ne - NLP serer Goal leerage NLP approaches to extract biomedical concepts General biomedical concepts à MESH Fie specific types of entities: disease, chemical, gene, biological process, and cell line Uses Processing user entered queries Indexing metadata textual fields (e.g., description) Implementation mixture of existing tools and locally deeloped systems MetaMap Lite Machine learning based NER systems (e.g., based on CRF) Rule-based systems (e.g., dictionary lookup) 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 9
10 NLP ealuation Category Method Precision/recall Note Mesh term MetaMap Lite Gene CRF 89.95%/50.75% Trained on pubmed corpus, tested on dataset description Disease CRF 92.54%/88.89% Trained on pubmed corpus, tested on dataset description Drug CRF 91.08%/69.06% Trained on pubmed corpus, tested on dataset description Cell Line CRF Insufficient entities for ealuation Biological process Trained on corpus proided in Dictionary lookup 93.96% /76.8% QuickGo, tested on dataset description 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 10
11 NLP s effect on search No NLP With NLP expansion infap infndcg p@
12 What s ne - Terminology serer Goal leerage existing terminologies in the biomedical domain to facilitate search Uses Term expansion during indexing by the ingestion pipeline Query expansion for search engine Auto-completion for search engine Spelling correction for search engine 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 12
13 Terminology serer implementation Sources - Condition, Chemical, Gene, Procedure, Lab, Anatomy, Organism Terminology Source Concepts Relationships NCBI taxonomy 906,782 1,813,962 SNOMEDCT_US 315,904 5,414,108 MESH 254,883 2,947,524 FMA 83,282 1,105,781 GO 39,269 2,341,646 HGNC 38, ,996 TOTAL 1,725,407 7,578,017 Techniques Neo4j - graph Database SciGraph interface (RESTful API) for Neo4j Response time < 0.1 sec 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 13
14 What s ne - Ranking algorithm Current ranking algorithms ElasticSearch s implementation of probabilistic releance model called Okapi GM25 Citation count, such as GEO Ranking by citation counts (GEO), published time. Future implementation Ne releance ranking algorithms from the Dataset Retrieal challenge: Blind Releance Feedback algorithm and learning-to-rank algorithm 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 14
15 Highlights of DataMed 3.0 More repositories 74 Web API Fine grained User Tracking System Reporting of Broken Links Duplicated datasets Visualize data statistics (Diploid project) Schema.org markup
16 Dataset statistics 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 16
17 Web API API for DataMed - REQUEST API DataMed RESULT
18 User tracking system Search term 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 18
19 User tracking system List & rank order of results returned in the page 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 19
20 User tracking system Results & position in page clicked by user Scroll position 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 20
21 User tracking system - Uses Query logs for analyses Usability studies Information for most accessed data Information for ranking by user feedback 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 21
22 Broken links 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 22
23 Duplicated datasets 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 23
24 User statistics 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 24
25 Publications (10/2016-present) Papers Dong X, Zhang Y, Xu H. Search Datasets in Literature: A Case Study of GWAS, AMIA CRI/TBI Symposium, San Francisco, 2017 Cohen T, Roberts K, Gururaj AE, Chen X, Pornejati S, Hersh WR, Demner-Fushman D, Ohno- Machado L, Xu H. A Publicly Aailable Benchmark for Biomedical Dataset Retrieal: The Reference Standard for the 2016 biocaddie Dataset Retrieal Challenge. Database, 2017 (Accepted). Roberts K, Gururaj AE, Chen X, Pornejati S, Hersh WR, Demner-Fushman D, Ohno-Machado L, Cohen T, Xu H. Information Retrieal for Biomedical Datasets: The 2016 biocaddie Dataset Retrieal Challenge. Database, 2017 (Under Reie). Chen X, Liu R, Ozyurt B, Gururaj AE, Soysal E, Cohen T, Tiryaki F, Li Y, Zong N, Jiang M, Rogith D, Salimi M, Kim H, Rocca-Serra P, Gonzalez-Beltran A, Farcas C, Johnson T, Margolis R, Alter G, Sansone S-A, Fore I, Ohno-Machado L, Grethe J, Xu H, Bell E,. Building a search engine for finding biomedical datasets across repositories the DataMed system. JAMIA, 2017 (Under Reie) Dixit R, Rogith D, Narayana V, Salimi M, Gururaj AE, Ohno-Machado L, Xu H, Johnson T. User Needs Analysis and Usability Assessment of DataMed a Biomedical Data Discoery Index. JAMIA, 2017 (Under Reie) 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 25
26 Publications (10/2016-present) Abstracts/Presentations Systems Demo, AMIA Annual Symposium, 2016 Ingestion & Indexing Pipeline Abstract, AMIA Annual Symposium, 2016 DataMed Abstract, ICBO 2016 DataMed Abstract, BD2K AHM DataMed NLP Pipeline Abstract, AMIA Annual Symposium, Accepted biocaddie Dataset Retrieal Challenge Abstract, AMIA Annual Symposium, Accepted 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 26
27 Next Steps Data Ingestion Expand to 100+ data repositories, ith DATS 2.2 Deelop automated tools for data ingestion Search engine Ranking algorithms - implementation of pilot projects (Emory and UIUC) Deep search Common data elements (CDE) Other ne functionalities: Update Adanced Search Display most Accessed Datasets Word Cloud (search results isualization) Alerts to ne results from saed searches Ealuation and promotion User and performance ealuation 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 27
28 UCSD Claudiu Farcas Jeffrey Grethe Hyeoneui Kim Nansu Zong Stephen Trac Yueling Li Larry Lui Burak Ozyurt Oxford Alejandra Gonzalez-Beltran Philippe Rocca-Serra Susanna-Assunta Sansone NIH Ian Fore CDT members UTHealth Xiaoling Chen Firat Tiryakt Treor Cohen Anupama Gururaj Todd Johnson Deeakar Rogith Nina Salimi Ergin Soysal Cui Tao Hua Xu Ram Dixit Pilot projects team members Preious members Pratik Chaudhary Ruiling Liu Ron Margolis Saeid Pournejati Stephanie Ngyuen Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 28
29 Thank you! For questions, please contact CDT at: 8/8/17 Supported by the NIH grant 1U24 AI to the Uniersity of California, San Diego 29
30 Preliminary ork on integrating CDE ith DataMed Hua Xu, PhD
31 Dimensions of DATS Dimension: meant to be used to report hat data points are about in a dataset, their nature, their units. Dimension must be typed biocaddie DATS Dimension as discussed ith Google schema.org team, ho added ariablemeasured property in schema.org to coer the notion. Application: List all measurement types performed in a clinical trial Purpose: allo indexing on measurement types / ariables / dimensions
32 Dimensions in DataMed As of May 2017, 66 repositories 1,375,801 datasets in DataMed. Collected dimension information for 4 repositories ImmPort (205/222) LINCS (286 /287) MPD (376/376) NeuroMorpho (50356/50356)
33 Mapping beteen Dimension and CDE NIH CDE Repositories 20,189 CDE across all initiaties, 19,985 unique CDE Examples: Smoking status Polysomnography blood oxygen saturation distribution percent alue In the past 7 days I felt comfortable ith others my age Thinking about your child's life, My child thinks his/her life has purpose. Dimension in DataMed 3083 unique dimension names from 4 repositories Examples: Blood Cell Count ith Differential Protein measurement Skin lesion inoled and skin lesion duration total horizontal distance traeled (immediately to 15 min post-injection), baseline Oerlap (exact matching) 6 only Smoking status, Medical history, Body eight, Age, Heart rate, Vital capacity
34 Mapping Dimension to CDE Dimension name axial length, right eye cardiac output hdl cholesterol left entricle ejection fraction serum itamin d concentration hole body bone mineral density (bmd) Obserations CDE axial length right eye cardiac output measurement hdl cholesterol alue echocardiogram left entricle ejection fraction measurement person serum itamin d leel number dual x-ray absorptiometry hole body bone mineral density alue Many datasets do not use CDE Some CDEs are represented as combination of seeral concepts, expressed as sentences Dimension and CDE may be related concepts at different granularity leels Synonyms and abbreiation may be inoled.
35 Query Backe Metadata nd - Dimensio n System architecture NLPbased CDE encodi ng serice Has CDE Ye identifi s er? DataMed n default o search algorithm Metadata enrichmen t ith CDE identifiers Search results Boosts applied to search CDE identifier in CDE enrichment field
36 Preliminary results Initial similarity algorithm based on Elasticsearch Examples: Dimension name axial length, left eye total cholesterol hdl cholesterol brain eight pulse rate left entricle diastolic olume non-hdl cholesterol Mapped to CDE axial length left eye total cholesterol alue hdl cholesterol alue brain eight measurement pulse rate measurement alue echocardiogram left entricle end diastolic olume measurement hdl cholesterol alue A lot body of of eight room at for dissection improement body eight at birth NLP CDE encoding algorithm can be applied to other applications here CDE recognition is needed.
37 Demo CDE enrichment using the algorithm for 4 repositories. ex.php hdl cholesterol ill mapped to "hdl cholesterol alue and suggest for a phrase search in dimension field. anced.php
Core Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationSteering Committee Meeting
Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationAgenda. Clarification of issues Quarter definition Steering and Executive Committee composition Dissemination and community outreach activities
Agenda Clarification of issues Quarter definition Steering and Executive Committee composition Dissemination and community outreach activities Progress and updates Y1Q3 and plans for Y1Q4 Plan for the
More informationSteering Committee Meeting
Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please isit: https://www.readytalk.com/account-administration/international-numbers
More informationExecutive Committee Meeting
Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationMetadata Ingestion and Processinng
biomedical and healthcare Data Discovery Index Ecosystem Ingestion and Processinng Jeffrey S. Grethe, Ph.D. 2017 BioCADDIE All Hands Meeting prototype Ingestion Indexing Repositories Ingestion ElasticSearch
More informationThe Final Updates. Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University of Oxford, UK
The Final Updates Supported by the NIH grant 1U24 AI117966-01 to UCSD PI, Co-Investigators at: Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University
More informationExecutive Committee Meeting
Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationMinutes. Date: Location: UCSD BRF2 5A03. Attendees Present
Executive Committee Meeting Location: UCSD BRF2 5A03 Date: 8-16-16 Start time: 10:00 am PDT End time: 11:30 am PDT Meeting Objective Attendees Present Minute Taker Executive Committee Meeting UCSD: Lucila
More informationExecutive Committee Meeting
Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationSteering Committee Meeting
Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationExecutive Committee Meeting
Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationSteering Committee Meeting
Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationSusanna-Assunta Sansone, PhD. Metadata WG3 chair.
Susanna-Assunta Sansone, PhD Metadata WG3 chair 3-workgroup@biocaddie.org WG3 Metadata v v Full description: goals, synergies, phases, members & files Joint effort with BD2K Center for Expanded Data Annotation
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting Agenda v Updates regarding last meeting action items v Presentation by Ergin about Ontology Services v Brief updates from others Supported by the NIH grant 1U24
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationMetadata Discovery and Integration to Support Repurposing of Heterogeneous Data using the OpenFurther Platform
Metadata Discovery and Integration to Support Repurposing of Heterogeneous Data using the OpenFurther Platform biocaddie All Hands Meeting September 11 th, 2016 Ram Gouripeddi & Julio Facelli Department
More informationA Data Citation Roadmap for Scholarly Data Repositories
A Data Citation Roadmap for Scholarly Data Repositories Tim Clark (Harvard Medical School & Massachusetts General Hospital) Martin Fenner (DataCite) Mercè Crosas (Institute for Quantiative Social Science,
More informationTSRI, 400-S PubMed / MyNCBI
TSRI, 400-S helplib@scripps.edu 858-784-8705 PubMed / MyNCBI My NCBI is a free service available in PubMed (and all other NCBI databases) that allows you to save searches, set up email alerts for search
More informationUC San Diego UC San Diego Electronic Theses and Dissertations
UC San Diego UC San Diego Electronic Theses and Dissertations Title Information Retrieval in Biomedical Research: From Articles to Datasets Permalink https://escholarship.org/uc/item/660390nr Author Wei,
More informationPrototyping a Biomedical Ontology Recommender Service
Prototyping a Biomedical Ontology Recommender Service Clement Jonquet Nigam H. Shah Mark A. Musen jonquet@stanford.edu 1 Ontologies & data & annota@ons (1/2) Hard for biomedical researchers to find the
More informationLIBER Webinar: A Data Citation Roadmap for Scholarly Data Repositories
LIBER Webinar: A Data Citation Roadmap for Scholarly Data Repositories Martin Fenner (DataCite) Mercè Crosas (Institute for Quantiative Social Science, Harvard University) May 15, 2017 2014 Joint Declaration
More informationEmbracing Semantic Technology for Better Metadata Authoring in Biomedicine
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine Attila L. Egyedi, Martin J. O Connor, Marcos Martínez-Romero, Debra Willrett, Josef Hardi, John Graybeal, and Mark A. Musen Stanford
More informationPhenotype Discovery in NHLBI Genomic Studies
Phenotype Discovery in NHLBI Genomic Studies Final Report Hyeoneui Kim, RN, PhD Son Doan, PhD Ko-Wei Lin, DVM, PhD Michael Conway, PhD Alexander Hsieh Asher Garland Seena Farzaneh Neda Alipanah Stephanie
More informationTSRI, 400-S PubMed / MyNCBI
TSRI, 400-S helplib@scripps.edu 858-784-8705 PubMed / MyNCBI My NCBI is a free service available in PubMed (and all other NCBI databases) that allows you to save searches, set up email alerts for search
More informationAI Application and Development in ehealth Field. MIN Dong
AI Application and Development in ehealth Field MIN Dong What s e-health? Defined by WHO ehealth is the cost-effective and secure use of information and communications technologies(icts) in support of
More informationELIXIR webinar. schema.org structured data for life sciences. Events, training materials, organizations, 16 March 2016, 14:00 GMT
ELIXIR webinar 16 March 2016, 14:00 GMT schema.org structured data for life sciences Events, training materials, organizations, Structured data? A standard way to annotate content so machines can understand
More informationPowering Knowledge Discovery. Insights from big data with Linguamatics I2E
Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural
More informationNatural Language Processing Pipelines to Annotate BioC Collections with an Application to the NCBI Disease Corpus
Natural Language Processing Pipelines to Annotate BioC Collections with an Application to the NCBI Disease Corpus Donald C. Comeau *, Haibin Liu, Rezarta Islamaj Doğan and W. John Wilbur National Center
More informationIBM Marketing Operations and Campaign Version 9 Release 0 January 15, Integration Guide
IBM Marketing Operations and Campaign Version 9 Release 0 January 15, 2013 Integration Guide Note Before using this information and the product it supports, read the information in Notices on page 51.
More informationSCDM 2017 ANNUAL CONFERENCE. September I Orlando
SCDM 2017 ANNUAL CONFERENCE September 24-27 I Orlando CDASH 2.0 What s New and How Does It Impact Me? Panel Discussion Moderator: Dawn M. Kaminski Director, Clinical Data Strategies Accenture Before We
More informationText mining tools for semantically enriching the scientific literature
Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the
More informationMulti-field query expansion is effective for biomedical dataset retrieval
Database, 2017, 1 20 doi: 10.1093/database/bax062 Original article Original article Multi-field query expansion is effective for biomedical dataset retrieval Mohamed Reda Bouadjenek* and Karin Verspoor
More informationIBM Marketing Operations and Campaign Version 9 Release 1.1 November 26, Integration Guide
IBM Marketing Operations and Campaign Version 9 Release 1.1 Noember 26, 2014 Integration Guide Note Before using this information and the product it supports, read the information in Notices on page 55.
More informationMonitor Developer s Guide
IBM Tioli Priacy Manager for e-business Monitor Deeloper s Guide Version 1.1 SC23-4790-00 IBM Tioli Priacy Manager for e-business Monitor Deeloper s Guide Version 1.1 SC23-4790-00 Note: Before using this
More informationThe NLM Medical Text Indexer System for Indexing Biomedical Literature
The NLM Medical Text Indexer System for Indexing Biomedical Literature James G. Mork 1, Antonio J. Jimeno Yepes 2,1, Alan R. Aronson 1 1 National Library of Medicine, Bethesda, MD, USA {mork,alan}@nlm.nih.gov
More informationImproving Interoperability of Text Mining Tools with BioC
Improving Interoperability of Text Mining Tools with BioC Ritu Khare, Chih-Hsuan Wei, Yuqing Mao, Robert Leaman, Zhiyong Lu * National Center for Biotechnology Information, 8600 Rockville Pike, Bethesda,
More informationThe IEEE Metadata Standard for Supporting Big Data Management
The IEEE Metadata Standard for Supporting Big Data Management Alex MH Kuo 1,2 (Ph.D) 1 School of Health Information Science University of Victoria, BC, Canada. 2 CEDAR, School of Medicine University of
More informationHarmonizing biocaddie Metadata Schemas for Indexing Clinical Research Datasets Using Semantic Web Technologies
Harmonizing biocaddie Metadata Schemas for Indexing Clinical Research Datasets Using Semantic Web Technologies Harold R. Solbrig 1, Guoqian Jiang 1 1 Mayo Clinic College of Medicine, Rochester, MN [solbrig.harold,
More informationState Stats Over 3,000 data measures covering all 50 states and the District of Columbia.
INTRODUCTION SAGE Stats includes oer thirty years of data from more than 100 sources. It features more than 6,000 unique data measures. The measures are split into unique collections. Right no, SAGE Stats
More informationSearching the Evidence in PubMed
CAMBRIDGE UNIVERSITY LIBRARY MEDICAL LIBRARY Supporting Literature Searching Searching the Evidence in PubMed July 2017 Supporting Literature Searching Searching the Evidence in PubMed How to access PubMed
More informationCDASH MODEL 1.0 AND CDASHIG 2.0. Kathleen Mellars Special Thanks to the CDASH Model and CDASHIG Teams
CDASH MODEL 1.0 AND CDASHIG 2.0 Kathleen Mellars Special Thanks to the CDASH Model and CDASHIG Teams 1 What is CDASH? Clinical Data Acquisition Standards Harmonization (CDASH) Standards for the collection
More informationIBM InfoSphere MDM Enterprise Viewer User's Guide
IBM InfoSphere Master Data Management Version 11 IBM InfoSphere MDM Enterprise Viewer User's Guide GI13-2661-00 IBM InfoSphere Master Data Management Version 11 IBM InfoSphere MDM Enterprise Viewer User's
More informationIBM Unica Distributed Marketing Version 8 Release 6 May 25, Field Marketer's Guide
IBM Unica Distributed Marketing Version 8 Release 6 May 25, 2012 Field Marketer's Guide Note Before using this information and the product it supports, read the information in Notices on page 83. This
More informationDocument Retrieval using Predication Similarity
Document Retrieval using Predication Similarity Kalpa Gunaratna 1 Kno.e.sis Center, Wright State University, Dayton, OH 45435 USA kalpa@knoesis.org Abstract. Document retrieval has been an important research
More informationSEEK User Manual. Introduction
SEEK User Manual Introduction SEEK is a computational gene co-expression search engine. It utilizes a vast human gene expression compendium to deliver fast, integrative, cross-platform co-expression analyses.
More informationA System for Ontology-Based Annotation of Biomedical Data
A System for Ontology-Based Annotation of Biomedical Data Clement Jonquet, Mark A. Musen, and Nigam Shah Stanford Center for Biomedical Informatics Research Stanford University School of Medicine Medical
More informationTaking a view on bio-ontologies. Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria
Taking a view on bio-ontologies Simon Jupp Functional Genomics Production Team ICBO, 2012 Graz, Austria Who we are European Bioinformatics Institute one of world s largest bio data and service providers
More informationContent Enrichment. An essential strategic capability for every publisher. Enriched content. Delivered.
Content Enrichment An essential strategic capability for every publisher Enriched content. Delivered. An essential strategic capability for every publisher Overview Content is at the centre of everything
More informationEnriching Knowledge Domain Visualizations: Analysis of a Record Linkage and Information Fusion Approach to Citation Data
Enriching Knowledge Domain Visualizations: Analysis of a Record Linkage and Information Fusion Approach to Citation Data Marie B. Synnestvedt, MSEd 1, 2 1 Drexel University College of Information Science
More informationTEXT MINING: THE NEXT DATA FRONTIER
TEXT MINING: THE NEXT DATA FRONTIER An Infrastructural Approach Dr. Petr Knoth CORE (core.ac.uk) Knowledge Media institute, The Open University United Kingdom 2 OpenMinTeD Establish an open and sustainable
More informationSciMiner User s Manual
SciMiner User s Manual Copyright 2008 Junguk Hur. All rights reserved. Bioinformatics Program University of Michigan Ann Arbor, MI 48109, USA Email: juhur@umich.edu Homepage: http://jdrf.neurology.med.umich.edu/sciminer/
More informationInternet Information Server User s Guide
IBM Tioli Monitoring for Web Infrastructure Internet Information Serer User s Guide Version 5.1.0 SH19-4573-00 IBM Tioli Monitoring for Web Infrastructure Internet Information Serer User s Guide Version
More informationNCBI News, November 2009
Peter Cooper, Ph.D. NCBI cooper@ncbi.nlm.nh.gov Dawn Lipshultz, M.S. NCBI lipshult@ncbi.nlm.nih.gov Featured Resource: New Discovery-oriented PubMed and NCBI Homepage The NCBI Site Guide A new and improved
More informationExploring the Query Expansion Methods for Concept Based Representation
Exploring the Query Expansion Methods for Concept Based Representation Yue Wang and Hui Fang Department of Electrical and Computer Engineering University of Delaware 140 Evans Hall, Newark, Delaware, 19716,
More informationA Technical Introduction to the Semantic Search Engine SeMedico
Talk in the Semesterprojekt Entwicklung einer Suchmaschine für Alternativmethoden zu Tierversuchen January 12, 2018 Humboldt-Universität zu Berlin A Technical Introduction to the Semantic Search Engine
More informationRenae Barger, Executive Director NN/LM Middle Atlantic Region
Renae Barger, Executive Director NN/LM Middle Atlantic Region rbarger@pitt.edu http://nnlm.gov/mar/ DANJ Meeting, November 4, 2011 Advanced PubMed (20 min) General Information PubMed Citation Types Automatic
More informationWhat is Text Mining? Sophia Ananiadou National Centre for Text Mining University of Manchester
National Centre for Text Mining www.nactem.ac.uk University of Manchester Outline Aims of text mining Text Mining steps Text Mining uses Applications 2 Aims Extract and discover knowledge hidden in text
More informationCTSA Program Common Metric for Informatics Solutions
CTSA Program Common Metric for Informatics Solutions KRISTI HOLMES, PHD DIRECTOR OF EVALUATION, NUCATS DIRECTOR, GALTER HEALTH SCIENCES LIBRARY & LEARNING CENTER NORTHWESTERN UNIVERSITY CTSA PROGRAM STEERING
More informationGuide to Managing Common Metadata
IBM InfoSphere Information Serer Version 11 Release 3 Guide to Managing Common Metadata SC19-4297-01 IBM InfoSphere Information Serer Version 11 Release 3 Guide to Managing Common Metadata SC19-4297-01
More informationIBM Agent Builder Version User's Guide IBM SC
IBM Agent Builder Version 6.3.5 User's Guide IBM SC32-1921-17 IBM Agent Builder Version 6.3.5 User's Guide IBM SC32-1921-17 Note Before you use this information and the product it supports, read the information
More informationExploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix
Exploring and Exploiting the Biological Maze Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Motivation An abundance of biological data sources contain data about scientific entities, such as
More informationSheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms
Sheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms Yikun Guo, Henk Harkema, Rob Gaizauskas University of Sheffield, UK {guo, harkema, gaizauskas}@dcs.shef.ac.uk
More informationJanuary 16, Re: Request for Comment: Data Access and Data Sharing Policy. Dear Dr. Selby:
Dr. Joe V. Selby, MD, MPH Executive Director Patient-Centered Outcomes Research Institute 1828 L Street, NW, Suite 900 Washington, DC 20036 Submitted electronically at: http://www.pcori.org/webform/data-access-and-data-sharing-policypublic-comment
More informationMaking data publication a first class research output
Making data publication a first class research output Andrew L. Hufton Managing Editor, Scientific Data https://www.nature.com/sdata/ Helping Researchers Publish, University of Cambridge, Oct 2017 Launched
More informationMulti-Backpropagation Network In Medical Diagnosis
Multi-Bacpropagation Netor In Medical Diagnosis Wan Hussain Wan Isha, Fadilah Sira, Abu Talib Othman School of Information Technology, Uniersiti Utara Malaysia, 0600 Sinto, Kedah, MALAYSIA Email: {hussain;
More informationQuick Reference Guide. Biomedical Answers
Quick Reference Guide Biomedical Answers www.embase.com .... 3 - Homepage... 4.... 5 - Search Forms... 6 - Refine... 8 - Using Emtree... 9 3.... - Reviewing Records... - Preview Abstracts and Index Terms...
More informationMass Spec Data Post-Processing Software. ClinProTools. Wayne Xu, Ph.D. Supercomputing Institute Phone: Help:
Mass Spec Data Post-Processing Software ClinProTools Presenter: Wayne Xu, Ph.D Supercomputing Institute Email: Phone: Help: wxu@msi.umn.edu (612) 624-1447 help@msi.umn.edu (612) 626-0802 Aug. 24,Thur.
More informationManaging your metadata efficiently - a structured way to organise and frontload your analysis and submission data
Paper TS06 Managing your metadata efficiently - a structured way to organise and frontload your analysis and submission data Kirsten Walther Langendorf, Novo Nordisk A/S, Copenhagen, Denmark Mikkel Traun,
More informationWeb of Science. Platform Release Nina Chang Product Release Date: December 10, 2017 EXTERNAL RELEASE DOCUMENTATION
Web of Science EXTERNAL RELEASE DOCUMENTATION Platform Release 5.27 Nina Chang Product Release Date: December 10, 2017 Document Version: 1.0 Date of issue: December 7, 2017 RELEASE OVERVIEW The following
More informationTivoli IBM Tivoli Advanced Catalog Management for z/os
Tioli IBM Tioli Adanced Catalog Management for z/os Version 2.2.0 Monitoring Agent User s Guide SC23-9818-00 Tioli IBM Tioli Adanced Catalog Management for z/os Version 2.2.0 Monitoring Agent User s Guide
More informationFunding from the Robert Wood Johnson Foundation s Public Health Services & Systems Research Program (grant ID #71597 to Martin and Birkhead)
1 Funding from the Robert Wood Johnson Foundation s Public Health Services & Systems Research Program (grant ID #71597 to Martin and Birkhead) Coauthors: Gus Birkhead, Natalie Helbig, Jennie Law, Weijia
More informationUsing Relations for Identification and Normalization of Disorders: Team CLEAR in the ShARe/CLEF 2013 ehealth Evaluation Lab
Using Relations for Identification and Normalization of Disorders: Team CLEAR in the ShARe/CLEF 2013 ehealth Evaluation Lab James Gung University of Colorado, Department of Computer Science Boulder, CO
More informationIBM Tivoli Storage Manager Version Optimizing Performance IBM
IBM Tioli Storage Manager Version 7.1.6 Optimizing Performance IBM IBM Tioli Storage Manager Version 7.1.6 Optimizing Performance IBM Note: Before you use this information and the product it supports,
More informationIBM Netcool Operations Insight Version 1 Release 4.1. Integration Guide IBM SC
IBM Netcool Operations Insight Version 1 Release 4.1 Integration Guide IBM SC27-8601-08 Note Before using this information and the product it supports, read the information in Notices on page 403. This
More informationBiomedical literature mining for knowledge discovery
Biomedical literature mining for knowledge discovery REZARTA ISLAMAJ DOĞAN National Center for Biotechnology Information National Library of Medicine Outline Biomedical Literature Access Challenges in
More informationDOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL
DOCUMENT RETRIEVAL USING A PROBABILISTIC KNOWLEDGE MODEL Shuguang Wang Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA swang@cs.pitt.edu Shyam Visweswaran Department of Biomedical
More informationTools and Infrastructure for Supporting Enterprise Knowledge Graphs
Tools and Infrastructure for Supporting Enterprise Knowledge Graphs Sumit Bhatia, Nidhi Rajshree, Anshu Jain, and Nitish Aggarwal IBM Research sumitbhatia@in.ibm.com, {nidhi.rajshree,anshu.n.jain}@us.ibm.com,nitish.aggarwal@ibm.com
More informationClinVar. Jennifer Lee, PhD, NCBI/NLM/NIH ClinVar
ClinVar What is ClinVar ClinVar is a freely available, central archive for associating observed variation with supporting clinical and experimental evidence for a wide range of disorders. The database
More informationRetrieval of Highly Related Documents Containing Gene-Disease Association
Retrieval of Highly Related Documents Containing Gene-Disease Association K. Santhosh kumar 1, P. Sudhakar 2 Department of Computer Science & Engineering Annamalai University Annamalai Nagar, India. santhosh09539@gmail.com,
More informationDB2 Universal Database for z/os
DB2 Uniersal Database for z/os Version 8 What s New? GC18-7428-02 DB2 Uniersal Database for z/os Version 8 What s New? GC18-7428-02 Note Before using this information and the product it supports, be sure
More informationiplanetwebserveruser sguide
IBM Tioli Monitoring for Web Infrastructure iplanetwebsereruser sguide Version 5.1.0 SH19-4574-00 IBM Tioli Monitoring for Web Infrastructure iplanetwebsereruser sguide Version 5.1.0 SH19-4574-00 Note
More information