Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
Agenda v Updates on action items v User testing: preliminary results and recommendations v Plans for additional user base expansion v Features for DataMed v3.0 v Updates from all team members Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego 2
Updates- action items v v v v v Complete re-indexing with updated NLP pipeline Data Type Ontology Schema.org annotations Demo of DataMed at the CDE workshop on May 9 have added dimensions to advanced search/nlp pipeline to extract entities Documentation- writing and testing Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego 3
DataMed Phase III Usability Objectives Objectives v Overall perceptions of the usefulness and usability of DataMed for biomedical data discovery. v What do participants think DataMed s purpose is? How easy or difficult is it for participants to use DataMed to find relevant datasets? What improvements or changes would participants change about DataMed to help them find datasets more easily? Assess the process of generating a search query and finding relevant datasets for 4 gold standard cases. Are participants able to successfully find relevant datasets? How do participants use the search and filtering options? What information and features are useful in identifying relevant datasets?
Methods Study Design v Participants were recruited from biocaddie + BD2K mailing lists and TMC institutions. v Tests were conducted remotely and in person. v The study used a qualitative think aloud approach: Participants were asked about their research area and experience using online datasets, proceeded to use DataMed to search for relevant datasets, and finally completed a SUS survey at the end of the session. v The test sessions took approximately 1 hour and participants will be compensated $25 for their time. Data Collection v User data: demographics, role, and experience with datasets v Usage data: keystrokes, clicks, times, and navigation paths v Interface feedback: think-aloud comments, open-ended reflection questions, suggestions for improvement, SUS survey
Initial Findings v What do participants think DataMed s purpose is? Most had a vague notion that DataMed involved searching for data, it was not clear to many from the homepage what it was, what it contained, who it was intended for, and how/what they could use it for. Some Public Health and Clinical researchers assumed the site would not be useful for them as it appeared targeted towards hardcore bioinformatics audiences. Almost all participants looked at the Data Types and Repositories pages to gain a better understanding of what the site contained. Upon searching DataMed, some were still unclear whether the results were publications or datasets.
Initial Findings v Are participants able to successfully find relevant datasets? What information and features are useful in identifying relevant datasets? Some participants in systems biology and cancer genomics found datasets of potential use though all mentioned they would need to investigate these datasets further. Many participants concluded that DataMed either did not have data useful to them or that couldn t tell whether DataMed had relevant data. Inconsistent/incomplete metadata (unique needs to research area; common needs: sample, intervention, data characteristics) Ambiguous filtering (data type distinctions, accessibility, logic) Opaque search capabilities (search fields, advanced search, hidden words)
Recommendations: Homepage v Provide a clearer message of the content and purpose of DataMed: 1. Relocate the Repositories, New Features, and Pilot Projects from the homepage. 2. Provide a clear description about what DataMed is, what it contains, and what it can be used for. Improve the Help and About pages to provide further details about what DataMed is, how it works, and what users can use it for. 3. Display Data Types (and Repositories) on the homepage. 4. Provide sample queries in the search bar. 5. Relocate search by dataset/repository to the advanced search.
Recommendations: Homepage v Provide a clearer message of the content and purpose of DataMed: 1. Relocate the Repositories, New Features, and Pilot Projects from the homepage. 2. Provide a clear description about what DataMed is, what it contains, and what it can be used for. Improve the Help and About pages to provide further details about what DataMed is, how it works, and what users can use it for. 3. Display Data Types (and Repositories) on the homepage. 4. Provide sample queries in the search bar. 5. Relocate search by dataset/repository to the advanced search.
Recommendations: Search Results 1. Expose the search process to inform users of how their query is executed in DataMed. Displaying steps such as synonym detection and ontology matching will help users understand how the results they are seeing relate to their query. Always surface and highlight relevant search term areas in results. 2. Include icons to clearly indicate and link directly to downloadable data 3. Separate alignment of checkbox, title, and metadata fields 4. Consider adding user control over metadata fields 1. Remove uninformative metadata fields such as ID 5. Either remove relevance sort or add more sorting options (ex: popularity, date added) Relevance should be explained in documentation.
Recommendations: Search Results 1. Expose the search process to inform users of how their query is executed in DataMed. Displaying steps such as synonym detection and ontology matching will help users understand how the results they are seeing relate to their query. Always surface and highlight relevant search term areas in results. 2. Include icons to clearly indicate and link directly to downloadable data 3. Separate alignment of checkbox, title, and metadata fields 4. Consider adding user control over metadata fields 1. Remove uninformative metadata fields such as ID 5. Either remove relevance sort or add more sorting options (ex: popularity, date added) Relevance should be explained in documentation.
Recommendations: Search Results 1. Expose the search process to inform users of how their query is executed in DataMed. Displaying steps such as synonym detection and ontology matching will help users understand how the results they are seeing relate to their query. Always surface and highlight relevant search term areas in results. 2. Include icons to clearly indicate and link directly to downloadable data 3. Separate alignment of checkbox, title, and metadata fields 4. Consider adding user control over metadata fields 1. Remove uninformative metadata fields such as ID 5. Either remove relevance sort or add more sorting options (ex: popularity, date added) Relevance should be explained in documentation.
Recommendations: Dataset Page 1. Present key information about dataset (Title, Repository, Date, Description, Access, Download Links) above page fold. 1. Other metadata can remain collapsed 2. Group and organize information to standardize order and meet user expectations 3. Provide explanations of metadata fields (hoverover)
Recommendations: Dataset Page 1. Present key information about dataset (Title, Repository, Date, Description, Access, Download Links) above page fold. 1. Other metadata can remain collapsed 2. Group and organize information to standardize order and meet user expectations 3. Provide explanations of metadata fields (hoverover)
Jina Huh, Giovanni Troiano, Jing Zhang CDT Meeting (4/11/2017)
(UCSD) Jina Huh, Giovanni Troiano, Jing Zhang 04/06/2017 UCSD Project Timeline 1- Apr 1- May 31- May 30- Jun 30- Jul Heuristic Evaluation 14 Interview Larger User Groups and Develop Lage Scale Surveys Large Scale Online Survey Collection Through Marketing Monthly Summaries of Design Recommendation for Iterative 60 90 164
(UCSD) Jina Huh, Giovanni Troiano, Jing Zhang 04/06/2017 Research Plan 1) Heuristic Evaluation Duration: 2 weeks Deliverable: A formal report from a heuristic evaluation of the DataMed UI design and interface. 2) Interview Larger User Groups and Develop Large Scale Surveys Duration: ~2 months Participants: 40-50 Deliverable: Direct user feedback on DataMed, generation of users workflows and personas. 3) Large Scale Online Survey Collection Through Marketing Duration: ~3 months Participants: 1000+ Deliverable: Targeted marketing strategy to make DataMed accessed and used by the greater public.
(UCSD) Jina Huh, Giovanni Troiano, Jing Zhang 04/06/2017 Finding Related Datasets Finding Relevant Datasets Finding New datasets DataMed Supports This Action 10 8 6 4 2 0 Personas
Features for DataMed v3.0 v https://docs.google.com/spreadsheets/d/15dxpuoezw jb24d1ughkdw8nzzd0qpfwv0qdluyj6jsg/edit#gid=176 8187641 Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego 21
Ongoing work Task Status Documentation 12 Source code Ongoing 13 Tutorials Not Started 14 Help menu Ongoing 15 Video link on both biocaddie/datamed Completed 16 FAQ page Ongoing Usability studies 17 User studies Ongoing Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego 22
Other issues v Please deposit codes in GitHub. Please contact me at Anupama.E.Gururaj@uth.tmc.edu if you need access v http://datamedbeta.biocaddie.org/index.p hp v Any other issues? v Thank You