Core Technology Development Team Meeting

Size: px
Start display at page:

Download "Core Technology Development Team Meeting"

Transcription

1 Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: Access Code: For international call in numbers, please visit:

2 Agenda v Harvester PP presentation: Ramkiran Gouripeddi, Biomedical Informatics Research, University of Utah: Metadata Discovery and Integration to Support Repurposing of Heterogeneous Data using the OpenFurther Platform v Updates of last meeting action items v Discussion about BD2K AHM demo, meeting with LINCS v Features/timeline discussion v Brief updates from All Supported by the NIH grant 1U24 AI to the University of California, San Diego 2

3 Metadata Discovery and Integra2on to Support Repurposing of Heterogeneous Data using the OpenFurther Pla>orm Mee#ng with biocaddie Core Technology Development Team October 27 th, 2015

4 Peter Bernie Ram Team Ryan Phillip Naresh Randy Julio 2

5 Overview Proposal Presenta2on Background OpenFurther Goals Metadata Aim 1 Metadata Repository Aim 2 Evalua#on Future Steps 3

6 Biomedical Research Clinical trials Observa#onal Environmental (exposome) data Electronic health record (phenotypes) Genomic Annota#ons (genotypes) Compara#ve Effec#veness Health Services Data Popula#on Health Socio- economic Proteomics Environmental Genomic Biospecimen Metabolomics 4

7 Embedded Syntac2cs & Seman2cs Diagnosis? ICD9? Primary or Secondary Diagnosis? = 5

8 Current State of Art Relies on human manual cura#on Non- scalable Limits realiza#on of full poten#al of using big data technologies in biomedical science. Final number of metadata (fields/columns) and terminology manual mappings stored in OpenFurther for ini#al onboarding of laboratory, microbiology and radiology results for 6 hospitals in the PHIS+ network. (Numbers do not include ar#facts that were considered as poten#al matches in the preceding discovery phase.) 6

9 OpenFurther 7

10 Goals Prototype a computa#onal infrastructure that supports discovery and mapping of metadata ar#facts and terminologies to overcome current limita#ons. Automated/semi- automated depending on confidence threshold chosen at specific implementa#ons. Agnos#c to specific algorithms or tools as many of these are domain- specific and dependent on data. Choose best available solu#on(s) based on performance, and for specific metadata ar#facts. Leverage exis#ng components of OpenFurther framework to manage, integrate and share metadata in structured non- proprietary formats. 8

11 Aim 1 Catalogue, benchmark and evaluate exis#ng metadata discovery and mapping methods and tools that are applicable to clinical (PHIS+), environmental (Environmental Protec#on Agency s Air Quality) and genomic (ClinVar) data. 9

12 Metadata Data about data including all physical data contained in soeware and other media, business and technical processes related to the data, and rules, constraints, structures and provenance of data. Types of metadata Descrip#ve: describes content including seman#cs for purposes such as discovery and iden#fica#on Structural: describes rela#onships between compound objects Administra#ve: includes the crea#on, file type, and access 10

13 Common Metadata Embedded Sources Databases: Rela#onal, NoSQL and EAV In en##es, aiributes, inter- rela#ons, data types, constraints, ddl, data dic#onaries and sample data Spreadsheets XML messages: stored as schemas, aiributes and markups Web services Soeware source code used to generate, process or store data En#ty rela#onship specifica#ons Data descrip#on language (DDL) defining data structures & database schemas Unstructured documents: PDFs, word, text documents. 11

14 Metadata Discovery and Mapping Metadata Discovery & Mapping Service Human Expert: Understands metadata ar#facts and finds poten#al matches Automated Approach: Discovers metadata and generates poten#al match(es) 12

15 Metadata Discovery and Harves2ng Methods Classifica#on: Primarily focusing on discovery and extrac#on of metadata Those that combine discovery with mapping Majority Some metadata discovery methods (literature) Mercury Grid based metadata services Library cataloguing Soeware architectures 13

16 Metadata Mapping/Matching Next step in characterizing source s data: harmonize and map source s metadata with central metadata stores. Metadata matching requires considering descrip#ve ar#facts (data types, rules, constraints and possible values) Lexical (exact, synonym, paierns) Seman#c matching of metadata ar#facts (e.g. date of birth vs. birth date) Sta#s#cal matching Classifica#on Granularity Elements vs. structure Schema, instance, element, structure, language or constraints. Schema matching vs. ontology alignment Kinds of inputs they use. Well- known mappers use a hybrid approach by relying on both descrip#ve and structural metadata. 14

17 Some Metadata Mapping/Matching Methods Similarity Flooding Uses similarity propaga#on Artemis (Analysis of Requirements: Tool Environment for Mul#ple Informa#on) Affinity- based analysis and hierarchical clustering of source schema elements Cupid Computes similarity coefficients with of a domain specific dic#onary COMA (COmbina#on of MAtching algorithms) Mul#ple matching algorithms that can evaluated against one another Protoplasm Industrial strength matcher S- Match Compares structures (e.g., XML schemas or ontologies) and returns seman#c rela#ons) Seman#c matching methods and tools for analysis: Metamap MTERMS ctakes LOINC mappers Machine learning approaches Combing indirect matches and direct matches Mul#faceted approaches Distributed seman#cs and seman#c indexing approaches 15

18 Metadata Repository 16

19 Aim 2 Incorporate into OpenFurther a state of art prototype to integrate these metadata discovery and mapping tools that support expert review and provide measurements to inform on the uncertainty factor for the mappings. 17

20 Metadata Discovery and Mapping Service (MDMS) Framework to programma#cally characterize metadata of a source. Will consist of an extensible library of metadata discovery and mapping methods and tools. Service oriented architecture. MDMS will accelerate mul#ple data services. 18

21 Architectural Overview of MDMS 19

22 Evalua2on Expert curated metadata Clinical (PHIS+) Environmental (EPA s Air Quality) Genomic (ClinVar) 20

23 Expect Results and Future Steps Selec#on a suitable discovery and mapping methodology and providing the results for storage in the MDR and TS. Automated mapping for less tolerant applica#ons, with the op#on of domain expert review of single or ranked matches. Future work Biomedical metadata discovery and mapping plaqorm that accelerated availability of data for repurposing. Adding addi#onal discovery and mapping methods. Authors can submit their methods for evalua#on and use. Learning from previous mappings stored in MDR and TS along with similarity measures using machine learning. 21

24 Timeline 22

25 Ques2ons 23

26 Thank You 24

27 Updates Action Items v Visit datamed.biocaddie.org and provide feedback for the new version, v0.2 via the prototype_issues repository in GitHub v Follow up with Chris Mungall to add him to the biocaddie GitHub organization Anu v Create a diagram showing biocaddie with Aztec and other repositories and share with the group Jeff/ Ian v Send a draft of the Core Dev. Team poster abstract to the group for review - Anu v Share the list of planned prototype feature implementations with the biocaddie team - CDT Supported by the NIH grant 1U24 AI to the University of California, San Diego 3

28 BD2K AHM Demo v Things to be accomplished before the demo: ü Updated indices (following WG3 specifications) for following repositories: u PDB,GEO,LINCS, BioProject, ArrayExpress, GEMMA, dbgap ü ü Integrated PPs implemented on the datamed server Implement new design completely, group metadata/facets for each dataset Supported by the NIH grant 1U24 AI to the University of California, San Diego 4

29 BD2K AHM Demo v Usability feedback to be collected ü ü survey? Recruit interested users for formal user testing (both in person and remote) v Details for the demo: ü ü Demo in the lobby? Demo in a separate room: Ø Ø For how many people? How many computers to be setup? Supported by the NIH grant 1U24 AI to the University of California, San Diego 5

30 BD2K AHM v LINCS face-to-face meeting: ü ü ü ü ü Contact with LINCS program manager (Ron) When do we meet? Do we need a room? Who will participate all biocaddie members or just CDT? Agenda follow up meeting to the earlier one Supported by the NIH grant 1U24 AI to the University of California, San Diego 6

31 Timeline Y2 Q1 Sep. Nov., 2015 Y2 Q2 Dec Feb., 2016 Sep., 2015 Oct., 2015 Nov., 2015 Dec., 2015 Jan., 2016 Feb., 2016 Interface design Global sta)s)cs New interface for prototype v0.2 Improving interface based on feedback Searching algorithms Finder similar datasets Add Boolean search Add advanced search Add data repositories search Ranking algorithms Refine search results based on user's selec)on Report from WG 8 Dataset result display Sort datasets Group metadata Accessibility of dataset Summarize all returned results (isee- DELVE) Allow users to select mul)ple repositories Improve faceted browsering Personalized search Search history Share search results User account Save search results Link dataset to external resources PubMed Grants (via PubMed?) 7

32 Timeline Y2 Q1 Sep. Nov., 2015 Y2 Q2 Dec Feb., 2016 Sep., 2015 Oct., 2015 Nov., 2015 Dec., 2015 Jan., 2016 Feb., 2016 Data inges=on Data inges)on and indexing Metadata mapping Terminology server Import ontology Create UI broswer Integrate to Scigraph API Integrate autocomplete feature to core UI Integra=on of pilot projects Integrate PP 1.1 GWAS Finder Integrate PP 2.2 isee- DELVE for pdb Integrate PP 2.1 DataRank for GEO Explore gene expression data using PP 2.2 Integrate PP 3.2 for PDB Integrate PP 2.1 for other dataset Feedback collec=on Github Feedback form Documenta=on Source codes on github Tutorials Usability Study UI Analysis User Study Track user's ac)ons Data duplica=on problem Metadata management Architecture/Scalability Code refactoring Back up 8

33 Ongoing work Task Status 1 Metadata Ingestion Import repositories PDB,GEO 2. LINCS 3. BioProject, ArrayExpress, GEMMA, dbgap 4. ICPSR Stable API details Ongoing Sample files 1.2 Metadata mapping Ongoing 1.3 Metadata management Ongoing 1.4 Indexing Ongoing 2 Terminology server 2.1 Develop terminology server 1) Imported terminologies (6) and validated them 2) Created UI-Browser for TS 3) Integration to Scigraph API 4) Create auto complete feature 09/01 10/09 Ongoing 10/ Integrate terminology server Ongoing Supported by the NIH grant 1U24 AI to the University of California, San Diego 9

34 Pilot project integration (Task 3) PP Presented to CDT / / / /01 As Integrated Specialized advanced search for GWAS datasets Ranking function based on citation metrics for GEO series data a) isee similarity metric in ElasticSearch b) DELVE implementation as exploratory search and visualization option. (i) for PDB (ii) for gene expression data Ranking function based on citation metrics (dataset mentions) for PDB data Completed On 09/22 10/21 9/01 Ongoing (12/31) Ongoing (11/30) Supported by the NIH grant 1U24 AI to the University of California, San Diego 10

35 Ongoing work Task Status 4 Interface Design 4.1 Global statistics Implemented 4.2 Design interface Ongoing 4.3 Implement new design Ongoing 4.4 Breadcrumb for website navigation Not started 4.5 Display most Accessed Datasets Not Started 5 Personalized search 5.1 Search history Implemented 5.2 Save search results Not Started 5.3 Share search results Not Started 5.4 User account - - Discussion Not Started 6 Searching/Ranking algorithms 6.1 Similar datasets Implemented 6.2 Data repositories search Ongoing 6.3 Boolean/advanced search Not Started 6.4 Refine search results based on user s selection Not Started Supported by the NIH grant 1U24 AI to the University of California, San Diego 11

36 Ongoing work Task Status 7 Display of results 7.1 Sort datasets Ongoing 7.2 What fields should be displayed? - Discussion 7.3 Browsing (grouping facets/metadata) Not started 7.4 Accessibility information Not started 8 Link to external resources 8.1 Pubmed Ongoing 8.2 Grants Ongoing 9 Feedback 9.1 GitHub Implemented 9.2 Feedback form Not Started 10 Documentation 10.1 Source code Not Started 10.2 Tutorials Not Started 11 Usability studies 11.1 UI Analysis Completed 11.2 User studies Not Started 12 Data Duplication issue Supported by the NIH grant 1U24 AI to the University of California, San Diego 1

37 Other issues v Please deposit codes in GitHub. Please contact me at Anupama.E.Gururaj@uth.tmc.edu if you need access v Any other issues? v Thank You

Metadata Discovery and Integration to Support Repurposing of Heterogeneous Data using the OpenFurther Platform

Metadata Discovery and Integration to Support Repurposing of Heterogeneous Data using the OpenFurther Platform Metadata Discovery and Integration to Support Repurposing of Heterogeneous Data using the OpenFurther Platform biocaddie All Hands Meeting September 11 th, 2016 Ram Gouripeddi & Julio Facelli Department

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Executive Committee Meeting

Executive Committee Meeting Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Steering Committee Meeting

Steering Committee Meeting Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Executive Committee Meeting

Executive Committee Meeting Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Metadata Ingestion and Processinng

Metadata Ingestion and Processinng biomedical and healthcare Data Discovery Index Ecosystem Ingestion and Processinng Jeffrey S. Grethe, Ph.D. 2017 BioCADDIE All Hands Meeting prototype Ingestion Indexing Repositories Ingestion ElasticSearch

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Executive Committee Meeting

Executive Committee Meeting Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Steering Committee Meeting

Steering Committee Meeting Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Agenda. Clarification of issues Quarter definition Steering and Executive Committee composition Dissemination and community outreach activities

Agenda. Clarification of issues Quarter definition Steering and Executive Committee composition Dissemination and community outreach activities Agenda Clarification of issues Quarter definition Steering and Executive Committee composition Dissemination and community outreach activities Progress and updates Y1Q3 and plans for Y1Q4 Plan for the

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Steering Committee Meeting

Steering Committee Meeting Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please isit: https://www.readytalk.com/account-administration/international-numbers

More information

The Final Updates. Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University of Oxford, UK

The Final Updates. Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University of Oxford, UK The Final Updates Supported by the NIH grant 1U24 AI117966-01 to UCSD PI, Co-Investigators at: Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University

More information

Steering Committee Meeting

Steering Committee Meeting Steering Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Minutes. Date: Location: UCSD BRF2 5A03. Attendees Present

Minutes. Date: Location: UCSD BRF2 5A03. Attendees Present Executive Committee Meeting Location: UCSD BRF2 5A03 Date: 8-16-16 Start time: 10:00 am PDT End time: 11:30 am PDT Meeting Objective Attendees Present Minute Taker Executive Committee Meeting UCSD: Lucila

More information

Executive Committee Meeting

Executive Committee Meeting Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

A Vision for Bigger Biomedical Data: Integration of REDCap with Other Data Sources

A Vision for Bigger Biomedical Data: Integration of REDCap with Other Data Sources A Vision for Bigger Biomedical Data: Integration of REDCap with Other Data Sources Ram Gouripeddi Assistant Professor, Department of Biomedical Informatics, University of Utah Senior Biomedical Informatics

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Susanna-Assunta Sansone, PhD. Metadata WG3 chair.

Susanna-Assunta Sansone, PhD. Metadata WG3 chair. Susanna-Assunta Sansone, PhD Metadata WG3 chair 3-workgroup@biocaddie.org WG3 Metadata v v Full description: goals, synergies, phases, members & files Joint effort with BD2K Center for Expanded Data Annotation

More information

Harmonizing biocaddie Metadata Schemas for Indexing Clinical Research Datasets Using Semantic Web Technologies

Harmonizing biocaddie Metadata Schemas for Indexing Clinical Research Datasets Using Semantic Web Technologies Harmonizing biocaddie Metadata Schemas for Indexing Clinical Research Datasets Using Semantic Web Technologies Harold R. Solbrig 1, Guoqian Jiang 1 1 Mayo Clinic College of Medicine, Rochester, MN [solbrig.harold,

More information

Visualizing Logical Dependencies in SWRL Rule Bases

Visualizing Logical Dependencies in SWRL Rule Bases Visualizing Logical Dependencies in SWRL Rule Bases Saeed Hassanpour, Mar:n J. O Connor and Amar K. Das Stanford Center for Biomedical Informa:cs Research MSOB X215, 251 Campus Drive, Stanford, California,

More information

Crea%ng and U%lizing Linked Open Sta%s%cal Data for the Development of Advanced Analy%cs Services E. Kalampokis, A. Karamanou, A. Nikolov, P.

Crea%ng and U%lizing Linked Open Sta%s%cal Data for the Development of Advanced Analy%cs Services E. Kalampokis, A. Karamanou, A. Nikolov, P. Crea%ng and U%lizing Linked Open Sta%s%cal Data for the Development of Advanced Analy%cs Services E. Kalampokis, A. Karamanou, A. Nikolov, P. Haase, R. Cyganiak, B. Roberts, P. Hermans, E. Tambouris, K.

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting Agenda v Updates regarding last meeting action items v Presentation by Ergin about Ontology Services v Brief updates from others Supported by the NIH grant 1U24

More information

What were his cri+cisms? Classical Methodologies:

What were his cri+cisms? Classical Methodologies: 1 2 Classifica+on In this scheme there are several methodologies, such as Process- oriented, Blended, Object Oriented, Rapid development, People oriented and Organisa+onal oriented. According to David

More information

Preliminary ACTL-SLOW Design in the ACS and OPC-UA context. G. Tos? (19/04/2016)

Preliminary ACTL-SLOW Design in the ACS and OPC-UA context. G. Tos? (19/04/2016) Preliminary ACTL-SLOW Design in the ACS and OPC-UA context G. Tos? (19/04/2016) Summary General Introduc?on to ACS Preliminary ACTL-SLOW proposed design Hardware device integra?on in ACS and ACTL- SLOW

More information

Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE

Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE eurocris September 9, 2013 Outline Data Challenges Metadata Solu=on DataONE addressing the Data Challenge Enabling Scien=fic Discovery

More information

The Vitro Integrated Ontology Editor and Seman5c Web Applica5on

The Vitro Integrated Ontology Editor and Seman5c Web Applica5on The Vitro Integrated Ontology Editor and Seman5c Web Applica5on Brian Lowe, Brian Caruso, Nick Cappadona, Miles Worthington, Stella Mitchell, Jon Corson- Rikert, and the VIVO Collabora5on Interna5onal

More information

Making Research Data Public: Why, What, and How. Fall 2016

Making Research Data Public: Why, What, and How. Fall 2016 Making Research Data Public: Why, What, and How Fall 2016 Research Data Service (RDS) The Research Data Service provides the Illinois research community with exper:se, tools, and infrastructure to manage

More information

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Course Goals To help you to understand search engines, evaluate and compare them, and

More information

Ontology engineering. Valen.na Tamma. Based on slides by A. Gomez Perez, N. Noy, D. McGuinness, E. Kendal, A. Rector and O. Corcho

Ontology engineering. Valen.na Tamma. Based on slides by A. Gomez Perez, N. Noy, D. McGuinness, E. Kendal, A. Rector and O. Corcho Ontology engineering Valen.na Tamma Based on slides by A. Gomez Perez, N. Noy, D. McGuinness, E. Kendal, A. Rector and O. Corcho Summary Background on ontology; Ontology and ontological commitment; Logic

More information

Ag Data Commons: Harnessing the Power of Digital Agriculture Cynthia Parr USDA ARS National Agricultural Library

Ag Data Commons: Harnessing the Power of Digital Agriculture Cynthia Parr USDA ARS National Agricultural Library Ag Data Commons: Harnessing the Power of Digital Agriculture Cynthia Parr USDA ARS National Agricultural Library Live poll at: https://pollev.com/ cyndyparr196 Problems with Public Ag Data Government Website

More information

CORPORATE PRESENTATION

CORPORATE PRESENTATION CORPORATE PRESENTATION Background on device detec/on (1/2) Identifying the capabilities of a device accessing web contents has been an extensively explored issue in the past years, in particular in the

More information

Clinical Metadata A complete metadata and project management solu6on. October 2017 Andrew Ndikom and Liang Wang

Clinical Metadata A complete metadata and project management solu6on. October 2017 Andrew Ndikom and Liang Wang A complete metadata and project management solu6on. October 2017 Andrew Ndikom and Liang Wang 1 Agenda How is metadata currently managed within the industry? Five key problems with current approaches.

More information

Data publication and discovery with Globus

Data publication and discovery with Globus Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,

More information

Leveraging Tools and Components from OODT and Apache within Climate Science and the Earth System Grid Federa9on

Leveraging Tools and Components from OODT and Apache within Climate Science and the Earth System Grid Federa9on Leveraging Tools and Components from OODT and Apache within Climate Science and the Earth System Grid Federa9on Luca Cinquini, Dan Crichton, Chris Ma2mann NASA Jet Propulsion Laboratory, California Ins9tute

More information

Welcome to the SIHO itransact portal.

Welcome to the SIHO itransact portal. Provider and Vendor Access Portal One stop access for your guide to utilizing SIHO s new itransact platform. Welcome to the SIHO itransact portal. Primary access codes will be given to key contacts at

More information

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University

CS6200 Informa.on Retrieval. David Smith College of Computer and Informa.on Science Northeastern University CS6200 Informa.on Retrieval David Smith College of Computer and Informa.on Science Northeastern University Course Goals To help you to understand search engines, evaluate and compare them, and

More information

Integrating Selenium with Confluence and JIRA

Integrating Selenium with Confluence and JIRA Integrating Selenium with Confluence and JIRA Open Source Test Management within Confluence, Automation of Selenium, Reporting, and Traceability Andrew Lampitt, Co-Founder Sanjiva Nath, CEO and Founder

More information

Feed the Future Innovation Lab for Peanut (Peanut Innovation Lab) Data Management Plan Version:

Feed the Future Innovation Lab for Peanut (Peanut Innovation Lab) Data Management Plan Version: Feed the Future Innovation Lab for Peanut (Peanut Innovation Lab) Data Management Plan Version: 20180316 Peanut Innovation Lab Management Entity The University of Georgia, Athens, Georgia Feed the Future

More information

Robust Identification of Fuzzy Duplicates

Robust Identification of Fuzzy Duplicates Robust Identification of Fuzzy Duplicates ì Authors: Surajit Chaudhuri (Microso3 Research) Venkatesh Gan; (Microso3 Research) Rajeev Motwani (Stanford University) Publica;on: 21 st Interna;onal Conference

More information

Object Oriented Design (OOD): The Concept

Object Oriented Design (OOD): The Concept Object Oriented Design (OOD): The Concept Objec,ves To explain how a so8ware design may be represented as a set of interac;ng objects that manage their own state and opera;ons 1 Topics covered Object Oriented

More information

CoG: The NEW ESGF WEB USER INTERFACE

CoG: The NEW ESGF WEB USER INTERFACE CoG: The NEW ESGF WEB USER INTERFACE ESGF F2F Workshop, Livermore, CA, December 2014 Luca Cinquini [1], Cecelia DeLuca [2], Sylvia Murphy [2] [1] California Ins/tute of Technology & NASA Jet Propulsion

More information

Data Curation Profile Human Genomics

Data Curation Profile Human Genomics Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date

More information

Executive Summary for deliverable D6.1: Definition of the PFS services (requirements, initial design)

Executive Summary for deliverable D6.1: Definition of the PFS services (requirements, initial design) Electronic Health Records for Clinical Research Executive Summary for deliverable D6.1: Definition of the PFS services (requirements, initial design) Project acronym: EHR4CR Project full title: Electronic

More information

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017 Update HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017 1 AGENDA DRS DRS DRS Architecture DRS DRS DRS Work 2 COLLABORATIVELY MANAGED DRS Business Owner Digital

More information

System Modeling Environment

System Modeling Environment System Modeling Environment Requirements, Architecture and Implementa

More information

Introduc)on to Knowledge Graphs and Rich Seman)c Search. Peter Haase, metaphacts Barry Norton, Bri4sh Museum Denny Vrandečić, Google / Wikimedia

Introduc)on to Knowledge Graphs and Rich Seman)c Search. Peter Haase, metaphacts Barry Norton, Bri4sh Museum Denny Vrandečić, Google / Wikimedia Introduc)on to Knowledge Graphs and Rich Seman)c Search Peter Haase, metaphacts Barry Norton, Bri4sh Museum Denny Vrandečić, Google / Wikimedia Speaker Introduc4on A Knowledge Graph Perspec3ve Outline

More information

Commi&ng to Data Quality

Commi&ng to Data Quality Commi&ng to Data Quality Ann Green Digital Lifecycle Research & Consul;ng NADDI Vancouver 2014 outline Data Quality Building the DDI ShiGs Crisis of Quality & Loss of Data Commi&ng to Data Quality Data

More information

Optimizing the Use of Data Standards CSS Summary

Optimizing the Use of Data Standards CSS Summary Optimizing the Use of Data Standards CSS Summary PhUSE Webinar 26 April 2017 Co-Leads: Susan Kenny (Maximum Likelihood) Jane Lozano (Eli Lilly) Best Prac*ces for Data Collec*on Instruc*ons Project Lead:

More information

Ar#ficial Intelligence

Ar#ficial Intelligence Ar#ficial Intelligence Advanced Searching Prof Alexiei Dingli Gene#c Algorithms Charles Darwin Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for

More information

Web Linked Data (RDF, Seman3c Web, Web of Data)

Web Linked Data (RDF, Seman3c Web, Web of Data) Web Linked Data (RDF, Seman3c Web, Web of Data) Graham Klyne e-research Centre, University of Oxford hep://annalist.net My background Involved in RDF/seman3c web/linked data for many years (and through

More information

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond Alessia Bardi and Paolo Manghi, Institute of Information Science and Technologies CNR Katerina Iatropoulou, ATHENA, Iryna Kuchma and Gwen Franck, EIFL Pedro Príncipe, University of Minho OpenAIRE Fostering

More information

Provenance Manager: PROV-man an Implementation of the PROV Standard. Ammar Benabadelkader Provenance Taskforce Budapest, 24 March 2014

Provenance Manager: PROV-man an Implementation of the PROV Standard. Ammar Benabadelkader Provenance Taskforce Budapest, 24 March 2014 Provenance Manager: PROV-man an Implementation of the PROV Standard Ammar Benabadelkader Provenance Taskforce Budapest, 24 March 2014 Outlines Motivation State-of-the-art PROV-man The Approach, the data

More information

Assessing Medical Device. Cyber Risks in a Healthcare. Environment

Assessing Medical Device. Cyber Risks in a Healthcare. Environment Assessing Medical Device Medical Devices Security Cyber Risks in a Healthcare Phil Englert Director Technology Operations Environment Catholic Health Ini

More information

Challenges on Developing Tools for Exploi=ng Linked Open Data Cubes. Kalampokis, Roberts, Karamanou, Tambouris, Tarabanis, Hermans

Challenges on Developing Tools for Exploi=ng Linked Open Data Cubes. Kalampokis, Roberts, Karamanou, Tambouris, Tarabanis, Hermans Challenges on Developing Tools for Exploi=ng Linked Open Data Cubes Kalampokis, Roberts, Karamanou, Tambouris, Tarabanis, Hermans Bill Roberts @billroberts hcp://swirrl.com 2 3 4 Pilot partners: government

More information

Dataverse 4.0 & Beyond. Eleni Castro > Ins/tute for Quan/ta/ve Social Science (IQSS), Harvard University

Dataverse 4.0 & Beyond. Eleni Castro > Ins/tute for Quan/ta/ve Social Science (IQSS), Harvard University Dataverse 4.0 & Beyond ì Eleni Castro > Ins/tute for Quan/ta/ve Social Science (IQSS), Harvard University 2 Data Science Team Data Cura/on & Stewardship Informa/on Scien/sts Researchers Sta/s/cal Innova/on

More information

Prototyping a Biomedical Ontology Recommender Service

Prototyping a Biomedical Ontology Recommender Service Prototyping a Biomedical Ontology Recommender Service Clement Jonquet Nigam H. Shah Mark A. Musen jonquet@stanford.edu 1 Ontologies & data & annota@ons (1/2) Hard for biomedical researchers to find the

More information

Con$nuous Integra$on Development Environment. Kovács Gábor

Con$nuous Integra$on Development Environment. Kovács Gábor Con$nuous Integra$on Development Environment Kovács Gábor kovacsg@tmit.bme.hu Before we start anything Select a language Set up conven$ons Select development tools Set up development environment Set up

More information

WheatIS: Progress report

WheatIS: Progress report WheatIS: Progress report WheatIS Annual meeting, San Diego, 9 January 2015 WheatIS data submission DSpace Beta-version to test: http://urgi.versailles.inra.fr/xmlui/ At the moment, available submission

More information

REDCap Best Prac/ces. ITHS Biomedical Informa2cs Core Bas de Veer MS Research Consultant

REDCap Best Prac/ces. ITHS Biomedical Informa2cs Core Bas de Veer MS Research Consultant REDCap Best Prac/ces ITHS Biomedical Informa2cs Core iths_redcap_admin@uw.edu Bas de Veer MS Research Consultant REDCap version: 6.4.0 Last updated February 10, 2015 1 Goals & Agenda Goals Understanding

More information

Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research

Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research Ian Fore, D.Phil. Associate Director, Biorepository and Pathology Informatics Senior Program

More information

Getting DCIM Right the First or Second Time Around. PRESENTED BY Chris James CEO, DCIMPro

Getting DCIM Right the First or Second Time Around. PRESENTED BY Chris James CEO, DCIMPro Getting DCIM Right the First or Second Time Around. PRESENTED BY Chris James CEO, DCIMPro Agenda: What are the Core Elements of DCIM? What is DCIM and why? The DCIM Maturation Model What is a Successful

More information

The informa(on model at Banco de Portugal: innova(ve and flexible data solu(ons

The informa(on model at Banco de Portugal: innova(ve and flexible data solu(ons The informa(on model at Banco de Portugal: innova(ve and flexible data solu(ons João Cadete de Matos Director, Sta1s1cs Department 15 May 2014 CEMLA Mee(ng on Financial Informa(on Needs for Sta(s(cs, Macropruden(al

More information

Data Exchange and Conversion Utilities and Tools (DExT)

Data Exchange and Conversion Utilities and Tools (DExT) Data Exchange and Conversion Utilities and Tools (DExT) Louise Corti, Angad Bhat, Herve L Hours UK Data Archive CAQDAS Conference, April 2007 An exchange format for qualitative data Data exchange models

More information

Applying Data Visualiza0on to Analyze Ebola Call Center Trends

Applying Data Visualiza0on to Analyze Ebola Call Center Trends eh ea lth A F R I C A Applying Data Visualiza0on to Analyze Ebola Call Center Trends Sara Brown, MPH, CBIP Associate Crow Insight www.crowinsight.com Overview ehealth Africa & its role in figh=ng the Ebola

More information

Digital repositories as research infrastructure: a UK perspective

Digital repositories as research infrastructure: a UK perspective Digital repositories as research infrastructure: a UK perspective Dr Liz Lyon Director This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 UKOLN is supported by: Presentation

More information

represen/ng the world in 1s and 0s CS 4100/5100 Founda/ons of AI

represen/ng the world in 1s and 0s CS 4100/5100 Founda/ons of AI represen/ng the world in 1s and 0s CS 4100/5100 Founda/ons of AI Announcements Assignment 2 clarifica/ons Final projects: what s next? Feedback Project Proposal Midterm Exam: October 18th ASP CLARIFICATIONS

More information

RTP Taxonomy & Rela.onships

RTP Taxonomy & Rela.onships RTP Taxonomy & Rela.onships dra%- lennox- raiarea- rtp- grouping- taxonomy- 03 IETF 88 @Authors 1 Changes Since - 02 Major re- write Sec.on 2, Concepts, re- structured to a conceptual media chain with

More information

Big Data, Big Compute, Big Interac3on Machines for Future Biology. Rick Stevens. Argonne Na3onal Laboratory The University of Chicago

Big Data, Big Compute, Big Interac3on Machines for Future Biology. Rick Stevens. Argonne Na3onal Laboratory The University of Chicago Assembly Annota3on Modeling Design Big Data, Big Compute, Big Interac3on Machines for Future Biology Rick Stevens stevens@anl.gov Argonne Na3onal Laboratory The University of Chicago There are no solved

More information

Chunking: An Empirical Evalua3on of So7ware Architecture (?)

Chunking: An Empirical Evalua3on of So7ware Architecture (?) Chunking: An Empirical Evalua3on of So7ware Architecture (?) Rachana Koneru David M. Weiss Iowa State University weiss@iastate.edu rachana.koneru@gmail.com With participation by Audris Mockus, Jeff St.

More information

Distributed Research Networks: Lessons from the Field

Distributed Research Networks: Lessons from the Field Distributed Research Networks: Lessons from the Field The Learning System Summit Sponsored by the Joseph H. Kanter Family Founda8on The Na(onal Press Club Washington, DC May 17-18, 2012 Jeffrey Brown,

More information

The IEEE Metadata Standard for Supporting Big Data Management

The IEEE Metadata Standard for Supporting Big Data Management The IEEE Metadata Standard for Supporting Big Data Management Alex MH Kuo 1,2 (Ph.D) 1 School of Health Information Science University of Victoria, BC, Canada. 2 CEDAR, School of Medicine University of

More information

eveloping DataMed the current status

eveloping DataMed the current status eeloping DataMed the current status Hua Xu Core Deelopment Team (CDT) biocaddie AHM 2017 8/8/17 Supported by the NIH grant 1U24 AI117966-01 to the Uniersity of California, San Diego 1 Outline CDT Roles

More information

The DCGS- A ontology suite Standard opera8ng procedures and ontology quality assurance Annota8on vs. Explica8on How the DCGS- A ontologies are being

The DCGS- A ontology suite Standard opera8ng procedures and ontology quality assurance Annota8on vs. Explica8on How the DCGS- A ontologies are being Ron Rudnicki November 12, 2013 The DCGS- A ontology suite Standard opera8ng procedures and ontology quality assurance Annota8on vs. Explica8on How the DCGS- A ontologies are being used for the explica8on

More information

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System Ilkay Al(ntas and Daniel Crawl San Diego Supercomputer Center UC San Diego Jianwu Wang UMBC WorDS.sdsc.edu Computa3onal

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

CONTENTdm Users Group Meeting, May 2014 CONTENTdm Users Group Meeting, May 2014

CONTENTdm Users Group Meeting, May 2014 CONTENTdm Users Group Meeting, May 2014 University of South Carolina Scholar Commons CONTENTdm Users Group Meeting, May 2014 CONTENTdm Users Group Meeting, May 2014 CONTENTdm Update Christian Sarason OCLC Follow this and additional works at:

More information

Decision Support Systems

Decision Support Systems Decision Support Systems 2011/2012 Week 3. Lecture 6 Previous Class Dimensions & Measures Dimensions: Item Time Loca0on Measures: Quan0ty Sales TransID ItemName ItemID Date Store Qty T0001 Computer I23

More information

Call for Participation in AIP-6

Call for Participation in AIP-6 Call for Participation in AIP-6 GEOSS Architecture Implementation Pilot (AIP) Issue Date of CFP: 9 February 2013 Due Date for CFP Responses: 15 March 2013 Introduction GEOSS Architecture Implementation

More information

Bioinforma)cs Resources

Bioinforma)cs Resources Bioinforma)cs Resources Lecture & Exercises Prof. B. Rost, Dr. L. Richter, J. Reeb Ins)tut für Informa)k I12 Bioinforma)cs Resources Organiza)on Schedule Overview Organiza)on Lecture: Friday 9-12, i.e.

More information

CGMS WMO Task Force on Metadata Implementation Progress. Presented to IPET-SUP agenda item 12.2

CGMS WMO Task Force on Metadata Implementation Progress. Presented to IPET-SUP agenda item 12.2 CGMS WMO Task Force on Metadata Implementation Progress Presented to IPET-SUP 17-03-2015 agenda item 12.2 Overview CGMS-WMO-Task Force on Metadata Implementation Refresher Why such Task Force Task Force

More information

A formal design process, part 2

A formal design process, part 2 Principles of So3ware Construc9on: Objects, Design, and Concurrency Designing (sub-) systems A formal design process, part 2 Josh Bloch Charlie Garrod School of Computer Science 1 Administrivia Midterm

More information

NCBI News, November 2009

NCBI News, November 2009 Peter Cooper, Ph.D. NCBI cooper@ncbi.nlm.nh.gov Dawn Lipshultz, M.S. NCBI lipshult@ncbi.nlm.nih.gov Featured Resource: New Discovery-oriented PubMed and NCBI Homepage The NCBI Site Guide A new and improved

More information

Reducing Consumer Uncertainty

Reducing Consumer Uncertainty Spatial Analytics Reducing Consumer Uncertainty Towards an Ontology for Geospatial User-centric Metadata Introduction Cooperative Research Centre for Spatial Information (CRCSI) in Australia Communicate

More information

The Linked Data Value Chain Model: A Methodology for Information Integration and Orchestration

The Linked Data Value Chain Model: A Methodology for Information Integration and Orchestration Na/onal Research University Higher School of Economics The Linked Data Value Chain Model: A Methodology for Information Integration and Orchestration Daniel Hladky Semantic Web Lab at HSE/W3C 28 November

More information