Deliverable 8.2. Project ID Project Title. Project Acronym. Start Date of the Project. Duration of the Project. Work Package Number 8
|
|
- Hilda Cooper
- 5 years ago
- Views:
Transcription
1 Deliverable 8.2 Project ID Project Title Project Acronym Start Date of the Project Duration of the Project A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype data PhenoMeNal 1st September Months Work Package Number 8 Work Package Title Deliverable Title Delivery Date Work Package leader Contributing Partners Authors Data provenance, compliance, and integrity D8.2 Modularised ISA model and format:biospecimen centric schema, corresponding xml schemas, reference implementation guidelines and validation rules M24 UOXF UOXF, EMBL-EBI,ICL. Philippe Rocca-Serra, Susanna-Assunta Sansone, Reza Salek, Kenneth Haug, Namrata Kale, Jake Pearce, Noureddin Saddawi, David Johnson, Alejandra Gonzalez- Beltran. Abstract: ISA representation is the data structure used by EMBL-EBI MetaboLights repository for metabolomic studies metadata. The format is also adopted by data-focussed publishers to handle datasets, such as Oxford University Press GigaScience and Springer Nature s Scientific Data). The initial format specifies rather informally the underlying model and how the syntactic elements are related to each other. The work presented here summarized how a set of JSON schemata, support JSON-LD context file for full semantic representations, as well as a clinical data set-orientated ISA configuration has been developed to produce a machine-readable serialization of the ISA model. Furthermore, this deliverable presents the latest developments of the ISA-API implementating the set of coding recommendations adopted by PhenoMeNal. EXECUTIVE SUMMARY 3 1
2 DETAILED REPORT OF THE DELIVERABLE 4 1. Creation of a machine-readable ISA model Background: Implementation: Normative documentations in readthedoc format Reference implementation: the ISA-API 6 2. Declaration of study design related information Background Coding Patterns and Recommendations Implementation 8 3. Declaration of Ethics and Legal Information Background Coding Patterns and Recommendations Implementation Declaration of Quality Control Elements Background Coding Patterns and Recommendation and Implementation Declaration of instrument vendor format and preprocessed data Background Coding Patterns, Recommendations and Implementation 16 WORK PLAN 16 DELIVERY AND SCHEDULE 17 CONCLUSION 17 2
3 1 EXECUTIVE SUMMARY The H2020 PhenoMeNal e-infrastructure project aims to deliver a scalable, robust and standards-compliant infrastructure for clinical phenotyping by means of metabolomics techniques. The main goal of the deliverable is to provide a formal, machine-readable specification of the ISA model, aiming at delivering a more prescriptive representation of experimental study metadata than those currently available from the initial ISA-Tab normative documents released in DETAILED REPORT OF THE DELIVERABLE 3
4 Creation of a machine-readable ISA model Background: The ISA specifications initially released in did not provide machine readable, formal representations, thus making it difficult for developers and implementers to built compliant tools owing to the potential risks associated with interpreting a textual description of a syntax specifications. The goal of the deliverable was to eliminate those shortcomings by producing a machine-readable serialization of the ISA model, to establish the foundation for a robust metadata management tracking and quality assessment for the PhenoMeNal project. Implementation: JSON Schema representation of ISA model WP8 has delivered an exhaustive and normative representation of the ISA syntax relying on JSON schema technology. A set of 21 JSON schemata have been produced, representing each of the core objects underlying the ISA tabular syntax (see figure 1); the work is available from the ISA github repository
5 Figure 1. An overview of the modular, JSON schema-based formal representation of the ISA model. Normative documentations in readthedoc format To further support the JSON schema representation, a full set of documentation has been released in the form of readthedoc microsite 3 (see Figure 2), 3 5
6 Figure 2. A screenshot showing the dedicated microsite based on the readthedoc approach. The site provides an up to date online resource for guiding users through the ISA specifications and serves information about the different types of serializations (tabular or JSON). Reference implementation: the ISA-API Built on top of the JSON schema definition, the ISA-API provides a set of tools to manipulate ISA objects, parse ISA documents in Tabular or JSON formats, build an object 6
7 representation and a graph data structure, which allows fast traversal of information, as well as validation against the syntax. Additional components have been added to further validate information supplied in ISA syntax. These additional validation steps are required in particular by conversion modules to third party formats. These are formats used by public repositories for non-metabolomics related omics data types such as transcriptomics and genomics. The ISA- API converters allow data processing for repositories such as EMBL-EBI ArrayExpress or EMBL-EBI Short Read Archive. Both repositories have specific annotation needs, which need to be dealt with. The ISA-API is available from the GitHub repository 4, Figure 3. A screenshot of the ISA-API GitHub repository, which provides documentation and assistance to developers that wish to use or contribute to the work. Declaration of study design related information 4 7
8 Background A systematic analysis of EMBL-EBI Metabolights has been performed as means to test the validity and efficiency of PhenoMenal workflows. The results indicate that nearly 50% of ISA archives served by the European repository contain a syntactic or structural error. A finer review shows that 25% fail a basic syntactic validation invoking the relevant function from ISA-API. Another 24% reveals errors, when tested for semantic content using additional function from the ISA-API. The errors belong to 2 distinct classes but both affect automatic handling of the document by analysis workflows. The first type of error corresponds to the declaration of spurious ISA Experimental Factors, which are meant to represent independent variables as declared by experimentalists. They are therefore meant to encompass a range of discrete values. A simple inspection of the ranges exposes the errors. In a number of cases, submitters and curators confuse independent variables and covariates (e.g. typically experiments declaring more than 6 factors should attract suspicion: e.g. MTBLS124 or MTBLS93 with combinations). The second type of error is structural and corresponds to a failure to properly represent the underlying relationships between subject-derived material and data acquisition events. This type of error is harder to detect as the ISA-Tab documents are syntactically valid however the information representation is erroneous. This leads to failure to properly determine sample size and therefore computation cannot be automated or if it passes the checks would produce erroneous results. Coding Patterns and Recommendations As pointed out already, the possibilities offered by PhenoMeNal workflows in terms of batch processing public datasets highlighted problems in annotation consistency. The ISA-API function to pull datasets from NIH Metabolomics Workbench, the US counterpart for EMBL-EBI MetaboLights also revealed metadata elements absent from MetaboLights metadata. To address these issues and converge towards a common set of descriptors, the PhenoMeNal ISA configuration now provided several new fields in the Study Design Descriptor Section of ISA document to report key summary information, which can help data discovery. These are summarized below. Augmented annotation for study designs Study Design Ontology Terms: {full factorial design, fractional factorial design} Comment[number of factor level] Comment[study subject count] Comment[number of treatment groups] Implementation 8
9 The ISA-API now implements several curation functions to detect and, to some extent, correct errors of the type mentioned above. At least, the newly developed functions provide curators additional information to direct their actions. The functions deliver simple yet effective means to significantly increase the quality of ISA archive documents generated by submission tools and pipeline. The ISA-API is now augmented with a creation mode, which can be used to bootstrap the creation of ISA documents by using study design information from users. The main feature of the code behind the functionality is the reliance of patterns, specific to a number of experimental designs (e.g. factorial design, balanced design, repeated measure designs), which can be applied irrespective of the type of data acquisition used. While the initial function has been developed to primarily support the reporting the MS and NMR based studies, we demonstrated the portability of the approach by implementing support for DNA microarray and next generation sequencing based signal acquisition. The code is available from the following github repository 5. To further demonstrate the benefits of the component, a series of jupyter notebooks have been built and can be used as basis for tutorial and training. A youtube video is being prepared and will be released as an update to this deliverable. The code is available from the following github repository 6 (see also Figure 4). Figure 4: A screenshot of the ISA-tools github repository showing six ipython notebooks devised to showcase the ISA-API capability to support MS and NMR based metabolomics studies but also applications of molecular biology technique for transcriptomics and genomics studies
10 As indicated above, the ISA-API can support multi-omics datasets owing to its native support for an array of molecular biology technique, a benefit from the modular approach allowed by the new ISA JSON schemata and the reliance on ISA configurations (see figure 5). Figure 5: A detailed view of the ipython notebook showing mobilisation of ISA-API create mode to rapidly generate an ISA archive document based on study design information prompted by the toolkit from users. This example shows how this is a applied to the specific case of NMR based data acquisition. 10
11 Declaration of Ethics and Legal Information Background In order to comply with EU regulations on data protection, privacy and ethics, metadata descriptors covering terms of use, consent availability and additional ancillary information has to be provided as part of study archives. Such information, when present in the ISA structured Metabolomics study metadata, could be used in downstream workflows to check whether requesters have the relevant privileges against the dataset s actual terms of use. Having such information embedded in ISA documents would greatly enhance the possibilities of devising a set of safety checks in additions to those already in place. Coding Patterns and Recommendations ISA documents can be annotated with data use information, implementing a series of ISA Comment fields holding values selected from the DUO, the data use ontology 7,8 values. The practice is in agreement with efforts carried out in the context of the Global Alliance for Genomic Health (GA4GH). 9 An ISA configuration file for clinical context has been amended and posted to Github: It is now the pattern to follow to report: - terms of use - ethical committee name - ethical committee project identification - url to data access committee - information about patient consent availability Implementation ISA configurations relevant to human patient based studies have been updated accordingly. The code is available from the GitHub repository "The Data Use Ontology - EMBL-EBI." 20 Feb. 2017, Accessed 31 Aug "The Data Use Ontology - The OBO Foundry." Accessed 31 Aug "Global Alliance for Genomics and Health." Accessed 31 Aug
12 Figure 6. panel a: an overview of the PhenoMeNal specific ISA configuration, documenting the implementation guidelines and patterns. 12
13 Figure 6. Panel b: The investigation.xml. 13
14 Figure 6. panel c: studysample.xml hold the ELSI related annotation elements in the form the ISA Comment fields configured to hold Data Use Ontology (DUO) terms (shown highlighted in pale yellow). Declaration of Quality Control Elements Background 14
15 It has been pointed out that metabolomics signal can only be properly analyzed if all the contextual information surrounding data acquisition events is provided. In particular, the reporting of all controls injected alongside test samples in mass spectrometry applications constitutes an essential quality insurance element. Up until now no requirements were made on users to deposit such data by public repositories. There was therefore no guideline for submission. The work reported in this deliverable closes this gap. Coding Patterns and Recommendation and Implementation Introduction the Terminology for reporting QC as established by the working group 8 and reported earlier. The terminology defined then is now available for us in the form of a new version of the ISA configuration for handling patient based datasets. The configuration is available from the ISA-tools github repository. More specifically, the element to consider is the following i. header="material Type", required=true <list-values>specimen,long-term reference,external long-term reference,study reference,dilution series reference,standard reference material,normal blank,reagent blank,sample preparation blank,batch terminus,negative control reference (blank),positive control reference (standard)</list-values> In addition, the ISA-API create mode is being augmented with a function allowing the reporting of all control elements following the definition and declaration of the experimental plan. This feature has been developed based on a similar function present in Mastr-MS and following discussions with Dr Saravanan Dayalan (Australia, Queensland university). NOTE: This work is not complete at time of the deliverable but the implementation is ongoing. Declaration of instrument vendor format and preprocessed data Background Instrument generated data most often come in vendor specific formats, some of which are organized in a directory structure. Such is the case for Bruker NMR data. While open standard formats exist to produce a vendor neutral equivalent information file (.e.g mzml in mass spectrometry context, nmrml in nuclear magnetic resonance context), there is a demand for keeping native instrument output. This request by users and metabolomics practitioners highlighted the need for coding guidelines for representing both types of raw information. In ISA format, a specific syntactic element (Raw Data File) exists but it is not repeatable. In other words, only one occurrence of the field element may appear in any one ISA assay 15
16 table. Therefore users need guidance to represent vendor native data as well as vendor neutral corresponding information. Coding Patterns, Recommendations and Implementation Reporting Vendor Specific Formatted Data Only (no conversion to open standard) Use of ISA Raw Data File and provide vendor file as tar.gz file and a ISA Comment[vendor] Field Reporting both Vendor Specific and Open Standard converted data: ISA NMR: Bruker files ISA Raw Data File: to provide uri to zip, md5 checksummed vendor file ISA Derived Data File to provide URI to nmrml zipped and checksummed file. MS: Waters files ISA Raw Data File: to provide uri to zip, md5 checksummed vendor file ISA Raw Data File: to provide uri to zip, md5 checksummed vendor file ISA Derived Data File to provide URI to mzml zipped and checksummed file. WORK PLAN Consistent with the work so far and building on it, WP8 s attention for deliverable D8.4 has been focused on collecting the needs from the community and practitioners, regularly meeting with them and reaching out for input in order to shape specifications attuned to use cases. The actual delivery of the standardization format and supporting documentation is meant to take place in the remaining tasks and deliverables (T8.4 and D8.4.1-D8.4.2, on month 24 and month 30 respectively. Objectives O8.1 Define metadata and data exchange standards, along with technical and user documentations. O8.2 Implement and maintain PhenoMeNal reference implementations. Tasks T8.1: Use cases and state of the art of communication standards T8.2: Standards for exchanging experimental and clinical metadata T8.3: Data standards exchange formats T8.4: Harmonization of data matrices and analytical results 16
17 T8.5: Maintain documentation and disseminate information Deliverables D8.1 Report on community standards for reporting, access and integrity supported in the PhenoMeNal grid; to be disseminated in a dedicated BioSharing page and via the project website. (M12) D8.2: Modularized ISA model and format: biospecimen centric schema, corresponding xml schemas, reference implementation guidelines and validation rules. (M24) D8.3: nmrml, mzml data exchange formats and associated terminologies for instrument raw, with reference implementation guidelines and validation rules. (M18) D8.4: Signal processing and analysis data exchange format D8.4.1: Specifications for derived data matrices, specifications and terminology for description of analysis and statistical results (M24) D8.4.2: Reference implementation guidelines and validation rules (M30) DELIVERY AND SCHEDULE The delivery is delayed: No CONCLUSION The delivery of the machine readable specifications of the ISA syntax along with the supporting ISA-API for read/write calls, syntactic validation and content checking, as well as conversion capabilities to other formats gives metabolomics users powerful tools for structuring experimental metadata to high levels of quality. The recent addition of a create mode to boostrap creation of ISA documents adds a unique capability, that of moving data management from a retrospective activity to a prospective one. Indeed, the ISA-API create function enables the creation of prepopulation metadata capture templates by placing the notion of study design at the core of the reporting. The coming months will see deeper validation and testing of the approach by deploying the API with several Phenome Centres. 17
A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype data. 1st September 2015
Deliverable 8.4.1 Project ID 654241 Project Title Project Acronym Start Date of the Project Duration of the Project A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype
More informationDeliverable A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype data. 1st September 2015
Deliverable 9.5.1 Project ID 654241 Project Title A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype data. Project Acronym Start Date of the Project PhenoMeNal
More informationThe Final Updates. Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University of Oxford, UK
The Final Updates Supported by the NIH grant 1U24 AI117966-01 to UCSD PI, Co-Investigators at: Philippe Rocca-Serra Alejandra Gonzalez-Beltran, Susanna-Assunta Sansone, Oxford e-research Centre, University
More informationDeliverable 6.3. A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype data. 1st September 2015
Deliverable 6.3 Project ID 654241 Project Title Project Acronym Start Date of the Project Duration of the Project Work Package Number Work Package Title Deliverable Title Delivery Date Work Package leader
More informationReal converters, parsers & validators for NMR-ML. Standards Development. WP leader: Steffen Neumann IPB
Deliverable D2.5 Project Title: Developing an efficient e-infrastructure, standards and dataflow for metabolomics and its interface to biomedical and life science e-infrastructures in Europe and world-wide
More informationScientific Research Data Management Policy
Scientific Research Data Management Policy DOCUMENT SUMMARY Document No. SRDMP-0001 Ref. Document Title Author(s) Policy Sponsor Scientific Research Data Management Policy Karen Ambrose Alison Davis DOCUMENT
More informationDeveloping a Research Data Policy
Developing a Research Data Policy Core Elements of the Content of a Research Data Management Policy This document may be useful for defining research data, explaining what RDM is, illustrating workflows,
More informationPackage Risa. November 28, 2017
Version 1.20.0 Date 2013-08-15 Package R November 28, 2017 Title Converting experimental metadata from ISA-tab into Bioconductor data structures Author Alejandra Gonzalez-Beltran, Audrey Kauffmann, Steffen
More informationSupplementary Note-- Williams et al The Image Data Resource: A Bioimage Data Integration and Publication Platform
Supplementary Note-- Williams et al The Image Data Resource: A Bioimage Data Integration and Publication Platform 1. Exploring the IDR This current IDR web user interface (WUI) is based on the open source
More informationELIXIR Human Data Use Case
ELIXIR Human Data Use Case Mikael Borg, ELIXIR Sweden ELIXIR-EXCELERATE is funded by the European Commission within the Research Infrastructures programme of Horizon 2020, grant agreement number 676559.
More informationDeliverable 8.3. A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype data. 1st September 2015
Deliverable 8.3 Project ID 654241 Project Title Project Acronym Start Date of the Project Duration of the Project Work Package Number Work Package Title Deliverable Title Delivery Date Work Package leader
More informationISAcreator V User Guide. User Guide: V1.3.2 February2011 Contact: Download:
ISAcreator V 1.3.2 User Guide User Guide: V1.3.2 February2011 Contact: isatools@googlegroups.com Download: http://isa-tools.org 1 USER GUIDE LIST OF IMPROVEMENTS Improved user interface - addition of pure
More informationD WSMO Data Grounding Component
Project Number: 215219 Project Acronym: SOA4All Project Title: Instrument: Thematic Priority: Service Oriented Architectures for All Integrated Project Information and Communication Technologies Activity
More informationOpenBudgets.eu: Fighting Corruption with Fiscal Transparency. Project Number: Start Date of Project: Duration: 30 months
OpenBudgets.eu: Fighting Corruption with Fiscal Transparency Project Number: 645833 Start Date of Project: 01.05.2015 Duration: 30 months Deliverable 4.1 Specification of services' Interfaces Dissemination
More informationData Curation Handbook Steps
Data Curation Handbook Steps By Lisa R. Johnston Preliminary Step 0: Establish Your Data Curation Service: Repository data curation services should be sustained through appropriate staffing and business
More informationEUROPEAN MEDICINES AGENCY (EMA) CONSULTATION
EUROPEAN MEDICINES AGENCY (EMA) CONSULTATION Guideline on GCP compliance in relation to trial master file (paper and/or electronic) for content, management, archiving, audit and inspection of clinical
More informationHow to store and visualize RNA-seq data
How to store and visualize RNA-seq data Gabriella Rustici Functional Genomics Group gabry@ebi.ac.uk EBI is an Outstation of the European Molecular Biology Laboratory. Talk summary How do we archive RNA-seq
More informationExecutive Summary for deliverable D6.1: Definition of the PFS services (requirements, initial design)
Electronic Health Records for Clinical Research Executive Summary for deliverable D6.1: Definition of the PFS services (requirements, initial design) Project acronym: EHR4CR Project full title: Electronic
More informationAnalytics Toolkit - Final Deployment
The NOMAD (Novel Materials Discovery) Laboratory a European Centre of Excellence Analytics Toolkit - Final Deployment Deliverable No: 4.3 Expected Delivery Date: 31/10/2017, M24 Actual Delivery Date: 06/12/2017,
More informationThe Clinical Data Repository Provides CPR's Foundation
Tutorials, T.Handler,M.D.,W.Rishel Research Note 6 November 2003 The Clinical Data Repository Provides CPR's Foundation The core of any computer-based patient record system is a permanent data store. The
More informationDeliverable A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype data.
Project ID 654241 Deliverable 9.2.3 Project Title Project Acronym A comprehensive and standardised e-infrastructure for analysing medical metabolic phenotype data. PhenoMeNal Start Date of the Project
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationDeveloping A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR
Developing A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR Guoqian Jiang 1, Eric Prud Hommeax 2, and Harold R. Solbrig 1 1 Mayo Clinic, Rochester, MN, 55905, USA 2
More informationDATA Act Information Model Schema (DAIMS) Architecture. U.S. Department of the Treasury
DATA Act Information Model Schema (DAIMS) Architecture U.S. Department of the Treasury September 22, 2017 Table of Contents 1. Introduction... 1 2. Conceptual Information Model... 2 3. Metadata... 4 4.
More informationContinuous auditing certification
State of the Art in cloud service certification Cloud computing has emerged as the de-facto-standard when it comes to IT delivery. It comes with many benefits, such as flexibility, cost-efficiency and
More informationMedici for Digital Cultural Heritage Libraries. George Tsouloupas, PhD The LinkSCEEM Project
Medici for Digital Cultural Heritage Libraries George Tsouloupas, PhD The LinkSCEEM Project Overview of Digital Libraries A Digital Library: "An informal definition of a digital library is a managed collection
More informationImplementation of a reporting workflow to maintain data lineage for major water resource modelling projects
18 th World IMACS / MODSIM Congress, Cairns, Australia 13-17 July 2009 http://mssanz.org.au/modsim09 Implementation of a reporting workflow to maintain data lineage for major water Merrin, L.E. 1 and S.M.
More informationEmbracing Semantic Technology for Better Metadata Authoring in Biomedicine
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine Attila L. Egyedi, Martin J. O Connor, Marcos Martínez-Romero, Debra Willrett, Josef Hardi, John Graybeal, and Mark A. Musen Stanford
More informationMetadata Ingestion and Processinng
biomedical and healthcare Data Discovery Index Ecosystem Ingestion and Processinng Jeffrey S. Grethe, Ph.D. 2017 BioCADDIE All Hands Meeting prototype Ingestion Indexing Repositories Ingestion ElasticSearch
More informationDocument Title Ingest Guide for University Electronic Records
Digital Collections and Archives, Manuscripts & Archives, Document Title Ingest Guide for University Electronic Records Document Number 3.1 Version Draft for Comment 3 rd version Date 09/30/05 NHPRC Grant
More informationAUTOTASK ENDPOINT BACKUP (AEB) SECURITY ARCHITECTURE GUIDE
AUTOTASK ENDPOINT BACKUP (AEB) SECURITY ARCHITECTURE GUIDE Table of Contents Dedicated Geo-Redundant Data Center Infrastructure 02 SSAE 16 / SAS 70 and SOC2 Audits 03 Logical Access Security 03 Dedicated
More informationehealth EIF ehealth European Interoperability Framework European Commission ISA Work Programme
ehealth EIF ehealth European Interoperability Framework European Commission ISA Work Programme Overall Executive Summary A study prepared for the European Commission DG Connect This study was carried out
More informationImproving Metadata Compliance and Assessing Quality Metrics with a Standards Library
PharmaSUG 2018 - Paper SS-12 Improving Metadata Compliance and Assessing Quality Metrics with a Standards Library Veena Nataraj, Erica Davis, Shire ABSTRACT Establishing internal Data Standards helps companies
More informationDATA SELECTION AND APPRAISAL CHECKLIST University of Reading Research Data Archive
Research and Enterprise Services DATA SELECTION AND APPRAISAL CHECKLIST University of Reading Research Data Archive Introduction This Selection and Appraisal Checklist provides a set of criteria against
More informationExecutive Committee Meeting
Executive Committee Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationenanomapper database, search tools and templates Nina Jeliazkova, Nikolay Kochev IdeaConsult Ltd. Sofia, Bulgaria
enanomapper database, search tools and templates Nina Jeliazkova, Nikolay Kochev IdeaConsult Ltd. Sofia, Bulgaria www.ideaconsult.net Ø enanomapper database: data model, technology; NANoREG data transfer
More informationSession Two: OAIS Model & Digital Curation Lifecycle Model
From the SelectedWorks of Group 4 SundbergVernonDhaliwal Winter January 19, 2016 Session Two: OAIS Model & Digital Curation Lifecycle Model Dr. Eun G Park Available at: https://works.bepress.com/group4-sundbergvernondhaliwal/10/
More informationReducing Consumer Uncertainty Towards a Vocabulary for User-centric Geospatial Metadata
Meeting Host Supporting Partner Meeting Sponsors Reducing Consumer Uncertainty Towards a Vocabulary for User-centric Geospatial Metadata 105th OGC Technical Committee Palmerston North, New Zealand Dr.
More informationXML in the bipharmaceutical
XML in the bipharmaceutical sector XML holds out the opportunity to integrate data across both the enterprise and the network of biopharmaceutical alliances - with little technological dislocation and
More informationThe ELIXIR of Linked Data
The ELIXIR of Linked Data Professor Carole Goble (UK node) Barend Mons (NL node), Helen Parkinson (EMBL-EBI node) The Interoperability Services Backbone Team European Life Sciences Infrastructure for Biological
More informationIntroduction to Web Services & SOA
References: Web Services, A Technical Introduction, Deitel & Deitel Building Scalable and High Performance Java Web Applications, Barish Web Service Definition The term "Web Services" can be confusing.
More informationDescription of CORE Implementation in Java
Partner s name: Istat WP number and name: WP6 Implementation library for generic interface and production chain for Java Deliverable number and name: 6.1 Description of Implementation in Java Description
More informationExploring the Concept of Temporal Interoperability as a Framework for Digital Preservation*
Exploring the Concept of Temporal Interoperability as a Framework for Digital Preservation* Margaret Hedstrom, University of Michigan, Ann Arbor, MI USA Abstract: This paper explores a new way of thinking
More informationDeliverable 6.4. Initial Data Management Plan. RINGO (GA no ) PUBLIC; R. Readiness of ICOS for Necessities of integrated Global Observations
Ref. Ares(2017)3291958-30/06/2017 Readiness of ICOS for Necessities of integrated Global Observations Deliverable 6.4 Initial Data Management Plan RINGO (GA no 730944) PUBLIC; R RINGO D6.5, Initial Risk
More informationTEXT MINING: THE NEXT DATA FRONTIER
TEXT MINING: THE NEXT DATA FRONTIER An Infrastructural Approach Dr. Petr Knoth CORE (core.ac.uk) Knowledge Media institute, The Open University United Kingdom 2 OpenMinTeD Establish an open and sustainable
More informationDeliverable Initial Data Management Plan
EU H2020 Research and Innovation Project HOBBIT Holistic Benchmarking of Big Linked Data Project Number: 688227 Start Date of Project: 01/12/2015 Duration: 36 months Deliverable 8.5.1 Initial Data Management
More informationAgenda. Clarification of issues Quarter definition Steering and Executive Committee composition Dissemination and community outreach activities
Agenda Clarification of issues Quarter definition Steering and Executive Committee composition Dissemination and community outreach activities Progress and updates Y1Q3 and plans for Y1Q4 Plan for the
More informationEuropean Platform on Rare Diseases Registration
The European Commission s science and knowledge service Joint Research Centre European Platform on Rare Diseases Registration Simona Martin Agnieszka Kinsner-Ovaskainen Monica Lanzoni Andri Papadopoulou
More informationTransitioning to Symyx
Whitepaper Transitioning to Symyx Notebook by Accelrys from Third-Party Electronic Lab Notebooks Ordinarily in a market with strong growth, vendors do not focus on competitive displacement of competitor
More informationIBM Advantage: IBM Watson Compare and Comply Element Classification
IBM Advantage: IBM Watson Compare and Comply Element Classification Executive overview... 1 Introducing Watson Compare and Comply... 2 Definitions... 3 Element Classification insights... 4 Sample use cases...
More informationIntegrating SAS with Open Source. Software
Integrating SAS with Open Source Software Jeremy Fletcher Informatics Specialist Pharma Global Informatics F. Hoffmann-La Roche F. Hoffmann La Roche A Global Healthcare Leader One of the leading research-intensive
More informationThe NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets
The NIH Collaboratory Distributed Research Network: A Privacy Protecting Method for Sharing Research Data Sets Jeffrey Brown, Lesley Curtis, and Rich Platt June 13, 2014 Previously The NIH Collaboratory:
More informationCore Technology Development Team Meeting
Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers
More informationHow WhereScape Data Automation Ensures You Are GDPR Compliant
How WhereScape Data Automation Ensures You Are GDPR Compliant This white paper summarizes how WhereScape automation software can help your organization deliver key requirements of the General Data Protection
More informationAdding Research Datasets to the UWA Research Repository
University Library Adding Research Datasets to the UWA Research Repository Guide to Researchers What does UWA mean by Research Datasets? Research Data is defined as facts, observations or experiences on
More informationChapter 8: SDLC Reviews and Audit Learning objectives Introduction Role of IS Auditor in SDLC
Chapter 8: SDLC Reviews and Audit... 2 8.1 Learning objectives... 2 8.1 Introduction... 2 8.2 Role of IS Auditor in SDLC... 2 8.2.1 IS Auditor as Team member... 2 8.2.2 Mid-project reviews... 3 8.2.3 Post
More informationScience Europe Consultation on Research Data Management
Science Europe Consultation on Research Data Management Consultation available until 30 April 2018 at http://scieur.org/rdm-consultation Introduction Science Europe and the Netherlands Organisation for
More informationGlossary of Exchange Network Related Groups
Glossary of Exchange Network Related Groups CDX Central Data Exchange EPA's Central Data Exchange (CDX) is the point of entry on the National Environmental Information Exchange Network (Exchange Network)
More informationHow a Metadata Repository enables dynamism and automation in SDTM-like dataset generation
Paper DH05 How a Metadata Repository enables dynamism and automation in SDTM-like dataset generation Judith Goud, Akana, Bennekom, The Netherlands Priya Shetty, Intelent, Princeton, USA ABSTRACT The traditional
More informationIntroduction to Web Services & SOA
References: Web Services, A Technical Introduction, Deitel & Deitel Building Scalable and High Performance Java Web Applications, Barish Service-Oriented Programming (SOP) SOP A programming paradigm that
More informationNational Data Sharing and Accessibility Policy-2012 (NDSAP-2012)
National Data Sharing and Accessibility Policy-2012 (NDSAP-2012) Department of Science & Technology Ministry of science & Technology Government of India Government of India Ministry of Science & Technology
More informationDATA-SHARING PLAN FOR MOORE FOUNDATION Coral resilience investigated in the field and via a sea anemone model system
DATA-SHARING PLAN FOR MOORE FOUNDATION Coral resilience investigated in the field and via a sea anemone model system GENERAL PHILOSOPHY (Arthur Grossman, Steve Palumbi, and John Pringle) The three Principal
More informationThis document is a preview generated by EVS
INTERNATIONAL STANDARD IEC 62559-3 Edition 1.0 2017-12 colour inside Use case methodology Part 3: Definition of use case template artefacts into an XML serialized format IEC 62559-3:2017-12(en) THIS PUBLICATION
More informationLegal Issues in Data Management: A Practical Approach
Legal Issues in Data Management: A Practical Approach Professor Anne Fitzgerald Faculty of Law OAK Law Project Legal Framework for e-research Project Queensland University of Technology (QUT) am.fitzgerald@qut.edu.au
More informationDeveloping A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR
Developing A Semantic Web-based Framework for Executing the Clinical Quality Language Using FHIR Guoqian Jiang 1, Eric Prud Hommeaux 2, Guohui Xiao 3, and Harold R. Solbrig 1 1 Mayo Clinic, Rochester,
More informationComply with Data Integrity Regulations with Chromeleon CDS Software
Comply with Data Integrity Regulations with Chromeleon CDS Software Anna Severoni Sales Support Specialist for Chromatography Thermo Fisher Scientific, Rodano (MI) The world leader in serving science Introduction
More informationAgenda. Bibliography
Humor 2 1 Agenda 3 Trusted Digital Repositories (TDR) definition Open Archival Information System (OAIS) its relevance to TDRs Requirements for a TDR Trustworthy Repositories Audit & Certification: Criteria
More informationContent Management for the Defense Intelligence Enterprise
Gilbane Beacon Guidance on Content Strategies, Practices and Technologies Content Management for the Defense Intelligence Enterprise How XML and the Digital Production Process Transform Information Sharing
More informationGEOSS Data Management Principles: Importance and Implementation
GEOSS Data Management Principles: Importance and Implementation Alex de Sherbinin / Associate Director / CIESIN, Columbia University Gregory Giuliani / Lecturer / University of Geneva Joan Maso / Researcher
More informationBeginning To Define ebxml Initial Draft
Beginning To Define ebxml Initial Draft File Name Version BeginningToDefineebXML 1 Abstract This document provides a visual representation of how the ebxml Architecture could work. As ebxml evolves, this
More informationSupporting Patient Screening to Identify Suitable Clinical Trials
Supporting Patient Screening to Identify Suitable Clinical Trials Anca BUCUR a,1, Jasper VAN LEEUWEN a, Njin-Zu CHEN a, Brecht CLAERHOUT b Kristof DE SCHEPPER b, David PEREZ-REY c, Raul ALONSO-CALVO c,
More informationMaximizing Public Data Sources for Sequencing and GWAS
Maximizing Public Data Sources for Sequencing and GWAS February 4, 2014 G Bryce Christensen Director of Services Questions during the presentation Use the Questions pane in your GoToWebinar window Agenda
More informationDeliverable Final Data Management Plan
EU H2020 Research and Innovation Project HOBBIT Holistic Benchmarking of Big Linked Data Project Number: 688227 Start Date of Project: 01/12/2015 Duration: 36 months Deliverable 8.5.3 Final Data Management
More informationSemantic Web for Earth and Environmental Terminology (SWEET) Status, Future Development and Community Building
Semantic Web for Earth and Environmental Terminology (SWEET) 2018 Status, Future Development and Community Building 2 Agenda and Purpose Current status of SWEET e.g. What has the community been doing?
More informationT103 PlantPAx System Fundamentals
T103 PlantPAx System Fundamentals PUBLIC INFORMATION Rev 5058-CO900E Copyright 2014 Rockwell Automation, Inc. All Rights Reserved. PUBLIC INFORMATION Copyright 2014 Rockwell Automation, Inc. All Rights
More informationBasic Principles of MedWIS - WISE interoperability
Co-ordination committee seminar of the national focal points Basic Principles of MedWIS - WISE interoperability Eduardo García ADASA Sistemas Nice - France Agenda WISE vs MedWIS WISE WISE DS WISE vs WISE
More informationToward Horizon 2020: INSPIRE, PSI and other EU policies on data sharing and standardization
Toward Horizon 2020: INSPIRE, PSI and other EU policies on data sharing and standardization www.jrc.ec.europa.eu Serving society Stimulating innovation Supporting legislation The Mission of the Joint Research
More informationEnabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services
Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit
More informationCost-Benefit Analysis of Retrospective vs. Prospective Data Standardization
Cost-Benefit Analysis of Retrospective vs. Prospective Data Standardization Vicki Seyfert-Margolis, PhD Senior Advisor, Science Innovation and Policy Food and Drug Administration IOM Sharing Clinical Research
More information28 September PI: John Chip Breier, Ph.D. Applied Ocean Physics & Engineering Woods Hole Oceanographic Institution
Developing a Particulate Sampling and In Situ Preservation System for High Spatial and Temporal Resolution Studies of Microbial and Biogeochemical Processes 28 September 2010 PI: John Chip Breier, Ph.D.
More informationNSF Data Management Plan Template Duke University Libraries Data and GIS Services
NSF Data Management Plan Template Duke University Libraries Data and GIS Services NSF Data Management Plan Requirement Overview The Data Management Plan (DMP) should be a supplementary document of no more
More informationOn the Design and Implementation of a Generalized Process for Business Statistics
On the Design and Implementation of a Generalized Process for Business Statistics M. Bruno, D. Infante, G. Ruocco, M. Scannapieco 1. INTRODUCTION Since the second half of 2014, Istat has been involved
More informationEUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal Heinrich Widmann, DKRZ DI4R 2016, Krakow, 28 September 2016 www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020
More informationBuilding a Data Strategy for a Digital World
Building a Data Strategy for a Digital World Jason Hunter, CTO, APAC Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies Data Hub 100 s of Service
More informationHow to write ADaM specifications like a ninja.
Poster PP06 How to write ADaM specifications like a ninja. Caroline Francis, Independent SAS & Standards Consultant, Torrevieja, Spain ABSTRACT To produce analysis datasets from CDISC Study Data Tabulation
More informationwarwick.ac.uk/lib-publications
Original citation: Zhao, Lei, Lim Choi Keung, Sarah Niukyun and Arvanitis, Theodoros N. (2016) A BioPortalbased terminology service for health data interoperability. In: Unifying the Applications and Foundations
More informationNOMAD Metadata for all
EMMC Workshop on Interoperability NOMAD Metadata for all Cambridge, 8 Nov 2017 Fawzi Mohamed FHI Berlin NOMAD Center of excellence goals 200,000 materials known to exist basic properties for very few highly
More informationVI-SEEM Data Repository. Presented by: Panayiotis Charalambous
SIMDAS AND VI-SEEM WORKSHOP ON DATA MANAGEMENT AND SEMANTIC STRUCTURES FOR CROSS-DISCIPLINARY RESEARCH IN THE SEEM REGION VRE for regional Interdisciplinary communities in Southeast Europe and the Eastern
More informationCommon Protocol Template (CPT) Frequently Asked Questions
Last Updated 12-December-2017 Topics 1 Rationale for Using the CPT... 2 2 Stakeholder Input to CPT Development... 3 3 Alignment of CPT and National Institutes of Health (NIH) Food and Drug Administration
More informationSAS IT Resource Management 3.8: Reporting Guide
SAS IT Resource Management 3.8: Reporting Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2017. SAS IT Resource Management 3.8: Reporting Guide.
More informationDataset-XML - A New CDISC Standard
Dataset-XML - A New CDISC Standard Lex Jansen Principal Software Developer @ SAS CDISC XML Technologies Team Single Day Event CDISC Tools and Optimization September 29, 2014, Cary, NC Agenda Dataset-XML
More informationOpus: University of Bath Online Publication Store
Patel, M. (2004) Semantic Interoperability in Digital Library Systems. In: WP5 Forum Workshop: Semantic Interoperability in Digital Library Systems, DELOS Network of Excellence in Digital Libraries, 2004-09-16-2004-09-16,
More informationWebEx Management. GP Connect. WebEx Interactions
WebEx Management GP Connect WebEx Interactions Submit questions using the chat facility to everyone Please keep chat conversations private Refrain from answering questions proposed We ll answer questions
More informationArchitecture Tool Certification Certification Policy
Architecture Tool Certification Certification Policy Version 1.0 January 2012 Copyright 2012, The Open Group All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
More informationehealth action in the EU
ehealth action in the EU ehealth for smart and inclusive growth 13 February 2014 Jerome Boehm DG SANCO ehealth and Health Technology Assessment General Health Objectives of the EU cooperation on ehealth
More informationReducing Consumer Uncertainty
Spatial Analytics Reducing Consumer Uncertainty Towards an Ontology for Geospatial User-centric Metadata Introduction Cooperative Research Centre for Spatial Information (CRCSI) in Australia Communicate
More informatione-infrastructures in FP7 INFO DAY - Paris
e-infrastructures in FP7 INFO DAY - Paris Carlos Morais Pires European Commission DG INFSO GÉANT & e-infrastructure Unit 1 Global challenges with high societal impact Big Science and the role of empowered
More informationEXAM PREPARATION GUIDE
EXAM PREPARATION GUIDE PECB Certified ISO/IEC 38500 Lead IT Corporate Governance Manager The objective of the PECB Certified ISO/IEC 38500 Lead IT Corporate Governance Manager examination is to ensure
More informationDATA PRESERVATION AND SHARING INITIATIVE. 1. Aims of the EORTC QLG Data Repository project
DATA PRESERVATION AND SHARING INITIATIVE 1. Aims of the EORTC QLG Data Repository project The European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group Data Repository project
More informationData Curation Profile Human Genomics
Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date
More information