Managing Data in the long term. 11 Feb 2016

Size: px
Start display at page:

Download "Managing Data in the long term. 11 Feb 2016"

Transcription

1 Managing Data in the long term 11 Feb 2016

2 Outline What is needed for managing our data? What is an archive? 2

3 3

4 Motivation Researchers often have funds for data management during the project lifetime. Limited time to manage data once project has completed Essentially it's not the researchers job But, there is value in the ensuring data are available beyond the end of the project Value to peers Potential value to researchers in other areas (cross-discipline) 4

5 Motivation Edmond Halley (C18) used historical data to determine the trajectory of a comet and provide validation of Newton's theory of gravitation. Due to period of comet ~70years historical data essential. Needed to use data collected for different purposes (eg propaganda, religious) 5

6 Motivation Data taken from google scholar Download View Publication Moser et al Nobel Prize data / / / /

7 Motivation 7

8 Motivation Currently ~50TB data archived 44 datasets (49TB) from Climate community 4 datasets Biology, 1 Computer science With very little prompting! Demonstrates researchers do see value in their data a service capable of managing their data 8

9 9

10 An Important Distinction An archive is a service that provides long-term access to data. Long-term usually means More than 5 years. An archive is not a backup A backup is a snapshot of data that may change over time (eg last weeks backup of file X may be different to this weeks backup of file X). Once data reaches a mature state (ie doesn't change) then it can be considered for archiving. 10

11 Roles The Norstore archive recognises 5 different types of user: Creator, Contributor, Data Manager, Rights Holder, Access User. All types can be a person or an organisation (although in the case of an organisation a contact person is needed). The Access Users doesn't need to be defined (unless data access is restricted). It is possible that the different types can resolve to the same person or organisation. It s important to assign these roles to the dataset in case of questions. 11

12 Creator and Contributor Roles A person uploading data into the archive takes the role of the Contributor. There can be more than one contributor for a dataset. The Contributor uploads the data and fills-in the metadata for the dataset. The Contributor shares the responsibility of ensuring the dataset is complete, abides by the Terms and Conditions and that the metadata is accurate. The Creator is the person or group that created the data. 12

13 Data Manager To address the problem of datasets being used in different situations than originally anticipated we need to have an expert or contact person for the dataset. Data Manager responsible for fielding questions or comments regarding the dataset during its lifetime. The Contributor does not need to maintain a connection with the dataset (eg contributor could be a PostDoc or PhD student). Doesn t have to be an expert on the dataset, but should know whom to contact. Similar to what happens with publications (contact person or corresponding author is mentioned). 13

14 Rights Holder The Rights Holder is the person or group that controls or owns the rights to the dataset. This includes intellectual property, copyright. There may be more than one Rights Holder for a dataset. If the access restrictions exist on the use of the dataset the Rights Holder will need to be contacted for permission to use the dataset. In most cases (those abiding by the NLOD or CCv4 license) the role of the Rights Holder is less important (but it still relevant). It is IMPORTANT that you check with your Institution, funding agency as to whom has rights on your dataset. 14

15 Access User Any person querying the archive or using the data in the archive assumes the role of an Access User. Metadata for all published datasets is accessible by all Access Users. Datasets are accessible only requiring an address The download link needs to be sent to the user. Using datasets assumes you abide by the access licence. 15

16 Archiving Data Is part of the data lifecycle Requires information from previous phases Information on how the data was collected, processed, etc Needs to be taken into consideration at project proposal time Motivates the need for a plan for data management 16

17 Research Data Management Plan Founded on three criteria for the research project: Successful data collection Successful data use Successful data sharing with target audience Throughout the data lifecycle Plan will also help in provisioning resources Norstore working on a template to address these criteria that: Recognises researchers are not data management experts Uses best practices from (UK Digital Curation Centre, DANS and other agencies) 17

18 Research Data Management Plan Research Council of Norway Researchers Data Management Plan Research Institutions Service Providers: Norstore, Nortur 18

19 Research Data Management Plan Currently drafting a template for the plan Intention is to have pre-prepared text as much as possible Template ready very soon Next steps: Review template draft internally, seek feedback from stakeholders 19

20 Datasets (archive) A collection of related data Usually data is related in terms of use e.g. cloud simulation data Up to the researcher to define the dataset Datasets elegible for archiving should be 'closed' or 'complete': Datasets such as those that resulted in publications Datasets that are considered a natural conclusion to a project All datasets should be considered of lasting value to the community 20

21 Datasets (archive) Data needs ideally to be in a standard or open format that makes it possible to migrate in case of obsolescence. Licensing who can use the data, under what restrictions. Contact persons in case of questions concerning the data. Integrity checksums should be provided along with the data. Metadata description of the data. 21

22 Datasets (Archive) Are some popular approaches to arranging data. Internet Engineering Task Force proposal for structuring related data BagIt ( Used by a variety of institutions (eg Library of Congress) Essentially: 22

23 Dataset (Archive) BagIt data directory contains sub-structure. Suggest dividing into: doc for documentation (including table of contents of layout) src for any source code needed to read the data (and possibly that generated the data) aux auxiliary data file <data type> for data files of that data type Or any other layout. But, try to provide a doc directory containing documentation and a src containing source code. Can then zip or tar the BagIt hierarchy and upload to the archive. 23

24 Metadata What is this? What was it used for? 24

25 Metadata and Datasets Metadata essential to successfully use the dataset: Describes what the dataset is. Describes where it came from. Describes how to use it. Metadata is created throughout the data lifecycle Different phases of the lifecycle require different types of metadata Perhaps data are initially stored in a primitive format and then processed. 25

26 Metadata and Datasets Can be divided into 3 classes: Descriptive: what the data is, features, etc Structural: how the data is arranged, formats, etc Administrative: how to manage the data, checksums, rights etc Many domains have complex, detailed metadata... 26

27 Metadata Seeing Standards: A Visualisation of the Metadata Universe. J. Riley, D. Becker 27

28 Metadata Metadata schemes for many communities at different stages of evolution. Quite detailed. Very difficult for Norstore to support all metadata schemes Look for lowest common denomenator 28

29 Norstore Archive Metadata Many metadata schemes have Dublin Core either as a basis or have a strong overlap with Dublin Core. Dublin Core is an ISO standard. The standard has 15 terms, extended Dublin Core has more terms. The Norstore Archive uses Dublin Core as a basis. Additional metadata terms added that are not covered by DC, but are generic enough for all communities. OAI-PMH based on DC so automatically compliant. Metadata is separate entity from the dataset. 29

30 Norstore Archive Metadata Descriptive Information Administrative Information Structural Information Category Description Identifier Internal Identifier Journal Article Language Phase State Subject Title Access Rights Contributor Created Creator Data Manager License Lifetime Preservation Level Published on Publisher Rights Rights Holder Submitted Terms and Conditions for Deposit File Checksum File Name File Size File Type Descriptive Information (optional) Bibliographic Citation Conforms to Comment Geolocation Label Project Provenance Source Temporal Coverage Bold terms are Dublin Core recommended terms. Top 3 boxes contain mandatory metadata. Terms in italics are automatically filled in by archive. Only ~14 terms to be defined by the user 30

31 Norstore Archive Metadata Norstore metadata intended to be as generic as possible. Sufficient to locate data and understand how to use the data. More detailed information should be contained in domainspecific catalogues. Can reference domain-specific catalogue within descriptive metadata In the future we could envisage the archive holding a reference to the domain catalogue. Need to be aware the archive lifetime may be longer than the domain catalogue Can have domain specific catalogues use the DOI as a handle to the data (resolving the link will provide access to the data). 31

32 Norstore and Domain Metadata Data Service Norstore Archive Metadata Service DOI Resolver Domain Metadata Service Domain Metadata catalogue can have DOI registered. Could then invoke DOI resolver to provide access to archive metadata and data. 32

33 Norstore Metadata Currently metadata must be supplied using the web interface Metadata needs to be completed in two stages: Before data upload consists of mandatory metadata that is needed by the archive for managing the data (eg contact information, title of the dataset etc) After data upload consists of remaining mandatory descriptive information and optional information. There is a 3 month time limit to fill in metadata User will be reminded during this period of need to complete metadata. At the discretion of the Archive Manager the dataset may be deleted if the metadata remains incomplete after this limit. Typically metadata is completed within 2 weeks. 33

34 Completing Norstore Metadata 34

35 Tips for Norstore Metadata Avoid duplication if information is contained in the publication or other referenced material. Consider what information is needed to reanalyse the data: libraries, operating systems, workflows, manuals, any other data a good test is to ask a person new to the community to document what they need to make use of the data. Any features in the data worth mentioning? How the data was collected described? The environment the data was collected in such as instrument settings etc. 35

36 Tips for Norstore Metadata Use the description field to describe what the dataset is and how to use it. Use the journal metadata to provide a reference to the article that describes the dataset. If there is a lot of documentation it could be included as part of the dataset and describe where to find it in the description. In this case the description can be more succinct. If the dataset has temporal or spatial information consider using the optional metadata to capture that information. Provides a visual aid to the description of your dataset. 36

37 Tips for Norstore Metadata Links to external references with more information are good. But, beware of longevity Will the reference last the lifetime of the dataset? Beware of jargon or terminology Perhaps run the description by novice users to see if it s clear 37

38 Norstore Metadata Plans Recognise that many projects have metadata catalogues. Ability to extract subset that matches some of the Norstore metadata terms would be useful. Working on a REST API for the metadata catalogue. Currently looking at the Search, but can be extended to the ingest of metadata. Allows you to script the extraction and loading of some of the norstore metadata automatically. Useful for projects with many datasets. Implement metadata errata: Allow traceable corrections of metadata 38

39 Archive Oslo disk irods W e b Norstore catalog C L I External User tape Project Area user Tromso irods disk 39

40 The Archive User Interface (web and CLI) IRODS Metadata Catalog Storage (disk and tape, Oslo, Tromso) Designed the archive to allow replacement of any component with minimal impact 40

41 User Interface The primary user interface is web-based. Command line interface used for large dataset interaction with the project area. Interfaces to norstore metadata catalogue. Also used for metadata search PostgreSQL database. All metadata and state information held there. Also interfaces to the data management system (irods). 41

42 IRODS Data Management System Rule oriented data management system Abstracts details of distributed storage by providing logicallayer Logical-physical mapping held in irods metadata catalogue PostgreSQL database. Provides access control and interfaces to authentication such as GSI and Kerberos Norstore makes use of just one archive user to manage the data Users don t interact directly with irods, but through the web interface or command line tools. 42

43 IRODS Data Management System Allows policies to be placed on the data Norstore policy to replicate data to 3 resources Also have a policy to remove data from one resource and replicate to a new resource Also policy to regularly checksum data 43

44 Archiving a Dataset Datasets can be archived from researchers local computer, or from norstore project area. Local computer uploads achieved via Filesender service Datasets < 1TB in size can be uploaded (can be increased) Project area requires users to be registered with a valid project Data are uploaded via command-line scripts Once dataset is uploaded metadata needs to be filled in via the web interface. 44

45 Norstore Archive workflow Identify data Seek approval Identify metadata Fill-in metadata upload data Verify Request publication Verify metadata Ensure approval Assign DOI Publish 45

46 Project Area Upload Select Project Area Upload containing dataset UUID Create Dataset Manifest File A valid argument for find <dir>! type d <file pattern> E.g. /projects/ns9999k name *.tgz Run ArchiveData set UUID <manifest file> Job submitted to queue. when finished. Query status: ListArchiveDataset UUID 46

47 Publishing Data Necessary in order to be able to cite datasets. Currently using DataCite node in Denmark to issue Digital Object Identifiers. DOI are standard, unique identifier that can be used to identify a resource. Originally developed for documents, but now being used for data. Each DOI must point to metadata about the object and may contain a link to the dataset itself. Resolver services are used to resolve the DOI to a URI. Structure of DOI meaningful doi: / refers to the DOI registry, 1000 refers to the entity that registered the data, 182 refers to the actual object. Once a dataset is published it cannot be modified Some metadata may be updated 47

48 Publishing Data 48

49 Landing page Permanent metadata record for the dataset. All access via DOI resolve to this page. Page contains links to additional metatdata and data Landing page exists for terminated datasets Called a Tombstone record Link to data removed Contains additional metadata: when data was removed, reason for removal. 49

50 Planned Functionality Imminent: REST API for searching datasets. Provides command line access to metadata. Allow harvesting of metadata (opensearch and OAI-PMH planned). Imminent: Versioning datasets. Accommodate cases where researcher wishes to update data (either data has migrated to different format, or mistakes made, or update metadata). Provide a link back to the previous version visible from landing page. New version will have new DOI. Previous version will remain accessible unless explicitly terminated. 50

51 Future Functionality Subsets of datasets: Researchers may be interested in downloading only a subset of a dataset. Via the table of contents it s possible to identify subset of interest and tag for download files of interest. Collections of datasets: There may be a logical grouping of datasets (eg series of datasets) Can make it easier to link related datasets 51

52 Preserving Datasets Digital preservation attempts to ensure digital remains accessible and usable by future users. This is addressed by: Ensuring bit-level integrity through data replication. Ensuring data is understandable (may require adding or updating metadata on how to use and interpret the data). Ensuring data is discoverable (equipped with the right and relevant metadata and description). Ensuring data in usable format (may require migration from obsolete formats to new formats, or virtual environments). 52

53 Migration and Virtualisation Things to be aware of for Migration: What s the best format (most durable, popular, open)? What features in the data need to be maintained and how can we check they are? Migration pros/cons: Easy to use new tools with old data, easier to integrate data into new/current workflows One-way street. May lose some features/functionality in the migration that may only be relevant later. Requires experts to be able to assess what features need to be kept and whether they are indeed kept. Things to be aware of for Virtualisation: What type of virtual machine to use (licensing, rendering, performance)? Are all the resources required by the application contained within the VM? 53

54 Migration and Virtualisation Virtualisation pros/cons: Preserves original features/functionality (little risk of missing something). Can be difficult to integrate with newer tools. If large volume of data may not be scalable option. Choice depends on your circumstances and needs. 54

55 Auditing Aim to pass Data Seal of Approval ( Ensures the archive conforms to best practice Allows users to assess how reliable the archive is. 55

Archive II. The archive. 26/May/15

Archive II. The archive. 26/May/15 Archive II The archive 26/May/15 What is an archive? Is a service that provides long-term storage and access of data. Long-term usually means ~5years or more. Archive is strictly not the same as a backup.

More information

Data Curation Handbook Steps

Data Curation Handbook Steps Data Curation Handbook Steps By Lisa R. Johnston Preliminary Step 0: Establish Your Data Curation Service: Repository data curation services should be sustained through appropriate staffing and business

More information

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan Introduction NERC, Science and Data Centres NERC Discovery Metadata The Data Catalogue Service NERC Data Services Case study:

More information

Assessment of product against OAIS compliance requirements

Assessment of product against OAIS compliance requirements Assessment of product against OAIS compliance requirements Product name: Archivematica Date of assessment: 30/11/2013 Vendor Assessment performed by: Evelyn McLellan (President), Artefactual Systems Inc.

More information

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM OMB No. 3137 0071, Exp. Date: 09/30/2015 DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM Introduction: IMLS is committed to expanding public access to IMLS-funded research, data and other digital products:

More information

Dryad Curation Manual, Summer 2009

Dryad Curation Manual, Summer 2009 Sarah Carrier July 30, 2009 Introduction Dryad Curation Manual, Summer 2009 Dryad is being designed as a "catch-all" repository for numerical tables and all other kinds of published data that do not currently

More information

Preservation Standards (& Specifications) (&& Best Practices)

Preservation Standards (& Specifications) (&& Best Practices) Standards (& Specifications) (&& Best Practices) Discoverable, Available, Accessible: Preserving Digital Content NISO Webinar By Amy Kirchhoff Archive Service Product Manager, Portico, JSTOR September

More information

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShare Stuart Macdonald EDINA & Data Library University of Edinburgh NFAIS Open Data Seminar, 16 June 2016 Context EDINA and Data Library are a

More information

DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy

DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy Dr Kathryn Cassidy Software Engineer Digital Repository of Ireland Trinity College Dublin Development of a Preservation

More information

B2SAFE metadata management

B2SAFE metadata management B2SAFE metadata management version 1.2 by Claudio Cacciari, Robert Verkerk, Adil Hasan, Elena Erastova Introduction The B2SAFE service provides a set of functions for long term bit stream data preservation:

More information

Dataset Documentation Reference Guide for Pure Users

Dataset Documentation Reference Guide for Pure Users Dataset Documentation Reference Guide for Pure Users Pure is the University's Current Research Information System (CRIS). Information held in Pure relates to research staff and their datasets, publications,

More information

Implementation of the CoreTrustSeal

Implementation of the CoreTrustSeal Implementation of the CoreTrustSeal The CoreTrustSeal board hereby confirms that the Trusted Digital repository Mendeley Data complies with the guidelines version 2017-2019 set by the. The afore-mentioned

More information

The OAIS Reference Model: current implementations

The OAIS Reference Model: current implementations The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation

More information

Data Management Checklist

Data Management Checklist Data Management Checklist Managing research data throughout its lifecycle ensures its long-term value and prevents data from falling into digital obsolescence. Proper data management is a key prerequisite

More information

Persistent identifiers, long-term access and the DiVA preservation strategy

Persistent identifiers, long-term access and the DiVA preservation strategy Persistent identifiers, long-term access and the DiVA preservation strategy Eva Müller Electronic Publishing Centre Uppsala University Library, http://publications.uu.se/epcentre/ 1 Outline DiVA project

More information

Assessment of product against OAIS compliance requirements

Assessment of product against OAIS compliance requirements Assessment of product against OAIS compliance requirements Product name: Archivematica Sources consulted: Archivematica Documentation Date of assessment: 19/09/2013 Assessment performed by: Christopher

More information

IJDC General Article

IJDC General Article Developing a Data Vault Stuart Lewis Lorraine Beard Edinburgh University Library The University of Manchester Library Mary McDerby Robin Taylor IT Services The University of Manchester Edinburgh University

More information

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS) Institutional Repository using DSpace Yatrik Patel Scientist D (CS) yatrik@inflibnet.ac.in What is Institutional Repository? Institutional repositories [are]... digital collections capturing and preserving

More information

Preservation. Policy number: PP th March Table of Contents

Preservation. Policy number: PP th March Table of Contents Preservation Policy number: PP 10 23 th March 2018 Table of Contents Outline 2 Background 2 Organisation 2 Funding 3 Roles and Responsibilities 4 Task Forces 4 External Auditing and Certification 6 Ingest

More information

Metadata Workshop 3 March 2006 Part 1

Metadata Workshop 3 March 2006 Part 1 Metadata Workshop 3 March 2006 Part 1 Metadata overview and guidelines Amelia Breytenbach Ria Groenewald What metadata is Overview Types of metadata and their importance How metadata is stored, what metadata

More information

GNU EPrints 2 Overview

GNU EPrints 2 Overview GNU EPrints 2 Overview Christopher Gutteridge 14th October 2002 Abstract An overview of GNU EPrints 2. EPrints is free software which creates a web based archive and database of scholarly output and is

More information

Introduction to Digital Preservation. Danielle Mericle University of Oregon

Introduction to Digital Preservation. Danielle Mericle University of Oregon Introduction to Digital Preservation Danielle Mericle dmericle@uoregon.edu University of Oregon What is Digital Preservation? the series of management policies and activities necessary to ensure the enduring

More information

Science Europe Consultation on Research Data Management

Science Europe Consultation on Research Data Management Science Europe Consultation on Research Data Management Consultation available until 30 April 2018 at http://scieur.org/rdm-consultation Introduction Science Europe and the Netherlands Organisation for

More information

ISO Self-Assessment at the British Library. Caylin Smith Repository

ISO Self-Assessment at the British Library. Caylin Smith Repository ISO 16363 Self-Assessment at the British Library Caylin Smith Repository Manager caylin.smith@bl.uk @caylinssmith Outline Digital Preservation at the British Library The Library s Digital Collections Achieving

More information

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...)

Slide 1 & 2 Technical issues Slide 3 Technical expertise (continued...) Technical issues 1 Slide 1 & 2 Technical issues There are a wide variety of technical issues related to starting up an IR. I m not a technical expert, so I m going to cover most of these in a fairly superficial

More information

Digital repositories as research infrastructure: a UK perspective

Digital repositories as research infrastructure: a UK perspective Digital repositories as research infrastructure: a UK perspective Dr Liz Lyon Director This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 UKOLN is supported by: Presentation

More information

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA MATTHEW WOOLLARD.. ECONOMIC AND SOCIAL DATA SERVICE UNIVERSITY OF ESSEX... METADATA AND PERSISTENT IDENTIFIERS FOR SOCIAL AND ECONOMIC DATA,

More information

Data Management Glossary

Data Management Glossary Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative

More information

Towards a joint service catalogue for e-infrastructure services

Towards a joint service catalogue for e-infrastructure services Towards a joint service catalogue for e-infrastructure services Dr British Library 1 DI4R 2016 Workshop Joint service catalogue for research 29 September 2016 15/09/15 Goal A framework for creating a Catalogue

More information

CCSDS STANDARDS A Reference Model for an Open Archival Information System (OAIS)

CCSDS STANDARDS A Reference Model for an Open Archival Information System (OAIS) CCSDS STANDARDS A Reference Model for an Open Archival System (OAIS) Mr. Nestor Peccia European Space Operations Centre, Robert-Bosch-Str. 5, D-64293 Darmstadt, Germany. Phone +49 6151 902431, Fax +49

More information

Adding Research Datasets to the UWA Research Repository

Adding Research Datasets to the UWA Research Repository University Library Adding Research Datasets to the UWA Research Repository Guide to Researchers What does UWA mean by Research Datasets? Research Data is defined as facts, observations or experiences on

More information

SHARING YOUR RESEARCH DATA VIA

SHARING YOUR RESEARCH DATA VIA SHARING YOUR RESEARCH DATA VIA SCHOLARBANK@NUS MEET OUR TEAM Gerrie Kow Head, Scholarly Communication NUS Libraries gerrie@nus.edu.sg Estella Ye Research Data Management Librarian NUS Libraries estella.ye@nus.edu.sg

More information

Digital Preservation at NARA

Digital Preservation at NARA Digital Preservation at NARA Policy, Records, Technology Leslie Johnston Director of Digital Preservation US National Archives and Records Administration (NARA) ARMA, April 18, 2018 Policy Managing Government

More information

ORCA-Registry v2.4.1 Documentation

ORCA-Registry v2.4.1 Documentation ORCA-Registry v2.4.1 Documentation Document History James Blanden 26 May 2008 Version 1.0 Initial document. James Blanden 19 June 2008 Version 1.1 Updates for ORCA-Registry v2.0. James Blanden 8 January

More information

RADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft

RADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft RESEARCH DATA REPOSITORY http://www.radar-projekt.org http://www.radar-service.eu Establishing a generic Research Data Repository: RADAR Digital Infrastructures for Research 2016 Conference 28 th - 30

More information

Migration. 22 AUG 2017 VMware Validated Design 4.1 VMware Validated Design for Software-Defined Data Center 4.1

Migration. 22 AUG 2017 VMware Validated Design 4.1 VMware Validated Design for Software-Defined Data Center 4.1 22 AUG 2017 VMware Validated Design 4.1 VMware Validated Design for Software-Defined Data Center 4.1 You can find the most up-to-date technical documentation on the VMware Web site at: https://docs.vmware.com/

More information

Open Data and its enemies

Open Data and its enemies Open Data and its enemies Dries van Oosten Debye Institute for NanoMaterials Science Center for Extreme Matter and Emergent Phenomena 1 Content 2 FAIR Findable Accessible Interoperable Reusable 3 4 Funding

More information

How to deposit your accepted paper in ORA through Symplectic

How to deposit your accepted paper in ORA through Symplectic How to deposit your accepted paper in ORA through Symplectic Act on Acceptance: when you ve had a journal article or conference paper accepted for publication, deposit the accepted manuscript 1 into ORA

More information

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal Heinrich Widmann, DKRZ DI4R 2016, Krakow, 28 September 2016 www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020

More information

BPMN Processes for machine-actionable DMPs

BPMN Processes for machine-actionable DMPs BPMN Processes for machine-actionable DMPs Simon Oblasser & Tomasz Miksa Contents Start DMP... 2 Specify Size and Type... 3 Get Cost and Storage... 4 Storage Configuration and Cost Estimation... 4 Storage

More information

Writing a Data Management Plan A guide for the perplexed

Writing a Data Management Plan A guide for the perplexed March 29, 2012 Writing a Data Management Plan A guide for the perplexed Agenda Rationale and Motivations for Data Management Plans Data and data structures Metadata and provenance Provisions for privacy,

More information

Copyright 2008, Paul Conway.

Copyright 2008, Paul Conway. Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution - Non-Commercial - Share Alike 3.0 License.. http://creativecommons.org/licenses/by-nc-sa/3.0/

More information

An overview of the OAIS and Representation Information

An overview of the OAIS and Representation Information An overview of the OAIS and Representation Information JORUM, DCC and JISC Forum Long-term Curation and Preservation of Learning Objects February 9 th 2006 University of Glasgow Manjula Patel UKOLN and

More information

UNIVERSITY OF NOTTINGHAM LIBRARIES, RESEARCH AND LEARNING RESOURCES

UNIVERSITY OF NOTTINGHAM LIBRARIES, RESEARCH AND LEARNING RESOURCES UNIVERSITY OF NOTTINGHAM LIBRARIES, RESEARCH AND LEARNING RESOURCES Digital Preservation and Access Policy 2015 Contents 1.0 Document Control... 3 2.0 Aim... 5 2.1 Purpose... 5 2.2 Digital Preservation

More information

Levels of Service Authors: R. Duerr, A. Leon, D. Miller, D.J Scott Date 7/31/2009

Levels of Service Authors: R. Duerr, A. Leon, D. Miller, D.J Scott Date 7/31/2009 Levels of Service Authors: R. Duerr, A. Leon, D. Miller, D.J Scott Date 7/31/2009 CHANGE LOG Revision Date Description Author 1.0 7/31/2009 Original draft Duerr, Leon, Miller, Scott 2.0 3/19/2010 Updated

More information

Session Two: OAIS Model & Digital Curation Lifecycle Model

Session Two: OAIS Model & Digital Curation Lifecycle Model From the SelectedWorks of Group 4 SundbergVernonDhaliwal Winter January 19, 2016 Session Two: OAIS Model & Digital Curation Lifecycle Model Dr. Eun G Park Available at: https://works.bepress.com/group4-sundbergvernondhaliwal/10/

More information

CANTCL: A Package Repository for Tcl

CANTCL: A Package Repository for Tcl CANTCL: A Package Repository for Tcl Steve Cassidy Centre for Language Technology, Macquarie University, Sydney E-mail: Steve.Cassidy@mq.edu.au Abstract For a long time, Tcl users and developers have requested

More information

From Open Data to Data- Intensive Science through CERIF

From Open Data to Data- Intensive Science through CERIF From Open Data to Data- Intensive Science through CERIF Keith G Jeffery a, Anne Asserson b, Nikos Houssos c, Valerie Brasse d, Brigitte Jörg e a Keith G Jeffery Consultants, Shrivenham, SN6 8AH, U, b University

More information

Preserva'on*Watch. What%to%monitor%and%how%Scout%can%help. KEEP%SOLUTIONS%www.keep7solu:ons.com

Preserva'on*Watch. What%to%monitor%and%how%Scout%can%help. KEEP%SOLUTIONS%www.keep7solu:ons.com Preserva'on*Watch What%to%monitor%and%how%Scout%can%help Luis%Faria%lfaria@keep.pt KEEP%SOLUTIONS%www.keep7solu:ons.com Digital%Preserva:on%Advanced%Prac::oner%Course Glasgow,%15th719th%July%2013 KEEP$SOLUTIONS

More information

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours)

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours) Certification F. Genova (thanks to I. Dillo and Hervé L Hours) Perhaps the biggest challenge in sharing data is trust: how do you create a system robust enough for scientists to trust that, if they share,

More information

Its All About The Metadata

Its All About The Metadata Best Practices Exchange 2013 Its All About The Metadata Mark Evans - Digital Archiving Practice Manager 11/13/2013 Agenda Why Metadata is important Metadata landscape A flexible approach Case study - KDLA

More information

Metadata. Week 4 LBSC 671 Creating Information Infrastructures

Metadata. Week 4 LBSC 671 Creating Information Infrastructures Metadata Week 4 LBSC 671 Creating Information Infrastructures Muddiest Points Memory madness Hard drives, DVD s, solid state disks, tape, Digitization Images, audio, video, compression, file names, Where

More information

archiving with Office 365

archiving with Office 365 Email archiving with Office 365 ISO CERTIFIED info@cryoserver.com www.cryoserver.com +44 (0) 800 280 0525 Table of Contents 1.0 Purpose of Document 2 2.0 Email archiving in Office 365 2 2.1 Deleted folder

More information

Software Requirements Specification for the Names project prototype

Software Requirements Specification for the Names project prototype Software Requirements Specification for the Names project prototype Prepared for the JISC Names Project by Daniel Needham, Amanda Hill, Alan Danskin & Stephen Andrews April 2008 1 Table of Contents 1.

More information

Showing it all a new interface for finding all Norwegian research output

Showing it all a new interface for finding all Norwegian research output Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 00 (2014) 000 000 www.elsevier.com/locate/procedia CRIS 2014 Showing it all a new interface for finding all Norwegian research

More information

Edinburgh DataShare: Tackling research data in a DSpace institutional repository

Edinburgh DataShare: Tackling research data in a DSpace institutional repository Edinburgh DataShare: Tackling research data in a DSpace institutional repository Robin Rice EDINA and Data Library, Information Services University of Edinburgh, Scotland DSpace User Group Meeting Gothenburg,

More information

GEOSS Data Management Principles: Importance and Implementation

GEOSS Data Management Principles: Importance and Implementation GEOSS Data Management Principles: Importance and Implementation Alex de Sherbinin / Associate Director / CIESIN, Columbia University Gregory Giuliani / Lecturer / University of Geneva Joan Maso / Researcher

More information

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data Heinrich Widmann, DKRZ Claudia Martens, DKRZ Open Science Days, Berlin, 17 October 2017 www.eudat.eu EUDAT receives funding

More information

Personal Digital Information Project, Part 2: Hands-on Exercise

Personal Digital Information Project, Part 2: Hands-on Exercise Drexel University From the SelectedWorks of James Gross May 14, 2012 Personal Digital Information Project, Part 2: Hands-on Exercise James Gross, Drexel University Available at: https://works.bepress.com/jamesgross/28/

More information

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT EUDAT A European Collaborative Data Infrastructure Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT OpenAire Interoperability Workshop Braga, Feb. 8, 2013 EUDAT Key facts

More information

Writing for the web and SEO. University of Manchester Humanities T4 Guides Writing for the web and SEO Page 1

Writing for the web and SEO. University of Manchester Humanities T4 Guides Writing for the web and SEO Page 1 Writing for the web and SEO University of Manchester Humanities T4 Guides Writing for the web and SEO Page 1 Writing for the web and SEO Writing for the web and SEO... 2 Writing for the web... 3 Change

More information

Research Data Repository Interoperability Primer

Research Data Repository Interoperability Primer Research Data Repository Interoperability Primer The Research Data Repository Interoperability Working Group will establish standards for interoperability between different research data repository platforms

More information

Open Access & Open Data in H2020

Open Access & Open Data in H2020 Open Access & Open Data in H2020 Services & Support Hannelore Vanhaverbeke, PhD Research Coordination Office What are you in for? Mandatory Each beneficiary must ensure open access (free of charge, online

More information

The Materials Data Facility

The Materials Data Facility The Materials Data Facility Ben Blaiszik (blaiszik@uchicago.edu), Kyle Chard (chard@uchicago.edu) Ian Foster (foster@uchicago.edu) materialsdatafacility.org What is MDF? We aim to make it simple for materials

More information

Preservation Planning in the OAIS Model

Preservation Planning in the OAIS Model Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract

More information

Unique Object Identifiers

Unique Object Identifiers TM dnf0011 Unique Object Identifiers Component 2.1: Identifier Management Technical Guide Version 1 Unique Object Identifiers Page 1 of 6 Document Content Jonathan Simmons is responsible for the content

More information

Different Aspects of Digital Preservation

Different Aspects of Digital Preservation Different Aspects of Digital Preservation DCH-RP and EUDAT Workshop in Stockholm 3rd of June 2014 Börje Justrell Table of Content Definitions Strategies The Digital Archive Lifecycle 2 Digital preservation

More information

Data Replication: Automated move and copy of data. PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013

Data Replication: Automated move and copy of data. PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013 Data Replication: Automated move and copy of data PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013 Claudio Cacciari c.cacciari@cineca.it Outline The issue

More information

A service-oriented national e-thesis information system and repository

A service-oriented national e-thesis information system and repository Title of the presentation Date 1 A service-oriented national e-thesis information system and repository Nikos Houssos Panagiotis Stathopoulos Ioanna Sarantopoulou Dimitris Zavaliadis Evi Sachini National

More information

Content Management for the Defense Intelligence Enterprise

Content Management for the Defense Intelligence Enterprise Gilbane Beacon Guidance on Content Strategies, Practices and Technologies Content Management for the Defense Intelligence Enterprise How XML and the Digital Production Process Transform Information Sharing

More information

KM COLUMN. How to evaluate a content management system. Ask yourself: what are your business goals and needs? JANUARY What this article isn t

KM COLUMN. How to evaluate a content management system. Ask yourself: what are your business goals and needs? JANUARY What this article isn t KM COLUMN JANUARY 2002 How to evaluate a content management system Selecting and implementing a content management system (CMS) will be one of the largest IT projects tackled by many organisations. With

More information

Data Curation Profile Human Genomics

Data Curation Profile Human Genomics Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date

More information

The CEDA Archive: Data, Services and Infrastructure

The CEDA Archive: Data, Services and Infrastructure The CEDA Archive: Data, Services and Infrastructure Kevin Marsh Centre for Environmental Data Archival (CEDA) www.ceda.ac.uk with thanks to V. Bennett, P. Kershaw, S. Donegan and the rest of the CEDA Team

More information

1. Download and install the Firefox Web browser if needed. 2. Open Firefox, go to zotero.org and click the big red Download button.

1. Download and install the Firefox Web browser if needed. 2. Open Firefox, go to zotero.org and click the big red Download button. Get Started with Zotero A free, open-source alternative to products such as RefWorks and EndNote, Zotero captures reference data from many sources, and lets you organize your citations and export bibliographies

More information

CoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 2

CoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 2 CoSA & Preservica Practical Digital Preservation 2015/16 Practical OAIS Digital Preservation Online Workshop Module 2 Practical Digital Preservation 2015/16 Welcome! PDP Online Workshops - with focus on

More information

The Trustworthiness of Digital Records

The Trustworthiness of Digital Records The Trustworthiness of Digital Records International Congress on Digital Records Preservation Beijing, China 16 April 2010 1 The Concept of Record Record: any document made or received by a physical or

More information

Open Access compliance:

Open Access compliance: Open Access compliance: How publishers can reach the recommended standards Jisc, April 2016. Draft Summary This document outlines what publishers might do to help authors and institutions globally implement

More information

Archival Information Package (AIP) E-ARK AIP version 1.0

Archival Information Package (AIP) E-ARK AIP version 1.0 Archival Information Package (AIP) E-ARK AIP version 1.0 January 27 th 2017 Page 1 of 50 Executive Summary This AIP format specification is based on E-ARK deliverable, D4.4 Final version of SIP-AIP conversion

More information

Data management Backgrounds and steps to implementation; A pragmatic approach.

Data management Backgrounds and steps to implementation; A pragmatic approach. Data management Backgrounds and steps to implementation; A pragmatic approach. Research and data management through the years Find the differences 2 Research and data management through the years Find

More information

The Canadian Information Network for Research in the Social Sciences and Humanities.

The Canadian Information Network for Research in the Social Sciences and Humanities. The Canadian Information Network for Research in the Social Sciences and Humanities http://www.synergiescanada.org Tim Au Yeung and Mary Westell Libraries and Cultural Resources University of Calgary March

More information

For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS

For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS 1 1. USE CASES For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS Business need: Users need to be able to

More information

Microsoft SharePoint Server 2013 Plan, Configure & Manage

Microsoft SharePoint Server 2013 Plan, Configure & Manage Microsoft SharePoint Server 2013 Plan, Configure & Manage Course 20331-20332B 5 Days Instructor-led, Hands on Course Information This five day instructor-led course omits the overlap and redundancy that

More information

Roy Lowry, Gwen Moncoiffe and Adam Leadbetter (BODC) Cathy Norton and Lisa Raymond (MBLWHOI Library) Ed Urban (SCOR) Peter Pissierssens (IODE Project

Roy Lowry, Gwen Moncoiffe and Adam Leadbetter (BODC) Cathy Norton and Lisa Raymond (MBLWHOI Library) Ed Urban (SCOR) Peter Pissierssens (IODE Project Roy Lowry, Gwen Moncoiffe and Adam Leadbetter (BODC) Cathy Norton and Lisa Raymond (MBLWHOI Library) Ed Urban (SCOR) Peter Pissierssens (IODE Project Office) Linda Pikula (IODE GEMIM/NOAA Library) Data

More information

For Attribution: Developing Data Attribution and Citation Practices and Standards

For Attribution: Developing Data Attribution and Citation Practices and Standards For Attribution: Developing Data Attribution and Citation Practices and Standards Board on Research Data and Information Policy and Global Affairs Division National Research Council in collaboration with

More information

Demos: DMP Assistant and Dataverse

Demos: DMP Assistant and Dataverse Demos: DMP Assistant and Dataverse Alexandra Cooper, Data Services Coordinator, Queen s University Meghan Goodchild, RDM Systems Librarian, Queen s University/Scholars Portal Overview of session Research

More information

The Dublin Core Metadata Element Set

The Dublin Core Metadata Element Set ISSN: 1041-5635 The Dublin Core Metadata Element Set Abstract: Defines fifteen metadata elements for resource description in a crossdisciplinary information environment. A proposed American National Standard

More information

Author Guidelines for Endodontic Topics

Author Guidelines for Endodontic Topics 1. Submission of Manuscripts Author Guidelines for Endodontic Topics Manuscripts should be submitted electronically via the online submission site http://mc.manuscriptcentral.com/endodontictopics. Complete

More information

Digital The Harold B. Lee Library

Digital The Harold B. Lee Library Digital Preservation @ The Harold B. Lee Library CIMA 23 May 2013 How we got here? 1. Understanding Digital Preservation 2. Search for Content 3. Maintain Optical Disc Storage 4. In House Preservation

More information

Using DCAT-AP for research data

Using DCAT-AP for research data Using DCAT-AP for research data Andrea Perego SDSVoc 2016 Amsterdam, 30 November 2016 The Joint Research Centre (JRC) European Commission s science and knowledge service Support to EU policies with independent

More information

The digital preservation technological context

The digital preservation technological context The digital preservation technological context Michael Day, Digital Curation Centre UKOLN, University of Bath m.day@ukoln.ac.uk Preservation of Digital Heritage: Basic Concepts and Main Initiatives, Madrid,

More information

User Stories : Digital Archiving of UNHCR EDRMS Content. Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5

User Stories : Digital Archiving of UNHCR EDRMS Content. Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5 User Stories : Digital Archiving of UNHCR EDRMS Content Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5 Introduction This document presents the user stories that describe key interactions

More information

The Salesforce Migration Playbook

The Salesforce Migration Playbook The Salesforce Migration Playbook By Capstorm Table of Contents Salesforce Migration Overview...1 Step 1: Extract Data Into A Staging Environment...3 Step 2: Transform Data Into the Target Salesforce Schema...5

More information

Building a Digital Repository on a Shoestring Budget

Building a Digital Repository on a Shoestring Budget Building a Digital Repository on a Shoestring Budget Christinger Tomer University of Pittsburgh! PALA September 30, 2014 A version this presentation is available at http://www.pitt.edu/~ctomer/shoestring/

More information

ISO Information and documentation Digital records conversion and migration process

ISO Information and documentation Digital records conversion and migration process INTERNATIONAL STANDARD ISO 13008 First edition 2012-06-15 Information and documentation Digital records conversion and migration process Information et documentation Processus de conversion et migration

More information

USING DC FOR SERVICE DESCRIPTION

USING DC FOR SERVICE DESCRIPTION USING DC FOR SERVICE DESCRIPTION The Nature of Services...2 Content of a service...2 Aggregation/Boundary...3 Use of Elements to Describe Services...4 Resource content: Audience, Coverage, Description,

More information

Unique Identifiers Assessment: Results. R. Duerr

Unique Identifiers Assessment: Results. R. Duerr Unique Identifiers Assessment: Results 1 Outline Background Identifier schemes Assessment criteria Levels of data Use cases Assessment Results Preparing Data for Ingest, R. presented Duerr 10/27/09 by

More information

Globus Platform Services for Data Publication. Greg Nawrocki University of Chicago & Argonne National Lab GeoDaRRS August 7, 2018

Globus Platform Services for Data Publication. Greg Nawrocki University of Chicago & Argonne National Lab GeoDaRRS August 7, 2018 Globus Platform Services for Data Publication Greg Nawrocki greg@globus.org University of Chicago & Argonne National Lab GeoDaRRS August 7, 2018 Outline Globus Overview Globus Data Publication v1 Lessons

More information

The SPECTRUM 4.0 Acquisition Procedure

The SPECTRUM 4.0 Acquisition Procedure The SPECTRUM 4.0 Acquisition Procedure Contents 1. What is the SPECTRUM 4.0 Acquisition Procedure? 2. The Acquisition Procedure and Accreditation 3. Why is the Acquisition Procedure important? 4. When

More information

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland Dr Aileen O Carroll Policy Manager Digital Repository of Ireland

More information

Registry Interchange Format: Collections and Services (RIF-CS) explained

Registry Interchange Format: Collections and Services (RIF-CS) explained ANDS Guide Registry Interchange Format: Collections and Services (RIF-CS) explained Level: Awareness Last updated: 10 January 2017 Web link: www.ands.org.au/guides/rif-cs-explained The RIF-CS schema is

More information