Managing Data in the long term. 11 Feb 2016
|
|
- Sibyl Bennett
- 6 years ago
- Views:
Transcription
1 Managing Data in the long term 11 Feb 2016
2 Outline What is needed for managing our data? What is an archive? 2
3 3
4 Motivation Researchers often have funds for data management during the project lifetime. Limited time to manage data once project has completed Essentially it's not the researchers job But, there is value in the ensuring data are available beyond the end of the project Value to peers Potential value to researchers in other areas (cross-discipline) 4
5 Motivation Edmond Halley (C18) used historical data to determine the trajectory of a comet and provide validation of Newton's theory of gravitation. Due to period of comet ~70years historical data essential. Needed to use data collected for different purposes (eg propaganda, religious) 5
6 Motivation Data taken from google scholar Download View Publication Moser et al Nobel Prize data / / / /
7 Motivation 7
8 Motivation Currently ~50TB data archived 44 datasets (49TB) from Climate community 4 datasets Biology, 1 Computer science With very little prompting! Demonstrates researchers do see value in their data a service capable of managing their data 8
9 9
10 An Important Distinction An archive is a service that provides long-term access to data. Long-term usually means More than 5 years. An archive is not a backup A backup is a snapshot of data that may change over time (eg last weeks backup of file X may be different to this weeks backup of file X). Once data reaches a mature state (ie doesn't change) then it can be considered for archiving. 10
11 Roles The Norstore archive recognises 5 different types of user: Creator, Contributor, Data Manager, Rights Holder, Access User. All types can be a person or an organisation (although in the case of an organisation a contact person is needed). The Access Users doesn't need to be defined (unless data access is restricted). It is possible that the different types can resolve to the same person or organisation. It s important to assign these roles to the dataset in case of questions. 11
12 Creator and Contributor Roles A person uploading data into the archive takes the role of the Contributor. There can be more than one contributor for a dataset. The Contributor uploads the data and fills-in the metadata for the dataset. The Contributor shares the responsibility of ensuring the dataset is complete, abides by the Terms and Conditions and that the metadata is accurate. The Creator is the person or group that created the data. 12
13 Data Manager To address the problem of datasets being used in different situations than originally anticipated we need to have an expert or contact person for the dataset. Data Manager responsible for fielding questions or comments regarding the dataset during its lifetime. The Contributor does not need to maintain a connection with the dataset (eg contributor could be a PostDoc or PhD student). Doesn t have to be an expert on the dataset, but should know whom to contact. Similar to what happens with publications (contact person or corresponding author is mentioned). 13
14 Rights Holder The Rights Holder is the person or group that controls or owns the rights to the dataset. This includes intellectual property, copyright. There may be more than one Rights Holder for a dataset. If the access restrictions exist on the use of the dataset the Rights Holder will need to be contacted for permission to use the dataset. In most cases (those abiding by the NLOD or CCv4 license) the role of the Rights Holder is less important (but it still relevant). It is IMPORTANT that you check with your Institution, funding agency as to whom has rights on your dataset. 14
15 Access User Any person querying the archive or using the data in the archive assumes the role of an Access User. Metadata for all published datasets is accessible by all Access Users. Datasets are accessible only requiring an address The download link needs to be sent to the user. Using datasets assumes you abide by the access licence. 15
16 Archiving Data Is part of the data lifecycle Requires information from previous phases Information on how the data was collected, processed, etc Needs to be taken into consideration at project proposal time Motivates the need for a plan for data management 16
17 Research Data Management Plan Founded on three criteria for the research project: Successful data collection Successful data use Successful data sharing with target audience Throughout the data lifecycle Plan will also help in provisioning resources Norstore working on a template to address these criteria that: Recognises researchers are not data management experts Uses best practices from (UK Digital Curation Centre, DANS and other agencies) 17
18 Research Data Management Plan Research Council of Norway Researchers Data Management Plan Research Institutions Service Providers: Norstore, Nortur 18
19 Research Data Management Plan Currently drafting a template for the plan Intention is to have pre-prepared text as much as possible Template ready very soon Next steps: Review template draft internally, seek feedback from stakeholders 19
20 Datasets (archive) A collection of related data Usually data is related in terms of use e.g. cloud simulation data Up to the researcher to define the dataset Datasets elegible for archiving should be 'closed' or 'complete': Datasets such as those that resulted in publications Datasets that are considered a natural conclusion to a project All datasets should be considered of lasting value to the community 20
21 Datasets (archive) Data needs ideally to be in a standard or open format that makes it possible to migrate in case of obsolescence. Licensing who can use the data, under what restrictions. Contact persons in case of questions concerning the data. Integrity checksums should be provided along with the data. Metadata description of the data. 21
22 Datasets (Archive) Are some popular approaches to arranging data. Internet Engineering Task Force proposal for structuring related data BagIt ( Used by a variety of institutions (eg Library of Congress) Essentially: 22
23 Dataset (Archive) BagIt data directory contains sub-structure. Suggest dividing into: doc for documentation (including table of contents of layout) src for any source code needed to read the data (and possibly that generated the data) aux auxiliary data file <data type> for data files of that data type Or any other layout. But, try to provide a doc directory containing documentation and a src containing source code. Can then zip or tar the BagIt hierarchy and upload to the archive. 23
24 Metadata What is this? What was it used for? 24
25 Metadata and Datasets Metadata essential to successfully use the dataset: Describes what the dataset is. Describes where it came from. Describes how to use it. Metadata is created throughout the data lifecycle Different phases of the lifecycle require different types of metadata Perhaps data are initially stored in a primitive format and then processed. 25
26 Metadata and Datasets Can be divided into 3 classes: Descriptive: what the data is, features, etc Structural: how the data is arranged, formats, etc Administrative: how to manage the data, checksums, rights etc Many domains have complex, detailed metadata... 26
27 Metadata Seeing Standards: A Visualisation of the Metadata Universe. J. Riley, D. Becker 27
28 Metadata Metadata schemes for many communities at different stages of evolution. Quite detailed. Very difficult for Norstore to support all metadata schemes Look for lowest common denomenator 28
29 Norstore Archive Metadata Many metadata schemes have Dublin Core either as a basis or have a strong overlap with Dublin Core. Dublin Core is an ISO standard. The standard has 15 terms, extended Dublin Core has more terms. The Norstore Archive uses Dublin Core as a basis. Additional metadata terms added that are not covered by DC, but are generic enough for all communities. OAI-PMH based on DC so automatically compliant. Metadata is separate entity from the dataset. 29
30 Norstore Archive Metadata Descriptive Information Administrative Information Structural Information Category Description Identifier Internal Identifier Journal Article Language Phase State Subject Title Access Rights Contributor Created Creator Data Manager License Lifetime Preservation Level Published on Publisher Rights Rights Holder Submitted Terms and Conditions for Deposit File Checksum File Name File Size File Type Descriptive Information (optional) Bibliographic Citation Conforms to Comment Geolocation Label Project Provenance Source Temporal Coverage Bold terms are Dublin Core recommended terms. Top 3 boxes contain mandatory metadata. Terms in italics are automatically filled in by archive. Only ~14 terms to be defined by the user 30
31 Norstore Archive Metadata Norstore metadata intended to be as generic as possible. Sufficient to locate data and understand how to use the data. More detailed information should be contained in domainspecific catalogues. Can reference domain-specific catalogue within descriptive metadata In the future we could envisage the archive holding a reference to the domain catalogue. Need to be aware the archive lifetime may be longer than the domain catalogue Can have domain specific catalogues use the DOI as a handle to the data (resolving the link will provide access to the data). 31
32 Norstore and Domain Metadata Data Service Norstore Archive Metadata Service DOI Resolver Domain Metadata Service Domain Metadata catalogue can have DOI registered. Could then invoke DOI resolver to provide access to archive metadata and data. 32
33 Norstore Metadata Currently metadata must be supplied using the web interface Metadata needs to be completed in two stages: Before data upload consists of mandatory metadata that is needed by the archive for managing the data (eg contact information, title of the dataset etc) After data upload consists of remaining mandatory descriptive information and optional information. There is a 3 month time limit to fill in metadata User will be reminded during this period of need to complete metadata. At the discretion of the Archive Manager the dataset may be deleted if the metadata remains incomplete after this limit. Typically metadata is completed within 2 weeks. 33
34 Completing Norstore Metadata 34
35 Tips for Norstore Metadata Avoid duplication if information is contained in the publication or other referenced material. Consider what information is needed to reanalyse the data: libraries, operating systems, workflows, manuals, any other data a good test is to ask a person new to the community to document what they need to make use of the data. Any features in the data worth mentioning? How the data was collected described? The environment the data was collected in such as instrument settings etc. 35
36 Tips for Norstore Metadata Use the description field to describe what the dataset is and how to use it. Use the journal metadata to provide a reference to the article that describes the dataset. If there is a lot of documentation it could be included as part of the dataset and describe where to find it in the description. In this case the description can be more succinct. If the dataset has temporal or spatial information consider using the optional metadata to capture that information. Provides a visual aid to the description of your dataset. 36
37 Tips for Norstore Metadata Links to external references with more information are good. But, beware of longevity Will the reference last the lifetime of the dataset? Beware of jargon or terminology Perhaps run the description by novice users to see if it s clear 37
38 Norstore Metadata Plans Recognise that many projects have metadata catalogues. Ability to extract subset that matches some of the Norstore metadata terms would be useful. Working on a REST API for the metadata catalogue. Currently looking at the Search, but can be extended to the ingest of metadata. Allows you to script the extraction and loading of some of the norstore metadata automatically. Useful for projects with many datasets. Implement metadata errata: Allow traceable corrections of metadata 38
39 Archive Oslo disk irods W e b Norstore catalog C L I External User tape Project Area user Tromso irods disk 39
40 The Archive User Interface (web and CLI) IRODS Metadata Catalog Storage (disk and tape, Oslo, Tromso) Designed the archive to allow replacement of any component with minimal impact 40
41 User Interface The primary user interface is web-based. Command line interface used for large dataset interaction with the project area. Interfaces to norstore metadata catalogue. Also used for metadata search PostgreSQL database. All metadata and state information held there. Also interfaces to the data management system (irods). 41
42 IRODS Data Management System Rule oriented data management system Abstracts details of distributed storage by providing logicallayer Logical-physical mapping held in irods metadata catalogue PostgreSQL database. Provides access control and interfaces to authentication such as GSI and Kerberos Norstore makes use of just one archive user to manage the data Users don t interact directly with irods, but through the web interface or command line tools. 42
43 IRODS Data Management System Allows policies to be placed on the data Norstore policy to replicate data to 3 resources Also have a policy to remove data from one resource and replicate to a new resource Also policy to regularly checksum data 43
44 Archiving a Dataset Datasets can be archived from researchers local computer, or from norstore project area. Local computer uploads achieved via Filesender service Datasets < 1TB in size can be uploaded (can be increased) Project area requires users to be registered with a valid project Data are uploaded via command-line scripts Once dataset is uploaded metadata needs to be filled in via the web interface. 44
45 Norstore Archive workflow Identify data Seek approval Identify metadata Fill-in metadata upload data Verify Request publication Verify metadata Ensure approval Assign DOI Publish 45
46 Project Area Upload Select Project Area Upload containing dataset UUID Create Dataset Manifest File A valid argument for find <dir>! type d <file pattern> E.g. /projects/ns9999k name *.tgz Run ArchiveData set UUID <manifest file> Job submitted to queue. when finished. Query status: ListArchiveDataset UUID 46
47 Publishing Data Necessary in order to be able to cite datasets. Currently using DataCite node in Denmark to issue Digital Object Identifiers. DOI are standard, unique identifier that can be used to identify a resource. Originally developed for documents, but now being used for data. Each DOI must point to metadata about the object and may contain a link to the dataset itself. Resolver services are used to resolve the DOI to a URI. Structure of DOI meaningful doi: / refers to the DOI registry, 1000 refers to the entity that registered the data, 182 refers to the actual object. Once a dataset is published it cannot be modified Some metadata may be updated 47
48 Publishing Data 48
49 Landing page Permanent metadata record for the dataset. All access via DOI resolve to this page. Page contains links to additional metatdata and data Landing page exists for terminated datasets Called a Tombstone record Link to data removed Contains additional metadata: when data was removed, reason for removal. 49
50 Planned Functionality Imminent: REST API for searching datasets. Provides command line access to metadata. Allow harvesting of metadata (opensearch and OAI-PMH planned). Imminent: Versioning datasets. Accommodate cases where researcher wishes to update data (either data has migrated to different format, or mistakes made, or update metadata). Provide a link back to the previous version visible from landing page. New version will have new DOI. Previous version will remain accessible unless explicitly terminated. 50
51 Future Functionality Subsets of datasets: Researchers may be interested in downloading only a subset of a dataset. Via the table of contents it s possible to identify subset of interest and tag for download files of interest. Collections of datasets: There may be a logical grouping of datasets (eg series of datasets) Can make it easier to link related datasets 51
52 Preserving Datasets Digital preservation attempts to ensure digital remains accessible and usable by future users. This is addressed by: Ensuring bit-level integrity through data replication. Ensuring data is understandable (may require adding or updating metadata on how to use and interpret the data). Ensuring data is discoverable (equipped with the right and relevant metadata and description). Ensuring data in usable format (may require migration from obsolete formats to new formats, or virtual environments). 52
53 Migration and Virtualisation Things to be aware of for Migration: What s the best format (most durable, popular, open)? What features in the data need to be maintained and how can we check they are? Migration pros/cons: Easy to use new tools with old data, easier to integrate data into new/current workflows One-way street. May lose some features/functionality in the migration that may only be relevant later. Requires experts to be able to assess what features need to be kept and whether they are indeed kept. Things to be aware of for Virtualisation: What type of virtual machine to use (licensing, rendering, performance)? Are all the resources required by the application contained within the VM? 53
54 Migration and Virtualisation Virtualisation pros/cons: Preserves original features/functionality (little risk of missing something). Can be difficult to integrate with newer tools. If large volume of data may not be scalable option. Choice depends on your circumstances and needs. 54
55 Auditing Aim to pass Data Seal of Approval ( Ensures the archive conforms to best practice Allows users to assess how reliable the archive is. 55
Archive II. The archive. 26/May/15
Archive II The archive 26/May/15 What is an archive? Is a service that provides long-term storage and access of data. Long-term usually means ~5years or more. Archive is strictly not the same as a backup.
More informationData Curation Handbook Steps
Data Curation Handbook Steps By Lisa R. Johnston Preliminary Step 0: Establish Your Data Curation Service: Repository data curation services should be sustained through appropriate staffing and business
More informationMetadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan
Metadata for Data Discovery: The NERC Data Catalogue Service Steve Donegan Introduction NERC, Science and Data Centres NERC Discovery Metadata The Data Catalogue Service NERC Data Services Case study:
More informationAssessment of product against OAIS compliance requirements
Assessment of product against OAIS compliance requirements Product name: Archivematica Date of assessment: 30/11/2013 Vendor Assessment performed by: Evelyn McLellan (President), Artefactual Systems Inc.
More informationDIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM
OMB No. 3137 0071, Exp. Date: 09/30/2015 DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM Introduction: IMLS is committed to expanding public access to IMLS-funded research, data and other digital products:
More informationDryad Curation Manual, Summer 2009
Sarah Carrier July 30, 2009 Introduction Dryad Curation Manual, Summer 2009 Dryad is being designed as a "catch-all" repository for numerical tables and all other kinds of published data that do not currently
More informationPreservation Standards (& Specifications) (&& Best Practices)
Standards (& Specifications) (&& Best Practices) Discoverable, Available, Accessible: Preserving Digital Content NISO Webinar By Amy Kirchhoff Archive Service Product Manager, Portico, JSTOR September
More informationResearch Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShare Stuart Macdonald EDINA & Data Library University of Edinburgh NFAIS Open Data Seminar, 16 June 2016 Context EDINA and Data Library are a
More informationDRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy
DRI: Dr Aileen O Carroll Policy Manager Digital Repository of Ireland Royal Irish Academy Dr Kathryn Cassidy Software Engineer Digital Repository of Ireland Trinity College Dublin Development of a Preservation
More informationB2SAFE metadata management
B2SAFE metadata management version 1.2 by Claudio Cacciari, Robert Verkerk, Adil Hasan, Elena Erastova Introduction The B2SAFE service provides a set of functions for long term bit stream data preservation:
More informationDataset Documentation Reference Guide for Pure Users
Dataset Documentation Reference Guide for Pure Users Pure is the University's Current Research Information System (CRIS). Information held in Pure relates to research staff and their datasets, publications,
More informationImplementation of the CoreTrustSeal
Implementation of the CoreTrustSeal The CoreTrustSeal board hereby confirms that the Trusted Digital repository Mendeley Data complies with the guidelines version 2017-2019 set by the. The afore-mentioned
More informationThe OAIS Reference Model: current implementations
The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation
More informationData Management Checklist
Data Management Checklist Managing research data throughout its lifecycle ensures its long-term value and prevents data from falling into digital obsolescence. Proper data management is a key prerequisite
More informationPersistent identifiers, long-term access and the DiVA preservation strategy
Persistent identifiers, long-term access and the DiVA preservation strategy Eva Müller Electronic Publishing Centre Uppsala University Library, http://publications.uu.se/epcentre/ 1 Outline DiVA project
More informationAssessment of product against OAIS compliance requirements
Assessment of product against OAIS compliance requirements Product name: Archivematica Sources consulted: Archivematica Documentation Date of assessment: 19/09/2013 Assessment performed by: Christopher
More informationIJDC General Article
Developing a Data Vault Stuart Lewis Lorraine Beard Edinburgh University Library The University of Manchester Library Mary McDerby Robin Taylor IT Services The University of Manchester Edinburgh University
More informationInstitutional Repository using DSpace. Yatrik Patel Scientist D (CS)
Institutional Repository using DSpace Yatrik Patel Scientist D (CS) yatrik@inflibnet.ac.in What is Institutional Repository? Institutional repositories [are]... digital collections capturing and preserving
More informationPreservation. Policy number: PP th March Table of Contents
Preservation Policy number: PP 10 23 th March 2018 Table of Contents Outline 2 Background 2 Organisation 2 Funding 3 Roles and Responsibilities 4 Task Forces 4 External Auditing and Certification 6 Ingest
More informationMetadata Workshop 3 March 2006 Part 1
Metadata Workshop 3 March 2006 Part 1 Metadata overview and guidelines Amelia Breytenbach Ria Groenewald What metadata is Overview Types of metadata and their importance How metadata is stored, what metadata
More informationGNU EPrints 2 Overview
GNU EPrints 2 Overview Christopher Gutteridge 14th October 2002 Abstract An overview of GNU EPrints 2. EPrints is free software which creates a web based archive and database of scholarly output and is
More informationIntroduction to Digital Preservation. Danielle Mericle University of Oregon
Introduction to Digital Preservation Danielle Mericle dmericle@uoregon.edu University of Oregon What is Digital Preservation? the series of management policies and activities necessary to ensure the enduring
More informationScience Europe Consultation on Research Data Management
Science Europe Consultation on Research Data Management Consultation available until 30 April 2018 at http://scieur.org/rdm-consultation Introduction Science Europe and the Netherlands Organisation for
More informationISO Self-Assessment at the British Library. Caylin Smith Repository
ISO 16363 Self-Assessment at the British Library Caylin Smith Repository Manager caylin.smith@bl.uk @caylinssmith Outline Digital Preservation at the British Library The Library s Digital Collections Achieving
More informationSlide 1 & 2 Technical issues Slide 3 Technical expertise (continued...)
Technical issues 1 Slide 1 & 2 Technical issues There are a wide variety of technical issues related to starting up an IR. I m not a technical expert, so I m going to cover most of these in a fairly superficial
More informationDigital repositories as research infrastructure: a UK perspective
Digital repositories as research infrastructure: a UK perspective Dr Liz Lyon Director This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 UKOLN is supported by: Presentation
More informationPERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA
PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA MATTHEW WOOLLARD.. ECONOMIC AND SOCIAL DATA SERVICE UNIVERSITY OF ESSEX... METADATA AND PERSISTENT IDENTIFIERS FOR SOCIAL AND ECONOMIC DATA,
More informationData Management Glossary
Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative
More informationTowards a joint service catalogue for e-infrastructure services
Towards a joint service catalogue for e-infrastructure services Dr British Library 1 DI4R 2016 Workshop Joint service catalogue for research 29 September 2016 15/09/15 Goal A framework for creating a Catalogue
More informationCCSDS STANDARDS A Reference Model for an Open Archival Information System (OAIS)
CCSDS STANDARDS A Reference Model for an Open Archival System (OAIS) Mr. Nestor Peccia European Space Operations Centre, Robert-Bosch-Str. 5, D-64293 Darmstadt, Germany. Phone +49 6151 902431, Fax +49
More informationAdding Research Datasets to the UWA Research Repository
University Library Adding Research Datasets to the UWA Research Repository Guide to Researchers What does UWA mean by Research Datasets? Research Data is defined as facts, observations or experiences on
More informationSHARING YOUR RESEARCH DATA VIA
SHARING YOUR RESEARCH DATA VIA SCHOLARBANK@NUS MEET OUR TEAM Gerrie Kow Head, Scholarly Communication NUS Libraries gerrie@nus.edu.sg Estella Ye Research Data Management Librarian NUS Libraries estella.ye@nus.edu.sg
More informationDigital Preservation at NARA
Digital Preservation at NARA Policy, Records, Technology Leslie Johnston Director of Digital Preservation US National Archives and Records Administration (NARA) ARMA, April 18, 2018 Policy Managing Government
More informationORCA-Registry v2.4.1 Documentation
ORCA-Registry v2.4.1 Documentation Document History James Blanden 26 May 2008 Version 1.0 Initial document. James Blanden 19 June 2008 Version 1.1 Updates for ORCA-Registry v2.0. James Blanden 8 January
More informationRADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft
RESEARCH DATA REPOSITORY http://www.radar-projekt.org http://www.radar-service.eu Establishing a generic Research Data Repository: RADAR Digital Infrastructures for Research 2016 Conference 28 th - 30
More informationMigration. 22 AUG 2017 VMware Validated Design 4.1 VMware Validated Design for Software-Defined Data Center 4.1
22 AUG 2017 VMware Validated Design 4.1 VMware Validated Design for Software-Defined Data Center 4.1 You can find the most up-to-date technical documentation on the VMware Web site at: https://docs.vmware.com/
More informationOpen Data and its enemies
Open Data and its enemies Dries van Oosten Debye Institute for NanoMaterials Science Center for Extreme Matter and Emergent Phenomena 1 Content 2 FAIR Findable Accessible Interoperable Reusable 3 4 Funding
More informationHow to deposit your accepted paper in ORA through Symplectic
How to deposit your accepted paper in ORA through Symplectic Act on Acceptance: when you ve had a journal article or conference paper accepted for publication, deposit the accepted manuscript 1 into ORA
More informationEUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal Heinrich Widmann, DKRZ DI4R 2016, Krakow, 28 September 2016 www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020
More informationBPMN Processes for machine-actionable DMPs
BPMN Processes for machine-actionable DMPs Simon Oblasser & Tomasz Miksa Contents Start DMP... 2 Specify Size and Type... 3 Get Cost and Storage... 4 Storage Configuration and Cost Estimation... 4 Storage
More informationWriting a Data Management Plan A guide for the perplexed
March 29, 2012 Writing a Data Management Plan A guide for the perplexed Agenda Rationale and Motivations for Data Management Plans Data and data structures Metadata and provenance Provisions for privacy,
More informationCopyright 2008, Paul Conway.
Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution - Non-Commercial - Share Alike 3.0 License.. http://creativecommons.org/licenses/by-nc-sa/3.0/
More informationAn overview of the OAIS and Representation Information
An overview of the OAIS and Representation Information JORUM, DCC and JISC Forum Long-term Curation and Preservation of Learning Objects February 9 th 2006 University of Glasgow Manjula Patel UKOLN and
More informationUNIVERSITY OF NOTTINGHAM LIBRARIES, RESEARCH AND LEARNING RESOURCES
UNIVERSITY OF NOTTINGHAM LIBRARIES, RESEARCH AND LEARNING RESOURCES Digital Preservation and Access Policy 2015 Contents 1.0 Document Control... 3 2.0 Aim... 5 2.1 Purpose... 5 2.2 Digital Preservation
More informationLevels of Service Authors: R. Duerr, A. Leon, D. Miller, D.J Scott Date 7/31/2009
Levels of Service Authors: R. Duerr, A. Leon, D. Miller, D.J Scott Date 7/31/2009 CHANGE LOG Revision Date Description Author 1.0 7/31/2009 Original draft Duerr, Leon, Miller, Scott 2.0 3/19/2010 Updated
More informationSession Two: OAIS Model & Digital Curation Lifecycle Model
From the SelectedWorks of Group 4 SundbergVernonDhaliwal Winter January 19, 2016 Session Two: OAIS Model & Digital Curation Lifecycle Model Dr. Eun G Park Available at: https://works.bepress.com/group4-sundbergvernondhaliwal/10/
More informationCANTCL: A Package Repository for Tcl
CANTCL: A Package Repository for Tcl Steve Cassidy Centre for Language Technology, Macquarie University, Sydney E-mail: Steve.Cassidy@mq.edu.au Abstract For a long time, Tcl users and developers have requested
More informationFrom Open Data to Data- Intensive Science through CERIF
From Open Data to Data- Intensive Science through CERIF Keith G Jeffery a, Anne Asserson b, Nikos Houssos c, Valerie Brasse d, Brigitte Jörg e a Keith G Jeffery Consultants, Shrivenham, SN6 8AH, U, b University
More informationPreserva'on*Watch. What%to%monitor%and%how%Scout%can%help. KEEP%SOLUTIONS%www.keep7solu:ons.com
Preserva'on*Watch What%to%monitor%and%how%Scout%can%help Luis%Faria%lfaria@keep.pt KEEP%SOLUTIONS%www.keep7solu:ons.com Digital%Preserva:on%Advanced%Prac::oner%Course Glasgow,%15th719th%July%2013 KEEP$SOLUTIONS
More informationCertification. F. Genova (thanks to I. Dillo and Hervé L Hours)
Certification F. Genova (thanks to I. Dillo and Hervé L Hours) Perhaps the biggest challenge in sharing data is trust: how do you create a system robust enough for scientists to trust that, if they share,
More informationIts All About The Metadata
Best Practices Exchange 2013 Its All About The Metadata Mark Evans - Digital Archiving Practice Manager 11/13/2013 Agenda Why Metadata is important Metadata landscape A flexible approach Case study - KDLA
More informationMetadata. Week 4 LBSC 671 Creating Information Infrastructures
Metadata Week 4 LBSC 671 Creating Information Infrastructures Muddiest Points Memory madness Hard drives, DVD s, solid state disks, tape, Digitization Images, audio, video, compression, file names, Where
More informationarchiving with Office 365
Email archiving with Office 365 ISO CERTIFIED info@cryoserver.com www.cryoserver.com +44 (0) 800 280 0525 Table of Contents 1.0 Purpose of Document 2 2.0 Email archiving in Office 365 2 2.1 Deleted folder
More informationSoftware Requirements Specification for the Names project prototype
Software Requirements Specification for the Names project prototype Prepared for the JISC Names Project by Daniel Needham, Amanda Hill, Alan Danskin & Stephen Andrews April 2008 1 Table of Contents 1.
More informationShowing it all a new interface for finding all Norwegian research output
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 00 (2014) 000 000 www.elsevier.com/locate/procedia CRIS 2014 Showing it all a new interface for finding all Norwegian research
More informationEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository Robin Rice EDINA and Data Library, Information Services University of Edinburgh, Scotland DSpace User Group Meeting Gothenburg,
More informationGEOSS Data Management Principles: Importance and Implementation
GEOSS Data Management Principles: Importance and Implementation Alex de Sherbinin / Associate Director / CIESIN, Columbia University Gregory Giuliani / Lecturer / University of Geneva Joan Maso / Researcher
More informationEUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data
EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data Heinrich Widmann, DKRZ Claudia Martens, DKRZ Open Science Days, Berlin, 17 October 2017 www.eudat.eu EUDAT receives funding
More informationPersonal Digital Information Project, Part 2: Hands-on Exercise
Drexel University From the SelectedWorks of James Gross May 14, 2012 Personal Digital Information Project, Part 2: Hands-on Exercise James Gross, Drexel University Available at: https://works.bepress.com/jamesgross/28/
More informationEUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT
EUDAT A European Collaborative Data Infrastructure Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT OpenAire Interoperability Workshop Braga, Feb. 8, 2013 EUDAT Key facts
More informationWriting for the web and SEO. University of Manchester Humanities T4 Guides Writing for the web and SEO Page 1
Writing for the web and SEO University of Manchester Humanities T4 Guides Writing for the web and SEO Page 1 Writing for the web and SEO Writing for the web and SEO... 2 Writing for the web... 3 Change
More informationResearch Data Repository Interoperability Primer
Research Data Repository Interoperability Primer The Research Data Repository Interoperability Working Group will establish standards for interoperability between different research data repository platforms
More informationOpen Access & Open Data in H2020
Open Access & Open Data in H2020 Services & Support Hannelore Vanhaverbeke, PhD Research Coordination Office What are you in for? Mandatory Each beneficiary must ensure open access (free of charge, online
More informationThe Materials Data Facility
The Materials Data Facility Ben Blaiszik (blaiszik@uchicago.edu), Kyle Chard (chard@uchicago.edu) Ian Foster (foster@uchicago.edu) materialsdatafacility.org What is MDF? We aim to make it simple for materials
More informationPreservation Planning in the OAIS Model
Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract
More informationUnique Object Identifiers
TM dnf0011 Unique Object Identifiers Component 2.1: Identifier Management Technical Guide Version 1 Unique Object Identifiers Page 1 of 6 Document Content Jonathan Simmons is responsible for the content
More informationDifferent Aspects of Digital Preservation
Different Aspects of Digital Preservation DCH-RP and EUDAT Workshop in Stockholm 3rd of June 2014 Börje Justrell Table of Content Definitions Strategies The Digital Archive Lifecycle 2 Digital preservation
More informationData Replication: Automated move and copy of data. PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013
Data Replication: Automated move and copy of data PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013 Claudio Cacciari c.cacciari@cineca.it Outline The issue
More informationA service-oriented national e-thesis information system and repository
Title of the presentation Date 1 A service-oriented national e-thesis information system and repository Nikos Houssos Panagiotis Stathopoulos Ioanna Sarantopoulou Dimitris Zavaliadis Evi Sachini National
More informationContent Management for the Defense Intelligence Enterprise
Gilbane Beacon Guidance on Content Strategies, Practices and Technologies Content Management for the Defense Intelligence Enterprise How XML and the Digital Production Process Transform Information Sharing
More informationKM COLUMN. How to evaluate a content management system. Ask yourself: what are your business goals and needs? JANUARY What this article isn t
KM COLUMN JANUARY 2002 How to evaluate a content management system Selecting and implementing a content management system (CMS) will be one of the largest IT projects tackled by many organisations. With
More informationData Curation Profile Human Genomics
Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date
More informationThe CEDA Archive: Data, Services and Infrastructure
The CEDA Archive: Data, Services and Infrastructure Kevin Marsh Centre for Environmental Data Archival (CEDA) www.ceda.ac.uk with thanks to V. Bennett, P. Kershaw, S. Donegan and the rest of the CEDA Team
More information1. Download and install the Firefox Web browser if needed. 2. Open Firefox, go to zotero.org and click the big red Download button.
Get Started with Zotero A free, open-source alternative to products such as RefWorks and EndNote, Zotero captures reference data from many sources, and lets you organize your citations and export bibliographies
More informationCoSA & Preservica Practical Digital Preservation 2015/16. Practical OAIS Digital Preservation Online Workshop Module 2
CoSA & Preservica Practical Digital Preservation 2015/16 Practical OAIS Digital Preservation Online Workshop Module 2 Practical Digital Preservation 2015/16 Welcome! PDP Online Workshops - with focus on
More informationThe Trustworthiness of Digital Records
The Trustworthiness of Digital Records International Congress on Digital Records Preservation Beijing, China 16 April 2010 1 The Concept of Record Record: any document made or received by a physical or
More informationOpen Access compliance:
Open Access compliance: How publishers can reach the recommended standards Jisc, April 2016. Draft Summary This document outlines what publishers might do to help authors and institutions globally implement
More informationArchival Information Package (AIP) E-ARK AIP version 1.0
Archival Information Package (AIP) E-ARK AIP version 1.0 January 27 th 2017 Page 1 of 50 Executive Summary This AIP format specification is based on E-ARK deliverable, D4.4 Final version of SIP-AIP conversion
More informationData management Backgrounds and steps to implementation; A pragmatic approach.
Data management Backgrounds and steps to implementation; A pragmatic approach. Research and data management through the years Find the differences 2 Research and data management through the years Find
More informationThe Canadian Information Network for Research in the Social Sciences and Humanities.
The Canadian Information Network for Research in the Social Sciences and Humanities http://www.synergiescanada.org Tim Au Yeung and Mary Westell Libraries and Cultural Resources University of Calgary March
More informationFor each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS
1 1. USE CASES For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS Business need: Users need to be able to
More informationMicrosoft SharePoint Server 2013 Plan, Configure & Manage
Microsoft SharePoint Server 2013 Plan, Configure & Manage Course 20331-20332B 5 Days Instructor-led, Hands on Course Information This five day instructor-led course omits the overlap and redundancy that
More informationRoy Lowry, Gwen Moncoiffe and Adam Leadbetter (BODC) Cathy Norton and Lisa Raymond (MBLWHOI Library) Ed Urban (SCOR) Peter Pissierssens (IODE Project
Roy Lowry, Gwen Moncoiffe and Adam Leadbetter (BODC) Cathy Norton and Lisa Raymond (MBLWHOI Library) Ed Urban (SCOR) Peter Pissierssens (IODE Project Office) Linda Pikula (IODE GEMIM/NOAA Library) Data
More informationFor Attribution: Developing Data Attribution and Citation Practices and Standards
For Attribution: Developing Data Attribution and Citation Practices and Standards Board on Research Data and Information Policy and Global Affairs Division National Research Council in collaboration with
More informationDemos: DMP Assistant and Dataverse
Demos: DMP Assistant and Dataverse Alexandra Cooper, Data Services Coordinator, Queen s University Meghan Goodchild, RDM Systems Librarian, Queen s University/Scholars Portal Overview of session Research
More informationThe Dublin Core Metadata Element Set
ISSN: 1041-5635 The Dublin Core Metadata Element Set Abstract: Defines fifteen metadata elements for resource description in a crossdisciplinary information environment. A proposed American National Standard
More informationAuthor Guidelines for Endodontic Topics
1. Submission of Manuscripts Author Guidelines for Endodontic Topics Manuscripts should be submitted electronically via the online submission site http://mc.manuscriptcentral.com/endodontictopics. Complete
More informationDigital The Harold B. Lee Library
Digital Preservation @ The Harold B. Lee Library CIMA 23 May 2013 How we got here? 1. Understanding Digital Preservation 2. Search for Content 3. Maintain Optical Disc Storage 4. In House Preservation
More informationUsing DCAT-AP for research data
Using DCAT-AP for research data Andrea Perego SDSVoc 2016 Amsterdam, 30 November 2016 The Joint Research Centre (JRC) European Commission s science and knowledge service Support to EU policies with independent
More informationThe digital preservation technological context
The digital preservation technological context Michael Day, Digital Curation Centre UKOLN, University of Bath m.day@ukoln.ac.uk Preservation of Digital Heritage: Basic Concepts and Main Initiatives, Madrid,
More informationUser Stories : Digital Archiving of UNHCR EDRMS Content. Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5
User Stories : Digital Archiving of UNHCR EDRMS Content Prepared for UNHCR Open Preservation Foundation, May 2017 Version 0.5 Introduction This document presents the user stories that describe key interactions
More informationThe Salesforce Migration Playbook
The Salesforce Migration Playbook By Capstorm Table of Contents Salesforce Migration Overview...1 Step 1: Extract Data Into A Staging Environment...3 Step 2: Transform Data Into the Target Salesforce Schema...5
More informationBuilding a Digital Repository on a Shoestring Budget
Building a Digital Repository on a Shoestring Budget Christinger Tomer University of Pittsburgh! PALA September 30, 2014 A version this presentation is available at http://www.pitt.edu/~ctomer/shoestring/
More informationISO Information and documentation Digital records conversion and migration process
INTERNATIONAL STANDARD ISO 13008 First edition 2012-06-15 Information and documentation Digital records conversion and migration process Information et documentation Processus de conversion et migration
More informationUSING DC FOR SERVICE DESCRIPTION
USING DC FOR SERVICE DESCRIPTION The Nature of Services...2 Content of a service...2 Aggregation/Boundary...3 Use of Elements to Describe Services...4 Resource content: Audience, Coverage, Description,
More informationUnique Identifiers Assessment: Results. R. Duerr
Unique Identifiers Assessment: Results 1 Outline Background Identifier schemes Assessment criteria Levels of data Use cases Assessment Results Preparing Data for Ingest, R. presented Duerr 10/27/09 by
More informationGlobus Platform Services for Data Publication. Greg Nawrocki University of Chicago & Argonne National Lab GeoDaRRS August 7, 2018
Globus Platform Services for Data Publication Greg Nawrocki greg@globus.org University of Chicago & Argonne National Lab GeoDaRRS August 7, 2018 Outline Globus Overview Globus Data Publication v1 Lessons
More informationThe SPECTRUM 4.0 Acquisition Procedure
The SPECTRUM 4.0 Acquisition Procedure Contents 1. What is the SPECTRUM 4.0 Acquisition Procedure? 2. The Acquisition Procedure and Accreditation 3. Why is the Acquisition Procedure important? 4. When
More informationDRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland
DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland Dr Aileen O Carroll Policy Manager Digital Repository of Ireland
More informationRegistry Interchange Format: Collections and Services (RIF-CS) explained
ANDS Guide Registry Interchange Format: Collections and Services (RIF-CS) explained Level: Awareness Last updated: 10 January 2017 Web link: www.ands.org.au/guides/rif-cs-explained The RIF-CS schema is
More information