DMPonline / DaMaRO Workshop. Rewley House, Oxford 28 June DataStage. - a simple file management system. David Shotton

Similar documents
The Oxford DMPonline Project

(PI, JISC UMF DataFlow Project) Introduction to DataStage

Minimal Metadata Standards and MIIDI Reports

DataFlow and VIDaaS Workshop

SHARING YOUR RESEARCH DATA VIA

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh

DigitalHub Getting started: Submitting items

Facilitate Open Science Training for European Research

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Feed the Future Innovation Lab for Peanut (Peanut Innovation Lab) Data Management Plan Version:

Open Data is a new paradigm in which research data are freely and openly shared, with full re-use rights. Open data ensures that research integrity

JISC Grant Funding 07/11 Cover Sheet for Bids (All sections must be completed) Please indicate which strand you are applying for:

Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe

Open Access & Open Data in H2020

Data Curation Profile Human Genomics

The data explosion. and the need to manage diverse data sources in scientific research. Simon Coles

Open Science, FAIR data and effective data management

The Data Curation Profiles Toolkit: Interview Worksheet

Tools for Data Management. Research Data Management : Session 3 9 th June 2015

Where to store research data during and after a project. Dr. Chris Emmerson Research Data Manager

Data Management Checklist

Digital The Harold B. Lee Library

Integrating Data with Publications: Greater Interactivity and Challenges for Long-Term Preservation of the Scientific Record

DaMaRO (Data Management Roll-out at Oxford): DataBank & DataFinder

The library s role in promoting the sharing of scientific research data

The DOI Identifier. Drexel University. From the SelectedWorks of James Gross. James Gross, Drexel University. June 4, 2012

Introducing the Springer Nature Data Support Services

Linda Strick Fraunhofer FOKUS. EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018

NRF Open Access Statement

DOIs for Research Data

OPENAIRE FP7 POST-GRANT OPEN ACCESS PILOT

Guide to depositing research in 02, the UOC s institutional repository SENDING PUBLICATIONS TO THE O2 REPOSITORY FROM GIR

The MEG Metadata Schemas Registry Schemas and Ontologies: building a Semantic Infrastructure for GRIDs and digital libraries Edinburgh, 16 May 2003

WIReDSpace (Wits Institutional Repository Environment on DSpace)

Data Management: the What, When and How

VI-SEEM Data Repository. Presented by: Panayiotis Charalambous

The OAIS Reference Model: current implementations

Legal Issues in Data Management: A Practical Approach

RADAR A Repository for Long Tail Data

Deposit guide for Nottingham eprints

Callicott, Burton B, Scherer, David, Wesolek, Andrew. Published by Purdue University Press. For additional information about this book

The Materials Data Facility

Software + Services for Data Storage, Management, Discovery, and Re-Use

The International Journal of Digital Curation Volume 8, Issue

Persistent Identifier the data publishing perspective. Sünje Dallmeier-Tiessen, CERN 1

Data publication and discovery with Globus

Research Data Repository Interoperability Primer

Research Data Management: lessons learned - and still to learn

Show me the data. The pilot UK Research Data Registry. 26 February 2014

Accepted manuscript deposits via Symplectic Elements

RPg procedures for Research Data Management (RDM)

Single New Type for All Creative Works

Research Data Management Procedures and Guidance

How to assist researchers in sharing their research data October 22, 2015

Wolfson Digital Research Cluster. Inaugural Meeting

Overview. Section 1. Reviewer Home Page. Module 3: Reviewing a Manuscript for Ethnicity & Disease

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

Checklist and guidance for a Data Management Plan, v1.0

28 September PI: John Chip Breier, Ph.D. Applied Ocean Physics & Engineering Woods Hole Oceanographic Institution

Making data publication a first class research output

RDM through a UK lens - New Roles for Librarians?

Using DCC DMPonline to write a Data Management Plan

Lessons Learned. Implementing Rosetta in the Harold B. Lee Library

Deliverable 6.4. Initial Data Management Plan. RINGO (GA no ) PUBLIC; R. Readiness of ICOS for Necessities of integrated Global Observations

Edinburgh DataShare: Tackling research data in a DSpace institutional repository

Implementation of Open-World, Integrative, Transparent, Collaborative Research Data Platforms: the University of Things (UoT)

Reproducibility and Reuse of Scientific Code Evolving the Role and Capabilities of Publishers

Data Curation Profile Botany / Plant Taxonomy

Qvidian Proposal Automation provides several options to help you accelerate the population of your content database.

Promoting Open Standards for Digital Repository. case study examples and challenges

Taylor & Francis Online. A User Guide.

Part 2: Current State of OAR Interoperability. Towards Repository Interoperability Berlin 10 Workshop 6 November 2012

Helping Journals to Upgrade Data Publications for Reusable Research

How to contribute information to AGRIS

Advancing code and data publication and peer review. Erika Pastrana, PhD Executive Editor, Nature Journals ALPSP_Sept 2018

Interact2 Help and Support

The Canadian Information Network for Research in the Social Sciences and Humanities.

Guide for depositing Final Bachelor's Degree Project Final Master's Degree Project- Practicum in O2, the UOC's institutional repository

Fusion Registry 9 SDMX Data and Metadata Management System

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA

Data Management Plan Generic Template Zach S. Henderson Library

Data Exchange and Conversion Utilities and Tools (DExT)

Science Panel Discussion presentation: "A Data Sharing Story"

REFERENCE GUIDE FOR MANUAL DATA INPUT v1.1

Data Management Plan

Adding Research Datasets to the UWA Research Repository

DSpace: Administration

Mendeley: A Reference Management Tools

DataSTORRE Deposit Guide

Data Curation Handbook Steps

Using DCC DMPonline to write a Data Management Plan

Making Ontology Documentation with LODE

Medici for Digital Cultural Heritage Libraries. George Tsouloupas, PhD The LinkSCEEM Project

The Physiome Model Repository. Poul Nielsen

Demos: DMP Assistant and Dataverse

The Data Management Plan: Putting policy into practice Suzanne Clarke Director, Information Resources

University at Buffalo's NEES Equipment Site. Data Management. Jason P. Hanley IT Services Manager

Metadata Sharing Policy

NSF Data Management Plan Template Duke University Libraries Data and GIS Services

Microsoft Office SharePoint. Reference Guide for Contributors

Transcription:

DMPonline / DaMaRO Workshop Rewley House, Oxford 28 June 2013 DataStage - a simple file management system David Shotton Oxford e-research Centre and Department of Zoology University of Oxford, UK david.shotton@zoo.ox.ac.uk David Shotton, 2013 Published under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Licence

Information management in biological research can be quite complex Information flow for an epidemiological study of zoonotic diseases

The conventional research data lifecycle Institutional repositories Publication activities Scholarly publications: conference papers and journal articles Hypothesis formulation and project design Research results and conclusions Research plan Data selection and interpretation Experimentation and data creation Research datasets abandoned on local hard drives or CD-ROMs Raw data in research notebooks and live PC files

The enhanced research data lifecycle Dissemination Open data on Web Institutional repositories Papers and datasets Preservation Publication activities Research results and conclusions Scholarly publications: conference papers and journal articles Hypothesis formulation and project design Research plan Local filestore Private and sharable Management Data selection and interpretation Raw data in research notebooks and live PC files Experimentation and data creation

The need for easy-to-use data management tools To facilitate uptake by busy researchers, we need easyto-use data management tools that provide immediate benefit We practice sheer curation (http://en.wikipedia.org/wiki/sheer_curation): working with users rather than against them enable use of data file formats familiar to them make contribution attribution activities sufficiently lightweight and transparent that they do not impose significant cognitive overhead The first task in data management is to ensure live data files are properly stored and reliably backed up For this we have developed DataStage

What is DataStage? DataStage is a simple file management system for use by research groups to permit organization of 'live' data files, including those that will not get archived One DataStage instance is used per research group Each group member has a Private Folder: only (s)he can write, and only (she) and PI can read a Shared Folder: only (s)he can write, and all group members can read a Collaborative Folder: All group members can write and read The DataStage server can be deployed locally or in the cloud The DataStage file store appears to each user as a mapped drive It can also be accessed via a Web browser It is tailored to permit creation of data packages from selected data files, and their submission to the Oxford DataBank, our institutional data repository, where they can be archived and published with DataCite DOIs To encourage submissions, metadata requirements are minimal

DataStage and DataBank Researchers DataStage: http://datastage-demo.bodleian.ox.ac.uk DataBank: http://databank-demo.bodleian.ox.ac.uk DataStage file system Researchers, other users SWORD deposit DataBank repository

The DataStage web interface front page

The DataStage log-in page Log in with a locally assigned username and password Because the same username and password are used both to create the mapped drive and for web access, that use quite different protocols, we have been unable to integrate with the University Single Sign-On System

My own private folder contains a file, Permissions.txt, defining the access permissions for this folder, an RDF manifest file, and a file I added earlier

Upload files to this folder, then submit them as a data package I previously added the file userstories.docx to my private folder Additional files can be uploaded to the DataStage server one by one All the files in this folder (or any other selected folder or sub-folder) can be submitted as a data package to DataBank, using the Submit button

Benefits of DataStage Simple and easy to use Both mapped drive and browser access The DataStage Server can be hosted locally or on the cloud Can be used both for local file management and backup on a day-to-day basis, and also for data submission to a repository for archiving and publication Repository submission uses the SWORD2 protocol, and so can be directed to any SWORD2-compliant repository (e.g. Dspace), not just to DataBank Minimal metadata requirements to encourage usage Open source software Packaged as a Debian package for one-click installation on an Open Source Ubuntu Linux system Separate administrative interface for configuring group members, permissions, etc.

DataCite metadata entry tool for more complete metadata

Metadata can be viewed as HTML or output as RDF

Acknowledgements Graham Klyne and Bhavana Ananda have been primarily responsible for the development of DataStage, with Katherine Fletcher as our invaluable project manager Tanya Gray created the DataCite metadata input form Richard Jones of Cottage Labs, working with us and with the DataBank developer Anusha Ranganathan at the Bodleian Libraries, implemented the SWORD2 protocol to encode DataStage submissions to DataBank The JISC funded the DataFlow Project that lead to the development of DataStage and improvements to DataBank I am grateful to them all

end