Research Data Management: Secure and Preserve FEBRUARY 1, 2018 CAROL PERRY, LIBRARY A. MICHELLE EDWARDS, OAC

Similar documents
Research Data Management: Secure and Preserve: Best Practices to Preserve your Research Data

Where to store research data during and after a project. Dr. Chris Emmerson Research Data Manager

Sharing Qualitative Data: Challenges and Opportunities

SHARING YOUR RESEARCH DATA VIA

Science Europe Consultation on Research Data Management

Data Management Checklist

Checklist and guidance for a Data Management Plan, v1.0

Collection Policy. Policy Number: PP1 April 2015

Queen s University Library. Research Data Management (RDM) Workflow

RPg procedures for Research Data Management (RDM)

DIGITAL STEWARDSHIP SUPPLEMENTARY INFORMATION FORM

AMERICAN JOURNAL OF POLITICAL SCIENCE GUIDELINES FOR PREPARING REPLICATION FILES Version 1.0, March 25, 2015 William G. Jacoby

Ethics in Research Publications

Data Curation Profile Human Genomics

Data Management Plan Generic Template Zach S. Henderson Library

Linda Strick Fraunhofer FOKUS. EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018

Guide to Supplemental Materials

Making Sense of Data: What You Need to know about Persistent Identifiers, Best Practices, and Funder Requirements

Developing a Research Data Policy

DigitalHub Getting started: Submitting items

Tools for Data Management. Research Data Management : Session 3 9 th June 2015

Research Data Management & Preservation: A Library Perspective

Data management Backgrounds and steps to implementation; A pragmatic approach.

Appendix: Data Availability Policies & Replication Policies Time of Evaluation: January 2012

End User Licence. PUBLIC 31 January 2017 Version: T +44 (0) E ukdataservice.ac.uk

DRI: Preservation Planning Case Study Getting Started in Digital Preservation Digital Preservation Coalition November 2013 Dublin, Ireland

Scientific Research Data Management Policy

Optional Thesis Deposit

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS).

Electronic Thesis and Dissertation Tutorial: Submitting an ETD to SFA ScholarWorks

Data Curation Profile Plant Genetics / Corn Breeding

Adding Research Datasets to the UWA Research Repository

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

UC Irvine LAUC-I and Library Staff Research

Brown University Libraries Technology Plan

The EOIS-CaMS User Guide for New Staff

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)

Feed the Future Innovation Lab for Peanut (Peanut Innovation Lab) Data Management Plan Version:

Basic Requirements for Research Infrastructures in Europe

ISMTE Best Practices Around Data for Journals, and How to Follow Them" Brooks Hanson Director, Publications, AGU

It applies to personal information for individuals that are external to us such as donors, clients and suppliers (you, your).

PROCESSING AND CATALOGUING DATA AND DOCUMENTATION: QUALITATIVE

Forthcoming Deposit Agreement of FSD ARJA KUULA CES2012: DATA ACQUISITION AND LICENSE AGREEMENTS 5TH OCTOBER 2012, TAMPERE, FINLAND

User guide. Created by Ilse A. Rasmussen & Allan Leck Jensen. 27 August You ll find Organic Eprints here:

Data Exchange and Conversion Utilities and Tools (DExT)

PROCESSING AND CATALOGUING DATA AND DOCUMENTATION - QUALITATIVE

NDSA Web Archiving Survey

For Attribution: Developing Data Attribution and Citation Practices and Standards

Electronic Thesis Submission Manual. Intended for Graduate Students

Basics in good research data management (RDM) for reviewing DMPs

ProQuest Dissertations and Theses Overview. Austin McLean and Marlene Coles CGS Summer Workshop, July 2017

The COUNTER Code of Practice for Articles

Data Management Plan

Description Cross-domain Task Force Research Design Statement

NSF Data Management Plan Template Duke University Libraries Data and GIS Services

Data Management Plans

Swedish National Data Service, SND Checklist Data Management Plan Checklist for Data Management Plan

Functional Requirements - DigiRepWiki

Horizon2020/EURO Coordination and Support Actions. SOcietal Needs analysis and Emerging Technologies in the public Sector

Open Data Policy City of Irving

Web of Science. Platform Release Nina Chang Product Release Date: December 10, 2017 EXTERNAL RELEASE DOCUMENTATION

> Semantic Web Use Cases and Case Studies

Survey of Research Data Management Practices at the University of Pretoria

Data Curation Profile Movement of Proteins

1. Accessing EBSCOhost. 2. Why Choose EBSCOhost? 3. EBSCOhost Databases

The DOI Identifier. Drexel University. From the SelectedWorks of James Gross. James Gross, Drexel University. June 4, 2012

Focus: Themes within Introduction and Context

INSTITUTIONAL REPOSITORY SERVICES

OPEN DATA. Dr Arthur Smith Dr Marta Teperek 15/06/2015. Slides: University of Cambridge

Data Curation Profile Food Technology and Processing / Food Preservation

Pathways. CIARD Consultation Moscow, December 2009

Demystifying Scopus APIs

Interim Report Technical Support for Integrated Library Systems Comparison of Open Source and Proprietary Software

MAGS Data Access Policy

Voter Contact Registry

Data Curation Handbook Steps

Research Data Management

Supporting the H2020 OA Mandate

The What, Why, Who and How of Where: Building a Portal for Geospatial Data. Alan Darnell Director, Scholars Portal

CISER Data Archive Collection Policy

Where to Get Training... 5 Where to Get Support... 5

Oracle Enterprise Manager. Description. Versions Supported. Prerequisites

E-Marefa User Guide. "Arab Theses and Dissertations"

Electronic Thesis Submission Manual

Oracle Agile Engineering Data Management. MCAD Connector for SolidWorks - Version User Manual

Academic Editor Tutorial

DATA PRESERVATION AND SHARING INITIATIVE. 1. Aims of the EORTC QLG Data Repository project

Internet Resources - Data Bases and Search Tools. Module 7

Introduction to IPUMS

haplo-services.com

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

Data Partnerships to Improve Health Frequently Asked Questions. Glossary...9

eirb Electronic Institutional Review Board User Guide For more information contact: Research Administration (803)

Fair data and open data: differences and consequences

How to deposit your accepted paper in ORA through Symplectic

Increasing access to OA material through metadata aggregation

Dataset Documentation Reference Guide for Pure Users

Ontario Community College Multi-site REB Application Form. Application to Involve Human Participants in Research

Opus: University of Bath Online Publication Store

Open Access & Open Data in H2020

Transcription:

Research Data Management: Secure and Preserve FEBRUARY 1, 2018 CAROL PERRY, LIBRARY A. MICHELLE EDWARDS, OAC

As you finish one project and move onto another, you want to ensure that the legacy of your project s research data is preserved. What best practices should you implement to encourage proper data preservation? In today s research environment, your research data may be requested by other research labs around the world to replicate your data. How should your data be documented and preserved to facilitate the exchange of research data? RDM - SECURE AND PRESERVE 2018 2

Understand the basics : Objectives How organized is your research project? How well documented is your data and your research project? Are you ready to share/preserve your data? RDM - SECURE AND PRESERVE 2018 3

Create Research life cycle Share/ Reuse Process RESEARCH PROJECT Preserve Analyse Report RDM - SECURE AND PRESERVE 2018 4

Research Data Management Plan Share/ Reuse Create Process RESEARCH PROJECT PLAN Preserve Analyse Report RDM - SECURE AND PRESERVE 2018 5

RDM Refresher A sound strategy and best practices used to Organize Document Store Analyze Secure Preserve/Share/Reuse Your data RDM RDM - LEGACY - SECURE OF AND YOUR PRESERVE RESEARCH 2018 DATA, 2017 6

Setting the stage RDM - SECURE AND PRESERVE 2018 7

The French Blast Research Group received an NSERC grant in 2016-2017 to continue an ongoing feeding trial. The research involves three horses from a number of different farms, where each horse is weighed(kg) at the beginning of the trial, placed on one of three feed regimens (hay, pasture, or silage), their total feed intake(kg) is measured for 2 consecutive weeks, and their weights (kg) are taken at the same time. Additional information that is gathered includes: the name of the horse s owner, the annual income of the owner, and feed costs of the owner for their horses. Workshop Scenario You have been hired by the French Blast Research Group to manage their data collections. They have 5 years of horse feeding trial data that was collected from a number of horse farms across Ontario since 2011. You have been hired to review, manage, and preserve their research trial data according to best practices and requirements of the NSERC grant. RDM - SECURE AND PRESERVE 2018 8

Why should I care? Can you find the data you need? Can you understand the data you have? Can others understand the data you have? Are you required to handle your data in a specific way? Are you required to share or preserve it? How will you share/preserve your data? Where will you share/preserve it? RDM - SECURE AND PRESERVE 2018 9

Is this what your directory structure looks like? Sample directory: desktop ERS folder RDM - SECURE AND PRESERVE 2018 10

Folder structure - refresher BCDS_2015 BCDS_2016 BCDS_2017 OR bcds_feed_suppl bcds_silage bcds_hay OR bcds_farm_a bcds_farm_b bcds_farm_c RDM - SECURE AND PRESERVE 2018 11

Readme file sample Beef Cattle Diet Survey 2016 BCDS = Beef Cattle Diet Survey Folders: bcds_feed_suppl = feed containing supplements data all farms bcds_hay = hay feed data - all farms Files: bcds_raw = rawdata files all farms Data collected daily stored in monthly files January 1 2016-December 2016. Date = ISO date yyyy/mm/dd UOM = units of measure Note: Farm 3 data not collected January 15 2016 and January 16 2016. Out of range values are shown as -6999 RDM - SECURE AND PRESERVE 2018 12

Access, sharing, reuse RDM - SECURE AND PRESERVE 2018 13

Advantages to sharing Increases impact of your research Helps others replicate your research Encourages further scientific enquiry Reduces research costs by reducing duplication Encourages transparency and accountability RDM - SECURE AND PRESERVE 2018 14

Access, sharing, reuse Review legal & ethical obligations/restrictions Research Ethics Board Funding agency obligations Federal/provincial regulations Animal Use Data to Animal Care Services - annual reporting to the Canadian Council on Animal Care Anonymize data where necessary Remove identifying information Determine sharing criteria understand your Intellectual Property rights RDM - SECURE AND PRESERVE 2018 15

Legal & Ethical issues REB form What did you agree to regarding sharing and preserving your data? Funding policies Are there sharing and preservation requirements? Regulations What federal/provincial policies & regulations govern your data? Contracts/partnerships Are there restrictions to sharing your data due to commercial interests? Journals Does your publisher require data be shared/deposited? RDM - SECURE AND PRESERVE 2018 16

Anonymization Shared data or data preserved on a repository platform should be anonymized. Determinents: Can data be anonymized without rendering it unusable by others? Different types of data may be highly sensitive (health) and might never be shared Sometimes, consent or access control is a better alternative to anonymization Methods of anonymization: Aggregation- cell suppression; inference control; perturbation; rounding; sampling Masking Pseudonmizination (de-identifying) RDM - SECURE AND PRESERVE 2018 17

Anonymizing research data Direct identifiers Remove direct identifiers (or replace with pseudonyms) e.g. names, address, institution, photo Indirect identifiers Reduce precision/detail through aggregation e.g. birth year vs. date of birth, occupational categories, area rather than village Generalize meaning of detailed text e.g. occupational expertise Restrict upper lower ranges to hide outliers e.g. income, age RDM - SECURE AND PRESERVE 2018 18

Anonymizing qualitative data Remove direct identifiers, or replace with pseudonyms or replacements Avoid blanking out Identify replacements with [brackets] Plan for anonymization at time of transcription Avoid over-anonymization avoid distorting data, making it unreliable or misleading Keep log of anonymization actions taken Keep separate from anonymized data files Do not share log file publicly RDM - SECURE AND PRESERVE 2018 19

Working in groups of 2-3, anonymize the data file from our study. Exercise 1 10 15 minutes RDM - SECURE AND PRESERVE 2018 20

Review Explain the measures undertaken to anonymize the file. Any questions? RDM - SECURE AND PRESERVE 2018 21

Preservation RDM - SECURE AND PRESERVE 2018 22

Preservation options Institutional repository Publish with results Deposit in major data repository Deposit in discipline-specific data repository RDM - SECURE AND PRESERVE 2018 23

University of Guelph-based options Atrium (institutional repository) - e-theses, articles, reports, videos, etc. https://atrium.lib.uoguelph.ca/ Agri-environmental Data Repository- research data https://dataverse.scholarsportal.info/dataverse/ugardr University of Guelph Data Repository research data https://dataverse.scholarsportal.info/dataverse/ugrdr RDM - SECURE AND PRESERVE 2018 24

Discipline Specific Repositories Re3data.org global registry of research data repositories www.re3data.org/ Stanford University Libraries Guide to Domain-specific Data Repositories https://library.stanford.edu/research/data-management-services/share-and-preserve-researchdata/domain-specific-data-repositories RDM - SECURE AND PRESERVE 2018 25

National preservation & discovery system Part of: Research Data Canada Federated Pilot Project Data repositories will be connected through a national network and portal for discovery https://www.frdr.ca/repo/ RDM - SECURE AND PRESERVE 2018 26

Metadata Metadata = documentation that describes the project and its contents Best practice for project management Repositories will require some documentation to accompany your files Project level File level Item level End user terms of use Citation information RDM - SECURE AND PRESERVE 2018 27

Metadata element Description Fill in information as applicable Title Author (s) Production Date Funding agency Time period covered Date of collection Keywords Description Full title of the dataset Person, corporate body, or agency responsible for the work s intellectual content. Include names, affiliations, and email addresses (e.g. [NAME], [DEPARTMENT/SCHOOL], [COLLEGE], [UNIVERSITY]. The date the final version of the dataset being deposited in the repository was created. This is the date that will be used in the dataset citation. The source(s) of funds for the production of the work Time period covered by the data The date(s) when the data were actually collected Words or phrases that describes the data collection s content (please provide at least 5 keywords) Summary describing the purpose and scope of the data collection, and what questions the investigators attempted to answer. RDM - SECURE AND PRESERVE 2018 28

Exercise Metadata Begin filling in the metadata template with information from the Project Scenario. RDM - SECURE AND PRESERVE 2018 29

Preparing files TRANSFER FILES TO NON-PROPRIETARY FORMAT ascii text csv Pdf or pdf/a tiff jpeg2000 shp,shx, Mpeg4 xml html PREPARE DOCUMENTATION TO ACCOMPANY FILES Codebooks Syntax files Readme files User guides Scripts etc. RDM - SECURE AND PRESERVE 2018 30

Sample accessible dataset Potential for northward expansion of the American dog tick (Dermacentor variabilis, Say) range under climate change in North America http://hdl.handle.net/10864/gssjx RDM - SECURE AND PRESERVE 2018 31

End user licensing Determine what access or actions you want end users to be able to make Provide clear instructions on citation requirements and contact information RDM - SECURE AND PRESERVE 2018 32

Sample licence: Restrictions Open access. Citation Requirements The publishing of analysis and results from research using any of the data products is permitted in research communication such as scholarly papers, journals and the like. The authors of these communications are required to cite the primary investigators Minigan J. N. et al. (2017), as the source of the data, and to indicate that the results or views expressed are those of the author/authorized user and are not those of primary investigators. Depositor Requirements To provide funding agencies with essential information about use of archival resources and to facilitate the exchange of information about Agri- Environmental Research Data Repository (AERDR) participants research activities, users of AERDR data are requested to send to AERDR bibliographic citations for, or copies of, each completed manuscript or thesis abstract. Please indicate in a cover letter which data were used. Conditions Original publication must be cited (see Related Publications); see publication for methods. Disclaimer The original creators of the data, J.N. Minigan, H.A. Hager, A.S. Peregrine, and J.A. Newman, and the funding agency, the Ontario Ministry of Agriculture, Food and Rural Affairs, bear no responsibility for uses of this data set or for interpretations or inferences upon such uses RDM - SECURE AND PRESERVE 2018 33

Resources MANTRA Research Data Management Training. University of Edinburgh - http://mantra.edina.ac.uk/protectionrightsandaccess/ Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans - 2010 - http://www.pre.ethics.gc.ca/pdf/eng/tcps2/tcps_2_final_web.pdf TCPS 2 Tutorial Course on Research Ethics (CORE) - http://www.pre.ethics.gc.ca/eng/education/tutorial-didacticiel/ UK Anonymisation Network - Resources - http://ukanon.net/ukan-resources/ RDM - SECURE AND PRESERVE 2018 34

Contact information: Michelle Edwards OAC Statistics Consultant oacstats@uoguelph.ca http://www.uoguelph.ca/oac/stats Carol Perry Associate Librarian Research & Scholarship lib.research@uoguelph.ca RDM - SECURE AND PRESERVE 2018 35