Introduction to Data Management

Similar documents
Make your own data management plan

How to share research data

Science Europe Consultation on Research Data Management

Checklist and guidance for a Data Management Plan, v1.0

DATA MANAGEMENT PLANS Requirements and Recommendations for H2020 Projects. Matthias Razum April 20, 2018

Striving for efficiency

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

Deliverable 6.4. Initial Data Management Plan. RINGO (GA no ) PUBLIC; R. Readiness of ICOS for Necessities of integrated Global Observations

Data Management Plans. Sarah Jones Digital Curation Centre, Glasgow

Research Data Management: Edinburgh University Library s Approach Dominic Tate

ZB MED Information Center Life Sciences

Towards FAIRness: some reflections from an Earth Science perspective

Data management Backgrounds and steps to implementation; A pragmatic approach.

Data Management Checklist

EC Horizon 2020 Pilot on Open Research Data applicants

Linda Strick Fraunhofer FOKUS. EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018

Swedish National Data Service, SND Checklist Data Management Plan Checklist for Data Management Plan

Research Data Management

The UiT research data web portal. uit.no/forskningsdata

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

NRF Open Access Statement

Open Data is a new paradigm in which research data are freely and openly shared, with full re-use rights. Open data ensures that research integrity

Horizon 2020 Open Research Data Pilot: What is required? Sarah Jones Digital Curation Centre

Persistent Identifier the data publishing perspective. Sünje Dallmeier-Tiessen, CERN 1

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology

Big Data infrastructure and tools in libraries

DuraSpace FAIRness and GDPR

ESFRI WORKSHOP ON RIs AND EOSC

The KM3NeT Data Management Plan

CEH Environmental Information Data Centre support to NERCfunded. Jonathan Newman EIDC

Welcome to the Pure International Conference. Jill Lindmeier HR, Brand and Event Manager Oct 31, 2018

Fair data and open data: differences and consequences

Developing a social science data platform. Ron Dekker Director CESSDA

RADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft

Technical documentation. SIOS Data Management Plan

RDM, a view from Vancouver

Data Discovery - Introduction

Data Management Dr Evelyn Flanagan

EUDAT & SeaDataCloud

DOIs for Research Data

Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe

European Open Science Cloud Implementation roadmap: translating the vision into practice. September 2018

CODE AND DATA MANAGEMENT. Toni Rosati Lynn Yarmey

Tools for Data Management. Research Data Management : Session 3 9 th June 2015

Assessing the FAIRness of Datasets in Trustworthy Digital Repositories: a 5 star scale

How to make your data open

Facilitate Open Science Training for European Research What to know about interdisciplinary research data management in order to assess DMPs

Promoting semantic interoperability between public administrations in Europe

Mercè Crosas, Ph.D. Chief Data Science and Technology Officer Institute for Quantitative Social Science (IQSS) Harvard

OPEN SCIENCE AT THE SWEDISH RESEARCH COUNCIL. Sofie Björling Director of the Dept of Research Infrastructures, NPR Open Access

Indiana University Research Technology and the Research Data Alliance

Reproducibility and FAIR Data in the Earth and Space Sciences

Horizon 2020 and the Open Research Data pilot. Sarah Jones Digital Curation Centre, Glasgow

Horizon2020/EURO Coordination and Support Actions. SOcietal Needs analysis and Emerging Technologies in the public Sector

Developing a Research Data Policy

NSF Data Management Plan Template Duke University Libraries Data and GIS Services

Records Management and Interoperability Initiatives and Experiences in the EU and Estonia

DATA SELECTION AND APPRAISAL CHECKLIST University of Reading Research Data Archive

EUDAT. Towards a pan-european Collaborative Data Infrastructure

Long-term preservation for INSPIRE: a metadata framework and geo-portal implementation

Data management and discovery

Using DCC DMPonline to write a Data Management Plan

Plutof Bridge from Data Management to the Reporting (=Publishing) Urmas Kõljalg - Allan Zirk - Kessy Abarenkov University of Tartu

Towards a joint service catalogue for e-infrastructure services

Manchester Metropolitan University Information Security Strategy

Data Management and Data Management Plans. Dr. Tomasz Miksa. TU Wien & SBA Research

FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

clarin:el an infrastructure for documenting, sharing and processing language data

Data and visualization

Research Data Preserve, Share, Reuse, Publish, or Perish Mark Gahegan Director, Centre for eresearch 24 Symonds St

Open Standards for Linked Organisations. Linked Base Registries as a key enabler for egovernment in Flanders #OSLO2

Progress towards the EOSC

Breakout session Save The Data Research Data Management How it s done at Saxion University of Applied Sciences

Towards FAIRness in research data. Per Öster, 3 October 2018

Using Persistent Identifiers at

Where to store research data during and after a project. Dr. Chris Emmerson Research Data Manager

Digital repositories as research infrastructure: a UK perspective

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

WP4: Data Forum. Øystein Godøy, Boris Radosavljević, Boris Biskaborn, Anna Irrgang

I data set della ricerca ed il progetto EUDAT

Metadata for Data Discovery: The NERC Data Catalogue Service. Steve Donegan

Interoperability and transparency The European context

OpenAIRE Open Knowledge Infrastructure for Europe

RADAR Project. Data Archival and Publication as a Service. Matthias Razum FIZ Karlsruhe RESEARCH DATA REPOSITORIUM. Zurich, December 15, 2014

Paving the Rocky Road Toward Open and FAIR in the Field Sciences

European Marine Data Exchange

How FAIR am I? FAIR Principles and Interoperability of Data and Tools

Focus: Themes within Introduction and Context

RDM through a UK lens - New Roles for Librarians?

Archivierung und Publikation von Forschungsdaten mit RADAR

Demos: DMP Assistant and Dataverse

META-SHARE: An Open Resource Exchange Infrastructure for Stimulating Research and Innovation

Data Curation Handbook Steps

EU and multilingualism & How can public services benefit from CEF Automated Translation

GEOSS Data Management Principles: Importance and Implementation

About Knowledge Convergence. e-infrastructures Austria an interdisciplinary case study concerning research resources and their management

DEVELOPING, ENABLING, AND SUPPORTING DATA AND REPOSITORY CERTIFICATION

TEXT MINING: THE NEXT DATA FRONTIER

Scientific Data Policy of European X-Ray Free-Electron Laser Facility GmbH

Transcription:

Introduction to Data Management What, why and how? Gunn Inger Lyse NSD Norwegian Centre for Research Data www.nsd.no 2018 For NMBU Talent program 2016-2019, Meeting 5: Holmsbu Bad- og Fjordhotell at 7-8 June 2018 With focus on Teaching and Data Management (Internt saksnummer NSD: 201800211)

NSD Norwegian Centre for Research Data NSD has made data available for researchers for almost 50 years. We work continually to improve our services. Access to data for researchers Archiving/long-term perspective (requires curation to keep data alive) Open access balanced against personal data protection Overview (DBH and other services)

NSD Norwegian Centre for Research Data NSD has made data available for researchers for almost 50 years. We work continually to improve our services.

Outline WHAT Basic concepts of data management Funder requirements WHY Why is good data management good for you? HOW To establish good data management routines? How to share data? Data management tools at NSD.

Example 1: AVOID DUPLICATION OF EFFORT An example from research on sea ice by Marta Zygmuntowska, University of Bergen By NASA. Public Domain, https://commons.wikimedia.org/w/index.php?curid=15837631 (Example from a presentation by Beate Krøvel Humberset, University of Bergen Library)

Example 2: SHARING DATA FOR RE-USE (1/3) Jei vill hjerne gjerne jernet hjemme hjemmet

Example 2: SHARING DATA FOR RE-USE (2/3) Analysis Classification sports, politics, bokmål/nynorsk Automatic analysis of words gender, part of speech, Frequencies 700 mill. words + 1 mill words/week Research and development Neologisms Foreign words in Norwegian Media research Reading and writing aid Reference: Newspaper corpus (http://hdl.handle.net/11495/d9b5-0349-4330-0)

Example 2: SHARING DATA FOR RE-USE (3/3) Jeg vil hjerne gjerne Ord vil gjerne vil hjerne frekvens 17942 0

Basic concepts of research data What is research data? The problem: data loss FAIR data principle Data life cycle management Data Management Plan (DMP) Store vs. archive data Share data/make data available- what does it mean? Funder/institutional requirements

What is research data? All data created by researchers in the course of their project Source data = already existing data, independently of your research Research data = data resulting from your project (from raw data to processed data) Source data (existing data) Project data (raw data, processed data)

What are your data? If your memory stick/laptop/hard drive is corrupt, how much can you afford to loose? Project with multiple partners: who has (access to) which data? What if somebody questions your findings, what would you use to back up your claims? Would you be able to reproduce your figures in 5 years, or understand the variable abbreviations effortlessly?

The problem: data loss/missing access Mostly due to current methods capture and data malpractice, approximately 50% of all research data and experiments is considered not reproducible, and the vast majority (likely over 80%) of data never makes it to a trusted and sustainable repository. Source: Realising the European Open Science Cloud. Commission High Level Expert Group on the European Open Science Cloud. 2016. p.9. DOI:10.2777/940154.

FAIR Source: Realising the European Open Science Cloud. Commission High Level Expert Group on the European Open Science Cloud. 2016. p.9. DOI:10.2777/940154.

Data lifecycle Generate data Re-use data Process data Give access to data Analyse data Archive data

Data management plan (DMP) Generate data Re-use data Process data DMP Give access to data Analyse data Archive data

Store data vs. archive data During your project: store your data. Safe, automated backup and version control User-friendly solution to work with them, share and manage access rights for project partners When (parts of) data is no longer in daily use: archive your data. Findable (searchable, PID, metadata) Accessible (authentication&authorization) Interoperable (open/persistent file formats, standardized metadata) Re-usable: Documentation, maintenance/curation of data now and in the future

Sharing data Several modes of sharing, e.g.: Downloadable (raw data, processed data) Searchable (e.g. online search interfaces) Several degrees of access: Open Open w/authentication Restricted (conditioned by users/use) Closed (e.g. for privacy reasons) + embargo

Requirements EU (Horizon 2020): data management plan + make data available as open as possible, as restricted as necessary [link]; Research Council of Norway (RCN) (from 2018): data management plan + open by default policy [link]; Journals: Increasing requirement to share data [link]; Institutions: data management plan + open access to data (NMBU [link]), University of Oslo [link], UiT Arctic Univ. of Norway [link], NTNU [link], ); GDPR - General Data Protection Regulation 25. May 2018

Requirements in practice - NMBU

But Should all data be shared openly? No. But the fact that they exist should be public information. When can data be shared? Typically in connection with publications, or at the end of the project (embargo if necessary). Under which conditions & with whom? As open as possible, as closed as necessary. Can others find, understand and use my data? Yes. With proper documentation and long-term curation.

Data Management Requirements Source: https://www.cartoonstock.com/cartoonview.asp?catref=rman9976

WHY What s in it for you? Visibility Documentation Security (long-term) Integrity and trust Network, collaboration

HOW - Research data management Project start What data do you need? Data search? Identify possible legal & ethical issues Costs (e.g. for archiving data) During project Safe storage (for working data) How/where to share efficiently w/project partners? Backup, versioning Make conventions for naming of files/folders/code Document what you do -> readme.txt, log.txt, remember.txt, link to the DMP (e.g how are files and code organized?) End of project Select data for archiving (for DOI assignment etc) Are the file formats ok? Final documentation and metadata! Time saved if you documented from the start

Make a data management plan (DMP) DMP as a document that covers the whole cycle A checklist with questions to make you aware of what you need to consider and when A dynamic document for continuous updating.

NSD DMP solution

How to choose a data archive/ repository? -National? Community repository? International? -Provide persistent identifier (PID)? DOI, handle, -Certified? -Standardised metadata (ensure F in FAIR) -Capacity, guidance and costs? E.g. how many GB, TB are you allowed to archive? => The largest and most comprehensive registry of data repositories available

FAIR research data at NSD Findable: Archived datasets are given a Persistent ID (DOI); Indexed and searchable metadata (e.g. via DataCite). Accessible: Open access protocol; Clear procedure for authenacaaon and authorizaaon. Interoperable: NSD metadata follows the DDI standard, controlled vocabulary; Data available in different formats. Re-usable: Clear user condiaons (decided by you, not NSD) NSD curates your data and can return them in formats relevant to your research community.

Support for research from NSD Data management and planning: NSD webpages on data management and archiving (English): http://www.nsd.uib.no/arkivering/en/data-management-plan.html FAQs about data management: http://www.nsd.uib.no/arkivering/en/005_faq.html Find data? Search portal: www.search.nsd.no Archive data? Archiving portal: https://arkiveringsportalen.nsd.no/ Data management plan: www.dmp.nsd.no (independent of where you will archive your data) For questions, contact: dataarkivering@nsd.no Personal data: FAQs about personal data: http://www.nsd.uib.no/personvernombud/en/help/faq.html Unsure whether you have personal data? Try NSD s notification test: http://www.nsd.uib.no/personvernombud/en/notify/index.html For questions, contact: personvernombudet@nsd.no

Support for research from NSD Coming: Digital training (e-course, videos)

Other useful webpages International data search: www.datacite.org (all NSD data are also findable here) Search for repositories: www.re3data.org Online training on data management in Europe, developed by the infrastructure CESSDA: https://www.cessda.eu/research-infrastructure/training/expert- Tour-Guide-on-Data-Management