CODE AND DATA MANAGEMENT. Toni Rosati Lynn Yarmey

Similar documents
CEH Environmental Information Data Centre support to NERCfunded. Jonathan Newman EIDC

Persistent Identifier the data publishing perspective. Sünje Dallmeier-Tiessen, CERN 1

Introducing the Springer Nature Data Support Services

The library s role in promoting the sharing of scientific research data

Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe

Checklist and guidance for a Data Management Plan, v1.0

Open Data is a new paradigm in which research data are freely and openly shared, with full re-use rights. Open data ensures that research integrity

RADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft

Towards FAIRness: some reflections from an Earth Science perspective

Reproducibility and FAIR Data in the Earth and Space Sciences

Introduction to Data Management for Ocean Science Research

The DOI Identifier. Drexel University. From the SelectedWorks of James Gross. James Gross, Drexel University. June 4, 2012

COALITION ON PUBLISHING DATA IN THE EARTH AND SPACE SCIENCES: A MODEL TO ADVANCE LEADING DATA PRACTICES IN SCHOLARLY PUBLISHING. Source: NSF.

NSF Data Management Plan Template Duke University Libraries Data and GIS Services

ISMTE Best Practices Around Data for Journals, and How to Follow Them" Brooks Hanson Director, Publications, AGU

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

Science Europe Consultation on Research Data Management

DigitalHub Getting started: Submitting items

Data Archival and Dissemination Tools to Support Your Research, Management, and Education

Paving the Rocky Road Toward Open and FAIR in the Field Sciences

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data

The Physiome Model Repository. Poul Nielsen

Laboratorio di Programmazione. Prof. Marco Bertini

DataSTORRE Deposit Guide

UC Irvine LAUC-I and Library Staff Research

Data publication and discovery with Globus

A Data Management Plan Template for Ecological Restoration and Monitoring

Legal Issues in Data Management: A Practical Approach

FREYA Connected Open Identifiers for Discovery, Access and Use of Research Resources

Swedish National Data Service, SND Checklist Data Management Plan Checklist for Data Management Plan

Dryad Curation Manual, Summer 2009

The iplant Data Commons

Unique Identifiers Assessment: Results. R. Duerr

BPMN Processes for machine-actionable DMPs

Reproducible & Transparent Computational Science with Galaxy. Jeremy Goecks The Galaxy Team

The Data Census: Assessing Data Services at MSU

Welcome to the Pure International Conference. Jill Lindmeier HR, Brand and Event Manager Oct 31, 2018

Horizon 2020 and the Open Research Data pilot. Sarah Jones Digital Curation Centre, Glasgow

Services to Make Sense of Data. Patricia Cruse, Executive Director, DataCite Council of Science Editors San Diego May 2017

Focus: Themes within Introduction and Context

DATA SHARING FOR BETTER SCIENCE

Welcome! Virtual tutorial starts at 15:00 GMT. Please leave feedback afterwards at:

Core Technology Development Team Meeting

Data Management and Data Management Plans. Dr. Tomasz Miksa. TU Wien & SBA Research

Lab 08. Command Line and Git

January 16, Re: Request for Comment: Data Access and Data Sharing Policy. Dear Dr. Selby:

Indiana University Research Technology and the Research Data Alliance

Title: Interactive data entry and validation tool: A collaboration between librarians and researchers

Data Management Plan Generic Template Zach S. Henderson Library

Web of Science. Platform Release Nina Chang Product Release Date: March 25, 2018 EXTERNAL RELEASE DOCUMENTATION

DOIs for Research Data

Data Curation Profile Movement of Proteins

SHARING YOUR RESEARCH DATA VIA

2/8/18. Overview. Project Management. The First Law. What is Project Management? What Are These Changes? Software Configuration Management (SCM)

Personal Digital Information Project, Part 2: Hands-on Exercise

Project Management. Overview

Archivierung und Publikation von Forschungsdaten mit RADAR

Arctic Data Center: Call for Synthesis Working Group Proposals Due May 23, 2018

Uploading data to the NCBI SRA database

Introduction to INEXDA s Metadata Schema

Historization and Versioning of DDI-Lifecycle Metadata Objects

Facilitate Open Science Training for European Research

Making Sense of Data: What You Need to know about Persistent Identifiers, Best Practices, and Funder Requirements

Using GitHub to open up your software project

GEOSS Data Management Principles: Importance and Implementation

Data Management Plans. Sarah Jones Digital Curation Centre, Glasgow

Core Technology Development Team Meeting

Introduction to Data Management

GETTING STARTED WITH. Michael Lessard Senior Solutions Architect June 2017

How to make your data open

Opus: University of Bath Online Publication Store

How to share research data

RADAR Project. Data Archival and Publication as a Service. Matthias Razum FIZ Karlsruhe RESEARCH DATA REPOSITORIUM. Zurich, December 15, 2014

Intro Git Advices. Using Git. Matthieu Moy. Matthieu Moy Git 2016 < 1 / 11 >

Horizon Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts. Deliverable D4.1

CMIP6 Data Citation and Long- Term Archival

Data Management Dr Evelyn Flanagan

Reflections on Three Decades in Internet Time

The Materials Data Facility

Developing a Research Data Policy

Deliverable 6.4. Initial Data Management Plan. RINGO (GA no ) PUBLIC; R. Readiness of ICOS for Necessities of integrated Global Observations

Advancing code and data publication and peer review. Erika Pastrana, PhD Executive Editor, Nature Journals ALPSP_Sept 2018

Geospatial Enterprise Search. June

Bengkel Kelestarian Jurnal Pusat Sitasi Malaysia. Digital Object Identifier Way Forward. 12 Januari 2017

Version Control. Collaborating with git. Tim Frasier

Data Curation Practices at the Oak Ridge National Laboratory Distributed Active Archive Center

Outline. The Collaborative Research Platform for Data Curation and Repositories: CKAN For ANGIS Data Portal. Open Access & Open Data.

OPEN SCIENCE & SUPPORT ON AO PUBLISHING AND DATA MANAGEMENT

Referencing with persistent links:

Your Open Science and Research Publishing Platform. 1st SciShops Summer School

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology

Agenda. - Final Project Info. - All things Git. - Make sure to come to lab for Python next week

Fair data and open data: differences and consequences

Core Technology Development Team Meeting

ZB MED Information Center Life Sciences

Sustainable Governance for Long-Term Stewardship of Earth Science Data

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA

OpenAIRE From Pilot to Service The Open Knowledge Infrastructure for Europe

Creating a Corporate Taxonomy. Internet Librarian November 2001 Betsy Farr Cogliano

Transcription:

CODE AND DATA MANAGEMENT Toni Rosati Lynn Yarmey

Data Management is Important! Because Reproducibility is the foundation of science Journals are starting to require data deposit You want to get credit for producing data (data citations) Others can use and build on your work (data reuse) Recreating a figure from a 2006 paper shouldn t be painful Funders tell us so (See NSF, NIH, NOAA, etc)

Outline Back up often Sharing code File naming Metadata Sharing data A data search tool

Back up Tips: - 1 working copy on your computer - 1 copy on infrastructure near you - 1 copy on infrastructure far away But why would you only backup when you can do so much more?... SHARE!!

Why Share Code? Good backup Collaboration People don t have to contact you to get and understand the code Faster and easier than other options (emailing individuals or sharing on servers)

Why Share Code? Version control Commenting gives public and brief history Work on multiple computers with the same code flexibility in where you work (no USB drive necessary) Keep code with metadata/user instructions No bureaucracy FREE!

What is Git? Git is a distributed revision control and source code management (SCM) system capable of dealing with nonlinear workflows As with most other distributed revision control systems, and unlike most client-server systems, every Git working directory is a full-fledged repository with complete history and full version tracking capabilities, independent of network access or a central server. (Wikipedia)

GitHub

Sharing Code GitHub.com

Sharing Code GitHub.com GitHub serves as the location of record for VIC at: https://github.com/uw-hydro/vic

File Naming Make names unique and meaningful! Include (as appropriate): - Project name or acronym - Study title - Location - Data type - Researcher initials - Date - Data stage - Version number - File type Think long-term

Metadata What would someone unfamiliar with your data need in order to evaluate, understand, and reuse them? How about someone: - who works in your lab? - from a different lab in your field? - who is in a related interdisciplinary field? - who researches a completely different area? - who works for a newspaper? Congress?

Metadata is the difference between:

Metadata is Data about Data Units? Resolution? What do the Column names mean? Caveats? Known data issues or missing values? How data were collected? Where forcing data came from? How many layers were used in this model? Information that describes the content, quality, condition, origin, and other characteristics of data or other pieces of information. Metadata for spatial data may describe and document its subject matter; how, when, where, and by whom the data was collected; availability and distribution information; its projection, scale, resolution, and accuracy; and its reliability with regard to some standard. Metadata consists of properties and documentation. Properties are derived from the data source (for example, the coordinate system and projection of the data), while documentation is entered by a person (for example, keywords used to describe the data). Esri

Metadata What happens without good metadata? You have no idea what the data mean You think you understand the data, so you use it but you use it totally wrong You waste hours (or days) trying to find out more about the data

Sharing Data These days, Dr. Hodes said, the old model in which researchers jealously guarded their data is no longer applicable. http://www.nytimes.com/ 2011/04/04/health/ 04alzheimer.html

Sharing/Finding Data www.nsidc.org/acadis/search

Organize now. or. Thank you!

Data Reuse Our team enables Arctic sciences by ensuring datasets are well documented and can be understood by re-users. The trick with data re-use is to find the dataset then become familiar enough with a dataset to be able to combine it with other data and extract accurate results.

Data Curation Metadata Usability Documentation Training Re-use Tools A little marketing Partnering Consensus building Data management plans for grant proposals Integrating social and physical sciences Data quality checks Data analysis

DOIs and Citations Digital Object Identifiers (DOI) officially name a resource. A DOI is essentially a stable, permanent URL. Information about a digital object may change over time, including where to find it, but its DOI name will not change. The DOI System provides a framework for persistent identification, managing intellectual content, managing metadata, linking customers with content suppliers, facilitating electronic commerce, and enabling automated management of media. (DataCite.org)

Beyond ACADIS Other Resources General Info and help - Earth Science Information Partners (ESIP): http://wiki.esipfed.org/ UVA Libraries: http://www2.lib.virginia.edu/brown/data/ Data Management Plan and other tools DMP Tool: https://dmp.cdlib.org/ DataOne: https://www.dataone.org/cattools/data%20and%20metadata %20Management Metadata - Excel Plug-in tool (in development): http://www.cdlib.org/cdlinfo/2011/09/01/facilitating-data-management-dcxl/ Lists of Standards (not complete!) for bio, climate, ecology, oceanography - http:// marinemetadata.org/conventions Stanford-based portal for medical/bio - http://bioportal.bioontology.org/resources