Advancing code and data publication and peer review. Erika Pastrana, PhD Executive Editor, Nature Journals ALPSP_Sept 2018

Similar documents
WE HAVE SOME GREAT EARLY ADOPTERS

Making data publication a first class research output

Introducing the Springer Nature Data Support Services

Reproducibility and FAIR Data in the Earth and Space Sciences

Data Management Checklist

Science Europe Consultation on Research Data Management

A Data Citation Roadmap for Scholarly Data Repositories

Scientific Research Data Management Policy

Services to Make Sense of Data. Patricia Cruse, Executive Director, DataCite Council of Science Editors San Diego May 2017

Taylor & Francis Online. A User Guide.

Making Sense of Data: What You Need to know about Persistent Identifiers, Best Practices, and Funder Requirements

SHARING YOUR RESEARCH DATA VIA

Universidad de Extremadura 22 October Elsevier s approach to Open Access and latest developments. Edward Wedel-Larsen. Research Solutions

Open Science, FAIR data and effective data management

How to share research data

ISMTE Best Practices Around Data for Journals, and How to Follow Them" Brooks Hanson Director, Publications, AGU

Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe

GEOSS Data Management Principles: Importance and Implementation

Data Curation Handbook Steps

Open Access, RIMS and persistent identifiers

Research Data Preserve, Share, Reuse, Publish, or Perish Mark Gahegan Director, Centre for eresearch 24 Symonds St

HRA Open User Guide for Awardees

Submission Guide.

Adding Research Datasets to the UWA Research Repository

COALITION ON PUBLISHING DATA IN THE EARTH AND SPACE SCIENCES: A MODEL TO ADVANCE LEADING DATA PRACTICES IN SCHOLARLY PUBLISHING. Source: NSF.

Data Curation Profile Human Genomics

Developing a Research Data Policy

Research Elsevier

Data Management Planning

28 September PI: John Chip Breier, Ph.D. Applied Ocean Physics & Engineering Woods Hole Oceanographic Institution

Accepted manuscript deposits via Symplectic Elements

How to make your data open

INSTITUTIONAL REPOSITORY SERVICES

Open Access and DOAJ 과편협편집인워크숍 10:50-11:10. 한양대구리병원이현정

Persistent Identifier the data publishing perspective. Sünje Dallmeier-Tiessen, CERN 1

NRF Open Access Statement

Mercè Crosas, Ph.D. Chief Data Science and Technology Officer Institute for Quantitative Social Science (IQSS) Harvard

Open Access & Open Data in H2020

Feed the Future Innovation Lab for Peanut (Peanut Innovation Lab) Data Management Plan Version:

re3data.org - Making research data repositories visible and discoverable

Reproducibility and Reuse of Scientific Code Evolving the Role and Capabilities of Publishers

IMPLICIT RELIGION Guidelines for Contributors March 2007

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data

Open Data is a new paradigm in which research data are freely and openly shared, with full re-use rights. Open data ensures that research integrity

Research Data Management Procedures and Guidance

UTS Symplectic User Guide

CrossRef tools for small publishers

Our latest revised along with our full workflow is included here for reference.

Open Access compliance:

Elsevier publishing partner for the journals in Thailand

DATA SHARING FOR BETTER SCIENCE

Dataset Documentation Reference Guide for Pure Users

Callicott, Burton B, Scherer, David, Wesolek, Andrew. Published by Purdue University Press. For additional information about this book

Science Panel Discussion presentation: "A Data Sharing Story"

Swedish National Data Service, SND Checklist Data Management Plan Checklist for Data Management Plan

Project Transfer: Five Years Later ER&L 2012

January 16, Re: Request for Comment: Data Access and Data Sharing Policy. Dear Dr. Selby:

Using GitHub to open up your software project

Welcome to the Pure International Conference. Jill Lindmeier HR, Brand and Event Manager Oct 31, 2018

Research Data Management: lessons learned - and still to learn

DOIs for Research Data

Academic Editor Tutorial

INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES

Dataverse and DataTags

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours)

Trust and Certification: the case for Trustworthy Digital Repositories. RDA Europe webinar, 14 February 2017 Ingrid Dillo, DANS, The Netherlands

Depositing datasets in Pure: when and how July 2017

Open Access to Publications in H2020

Web of Science. Platform Release Nina Chang Product Release Date: December 10, 2017 EXTERNAL RELEASE DOCUMENTATION

How to deposit your accepted paper in ORA through Symplectic

DigitalHub Getting started: Submitting items

PDS, DOIs, and the Literature. Anne Raugh, University of Maryland Edwin Henneken, Harvard-Smithsonian Center for Astrophysics

bwfdm Communities - a Research Data Management Initiative in the State of Baden-Wuerttemberg

How to deposit your accepted paper in ORA through Symplectic

DMPonline / DaMaRO Workshop. Rewley House, Oxford 28 June DataStage. - a simple file management system. David Shotton

ENCePP Code of Conduct Implementation Guidance for Sharing of ENCePP Study Data

IMPROVING YOUR JOURNAL WORKFLOW

Guide to SciVal Experts

CODATA: Data Citation Workshop Perspectives from Editors and Publishers. Brooks Hanson Director, Publications, AGU

EC Horizon 2020 Pilot on Open Research Data applicants

Web of Science. Platform Release Nina Chang Product Release Date: March 25, 2018 EXTERNAL RELEASE DOCUMENTATION

Using Caltech s Institutional Repository to Track OA Publishing in Chemistry

Linda Strick Fraunhofer FOKUS. EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018

ProQuest Dissertations and Theses Overview. Austin McLean and Marlene Coles CGS Summer Workshop, July 2017

How to deposit your accepted paper in ORA through Symplectic

CrossRef developments and initiatives: an update on services for the scholarly publishing community from CrossRef

The data explosion. and the need to manage diverse data sources in scientific research. Simon Coles

Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit

CODE AND DATA MANAGEMENT. Toni Rosati Lynn Yarmey

No more than six tables, pictures or figures can be considered for the paper version, although

State of the Art in Data Citation

DataSTORRE Deposit Guide

Making the most of metadata with Metadata 2020

Helping Journals to Upgrade Data Publications for Reusable Research

Publishing data?! How do you do that then?

Experimental nonlocality-based network diagnostics of multipartite entangled states

Roy Lowry, Gwen Moncoiffe and Adam Leadbetter (BODC) Cathy Norton and Lisa Raymond (MBLWHOI Library) Ed Urban (SCOR) Peter Pissierssens (IODE Project

Transcription:

Advancing code and data publication and peer review Erika Pastrana, PhD Executive Editor, Nature Journals ALPSP_Sept 2018

2 Code related initiatives at Nature Research 1.1

Nature journals have been industry pioneers in making code accessible and robust upon publication 3 Currently, the Nature journals mandate that the availability and conditions of access of custom code that is central to the paper, be explicitely stated. A number of journals: Nature Biotechnology, Nature Methods, Nature Neuroscience, go a step further and ask authors to submit the code during the peer review process so that reviewers can test it and ensure reproducibility and robustness of the code and the claims. We have recently taken steps to extend this practice to more Nature journals for relevant papers (editorial published in Nature ʺDoes your code stand up to scrutiny?ʺ): Ø We ve developed new guidelines for authors on the peer review of code practice Ø We ve develped new checklist for authors when submitting new code for peer review

Publishing reproducible code is cumbersome with current tools 4 Journals today: Peer review of code: authors are asked to provide a zip file (or GitHub link) with all the materials for the reviewer to test the code. To run the code, the reviewer has to install the code in the right environment and hardware, install all dependencies and do all the troubleshooting on his/her own. Publishing of code: the code associated with the paper (ie version of record) is provided as a Supplementary Software zip folder which limits the wider impact of the code (readers have to go through the same process as reviewers to install and run the code). Alternatively, code is provdied via a link to a repository (such as GitHub). As publsihers, we currently have no mechanisms to ensure the maintenance of the version of code associated with the paperand this can have implications for reproducibility.

Trial using Code Ocean to facilitate code peer review and publication 5 A trial on 3 Nature journals (Nature Methods, Nature Biotechnology and Nature Machine Intelligence) with Code Ocean intending to gauge suitability of Code Ocean platform to: 1) Facilitate and enhance compliance with code peer review policy among authors, referees and editors 2) Provide referee service in enabling easier peer review of code 3) Gauge post-publication reader engagement with code delivered on a dynamic (container-type) platform http://blogs.nature.com/ofschemesandmemes/

Code Ocean s widget tool 6 The widget tool that enables accessing custom code associated with the paper https://f1000research.com/articles/4-121/v1

7 Data related initiatives at Nature Research 1.2

8 Journal data policies are becoming more consistent Springer Nature launched a data policy standardisation initiative in 2016 1 More than 1,400 (~60%) Springer Nature journals have adopted a standard research data policy Approach is practical and pragmatic, enabling all journals to adopt a policy even if they are new to data sharing All policies support community specific policies, mandates and repositories All policies and journals promote data citation in Information for authors Similar initiatives since introduced at Elsevier 2, Wiley 3, Taylor & Francis 4 1. Standardising and harmonising research data policy in scholarly publishing Iain Hrynaszkiewicz, Aliaksandr Birukou, Mathias Astell, Sowmya Swaminathan, Amye Kenall, Varsha Khodiyar International Journal of Digital Curation; doi: https://doi.org/10.2218/ijdc.v12i1.531 2. https://www.elsevier.com/authors/author-services/research-data/data-guidelines 3. https://authorservices.wiley.com/author-resources/journal-authors/licensing-open-access/open-access/data-sharing.html 4. https://authorservices.taylorandfrancis.com/understanding-our-data-sharing-policies/

Data policy at Nature journals: data availability statements & data citations (2016) 9 All manuscripts reporting original research published in Nature journals must include a data availability statement A statement about how data supporting the results reported in the article can be accessed Builds on existing mandates on data deposition Report availability of minimal data set to interpret, replicate and extend findings Details of publicly available datasets that are analysed/generated during study Details of source data provided with paper. State restrictions on access, for example due to privacy limitations Data availability. The RNA-seq data sets for human HSPCs (A0 and F0), ProEs (A5, F5, shnt, shtfam and shphb2), Accession E13.5 fetal numbers liver CD71 + Ter119 + erythroid cells (Tfam WT and KO), and H3K27ac ChIP-seq data sets have been deposited in the Gene Expression Omnibus (GEO) under the accession number GSE86912. Mass spectrometry data have been deposited in ProteomeXchange with Source the primary data accession code PXD006170. Source data for RNA-seq analysis are provided as Supplementary Tables 2 and 4 6. Source data for proteomic analysis are provided as Supplementary Tables 1, 3 and 4. Statistics source data for Figs 2e, g, 3b g, 4b, d, f, h n, 5c, e h, 6g l, 7a, c, f k, m r and 8c, i and Supplementary Figs 3b, d, e, g, i, j, 4c, 6a, c, d, f, 7a c, e, f, h m, o t and 8c, e g, i l are provided in Supplementary Table 8. All other data supporting the findings of this study are available from the corresponding author on request. From Nature Cell Biology (2017) doi:10.1038/ncb3527

10 Helping researchers find the right repository Recommended repositories list (>80 repositories) http://www.springernature.com/gp/group/data-policy/repositories What makes a good data repository? Data repositories for data supporting peer-reviewed publications generally should: I. Ensure long-term persistence and preservation of datasets II. III. Be recognized by a research community or research institution Provide deposited datasets with stable and persistent identifiers, such as Digital Object Identifiers (DOIs) IV. Allow access to data without unnecessary restrictions V. Provide a clear license or terms of use for deposited datasets

Example output of Research Data Support Paper published in Nature (https://doi.org/ 10.1038/nature23654) Dataset published in the Springer Nature figshare repository (https://doi.org/10.6084/m9.figshare.c.3 814360) 11 Data availability statement included with the paper

12 Thank You Erika Pastrana, PhD Executive Editor, Nature Research Journals e.pastrana@us.nature.com

13 Figshare widget on journal articles Reader/user feedback: Appearance: 68% rated as good or very good Usage/engagement: 76% are more likely to look at additional files Submissions: 62% more likely to submit a manuscript >320 respondents http://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-3369-8

14 From our data journal portfolio: Scientific Data Scientific Data Part of the Nature Research Group, Scientific Data is an open-access, online-only journal for descriptions of scientifically valuable datasets. The articles, known as Data Descriptors, combine traditional narrative content with curated, structured descriptions (metadata) of the published data to provide a new framework for data-sharing and reuse. Broad scope covering physical, life and quantitative social sciences In-house metadata curation Data-focused peer-review process Supports community data repositories Integrated submission of data to general repositories

15 The Data Descriptor article Peer-reviewed articles enabling authors to provide comprehensive methodological detail about their data. Does not contain tests of new scientific hypotheses. Sections: Title Abstract Background & Summary Methods Technical Validation Data Records Usage Notes Figures & Tables References Data Citations

16 Press release Wellcome trials research data support services with Springer Nature Embargoed to Thursday 12 July 2018 Wellcome is partnering with Springer Nature on a pilot to support researchers to make their research datasets discoverable and available to the wider research community. The pilot, expected to run over six to twelve months, will see Wellcome support the costs associated with its funded researchers using Springer Nature s Research Data Support services for an initial fixed number of datasets. The pilot is open for datasets underpinning research articles from Wellcome-funded studies and from researchers at Wellcome-funded institutes, and offered on a first-come, first-served basis. Data can be associated with a submission or publication to any peer-reviewed journal, and is not limited to Springer Nature journals; authors will remain in control of their data, including choice of Creative Commons license which ensure datasets are open access and available for re-use 1. As part of the pilot, Wellcome and Springer Nature will also work with Wellcome Open Research, Wellcome s publishing platform. David Carr, Programme Manager for Open Research at Wellcome, said: Wellcome is committed to working with our researchers to help them maximise the value of the data generated by the research we support ensuring it is managed, described and shared in ways that make it easy to discover and re-use. Our goal with this pilot is to garner researchers interest in and feedback on this kind of service, and its value in terms of enhancing visibility and use of the data. Grace Baynes, Vice-President, Data & New Product Development in Springer Nature s Open Research Group, said: We know that many researchers face challenges in organising and sharing their data 2, and we ve developed our Research Data Support services in response to these challenges. We re very pleased to be working with Wellcome, who have pioneered open science over the past fifteen years. Coupling policy, funding and practical support will be key to making data sharing and good data practice the status quo. Partnering with the research community to build collaborative solutions is central to our approach to research data, and we expect to learn a lot through this pilot. As a global charitable foundation and funder of research, Wellcome is strongly committed to maximising the value of research outputs. Wellcome s policy requires data and original software underpinning research publications to be made available to other researchers at the point of publication. The policy also emphasises that research outputs should be discoverable, have persistent identifiers, and be deposited in recognised community repositories. Springer Nature is committed to helping researchers take open approaches to their data wherever possible. The publisher encourages researchers to ensure any data and other materials underpinning research articles are discoverable and useable through a suitable repository, rather than being hidden in supplementary files. This approach is supported by Springer Nature standardised journal data policies, a publicly accessible recommended repositories list, and a free helpdesk providing advice to authors.

Appendix 17

18