ISMTE Best Practices Around Data for Journals, and How to Follow Them" Brooks Hanson Director, Publications, AGU bhanson@agu.org 1
Recent Alignment by Publishers, Repositories, and Funders Around Data TOP (transparency and openness promotion guidelines) 538 journals COPDESS.org (Coalition on Publishing Data in the Earth and Space Sciences) Statement of Commitment endorsed by most publishers and repositories in the Earth and space sciences Joint Declaration of Data Citation Principles endorsed by 109 organizations including most major publishers. Reproducibility conferences and outcomes (AAAS and other orgs) Certification standards for repositories Challenge is practicing what you preach
TOP: Transparency & Openness Promotion Guidelines for Journals Low barrier Modular Agnostic to discipline https://cos.io/top/
Data citation Design transparency TOP s 8 Standards Research materials transparency Data transparency Analytic methods (code) transparency Preregistration of studies Preregistration of analysis plans Replication 3 Tiers: ª Disclose ª Require ª Verify
https://www.force11.org/group/joint-declaration-data-citation-principles-final
Certification standards for repositories From Helen Glaves and Gary Baker
Coalition on Publishing Data in the Earth and Space Sciences (COPDESS.org) Connecting Earth Science publishers and Data Facilities to help translate the aspirations of open, available, and useful data from policy into practice. Formed in October 2014 Endorsed a Statement of Commitment, 2015 Includes: joint best practices between journals and repositories; references. Directory of repositories These slides
For Authors Clear open data policy statement, requiring that data be available for verification and reuse (see copdess.org and TOP for examples) Transparency in paper about how and where researchers can access data (e.g., statement in acknowledgements) Use trusted domain repositories if they are available; get a data citation. Allow for data access during peer review Ban or discourage available from authors statements Supplements should follow NISO guidelines All references should be in main reference list (not in supplements) Key references and data should be available at time of publication (no unpublished or in-press references)
Samples: Use IGSN s International Geo Sample Number: http://www.geosamples.org/ Globally unique and persistent identifier for physical samples in the field sciences guaranteed to be unique via a centralized control mechanism resolves to virtual sample representations (sample metadata profiles) managed at federated IGSN Allocating Agents Can generate and register before or after field work 9 COPDESS 2 10/20/2015
On software and computations methods https://www.force11.org/software-citation-principles V. Stodden, M. McNutt, D. H. Bailey, E. Deelman, Y. Gil, B. Hanson, M. A. Heroux, J. P.A. Ioannidis, M. Taufer, submitted 1. To facilitate reproducibility, share the data, software, workflows, and details of the computational environment in open repositories. 2. To enable discoverability, persistent links should appear in the published article and include a permanent identifier for data, code, and digital artifacts. 3. To enable credit, cite code and data used, including workflows and software tools. 4. To facilitate reuse, adequately document digital scholarly artifacts. 5. Journals should enact TOP at level 2 or 3 and conduct a Reproducibility Check as part of the publication process. 6. Use Open Licensing when publishing digital scholarly objects. 7. Instigate new research programs and pilot studies on reproducibility.
On Citing Data Endorse the Joint Declaration of Data Citation Principles (and state so in your instructions) Cite data as part of reference sections Can be previously published data of others Your data if deposited and with a DOI Should be part of the main reference list and cited in text as for any other reference. Example: Doe, J. and R. Roe. 2001. The FOO Data Set. Version 2.3. The FOO Data Center. http://dx.doi.org/10.xxxx/notfoo.547983. Accessed 1 May 2011.
In Editorial Workflows.. Provide instructions and messaging to authors Ask authors to declare data availability at submissions Ask reviewers and editors to check data availability Ask copyeditors to resolve data provided by statements in acknowledgments by moving to references or querying authors. Collect metadata about the data at submission (repository) Collect funding metadata and identifiers before publication and include in article metadata. Ask authors to link ORCID s to their accounts and include in metadata. Use an institutional identifier system (GRID is open and free).
On Identifiers Link ORCID s to authors using their api (do NOT allow direct entry) Use the Crossref funder registry for grants Use CREDIT for author role Use an institutional identifier Include these as metadata with the article
Suggested Author Instructions and Best Practices for Journals Data Policy Statement Data Citation Sample Citation and Identification Crossref Funder Registry ORCIDs References to further discussions on most of these topics http://www.copdess.org/ COPDESS
The details about data What data are required (do you really mean all of it)? Usually determined by the discipline and the data that are typically stored in domain repositories. More rarely the raw data, but certainly the tabulated data in support of reported results. Anonymized where appropriate. Enforcement of policies We need to help authors (starts at funding agencies and data collection), and be aware (acknowledgement statement). It is an ethical obligation and key for advancing science Ask reviewers and editors about data availability. May require editorial statements of concern after publication if data are not provided. Hold or coordinate publications until data and references are available. How about someone stealing my data or scooping me? Widely adopted citation standards and norms Some communities have much experience in open data I received data from someone else or it is commercial or restricted by my government or laws and I can t release it. Transparency in access. IP is ok if data, data products, or software are available and scholarly reuse is allowed. Alert authors to think about data and access up front, in negotiating transfer agreements.