Making Sense of Data: What You Need to know about Persistent Identifiers, Best Practices, and Funder Requirements Council of Science Editors May 23, 2017 Shelley Stall, MBA, CDMP, EDME AGU Assistant Director, Enterprise Data Management sstall@agu.org @ShelleyStall
About AGU Largest Earth and planetary science society 60,000 members; 137 countries Much more than geophysics Humongous annual meeting (22,000 abstracts; 25,000 attendees) Largest Society Publisher in the ESS 20 journals; 4 gold open access >6000 published papers in 2016 Explaining science broadly Outreach to government leaders and the public Eos.org Large press effort; sharing science Legislative affairs 2
https://sciencepolicy.agu.org/files/2013/07/agu-data-position-statement-final-2015.pdf 3
AGU Data Engagement Activities Started Earth and Space Science in 2014 Data and methods papers (along with research) Data Management Assessment Program for Repositories AGU Data Blog (Part of GeoSpace) Coalition on Publishing Data in the Earth and Space Sciences (COPDESS.org) and many other community efforts aimed at elevating best practices. Participating in elevating data best practices Several Policy pieces in Eos.org and, with others, Science Data Fair at AGU Fall Meeting Information to scientists around data skills and resources Many sessions around data and best practices at meetings 4
2017 SSP 39TH ANNUAL MEETING Thursday, June 1, 2017 Concurrent 1D: Research Data Policies: Craft Effective Policy and Improve your Impact CrossRef, Dryad, Dataverse, PLOS ONE, Springer Nature Concurrent 3B: Implementing Best Practices around Open Data, Samples, and Code in Scholarly Publications AGU, Center for Open Science, Research Data Alliance, Science/AAAS 5
AGU s Data Policy states All data necessary to understand, evaluate, replicate, and build upon the reported research must be made available and accessible whenever possible. 6
Research Data Best Practices for a Researcher Deposit the data in support of your publication in a leading domain repository that curates such data. If a domain repository is not available for some or all of your data, deposit your data in a general repository such as Zenodo, Dryad, or Figshare. All can assign a DOI for deposited data, or use your institution s archive. Data should not be listed as available from authors. For small datasets, tables, or images you can also use supplemental material. Make sure that the data are available publicly at the time of publication and available to reviewers after submission. Cite data or code sets used in your study as part of the reference list. Citations should follow the Joint Declaration of Data Citation Principles. Source: AGU FAQ and Best Practices (Authors) http://publications.agu.org/author-resource-center/publication-policies/datapolicy/data-policy-faq/ 7
Samples, Code and Identifiers Best Practices for a Researcher Develop and deposit software in github and cite that or include simple scripts in a supplement. Identify your samples with IGSN s List all funding sources (including in-kind support) in the acknowledgments and indicate grant numbers and funders in your submission. Include ORCID s for all authors Use and follow these resources (http://www.copdess.org/resources-for-researchers/) to develop projects that support reproducibility and integrity in research. Source: AGU FAQ and Best Practices (Authors) http://publications.agu.org/author-resource-center/publication-policies/datapolicy/data-policy-faq/ 8
Publishers and Repositories are Working Together TOP (Transparency and Openness Promotion) guidelines, signed by 2900 journals and organizations COPDESS.org (Coalition on Publishing Data in the Earth and Space Sciences) Statement of Commitment endorsed by most publishers and repositories in the Earth and space sciences Joint Declaration of Data Citation Principles endorsed by 114 organizations including most major publishers. Reproducibility conferences and outcomes (AAAS and other orgs) Quality/certification standards for repositories expanding Challenge is practicing what you preach 9
Formed in October 2014 Coalition on Publishing Data in the Earth and Space Sciences (COPDESS.org) Connecting Earth Science publishers and Data Facilities to help translate the aspirations of open, available, and useful data from policy into practice. Endorsed a Statement of Commitment, 2015 Includes: joint best practices between journals and repositories; references. 10
What are scholarly publishing best practices Joint declaration of data citation principles Citing data in the references Separate data publications when appropriate (more journals now available) Transparency about how researchers can access data (e.g., statement in acknowledgements) Include ORCID s and other persistent identifiers Funders, samples, author-credit, institutions (still to come) Use trusted domain repositories if they are available Use repositories that allow for data access during peer review Supplements should follow NISO guidelines All references should be in main reference list (not in supplements) Key references and data should be available at time of publication (no unpublished or in-press references) Source: COPDESS Statement of Commitment http://www.copdess.org/statement-of- commitment/ 11
Resources AGU Data Position Statement: https://sciencepolicy.agu.org/files/2013/07/agu-data-position- Statement-Final-2015.pdf AGU s Data Policy http://publications.agu.org/author-resource-center/publicationpolicies/data-policy/ AGU FAQ and Best Practices (Authors) http://publications.agu.org/author-resource-center/publicationpolicies/data-policy/data-policy-faq/ COPDESS Statement of Commitment http://www.copdess.org/statement-of- commitment/ Transparency and Openness Promotion Guidelines (TOP) https://cos.io/top/ Joint Declaration of Data Citation Principles https://www.force11.org/group/joint-declaration-data-citation-principlesfinal 12
TOP Modular Standards Citation Standards Describes citation of data Analytical Methods Transparency Describes analytical code accessibility Design and Analysis Transparency Sets standards for research design disclosures Preregistration of Analysis Plans Specification of analytical details before data collection Data Transparency Describes availability and sharing of data Research Materials Transparency Describes research materials accessibility Preregistration of Studies Specification of study details before data collection Replication Encourages publication of replication studies Source: https://cos.io/top/ 13
TOP Implementation Levels LEVEL 0 LEVEL 1 LEVEL 2 LEVEL 3 CITATION STANDARDS Journal encourages citation of data, code, and materials, or says nothing. Are You a Level 0? Journal describes citation of data in guidelines to authors with clear rules and examples. Article provides appropriate citation for data and materials used consistent with journal's author guidelines. Article is not published until providing appropriate citation for data and materials following journal's author guidelines. DATA TRANSPARENC Y ANALYTIC METHODS (CODE) TRANSPARENC Y Time to Journal encourages data sharing, or says nothing. Get on the Move!! Journal encourages code sharing, or says nothing. Article states whether data are available, and, if so, where to access them. Article states whether code is available, and, if so, where to accessthem. Data must be posted to a trusted repository. Exceptions must be identified at article submission. Code must be posted to a trusted repository. Exceptions must be identified at article submission. Data must be posted to a trusted repository, and reported analyses will be reproduced independently prior to publication. Code must be posted to a trusted repository, and reported analyses will be reproduced independently prior to publication. 14
Next Steps Review the Statement of Commitment on COPDESS.org Best Practices for Data Publication Links to the policies written by ESS publishers based on the statement are good examples. Review the TOP Guidelines for data related standards, and the others. Be a signatory if you are not currently. Make plans to move beyond Level 0. Work with your community organizations and institutions on communicating the need for well documented, citable data. Research Data Alliance (RDA) is a good resource. Work together on messaging to authors and in author workshops around data. Be consistent within your domain and cross domain as much as possible. Leverage existing standards.
Contact Information: Shelley Stall sstall@agu.org AGU Data Management Program: http://dataservices.agu.org/dmm/ 16
The details about data What data are required (do you really mean all of it)? Usually determined by the discipline and the data that are typically stored in domain repositories. More rarely the raw data, but certainly the tabulated data in support of reported results. Anonymized where appropriate. Enforcement of policies We need to help authors (starts at funding agencies and data collection), and be aware (acknowledgement statement). It is an ethical obligation and key for advancing science Ask reviewers and editors about data availability. May require editorial statements of concern after publication if data are not provided. Hold or coordinate publications until data and references are available. How about someone stealing my data or scooping me? Widely adopted citation standards and norms Some communities have much experience in open data Some urban legends used as excuses I received data from someone else or it is commercial or restricted by my government or laws and I can t release it. Transparency in access. IP is ok if data, data products, or software are available and scholarly reuse is allowed. Alert authors to think about data and access up front, in negotiating transfer agreements. 17
Questions for coordination in policies what would be helpful? ORCID IGSN Data availability statements Code availability statements Data citation Cite COPDESS statement? Joint statement to funders on data management plans? Joint statement on policies? 18