DEVELOPING, ENABLING, AND SUPPORTING DATA AND REPOSITORY CERTIFICATION Plato Smith, Ph.D., Data Management Librarian DataONE Member Node Special Topics Discussion June 8, 2017, 2pm - 2:30 pm
ASSESSING DATA AND REPOSITORY NEEDS It will be difficult to improve your institutional infrastructure without an overall understanding of the data you currently hold and how researchers at your institution are managing their data. CARDIO v.2 (Collaborative Assessment of Research Data), http://www.dcc.ac.uk/resources/tools/cardio
CERTIFICATION BODIES/STANDARDS (EXAMPLES) q Data Seal of Approval (DSA) data repository (2008) q DIN Standards Committee for Information and Documentation (NID) (1927) q International Organization for Standardization (ISO) (e.g. data, repository, services..) (1947) q Open Geospatial Consortium (OGC) data and interoperability q World Data System (WDS) data services (2008)
DEFINITIONS q Certification The provision by an independent body of written assurance (a certificate) that the product, service or system in question meets specific requirements. q Accreditation The formal recognition by an independent body, generally known as an accreditation body, that a certification body operates according to international standards. International Organization for Standardization, https://www.iso.org/certification.html
TABLE OF CONTENTS 1. Research Data Management Support Services 2. Criteria, Certification, & Standards 3. Data 4. Repository 5. Tools 6. Use Case Examples 7. Acknowledgements 8. References
RESEARCH DATA MANAGEMENT SUPPORT SERVICES* (JONES, PRYOR, & WHYTE, 2013) RDM Policy and Strategy Business Plan and Sustainability Research Data Management Data access portals and publication Active data management and storage Data archives and repositories Data appraisal, selection, and transfer Education, guidance, support, and training
DATA (CLASSIFICATION, CERTIFICATION, CURATION) 1. Classification (UF Data Classification Guidelines) I. Restricted II. Sensitive III. Open 2. Trustworthiness of the data centers/repositories* (RDA/WDS) I. Assessment II. Certification III. Standards 3. Data accessibility and discoverability* I. Description, documentation, representation II. Harvestable, interoperable, readable III. Reproducible, reusable, sustainable 4. Level of curation* I. Provide metadata of research outputs (discoverable) II. Enhance research outputs with data curation (citable) III. Manage/preserve research data for long-term archiving
DATA (STAKEHOLDERS RESPONSIBILITIES) 1. Funding agencies/senior Stakeholders I. Responsible for funding, guidelines, and policies. II. Responsible for capacity, infrastructure, and resources. 2. Data Owners (Data Producers) I. Responsible for appropriately classifying data. II. Responsible for the quality of the digital research data (DSA). 3. Data Custodians (Data Curators) I. Responsible for labeling data with appropriate classifications and applying required and suggested safeguards. II. Data repository is responsible for the quality of storage and availability of the data: data management (DSA). 4. Data Users (Data Consumers) I. Responsible for complying with data use requirements. II. Responsible for immediately referring requests for public records to appropriate data governance authority.
REPOSITORY q Institutional repository q Publishing repository q Dataset repository q Trusted data repository Governance and Ownership Integrity and Quality Research Data Management Concepts and Definitions Guidelines and Security Research Data Management Policy Framework*
CRITERIA, CERTIFICATION, & STANDARDS (EXAMPLES) 1. Data Seal of Approval (DSA) à Trustworthy Data Repository Requirements 2. World Data System (WDS) à Data and Services - Certification 3. DSA-WDS Partnership Working Group à Catalogue of Common Requirements 4. Criteria for trustworthy digital archives à DIN 31644 5. Trusted Digital Repository (TDR) àiso 16363:2012 Audit and certification of trustworthy digital repositories 6. Trusted Repositories Audit & Certification: Criteria and Checklist àoais/iso 17421:2012 Open archival information system (OAIS) Reference model
CRITERIA, CERTIFICATION, & STANDARDS (CONT.) The 16 Data Seal of Approval requirements are based on the following criteria: The data can be found on the Internet The data are accessible (clear rights and licenses) The data are in a usable format The data are reliable The data are identified in a unique and persistent way so that they can be referred to Data Seal of Approval (2017) Source: https://www.datasealofapproval.org/en/assessment/
CRITERIA, CERTIFICATION, & STANDARDS (CONT.) q Key principles in international standards development (International Organization for Standardization (ISO)): Ø Standards respond to a need in the market Ø Standards are based on global expert opinion Ø Standards are developed through multi-stakeholder process Ø Standards are based on consensus Source #1: https://www.iso.org/developing-standards.html q Community standards (Open Geospatial Consortium (OGC)): Ø Serve to bring de facto standards from the larger geospatial community to be a stable reference point Ø Serve to bring new, but implemented, standards to the OGC to form the basis for further development Source #2: http://www.opengeospatial.org/standards
CRITERIA, CERTIFICATION, & STANDARDS (CONT.) Assessment Working Group (e.g. OGC, RDA) Gap Analysis Data Asset Framework (DAF) Data Seal of Approval Requirements Criteria DSA-WDS Partnership Working Group Catalogue of Common Requirements OAIS TRAC Data & Repository Data Seal of Approval ISO 16363:2012 Audit and certification of TDR ISO 14721:2012 OAIS Reference Model
TOOLS (EXAMPLES) 1. CARDIO à Collaborative Assessment of Research Data Infrastructure and Objectives 2. DAF àdata Asset Framework 3. DRAMBORA à Digital Repository Audit Method Based on Risk Assessment A. Benchmarks/Milestones B. Program Evaluation/Self-Assessment C. Reporting/Success Metrics Source: http://www.dcc.ac.uk/resources/tools-and-applications
USE CASE EXAMPLE (USGS) Ø What factors contributed to the need for trusted repositories? Ø Can you tell me the benefits of trusted repositories? Ø What is an accepted trusted repository? Ø What certifications and standards do trusted repositories employ? https://www2.usgs.gov/fsp/acceptable_repositories_digital_assets.asp
USE CASE EXAMPLE (LP DAAC NASA ESDIS) http://www.re3data.org/repository/r3d100010376
ACKNOWLEDGEMENTS 1. Amber E. Budden, PhD., Director for Community Engagement and Outreach, DataONE, University of New Mexico 2. John Faundeen, Archivist, U.S. Geological Survey, EROS Center 3. Laura Moyers, DataONE Member Node Coordinator, Center for Information & Communication Studies, University of Tennessee-Knoxville 4. DataONE Users Group
REFERENCES 1. CCSDS. (2011). The Consultative Committee for Space Data Systems. Audit and Certification of Trustworthy Digital Repositories. Recommended Practice. CCSDS 652.0-M-1. Magenta Book. September 2011. https://public.ccsds.org/pubs/652x0m1.pdf. 2. CCSDS. (2012). Reference Model for an Open Archival Information System (OAIS). Recommended Practice. CCSDS 650.0-M-2. Magenta Book. June 2012. https://public.ccsds.org/pubs/650x0m2.pdf. 3. CRL. (nd). Digital Preservation Metrics. https://www.crl.edu/archivingpreservation/digital-archives/metrics. 4. ISO 17421:2012. Space data and information transfer systems-open archival information system (OAIS) Reference Model. https://www.iso.org/standard/57284.html. 5. ISO 16363:2012. Space data and information transfer systems-audit and certification of trustworthy digital repositories. https://www.iso.org/standard/56510.html 6. TRAC. (2007). Trustworthy Repositories Audit & Certification: Criteria and Checklist. https://www.crl.edu/sites/default/files/d6/attachments/pages/trac_0.pdf. 7. USGS. (2017). Acceptable Digital Repositories for USGS Scientific Publications and Data. https://www2.usgs.gov/fsp/acceptable_repositories_digital_assets.asp
Thank you Questions/Comments Contact information: Plato Smith plato.smith@ufl.edu