Data Citation Technical Issues - Identification

Similar documents
Open Access Statistics: Interoperable Usage Statistics for Open Access Documents

The bx Scholarly Recommender Service. Nettie Lagace Product Manager, Ex Libris

Establishing New Levels of Interoperability for Web-Based Scholarship

Data Citation and Scholarship

For Attribution: Developing Data Attribution and Citation Practices and Standards

An Archiving System for Managing Evolution in the Data Web

Introduction to INEXDA s Metadata Schema

Enhancing discovery with entity reconciliation: Use cases from the Linked Data for Libraries (LD4L) project

OAC: Alpha Data Model

Using DCAT-AP for research data

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

BPMN Processes for machine-actionable DMPs

Implementing the RDA Data Citation Recommendations for Long Tail Research Data. Stefan Pröll

Open Archives Initiative Object Reuse and Exchange Technical Committee Meeting, May 29, Edited by: Carl Lagoze & Herbert Van de Sompel

B2FIND and Metadata Quality

HiberActive Pro-Active Archiving of Web References from Scholarly Articles

Link Rot + Content Drift

Reducing Consumer Uncertainty

Data and visualization

Smart Routing. Requests

A distributed network of digital heritage information

SHARING YOUR RESEARCH DATA VIA

OAI-ORE. A non-technical introduction to: (

Specific requirements on the da ra metadata schema

Persistent Identifiers

Adoption of Data Citation Outcomes by BCO-DMO

On the use of Abstract Workflows to Capture Scientific Process Provenance

Reproducibility and FAIR Data in the Earth and Space Sciences

DOIs for Research Data

Services to Make Sense of Data. Patricia Cruse, Executive Director, DataCite Council of Science Editors San Diego May 2017

Metadata. Week 4 LBSC 671 Creating Information Infrastructures

ISMTE Best Practices Around Data for Journals, and How to Follow Them" Brooks Hanson Director, Publications, AGU

Category: Informational. G. Bilder Crossref J. Kunze California Digital Library S. Warner Cornell University April 2019

Reducing Consumer Uncertainty Towards a Vocabulary for User-centric Geospatial Metadata

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

SEAD Data Services. Jim Best Practices in Data Infrastructure Workshop. Cooperative agreement #OCI

FREYA Connected Open Identifiers for Discovery, Access and Use of Research Resources

COAR Interoperability Roadmap. Uppsala, May 21, 2012 COAR General Assembly

First Light for DOIs at ESO

Mercè Crosas, Ph.D. Chief Data Science and Technology Officer Institute for Quantitative Social Science (IQSS) Harvard

EERQI Innovative Indicators and Test Results

CMIP6 Data Citation and Long- Term Archival

Interoperability, Metadata, and Complex Objects

State of the Art in Data Citation

Linked Open Europeana: Semantics for the Digital Humanities

ResolutionDefinition - PILIN Team Wiki - Trac. Resolve. Retrieve. Reveal Association. Facets. Indirection. Association data. Retrieval Key.

Main Contents for the Presentation

The Semantic Institution: An Agenda for Publishing Authoritative Scholarly Facts. Leslie Carr

Part 2: Current State of OAR Interoperability. Towards Repository Interoperability Berlin 10 Workshop 6 November 2012

Unique Identifiers Assessment: Results. R. Duerr

Illinois Data Bank Metadata Documentation

Linking datasets with user commentary, annotations and publications: the CHARMe project

RDF: Resource Description Failures and Linked Data Letdowns

Persistent identifiers, long-term access and the DiVA preservation strategy

PDS, DOIs, and the Literature. Anne Raugh, University of Maryland Edwin Henneken, Harvard-Smithsonian Center for Astrophysics

Implementing the RDA data citation recommendations in the distributed Infrastructure of the Virtual Atomic and Molecular Data Centre

Research Data Repository Interoperability Primer

Registry Interchange Format: Collections and Services (RIF-CS) explained

Data Curation: Technical Challenges Facing Repositories. Brianna Marshall Jan. 9, 2014

CESSDA ERIC Persistent Identifier Policy Best Practice Guidelines

FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data

Harvesting RDF triples

Open Research Online The Open University s repository of research publications and other research outputs

LIBER Webinar: A Data Citation Roadmap for Scholarly Data Repositories

Building a federated science web one link at a time

OpenAIRE Guidelines Promoting Repositories Interoperability and Supporting Open Access Funder Mandates

Open Research Online The Open University s repository of research publications and other research outputs

Protégé-2000: A Flexible and Extensible Ontology-Editing Environment

The RMap Project: Linking the Products of Research and Scholarly Communication Tim DiLauro

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

GeoDCAT-AP: Use cases and open issues

Persistent Identifier the data publishing perspective. Sünje Dallmeier-Tiessen, CERN 1

Harvesting RDF Triples

Introducing the Springer Nature Data Support Services

Resilient Linked Data. Dave Reynolds, Epimorphics

A Data Citation Roadmap for Scholarly Data Repositories

Jumpstarting the Semantic Web

Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis

INTELLIGENT SYSTEMS OVER THE INTERNET

A Data Sharing System

Survey of Existing Services in the Mathematical Digital Libraries and Repositories in the EuDML Project

is an electronic document that is both user friendly and library friendly

Paul Doorenbosch KB, National Library of the Netherlands Open Repositories 2010, Madrid,

The adore Federation Architecture

For each use case, the business need, usage scenario and derived requirements are stated. 1.1 USE CASE 1: EXPLORE AND SEARCH FOR SEMANTIC ASSESTS

The Materials Data Facility

Persistent Identifiers for Earth Science Provenance

CEN MetaLex. Facilitating Interchange in E- Government. Alexander Boer

Geographic Information Fundamentals Overview

DATAVERSE FOR JOURNALS

Demos: DMP Assistant and Dataverse

Dataverse and DataTags

FIBO Metadata in Ontology Mapping

Dutch View on URN:NBN and Related PID Services

Semantic e-science. Bibliographic Cloud

OpenAIRE Guidelines for Data Archive Managers 1.0 December 2012

Approaches to Making Dynamic Data Citeable Recommendations of the RDA Working Group Andreas Rauber

Why CERIF? Keith G Jeffery Scientific Coordinator ERCIM Anne Assserson eurocris. Keith G Jeffery SDSVoc Workshop Amsterdam

Transcription:

Data Citation Technical Issues - Identification Los Alamos National Laboratory Research Library http://public.lanl.gov/herbertv/ hvdsomp@gmail.com @hvdsomp 1

Identification of Data As part of a Citation To support various use case: Credit Access Re-use To support different agents: Human Machine 2

Identification of Data as Part of Citation Bollen, J., Van de Sompel, H., Hagberg, A., Chute, R. A Principal Component Analysis of 39 Scientific Impact Measures. PLoS ONE, 4(6), pp. e6022, 2009. doi:10.1371/journal.pone/ 0006022 Bollen, J., Van de Sompel, H., Hagberg, A., Chute, R. A Principal Component Analysis of 39 Scientific Impact Measures. PLoS ONE, 4(6), pp. e6022, 2009. doi:10.1371/journal.pone/ 0006022 http://dx.doi.org/10.1371/journal.pone/0006022 Bollen, J., Van de Sompel, H., Hagberg, A., Chute, R. A Principal Component Analysis of 39 Scientific Impact Measures. PLoS ONE, 4(6), pp. e6022, 2009. http://dx.doi.org/10.1371/ journal.pone/0006022 3

Disclaimer Throughout, DOI is used as example of commonly used Persistent Identifier. Other commonly used Persistent Identifiers include ARK, HDL, URN:NBN I do not consider myself a dataset expert, but I have been running into identifiers for a long time 4

3 Considerations that Merit More Attention Nature of Identifiers for Citation, Access, Re-Use Catering to Human and Machine Agents Granularity: Spatial ; Temporal 5

Data Publication, Access, Re-Use, Citation 6

Consideration 1: Identifiers for Citation, Access, Re-Use 7

Identifiers for Citation, Access, Re-Use 8

Identifiers for Citation, Access, Re-Use 9

Consideration 2: Catering to Human and Machine Agents 10

Catering to Human and Machine Agents Open human actionable splash page Open machine actionable splash page 11

Catering to Human and Machine Agents Attention to machine processing (without human intervention) of data Machine actionable metadata, e.g.: Clarification of various URIs involved: data (components, formats), splash page, data access points, etc.); cf. OAI- ORE Technical metadata about data API descriptions Authorship (URIs; ORCID) Rights & license details Provenance account 12

Consideration 3: Granularity - Spatial Simple segments (dimensions in hypercube) Complex segments (arbitrary queries) 13

Granularity - Spatial Similarity with requirement in emerging annotation frameworks to be able to describe segments of resources: Constraints in Open Annotation ; Selectors in Annotation Ontology 14

Consideration 3: Granularity - Temporal Different versions of dataset over time Different states of dataset over time (dynamic data) 15

Granularity - Temporal Similarity with Memento protocol that allows accessing prior versions/states of a resource given its original URI 16

Granularity Spatial & Temporal Both data publisher and data re-user require granular addressing capability There s something to be said in favor of citing at the level of the entire dataset (identifier of data) and adding granular specificity for access and reuse. 17

Option: Granularity Spatial & Temporal For citation, access, and re-use: Mint new identifier for each segment Option: For citation, access, and re-use: Use (entire) dataset identifier with query component Option: For citation: Use (entire) dataset identifier For access and re-use: Use additional URI that refers to a constraint/selector on the dataset that describes the segment (both spatial and temporal). Citation becomes a tuple: [dataset URI; constraint URI] 18

3 Considerations that Merit More Attention Nature of Identifiers for Citation, Access, Re-Use Catering to Human and Machine Agents Granularity: Spatial ; Temporal 19