Pilot Implementation: Publication and Citation of Scientific Primary Data

Similar documents
Using digital library techniques - Registration of scientific primary data -

Every Bit Counts. Publication and Citation of Data in the Earth Sciences MG&G Data Systems Advisory Committee Meeting 2009 Jens Klump et al.

Digital Object Identifiers for scientific data. Dr Norman Paskin International DOI Foundation Oxford OX2 8HY UK

CERA: Database System and Data Model

The DOI Identifier. Drexel University. From the SelectedWorks of James Gross. James Gross, Drexel University. June 4, 2012

DOIs for Scientists. Kirsten Sachs Bibliothek & Dokumentation, DESY

CMIP5 Datenmanagement erste Erfahrungen

INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES

Bengkel Kelestarian Jurnal Pusat Sitasi Malaysia. Digital Object Identifier Way Forward. 12 Januari 2017

Digital Preservation. Unique Identifiers

PDS, DOIs, and the Literature. Anne Raugh, University of Maryland Edwin Henneken, Harvard-Smithsonian Center for Astrophysics

Specific requirements on the da ra metadata schema

Internet Engineering Task Force (IETF) Obsoletes: 7302 September 2016 Category: Informational ISSN:

DOIs for Research Data

An introduction to data publications

doi> Digital Object Identifier

- C3Grid Stephan Kindermann, DKRZ. Martina Stockhause, MPI-M C3-Team

Technical documentation. SIOS Data Management Plan

Network Working Group. Category: Informational April A Uniform Resource Name (URN) Namespace for the Open Geospatial Consortium (OGC)

PID System for eresearch

First Light for DOIs at ESO

Digital object identifier system: an overview Rajesh Chandrakar INFLIBNET Centre, Ahmedabad, India

Persistent and unique Identifiers

Introduction to INEXDA s Metadata Schema

Robin Wilson Director. Digital Identifiers Metadata Services

For Attribution: Developing Data Attribution and Citation Practices and Standards

Core DOI Specification

Data Citation and Scholarship

Using Persistent Identifiers at

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia framework (MPEG-21) Part 21: Media Contract Ontology

Network Working Group. Category: Informational October 2005

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

CMIP6 Data Citation and Long- Term Archival

WP4: Data Forum. Øystein Godøy, Boris Radosavljević, Boris Biskaborn, Anna Irrgang

Conducting a Self-Assessment of a Long-Term Archive for Interdisciplinary Scientific Data as a Trustworthy Digital Repository

Approach to persistent identifiers and data-service-coupling in the German Spatial Data Infrastructure

Dutch View on URN:NBN and Related PID Services

epic and the Handle System

Adoption of Data Citation Outcomes by BCO-DMO

Persistent Identifiers

PIDs for CLARIN. Daan Broeder CLARIN / Max-Planck Institute for Psycholinguistics

Network Working Group Request for Comments: 3937 Category: Informational October 2004

Software Requirements Specification for the Names project prototype

The Experimental Project of DOI Registration for Research Data at Japan Link Center (JaLC)

ZB MED. Libraries and the Information Infrastructure in Germany: Nutrition Environment Agriculture. Medicine Health

RADAR - A repository for long tail data

+ Page Page 21 + I Want to Hold Your Hand(le)

ZB MED Information Center Life Sciences

EUDAT & SeaDataCloud

ATARRABI A WORKFLOW SYSTEM FOR THE PUBLICATION OF ENVIRONMENTAL DATA

CNI Fall 2015 Membership Meeting, Washington, D.C. Archivportal-D. The National Platform for Archival Information in Germany

GEOSS Data Management Principles: Importance and Implementation

Developing the ICSU World Data System (WDS)

ISO/IEC Information technology Multimedia framework (MPEG-21) Part 3: Digital Item Identification

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours)

TIB AV-Portal. Margret Plank 19th of January 2015 TACC Meeting

EUDAT Towards a Collaborative Data Infrastructure

EPFL S. Willmott UPC September 2003

RADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

RADAR A Repository for Long Tail Data

re3data.org - Making research data repositories visible and discoverable

FREYA Connected Open Identifiers for Discovery, Access and Use of Research Resources

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

National Computerization Agency Category: Informational July 2004

Trust and Certification: the case for Trustworthy Digital Repositories. RDA Europe webinar, 14 February 2017 Ingrid Dillo, DANS, The Netherlands

Microsoft XML Namespaces Standards Support Document

EUDAT Common data infrastructure

Handles at LC as of July 1999

Part 2: Current State of OAR Interoperability. Towards Repository Interoperability Berlin 10 Workshop 6 November 2012

Data Curation Profile Human Genomics

Description Cross-domain Task Force Research Design Statement

Some important concepts relating to identifiers are uniqueness, resolution, interoperability, and persistence.

Report on the European Resolution Discovery Service (ERDS) Meeting (Feb 17/18, 2010)

SHARING YOUR RESEARCH DATA VIA

Towards a joint service catalogue for e-infrastructure services

Administrative Guideline. SMPTE Metadata Registers Maintenance and Publication SMPTE AG 18:2017. Table of Contents

B2FIND and Metadata Quality

Persistent Identifier the data publishing perspective. Sünje Dallmeier-Tiessen, CERN 1

Internet Engineering Task Force (IETF) Category: Informational March 2017 ISSN:

Information technology Security techniques Telebiometric authentication framework using biometric hardware security module

Services pour identifier et valoriser : l enregistrement de données via DataCite

Internet Engineering Task Force (IETF) Request for Comments: 6061 Category: Informational January 2011 ISSN:

AN INFORMATION SYSTEM FOR RESEARCH DATA IN MATERIAL SCIENCE

Chemotion funded by. Göttingen eresearch Toolbox Series - Electronic Note Keeping. Nicole Jung.

Microsoft XML Namespaces Standards Support Document

1. Understand what persistent identifiers are, how they work and the benefits to using them in a DSpace repository environment

* Network Working Group. Expires: January 6, 2005 August A URN namespace for the Open Geospatial Consortium (OGC)

Persistent Identifiers for Earth Science Provenance

ITU-T Y Next generation network evolution phase 1 Overview

Is my institution ready for data citation? Dave Connell, Australian Antarctic Data Centre

Collaborations and Partnerships. John Broome CODATA-International

Name type specification definitions part 1 basic name

SERIES H: AUDIOVISUAL AND MULTIMEDIA SYSTEMS Infrastructure of audiovisual services Coding of moving video

Improving a Trustworthy Data Repository with ISO 16363

Riding the Wave: Move Beyond Text TIB's strategy in the context of non-textual materials. Uwe Rosemann, Irina Sens IATUL Conference Singapur

Utilizing PBCore as a Foundation for Archiving and Workflow Management

DRAFT SCIT/SDWG/11 Ad Agenda item 5 UNIFORM RESOURCE IDENTIFIERS FOR INDUSTRIAL PROPERTY RESOURCES

Archivierung und Publikation von Forschungsdaten mit RADAR

ITU-T I.570. Public/private ISDN interworking. SERIES I: INTEGRATED SERVICES DIGITAL NETWORK Internetwork interfaces. Recommendation ITU-T I.

Transcription:

Pilot Implementation: Publication and Citation of Scientific Primary Data Result of CODATA WG, supported by DFG Jan Brase Learning Lab Lower Saxony, Uni. Hannover Michael Lautenschlager WDC for Climate Model and Data / Max-Planck-Institute for Meteorology ERPANET WS, Cork, Ireland, 17+18.06.04 IDF Member's Meeting, London, 22.06.04

Roots CODATA 1) National Committee initiated WG, grant-aided by DFG Working Period September 2001 to May 2002 Result Final Report "Konzept zur Zitierfähigkeit wissenschaftlicher Primärdaten" or "Conception of Citing Scientifc Primary Data", Hannover, 29.05.2002 Continuation Two year project for pilot implementation funded by DFG starting in October 2003 ( 1) CODATA - Committee on Data for Science and Technology) J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 2

Problem and Solution Shortcomings in data provision and interdisciplinary use Rules of good scientific practise are not taken into account in all cases. Data sources are widely unknown. Data are achived without context. Method of resolution: publication of primary data Persitent Identifier with global resolving mechanism for data archive and context referencing Integration into library catalogues in order to find data together with articles J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 3

Credits in Science "Citation Index": Scientific efficiency is "measured" by publications. Extra work for data publication is currently not acknowledged. Data processing, context documentation, quality assurance. Recommendation: Data publications should be included in the standard scientific "Citation Index". Motivation of the individual scientist. Connection between person and primary dataset. Citable Data publications support the rules of good scientific practise. encourage inter-disciplinary data utilisation. Make data searchable in library catalogues together with articles Closes the gap between scientifc literature and related data sources J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 4

Metadata for primary data 1 Attribute Example 1. DOI 2. identifier 3. creator 4. publisher 5. title 6. language 7. StructuralType 8. mode 9. resourcetype 10.1594/WDCC/IPCC_EH4_OPYC_SRES_B2_MM URN:TIB:10.1594/WDCC/IPCC_EH4_OPYC_SRES_B 2_MM Monika Esch (Author) WDCC, World Data Center for Climate Climate Projection for the next Century calculated by the Global Climate Model ECHAM4- OPYC using the SRES B2 IPCC Scenario en Digital Abstract Dataset J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 5

Metadata for primary data 2 Attribute Example 10.-12. registration information 13. creationdate 14. publicationdate 15. description 16. publicationplace 17. size 18. format 19. edition 20. relateddois 10.1594 (RA) / 1 (issue no.) / 2004-07-18 (issue date) 2001-12-31 2004-07-18 These data represent results from the ECHAM4/OPYC climate model running the SRES- B2 sceanrio. The data base tables contain monthly mean time sereis of Hamburg 614190228 Bytes GRIB 1 (none) J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 6

Criteria for Persistent Identifier Allocation Critical points are securing of data quality and stable connection between identifier and data entity Allocation is restricted to syntax control and completeness, i.e. expert data description and long-term archiving Scientific quality assurance is expected by the author and will be reviewed during the allocation process. Published primary data cannot be changed like published articles. Stable connection between identifier reference and data entity as well as long-term availability of the primary data are essential and must be ensured (e.g. ICSU WDC's) J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 7

DOI and URN DOI (Digital Object Identifier) Non profit, but membership fee URN (Uniform Ressource Name) Presently cost free Extended metadata support Basic technical metadata System of registration agencies infrastructure Global resolving mechanism Anybody can register URN namespaces Resolving at community level J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 8

International DOI Foundation Global Handle System TIB Hannover Registr.Agency DDB URN-Knot GFZ Geophysics M&D/MPIM Climate Models Marum/AWI Observations TIB-ORDER Library Catalogue Data Storage Long-term Archiving Data Storage Long-term Archiving In WDC Data Storage Long-term Archiving In WDC DFG Project "Publication and Citation of Scientific Primary Data" J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 9

J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 10

More Details of Pilot Implementation Application Example J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 11

Primary data publication During her research for the World Data Center Climate (WDCC) the scientist Mrs. Weather gains primary data about the weather in Hannover in the year 2003. As usual the primary data is tested, evaluated, stored and administrated at the WDCC. In addition Mrs. Weather registers the primary data at the TIB. J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 12

Registration of primary data Mrs. Weather transmits to the TIB the URL where the data can be accessed, together with a XML-file containing all relevant metadata Including all information obligatory for the citing of electronic media (ISO 690-2) author title size edition language publisher publishing date publishing place J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 13

Identifier The TIB is saving this information about the primary data and awards the primary data with a unique identifier for registration: a DOI DOI (Digital Object Identifier) is a system for persistent and actionable identification and interoperable exchange of intellectual property on digital networks Coordinated by the International DOI foundation (IDF) Prefix Suffix 10.1000/123456 DOI J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 14

Citing primary data In her publications, Mrs. Weather is now citing this primary data with its unique DOI, maintaned from the TIB: doi:10.1594 /WDCC/W_Han_2003_MMB_2 10.1594 (Prefix) stands for the TIB as the registration agency. WDCC W_Han_2003_MMB_2 stands for the respective research institute. is the internal name of the Data J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 15

Resolving the DOI These DOI can be resolved (and the data can be cited) in every browser worldwide in three ways: Or by http://dx.doi.org/10.1594/wdcc/w_han_2003_mmb_2 http://doi.tib-hannover.de:8000/10.1594/wdcc/w_han_2003_mmb_2 Doi://10.1594/WDCC/W_Han_2003_MMB_2 (after installing a browser plugin) J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 16

J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 17

J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 18

J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 19

Usage scenario 1 Mr. Storm is reading publications from Mrs. Weather in a journal and would like to analyse her data under different aspects. In his publication Comparison of the weather from Hannover and Miami Mr. Storm cites Mrs. Weathers data using its DOI, refering to the uniqueness and own identity of the original data. Citation example: Weather, 2003: Weather in Hannover for 2003. [doi: 10.1594/WDCC/W_Han_2003_MMB_2] J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 20

Usage scenario 2 Mr. Nice is writing a paper about the sales figures of ice cream in Hannover in 2003, but he has no information about the weather. He uses the TIB as the central registration agency to start a metadata search over the registered primary data. The result is doi:10.1594/wdcc/w_han_2003_mmb_2 He resolves the DOI to find the data sufficient. The metadata refers him to the WDCC as publisher and data archive. In his paper he cites the data again using their DOI. J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 21

URN In cooperation with the German Library (DDB) in Frankfurt, every dataset is also registered with an unique URN, having the same structure as the DOI: DOI-Structure: 10.1594/WDCC/W_Han_2003_MMB_2 URN-Structure: Urn:TIB:10.1594/WDCC/W_Han_2003_MMB_2 J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 22

Current situation In cooperation with World Data Center Climate (WDCC), Max Plank Institut für Meteorologie, Hamburg Geoforschungszentrum Potsdam World Data Center MARE, Uni. Bremen and Alfred Wegener Institute Bremerhaven Learning Lab Lower Saxony, Uni. Hannover the TIB Hannover now is the world s first registration agency for scientific and technical data (STD-DOI). J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 23

Technical A Handle server is installed at the TIB Hannover, so TIB is able to register and resolve DOIs. The TIB officially received a DOI Prefix (10.1594) The first data sets have been stored at the TIB by hand. The automatic registration process is under development. J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 24

Technical realization Central Library database Göttingen Metadata storage International DOI Foundation DOI registration Cocoon-Webserver XML-basiert XSL-Transformierung Handle Server URN registration DDB Data URL with XML-file GFZ WDCs J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 25

Outlook 2004 We expect abaout 10.000 datasets until the end of the year. 2005 The system shall be widened for other science fields 2006 The TIB Hannover shall become the central registration agency for scientific primary data J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 26

Further information Project webpage: http://www.std-doi.de TIB Handle Server: http://doi.tib-hannover.de:8000 DOI Foundation: http://www.doi.org URN registration of the DDB: http://www.persistent-identifier.de J.Brase (L3S) + M.Lautenschlager (WDCC) / 07.06.04 / 27