DOI for Astronomical Data Centers: ESO. Hainaut, Bordelon, Grothkopf, Fourniol, Micol, Retzlaff, Sterzik, Stoehr [ESO] Enke, Riebe [AIP]

Similar documents
ESO Science Archive Interfaces

Implementing the RDA Data Citation Recommendations for Long Tail Research Data. Stefan Pröll

SysML for Telescope System Modeling

State of the Art in Data Citation

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

Storage and I/O requirements of the LHC experiments

First Light for DOIs at ESO

Purchasing. Operations 3% Marketing 3% HR. Production 1%

The Virtual Observatory and the IVOA

Data Centres in the Virtual Observatory Age

HPC IN EUROPE. Organisation of public HPC resources

This document is a preview generated by EVS

Service withdrawal: Selected IBM ServicePac offerings

Introduction to Bibliometrics and Tools for Organizing References. Uta Grothkopf ESO Library

This document is a preview generated by EVS

EUREKA European Network in international R&D Cooperation

SPACE SITUATIONAL AWARENESS

IBM offers Software Maintenance for additional Licensed Program Products

PRODUCT DATA. Reporting Module Type 7832

Global Economic Indicators: Global Leading Indicators

PRODUCT DATA. Conformance Test System for PSTN Telephones Base System Type 6711 (version 2.0)

EVA1 grip redesign allows greater production flexibility

GUIDELINES FOR THE MANAGEMENT OF ORGANIC PRODUCE CERTIFICATES BY APPROVED CERTIFYING ORGANISATIONS

Training Notes Unity Real Time 2

English Version. Postal Services - Open Standard Interface - Address Data File Format for OCR/VCS Dictionary Generation

E R T M S COMMUNICATION PLAN

Upgrading Luminex IS 2.3 to Bio-Plex Manager 6.1 Software. For technical support, call your local Bio-Rad office, or in the US, call

This document is a preview generated by EVS

Linking Literature to Astronomy Data with DOIs

The challenges of the ECMWF graphics packages

The Role of SANAS in Support of South African Regulatory Objectives. Mr. Mpho Phaloane South African National Accreditation System

This document is a preview generated by EVS

Generating SVG weather maps and meteorological graphs using Magics++

This document is a preview generated by EVS

European Cybersecurity PPP European Cyber Security Organisation - ECSO November 2016

Metadata Models for Experimental Science Data Management

This document is a preview generated by EVS

CUSTOMER GUIDE Interoute One Bridge Outlook Plugin Meeting Invite Example Guide

This document is a preview generated by EVS

This document is a preview generated by EVS

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA

Power Analyzer Firmware Update Utility Version Software Release Notes

This document is a preview generated by EVS

ITS Action Plan Task 1.3 Digital Maps

Uploading protocols and Assay Control Sets to the QIAsymphony SP via the USB stick

Multi-Site Parallel Testing with the S535 Wafer Acceptance Test System APPLICATION NOTE

Systematic and Coordinated Satellite Observation Plan for GEO Forest Carbon Tracking

This document is a preview generated by EVS

Convention Espace Partenaires , Ecole Militaire, Paris. ENX European Network Exchange Lennart Oly, Directeur, ENX Association

Experience from the UK Government s BIM Programme

ÍSLENSKIR STAÐLAR CWA :2011 ICS: Gildistaka Staðfestur af Staðlaráði Íslands

END-OF-SALE AND END-OF-LIFE ANNOUNCEMENT FOR THE CISCO MEDIA CONVERGENCE SERVER 7845H-2400

Checklist and guidance for a Data Management Plan, v1.0

Approaches to Making Dynamic Data Citeable Recommendations of the RDA Working Group Andreas Rauber

Cisco Extensible Provisioning and Operations Manager 4.5

INTERDIGITAL. 4 th Quarter 2013 Investor Presentation. invention collaboration contribution InterDigital, Inc. All rights reserved.

Country-specific notes on Waste Electrical and Electronic Equipment (WEEE)

Dataverse and DataTags

[ PARADIGM SCIENTIFIC SEARCH ] A POWERFUL SOLUTION for Enterprise-Wide Scientific Information Access

THE EUCLID ARCHIVE SYSTEM: A DATA-CENTRIC APPROACH TO BIG DATA

This document is a preview generated by EVS

This document is a preview generated by EVS

Unlimited UK mobile calls and unlimited UK texts Bolt On: Unlimited landlines Poland Bundle (400 minutes to mobiles & landlines) 3.

EUROPEAN READY-MIXED CONCRETE INDUSTRY STATISTICS YEAR

Signatories. to the EA Multilateral. and Bilateral Agreements

Summary of Data Management Principles

Safety. Introduction

The Materials Data Facility

Data management Backgrounds and steps to implementation; A pragmatic approach.

Items exceeding one or more of the maximum weight and dimensions of a flat. For maximum dimensions please see the service user guide.

Enterprise price plan guide Vodafone One Net Business

Developing a social science data platform. Ron Dekker Director CESSDA

SYS-Config v2.0. The clever way to configure your industrial control components. Request your free copy today

EventBuilder.com. International Audio Conferencing Access Guide. This guide contains: :: International Toll-Free Access Dialing Instructions

Global entertainment and media outlook Explore the content and tools

Siebel esmart (Siebel eservice Management and Request Tool) Customer Training Guide

E-Seminar. Voice over IP. Internet Technical Solution Seminar

This document is a preview generated by EVS

Price Plan Guide Vodafone 4G RED and 4G RED Business Enterprise Customers

Sberbank Online User Guide

EU Telecoms Reform package 2007 Comments/ questions

International Roaming Critical Information Summaries JULY 2017

This document is a preview generated by EVS

Batch Services at CERN: Status and Future Evolution

International Business Mail Rate Card

The Critical Importance of CIIP to Cybersecurity

This document is a preview generated by EVS

European Cybersecurity cppp and ECSO. org.eu

ISO/TS TECHNICAL SPECIFICATION. Automatic vehicle and equipment identification Intermodal goods transport Numbering and data structures

EUMETSAT EXPERIENCE WITH MULTICAST ACROSS GÉANT

AN POST SCHEDULE OF CHARGES

European Standardization & Digital Transformation. Ashok GANESH Director Innovation ETICS Management Committee

This document is a preview generated by EVS

Combating Pharmacrime AGENDA

ATCCIS Replication Mechanism (ARM)

From The European Library to The European Digital Library. Jill Cousins Inforum, Prague, May 2007

Distributed e-infrastructures for data intensive science

The Labour Cost Index decreased by 1.5% when compared to the same quarter in 2017

BoR (10) 13. BEREC report on Alternative Retail Voice and SMS Roaming Tariffs and Retail Data Roaming Tariffs

EXPOFACTS. Exposure Factors Sourcebook for Europe GENERAL

Transcription:

DOI for Astronomical Data Centers: ESO Hainaut, Bordelon, Grothkopf, Fourniol, Micol, Retzlaff, Sterzik, Stoehr [ESO] Enke, Riebe [AIP]

DOI for Astronomical Data Centers: ESO Hainaut, Bordelon, Grothkopf, Fourniol, Micol, Retzlaff, Sterzik, Stoehr [ESO] Enke, Riebe [AIP]

PIDs at ESO What is ESO? What do we do in terms of PIDs How does that work with RDA s recommendations DOIs for Astronomical Data Centres: ESO / PID Systems 2016 3

ESO? European Southern Observatory European Organisation for Astronomical Research in the Southern Hemisphere DOIs for Astronomical Data Centres: ESO / PID Systems 2016 4

European Southern Observatory European Organisation for Astronomical Research in the Southern Hemisphere Who? ESO? Austria, Belgium, Brazil, the Czech Republic, Denmark, Finland, France, Germany, Italy, the Netherlands, Poland, Portugal, Spain, Sweden, Switzerland and the United Kingdom + Chile DOIs for Astronomical Data Centres: ESO / PID Systems 2016 5

European Southern Observatory European Organisation for Astronomical Research in the Southern Hemisphere Who? Austria, Belgium, Brazil, the Czech Republic, Denmark, Finland, France, Germany, Italy, the Netherlands, Poland, Portugal, Spain, Sweden, Switzerland and the United Kingdom + Chile Why? ESO? provide state-of-the-art research facilities to astronomers big telescopes foster cutting-edge astronomical research collaborations DOIs for Astronomical Data Centres: ESO / PID Systems 2016 6

ESO? What? La Silla Paranal Chajnantor Armazones DOIs for Astronomical Data Centres: ESO / PID Systems 2016 7

ESO? What? La Silla: 3.6m + NTT DOIs for Astronomical Data Centres: ESO / PID Systems 2016 8

ESO? What? La Silla: 3.6m + NTT Paranal: VLT 4x 8.2m Surveys VLTI DOIs for Astronomical Data Centres: ESO / PID Systems 2016 9

What? La Silla: 3.6m + NTT Paranal: VLT 4x 8.2m Surveys VLTI Chajnantor: APEX sub/millimeter radio telescope (12m) ALMA sub/millimeter radio interferometer (25km) ESO? DOIs for Astronomical Data Centres: ESO / PID Systems 2016 10

ESO? What? Armazones: ELT 39m, 2024 DOIs for Astronomical Data Centres: ESO / PID Systems 2016 11

What? Astronomical Data Centre? All the raw data (La Silla + Paranal + APEX + WFCAM) > 20 years > 20 instruments DOIs for Astronomical Data Centres: ESO / PID Systems 2016 12

What? Astronomical Data Centre? All the raw data (La Silla + Paranal + APEX) > 20 years > 20 instruments A lot of processed data: Internal data products: Standard processing into physical units Advanced data products: Data processing to address specific questions Mosaic images, extracted + interpreted spectra Catalogues DOIs for Astronomical Data Centres: ESO / PID Systems 2016 13

What? All the raw data (La Silla + Paranal + APEX) > 20 years > 20 instruments A lot of processed data: Internal data products: Standard processing into physical units Advanced data products: Data processing to address specific questions Mosaic images, extracted + interpreted spectra Catalogues Why? Astronomical Data Centre? Preservation Distribution Archive-based science DOIs for Astronomical Data Centres: ESO / PID Systems 2016 14

Astronomical Data Centre? How much? 40 million files 700 TB 30 billion metadata rows DOIs for Astronomical Data Centres: ESO / PID Systems 2016 15

Astronomical Data Centre? How much? FORS1.2016-09-01T07:13:42.532 40 million files (each with unique, permanent identifier) 700 TB 30 billion metadata rows DOIs for Astronomical Data Centres: ESO / PID Systems 2016 16

How much? 40 million files 700 TB 30 billion metadata rows How? Astronomical Data Centre? NGAS, Linux, RAID5/6 2d spinning copies DOIs for Astronomical Data Centres: ESO / PID Systems 2016 17

How much? 40 million files 700 TB 30 billion metadata rows How? NGAS, Linux, RAID5 2d copies Where? Astronomical Data Centre? Science Archive Facility ESO HQ Off-site copy at MP CDF, with off-off-site tape backup DOIs for Astronomical Data Centres: ESO / PID Systems 2016 18

DOI = PID for digital objects DOI:10.18727/xxxxxx/yyyy/zzzz DOI registered at agency DOI resolves to a landing web page Meta data about the digital object Pointer to the actual digital object DOI is persistent At least 10yrs PIDs at ESO: DOI ESO has an agreement with TIB / Leibniz (sub-agency of DataCite for Germany) DOIs minted free of charge DOIs for Astronomical Data Centres: ESO / PID Systems 2016 19

Track (by ESO): DOIs: WHY? Track observation programmes used in publication: Science output per programme Science output per instrument DOIs for Astronomical Data Centres: ESO / PID Systems 2016 20

Track (by ESO): Track observation programmes used in publication: Science output per programme Science output per instrument Cite: DOIs: WHY? programme/dataset as citable entity DOIs for Astronomical Data Centres: ESO / PID Systems 2016 21

Track (by ESO): Track observation programmes used in publication: Science output per programme Science output per instrument Cite: programme/dataset as citable entity Reproduce science result: DOIs: WHY? Unique, reproducible dataset to reproduce a result DOIs for Astronomical Data Centres: ESO / PID Systems 2016 22

All raw data from an observing programme / run DOI:10.18727/ESOprog/097.C-0123/B Finite and well defined, DOIs: WHAT? Obtained with a science goal Continuity with previous system (citing prog.id) Metadata: information from the programme Authors, telescope, instrument, programme abstract DOIs for Astronomical Data Centres: ESO / PID Systems 2016 23

DOIs: WHAT? Advanced Data Product package DOI:10.18727/ESOdata/0123546* Data releases from Survey, from large programme Each release is Finite, well defined, Well documented Heavily used by the community * Unique, semi-sequential number DOIs for Astronomical Data Centres: ESO / PID Systems 2016 24

DOIs: WHAT? Arbitrary data set user generated DOI:10.18727/ESOdata/00123547 List of unique data files identifiers raw science, calibration, processed or combination Upload the list of IDs to ESO ESO generates the DOI Metadata: author, abstract with purpose of the dataset Unique dataset Author can refer to his dataset, citation Allows others to reproduce the exact same investigation Allows ESO to track Programmes Instrument Type of data (raw, processed, advanced ) DOIs for Astronomical Data Centres: ESO / PID Systems 2016 25

Recommendations 14 recommendations on how to adapt a data source for providing identifiable subsets for the long term, elaborated by the RDA WG on Dynamic Data Citation. These recommendations are mostly for databases. How do they apply to our data? DOIs for Astronomical Data Centres: ESO / PID Systems 2016 26

Recommendations R1 - Data Versioning: Apply versioning to ensure earlier states of data sets can be retrieved. OK: Each of our asset (be it raw or a data product) has a unique identifier based on its creation time. Changes are always incremental. R2 Timestamping: Ensure that operations on data are timestamped. OK: Each of our asset (be it raw or a data product) has a unique identifier based on its creation time. Always incremental. EMMI.2013-01-04T23:42:12.123 DOIs for Astronomical Data Centres: ESO / PID Systems 2016 27

Recommendations R3 - Query Store Facilities: Provide means for storing queries and the associated metadata in order to re-execute them in the future. N/A: Only the result of the queries resulting in a data download are preserved (and stored with a query ID). The actual query or selection process leading to that dataset is not preserved. R4 - Query Uniqueness: Re-write the query to a normalised form so that identical queries can be detected. NOT supported: only the result of the query is stored, not the query itself. DOIs for Astronomical Data Centres: ESO / PID Systems 2016 28

Recommendations R5 - Stable Sorting: Ensure that the sorting of the records in the data set is unambiguous and reproducible. OK: for the dataset discussed: a list of frames is unambiguous and reproducible. R6 - Result Set Verification: Compute fixity information (also referred to as checksum or hash key) OK: The individual frames have a check sum. The list of frames constituting the result of a query does not, but is simple to compare (eg, diff). DOIs for Astronomical Data Centres: ESO / PID Systems 2016 29

Recommendations R7 - Query Timestamping: Assign a timestamp to the query based on the last update to the database (for privacy) OK NO: The archive queries are time stamped. DB is ~continuously updated; privacy not an issue(?) R8 - Assigning Query PID: Assign PID to query if either the query or the result is new. Otherwise, return the existing PID of the earlier query. OK NO: PID is issued, but not check for uniqueness. Not an issue in our case? DOIs for Astronomical Data Centres: ESO / PID Systems 2016 30

Recommendations R9 - Store the Query: Store query and metadata (e.g. PID, original and normalised query, query and result set checksum, timestamp, superset PID, data set description, and other) in the query store. OK: Query itself is not relevant; result list is stored with metadata R10 - Automated Citation Text Generation: Generate citation texts in the format prevalent in the designated community. Include the PID into the citation text snippet. OK: this is the purpose of the DOI. Landing page includes a citable snippet in BiTeX. DOIs for Astronomical Data Centres: ESO / PID Systems 2016 31

Recommendations R11 - Landing Pages: Make the PIDs resolve to a human readable landing page that provides the data and metadata. OK: this is a requirement of the DOI. R12 - Machine Actionability: Provide an API / machine actionable landing page to access metadata and data via query reexecution. To be implemented: simple via the programmatic interface of the Archive Services. DOIs for Astronomical Data Centres: ESO / PID Systems 2016 32

Recommendations R13 - Technology Migration: When data is migrated to a new representation, migrate also the queries and associated fixity information. OK: lists are resilient; migration is core mission of the Science Archive Facility. R14 - Migration Verification: Verify successful data and query migration, ensuring that queries can be re-executed correctly. OK: core mission of the Science Archive Facility DOIs for Astronomical Data Centres: ESO / PID Systems 2016 33

We generate DOIs for Noteworthy datasets User-assembled dataset to make the dataset Final thoughts Trackable by us Citable in the literature Reproducable by other researchers DOIs for Astronomical Data Centres: ESO / PID Systems 2016 34

We generate DOIs for Noteworthy datasets User-assembled dataset to make the dataset Final thoughts Trackable by us Citable in the literature Reproducable by other researchers Do you have comments / recommendations? ohainaut@eso.org DOIs for Astronomical Data Centres: ESO / PID Systems 2016 35