RADAR A Repository for Long Tail Data

Similar documents
RADAR Project. Data Archival and Publication as a Service. Matthias Razum FIZ Karlsruhe RESEARCH DATA REPOSITORIUM. Zurich, December 15, 2014

RADAR. Establishing a generic Research Data Repository: RESEARCH DATA REPOSITORY. Dr. Angelina Kraft

Archivierung und Publikation von Forschungsdaten mit RADAR

RADAR Introduction and Basic Concepts. Matthias Razum

RADAR - A repository for long tail data

Data Management Checklist

COAR Interoperability Roadmap. Uppsala, May 21, 2012 COAR General Assembly

VI-SEEM Data Repository. Presented by: Panayiotis Charalambous

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

For Attribution: Developing Data Attribution and Citation Practices and Standards

DOIs for Research Data

Data management Backgrounds and steps to implementation; A pragmatic approach.

re3data.org - Making research data repositories visible and discoverable

Perspectives on Open Data in Science Open Data in Science: Challenges & Opportunities for Europe

FAIR-aligned Scientific Repositories: Essential Infrastructure for Open and FAIR Data

Research Data Management and Institutional Repositories

NRF Open Access Statement

Part 2: Current State of OAR Interoperability. Towards Repository Interoperability Berlin 10 Workshop 6 November 2012

Every Bit Counts. Publication and Citation of Data in the Earth Sciences MG&G Data Systems Advisory Committee Meeting 2009 Jens Klump et al.

The DOI Identifier. Drexel University. From the SelectedWorks of James Gross. James Gross, Drexel University. June 4, 2012

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

Open Access & Open Data in H2020

ZB MED Information Center Life Sciences

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

GEOSS Data Management Principles: Importance and Implementation

Science Europe Consultation on Research Data Management

Introducing the Springer Nature Data Support Services

DRS Update. HL Digital Preservation Services & Library Technology Services Created 2/2017, Updated 4/2017

Welcome to the Pure International Conference. Jill Lindmeier HR, Brand and Event Manager Oct 31, 2018

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

Reproducibility and FAIR Data in the Earth and Space Sciences

Open Access to Publications in H2020

Data Curation Handbook Steps

Indiana University Research Technology and the Research Data Alliance

Dataverse and DataTags

OpenAIRE Open Knowledge Infrastructure for Europe

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology

Research Elsevier

Persistent Identifier the data publishing perspective. Sünje Dallmeier-Tiessen, CERN 1

OpenAIRE From Pilot to Service The Open Knowledge Infrastructure for Europe

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA

Building on to the Digital Preservation Foundation at Harvard Library. Andrea Goethals ABCD-Library Meeting June 27, 2016

INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES

Digital repositories as research infrastructure: a UK perspective

Institutional Repository using DSpace. Yatrik Patel Scientist D (CS)


How to make your data open

Data Curation Profile Movement of Proteins

DMPonline / DaMaRO Workshop. Rewley House, Oxford 28 June DataStage. - a simple file management system. David Shotton

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

Data Curation Profile Human Genomics

Towards a joint service catalogue for e-infrastructure services

Data Citation and Scholarship

Linking data and publications the past, present, and future. Dr. Hylke Koers, Head of Content Innovation, Elsevier

Where to store research data during and after a project. Dr. Chris Emmerson Research Data Manager

Linda Strick Fraunhofer FOKUS. EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018

The DataCite Metadata Schema. Frauke Ziedorn Workshop: Metadata and Persistent Identifiers for Social and Economic Data 7th May 2012

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

DATA SHARING FOR BETTER SCIENCE

EUDAT. Towards a pan-european Collaborative Data Infrastructure

EUDAT Towards a Collaborative Data Infrastructure

OpenAIRE From Pilot to Service

Improving a Trustworthy Data Repository with ISO 16363

Horizon 2020 and the Open Research Data pilot. Sarah Jones Digital Curation Centre, Glasgow

Certification as a means of providing trust: the European framework. Ingrid Dillo Data Archiving and Networked Services

Mercè Crosas, Ph.D. Chief Data Science and Technology Officer Institute for Quantitative Social Science (IQSS) Harvard

Data Curation: Technical Challenges Facing Repositories. Brianna Marshall Jan. 9, 2014

SHARING YOUR RESEARCH DATA VIA

How to share research data

RDM, a view from Vancouver

COALITION ON PUBLISHING DATA IN THE EARTH AND SPACE SCIENCES: A MODEL TO ADVANCE LEADING DATA PRACTICES IN SCHOLARLY PUBLISHING. Source: NSF.

TIB AV-Portal: A Trusted Home for Conference Recordings

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Linking data and publications the past, present, and future. Dr. Hylke Koers, Head of Content Innovation, Elsevier

Data management and discovery

Caring for research data and what about software? Peter Doorn, director DANS

Assessing the FAIRness of Datasets in Trustworthy Digital Repositories: a 5 star scale

Applying Archival Science to Digital Curation: Advocacy for the Archivist s Role in Implementing and Managing Trusted Digital Repositories

Towards repository interoperability

SciX Open, self organising repository for scientific information exchange. D15: Value Added Publications IST

The Photon and Neutron Data Initiative PaN-data

EUDAT & SeaDataCloud

Universidad de Extremadura 22 October Elsevier s approach to Open Access and latest developments. Edward Wedel-Larsen. Research Solutions

Legal Issues in Data Management: A Practical Approach

ebooks Preservation at Scholars Portal Kate Davis & Grant Hurley Scholars Portal, Ontario Council of University Libraries

Survey of research data management practices at the University of Pretoria, South Africa: October 2009 March 2010

Horizon Societies of Symbiotic Robot-Plant Bio-Hybrids as Social Architectural Artifacts. Deliverable D4.1

Survey of Research Data Management Practices at the University of Pretoria

The Data Curation Profiles Toolkit: Interview Worksheet

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh

The library s role in promoting the sharing of scientific research data

Quality Assured (QA) data

Ensuring Proper Storage for Earth Science Data: The USGS Process to Certify Trusted Digital Repositories

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours)

OpenAIRE Guidelines Promoting Repositories Interoperability and Supporting Open Access Funder Mandates

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

Research Data Management: lessons learned - and still to learn

An introduction to data publications

Application of machine learning and big data technologies in OpenAIRE system

Trust and Certification: the case for Trustworthy Digital Repositories. RDA Europe webinar, 14 February 2017 Ingrid Dillo, DANS, The Netherlands

Transcription:

RADAR A Repository for Long Tail Data Angelina Kraft, Janna Neumann German National Library of Science and Technology TIB 36th Annual IATUL Conference Hannover, July 6 th, 2015 funded by

IN A NUTSHELL = Research Data Repository Goal: Establish a interdisciplinary research data repository Online: What kind of data? DIGITAL! Including: Raw (primary, machine output) Secondary (working data) Negative Analyzed in scientific articles Duration: September 2013 2014 2015 (August 2016) Funded by Hannover, July 6 th, 2015 2

SCIENCE NOW & THEN One thousand years ago, science was empirical: described natural phenomena Over the last one hundred years, a theoretical branch developed: building on models, generalisations In recent decades, an IT branch: Simulation of complex phenomena Today, science is data based (escience): Combination of theory, experiment & simulation. a a 2 4 G c 3 a 2 2 Hannover, July 6 th, 2015 3

RESEARCH DATA Long Tail The majority of datasets produced through research are part of the Long Tail of Research Data Source: Humphrey C (2014): OpenAIRE-COAR Conference, Athens Source: Ferguson et al. (2014): Big data from small data: data-sharing in the 'long tail' of neuroscience. DOI: 10.1038/nn.3838 Science Survey 2011: 48 % of respondents were working with datasets that were <1GB in size 50 % stored data exclusively! in labs Source: Science (2011): 331(6018), p. 692-693. DOI: 10.1126/science.331.6018.692 Hannover, July 6 th, 2015 4

DATA LANDSCAPE The Reality! Current situation of data publishing: Potential for data dumping overburdened! Articles Supplements Data centres / repositories Few Lack of archives in many subject areas! ~ 75 % of RD is never published Data on private hard disks / institutional servers Modified based on STM / Smit, E: Avoiding a Digital Dark Age for Data: why data and publications belong together ICSTI workshop Delivering Data in Science PARIS, 5 March 2012 Hannover, July 6 th, 2015 5

DATA LANDSCAPE The Future? Ideal case of data publishing: Journals request and check data deposition If no other data integration is possible RD in articles Supplements Research data in data centres and repositories Linking text & data enhanced publications Generic & discipline specific; interfaces for good connection! Support enhanced publications ; Persistent Identifiers Modified based on STM / Smit, E: Avoiding a Digital Dark Age for Data: why data and publications belong together ICSTI workshop Delivering Data in Science PARIS, 5 March 2012 Data on private hard disks / institutional servers Hannover, July 6 th, 2015 6

RADAR The Domain Model 1. Private domain 2. Collaborative domain 3. Public domain Archive 4. Dissemination domain Researcher s workplace Institutional infrastructure RADAR 2 Services: 1. Archival 2. Publication Portals, researchers Data selection Data documentation Data types / Data formats Based on: Treloar, A., Harboe-Ree, C. (2008) Data management and the curation continuum. How the Monash experience is informing repository relationships. VALA2008 14th Biennial Conference, Melbourne and Klump, J. (2009) Managing the Data Continuum. Online: http://oa.helmholtz.de/fileadmin/user_upload/data_continuum/klump.pdf Business model Infrastructure Software Metadata standards Persistent Identifiers Contracts Interfaces DataCite, publishers Reuse Hannover, July 6 th, 2015 7

TNERS oftware, framework & business model Scientific specification & evaluation Data management preservation services Data publication, metadata & contact to publishers

US OF RADAR rchival of research data as a generic service rustworthy preservation & traceable publication Long tail of research data ervices Basic service: interdisciplinary data preservation Extended service: data publication

GET AUDIENCE esearchers Archive and publish project based research data braries and Research Institutions Integration with existing institutional portals ltural Heritage Organizations Long term preservation & web access of digitized materials blishers Infrastructure for providing access to research data Linked to publications

VICE: Archival Storage Aim: Trustworthy data preservation For whom? Completed research projects Internal resources, not to be publically available (yet) Handle Properties: Minimum metadata set (9 parameters) Handle Variable storage period: up to 15 years + extension Bitstream Preservation for storage period Regular reports on data integrity Access rights for selected groups/users

VICE: Data Publication Aim: Trustworthy preservation & traceable publication DOI For whom? Projects: Data basis for scientific papers Independent data publications (e.g. negative data) Digital representations DOI API Properties: Expanded metadata set for discipline specific data DOI Unlimited storage period Regular reports on downstream use to data provider Access management (embargo & publisher services)

Test System June 2015

DOI 0110100 0110100 1110101 0110101 0111100 0111100 Data citation: Creator (Publication Year): Title of the data set. Publisher. Resource Type. Identifier

WORKFLOW Hannover, July 6 th, 2015 17

RESEARCH DATA MANAGEMENT Guidelines How To s, recommendations on formats, citations, licenses General & discipline specific glossary Step by step addition of examples Business model & quotes Indicative price, e.g. for funding applications Integration services for publishers/journals Data for peer review Hannover, July 6 th, 2015 18

RADAR Reliable Storage Space Management of storage quotas Bitstream Preservation Regular fixity checks PID Service (DOI & Handle) on data set or file level Generic metadata schema Managing license metadata & access rights Access may be restricted to the institution providing the data (resp. another authorized party) and service operator But: No functional long term preservation! Hannover, July 6 th, 2015 19

RADAR Roadmap Software further development of services 1. Middleware infrastructure realized 2. Archival service realized 3. Publication service in progress DSA certification in progress Roll out to further disciplines in progress Workflows & interfaces to data providers in progress Hannover, July 6 th, 2015 20

CH DATA REPOSITORY w.radar projekt.org Thank you for your attention! Questions? RADAR Test Account Contact: angelina.kraft@tib.uni hannover.de