Fundamentals of Data Infrastructures

Similar documents
Data management and discovery

Data Replication: Automated move and copy of data. PRACE Advanced Training Course on Data Staging and Data Movement Helsinki, September 10 th 2013

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

Data Discovery - Introduction

Data Staging and Data Movement with EUDAT. Course Introduction Helsinki 10 th -12 th September, Course Timetable TODAY

EUDAT - Open Data Services for Research

Using EUDAT services to replicate, store, share, and find cultural heritage data

EUDAT. Towards a pan-european Collaborative Data Infrastructure - A Nordic Perspective? -

EUDAT- Towards a Global Collaborative Data Infrastructure

EUDAT Data Services & Tools for Researchers and Communities. Dr. Per Öster Director, Research Infrastructures CSC IT Center for Science Ltd

Persistent Identifiers

Towards a joint service catalogue for e-infrastructure services

EUDAT. Towards a pan-european Collaborative Data Infrastructure. Damien Lecarpentier CSC-IT Center for Science, Finland EUDAT User Forum, Barcelona

EUDAT. Towards a pan-european Collaborative Data Infrastructure

EUDAT Training 2 nd EUDAT Conference, Rome October 28 th Introduction, Vision and Architecture. Giuseppe Fiameni CINECA Rob Baxter EPCC EUDAT members

EUDAT & AAI. Daan Broeder MPI for Psycholinguistics

NorStore. a national infrastructure for scientific data. Andreas O Jaunsen UNINETT Sigma as

EUDAT and Cloud Services

Towards FAIRness: some reflections from an Earth Science perspective

EUDAT. Towards a pan-european Collaborative Data Infrastructure

Persistent identifiers, long-term access and the DiVA preservation strategy

Data Management Checklist

EUDAT Common data infrastructure

EUDAT & SeaDataCloud

Scientific Data Curation and the Grid

I data set della ricerca ed il progetto EUDAT

EUDAT. Towards a pan-european Collaborative Data Infrastructure

USE CASES IN SEISMOLOGY. Alberto Michelini INGV

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

The safe share project John Chapman, Deputy head, information security, Jisc

The EUDAT Collaborative Data Infrastructure

Key Elements of Global Data Infrastructures

Writing a Data Management Plan A guide for the perplexed

Webinar Annotate data in the EUDAT CDI

EOSC Services & Architecture: the EOSC-hub approach Tiziana Ferrari, Project Coordinator, EGI Founda?on

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

Accelerating CCUS: A Global Conference to Progress CCUS

Federation Operator Practice: Metadata Registration Practice Statement

INSPIRE & Environment Data in the EU

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

RENKU - Reproduce, Reuse, Recycle Research. Rok Roškar and the SDSC Renku team

Checklist and guidance for a Data Management Plan, v1.0

Importance of User Deprovisioning from Services

CLARIN s central infrastructure. Dieter Van Uytvanck CLARIN-PLUS Tools & Services Workshop 2 June 2016 Vienna

Deliverable 6.4. Initial Data Management Plan. RINGO (GA no ) PUBLIC; R. Readiness of ICOS for Necessities of integrated Global Observations

B2SAFE metadata management

Swedish National Data Service, SND Checklist Data Management Plan Checklist for Data Management Plan

Metadata Models for Experimental Science Data Management

Fair data and open data: differences and consequences

Legal Issues in Data Management: A Practical Approach

Cloud Security: Constant Innovation

PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA

2. HDF AAI Meeting -- Demo Slides

Building a Materials Data Facility (MDF)

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

Oracle Cloud IaaS: Compute and Storage Fundamentals

Microsoft SharePoint End User level 1 course content (3-day)

The Materials Data Facility

Fundamental Concepts and Models

Trust and Certification: the case for Trustworthy Digital Repositories. RDA Europe webinar, 14 February 2017 Ingrid Dillo, DANS, The Netherlands

Indiana University Research Technology and the Research Data Alliance

Towards FAIRness in research data. Per Öster, 3 October 2018

Research Data Edinburgh: MANTRA & Edinburgh DataShare. Stuart Macdonald EDINA & Data Library University of Edinburgh

Digital Preservation: How to Plan

Science Europe Consultation on Research Data Management

EUDAT. Towards a Collaborative Data Infrastructure. Ari Lukkarinen CSC-IT Center for Science, Finland NORDUnet 2012 Oslo, 18 August 2012

META-SHARE : the open exchange platform Overview-Current State-Towards v3.0

Electronic Records Archives: Philadelphia Federal Executive Board

Welcome to the Pure International Conference. Jill Lindmeier HR, Brand and Event Manager Oct 31, 2018

The Future of ESGF. in the context of ENES Strategy

European Cloud Initiative: implementation status. Augusto BURGUEÑO ARJONA European Commission DG CNECT Unit C1: e-infrastructure and Science Cloud

Research Data Management & Preservation: A Library Perspective

Security in Big and Open Data - WG. Alessandra Scicchitano GÉANT - Project Development Officer Ralph Niederberger Jülich Supercomputing Center (FZJ)

Persistent Identifier the data publishing perspective. Sünje Dallmeier-Tiessen, CERN 1

IJDC General Article

Managing stakeholders and (creating) their expectations

Digital repositories as research infrastructure: a UK perspective

Big Data infrastructure and tools in libraries

Developing a Research Data Policy

DataFlow and VIDaaS Workshop

EGI federated e-infrastructure, a building block for the Open Science Commons

Making research data repositories visible and discoverable. Robert Ulrich Karlsruhe Institute of Technology

Welcome! Presenters: STFC January 10, 2019

NEUROIMAGING RESEARCH DATA LIFE- CYCLE MANAGEMENT

FLAT: A CLARIN-compatible repository solution based on Fedora Commons

DCH-RP Trust-Building Report

Safeguarding Digital Heritage through Sustained Use of Legacy Software

Science-as-a-Service

Coupled Computing and Data Analytics to support Science EGI Viewpoint Yannick Legré, EGI.eu Director

irods workflows for the data management in the EUDAT pan-european infrastructure

Grid Computing with Voyager

Data publication and discovery with Globus

Federation Operator Practice: Metadata Registration Practice Statement

Data Citation and Scholarship

Striving for efficiency

Smart Data Catalog DATASHEET

Richard Marciano Alexandra Chassanoff David Pcolar Bing Zhu Chien-Yi Hu. March 24, 2010

EC Horizon 2020 Pilot on Open Research Data applicants

In this unit we are going to look at cloud computing. Cloud computing, also known as 'on-demand computing', is a kind of Internet-based computing,

Reducing Consumer Uncertainty

Transcription:

Fundamentals of Data Infrastructures Dublin, March 2014 Welcome & Introduction Adam Carter EPCC, The University of Edinburgh Training Coordinator, EUDAT

Timetable 09:00 Registration & Coffee 09:15 Welcome & Introduction - Adam Carter 09:30 Data Discovery - Introduction: Why (benefits of reusing data), How EUDAT's services help with this (in general) - Adam Carter 09:45 PIDs - Adam Carter 10:05 Metadata and Ontologies - Adam Carter 10:30 B2SHARE Nordic - An example of a service that facilitates Data Discovery and uses PIDs and Metadata - Teemu Kemppainen 11:00 Coffee 11:30 Legal Issues in Research Data Collection &Sharing - Pawel Kamocki 12:50 Wrap Up - Adam Carter 13:00 Close & Lunch courtesy of EUDAT

What do we mean by Data Infrastructures? The services, and underlying hardware that allow researchers and data managers to: Store Data Move Data Organise Data Find Data Share Data

Why Data Infrastructures? You re all working with data Yours, other people s Data has value Mechanisms which support data re-use add value to that data They can help with big data, and small They can support new ways of doing science Data Science / Fourth Paradigm / Datascopes

Fundamentals of Data Infrastructures Data Re- Use Distributed by Nature Why? Collabora/on & Interdisciplinarity Economies of Scale Specialisa/on Services Networks Compute & Processing Storage

Fundamentals of Data Infrastructures Data Re- Use Distributed by Nature Why? Collabora/on & Interdisciplinarity Economies of Scale Specialisa/on Services Networks Compute & Processing Storage

Fundamentals of Data Infrastructures PIDs Metadata Moving Data Processing Data Authen/ca/on & Authorisa/on Licensing & Data Ownership Data Management Planning Data Cura/on & Preserva/on EUDAT

Fundamentals of Data Infrastructures PIDs Metadata Moving Data Processing Data Authen/ca/on & Authorisa/on Licensing & Data Ownership Data Management Planning Data Cura/on & Preserva/on EUDAT

PIDs The idea is that it s useful to identify your data with some unique, persistent label (which, in general, is not its location, but which can be used to discover its location) e.g.: 10.1045/march99-bunker This allows the data to be referred to unambiguously in publications, by machines & services,

Metadata Data about data e.g. Description of what it relates to, where and when it was collected, how it is formatted, Helps with: Finding data (search on the metadata) Understanding data There are standard formats for metadata and standard mechanisms for sharing it

Moving Data Data Staging Getting data to where you need to use it (often a compute resource, but also, e.g., a data infrastructure) Replication of Data Automated mechanisms for copying data, usually between geographically remote sites for redundancy and/or data locality (faster access) Dataflows Automated workflows based on a flow of data

Processing Data Such a big area, that it s a subject in its own right We don t discuss computing, HPC, HTC, Grid, Cloud, etc. here But there are overlaps with data infrastructures, e.g., Bringing compute to the data Workflows Data-intensive compute architectures

Authentication & Authorisation Authentication: Who you are Authorisation: What you can do Necessary prerequisites for trust Possibly contrary to first impressions: They help with sharing of data Mechanisms to describe who should have access to what, and what they can do with it, and to implement these access restrictions

Licensing & Ownership Rights and responsibilities Copyright Ownership of data Who decides what can be done with the data, and by whom? Who has responsibility for looking after the data (and ensuring that it can be re-used in the future)

Data Curation & Preservation Also a huge field in its own right Overlaps with data infrastructures: Providing services and hardware to those with responsibility for looking after others data and allowing the data curator or data owner to do what they need to do to look after the data There s often a shared responsibility, e.g., a data centre will look after the bits on disk, and a library will look after making sure they can be interpreted by future generations

Data Management & Planning This is about good research practice It s the process you go through to ensure that your data can be re-used in the future This supports: Reproducible science Open, verifiable results Re-use of data by you, your colleagues, and the wider research community It s about defining things like ownership

Timetable 09:00 Registration & Coffee 09:15 Welcome & Introduction - Adam Carter 09:30 Data Discovery - Introduction: Why (benefits of reusing data), How EUDAT's services help with this (in general) - Adam Carter 09:45 PIDs - Adam Carter 10:05 Metadata and Ontologies - Adam Carter 10:30 B2SHARE Nordic - An example of a service that facilitates Data Discovery and uses PIDs and Metadata - Teemu Kemppainen 11:00 Coffee 11:30 Data access rights and mechanisms, open, sensitive and confidential data, and copyrights and licensing issues - Pawel% Kamocki 12:50 Wrap Up - Adam Carter 13:00 Close & Lunch courtesy of EUDAT

2014 The University of Edinburgh Licence: CC-BY 4.0