Security in Big and Open Data - WG. Alessandra Scicchitano GÉANT - Project Development Officer Ralph Niederberger Jülich Supercomputing Center (FZJ)

Similar documents
Data management and discovery

EUDAT- Towards a Global Collaborative Data Infrastructure

The EUDAT Collaborative Data Infrastructure

Sharing Best Security Practices with your Peers - on an International Level

Data Discovery - Introduction

Moving e-infrastructure into a new era the FP7 challenge

High Performance Computing Course Notes Grid Computing I

EUDAT - Open Data Services for Research

We invented the Web. 20 years later we got Drupal.

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Data Intensive Science Impact on Networks

CSCS CERN videoconference CFD applications

High Performance Computing from an EU perspective

Deliverable 6.4. Initial Data Management Plan. RINGO (GA no ) PUBLIC; R. Readiness of ICOS for Necessities of integrated Global Observations

Inge Van Nieuwerburgh OpenAIRE NOAD Belgium. Tools&Services. OpenAIRE EUDAT. can be reused under the CC BY license

I data set della ricerca ed il progetto EUDAT

INSPIRE & Environment Data in the EU

Research Infrastructures and Horizon 2020

Cosylab Switzerland and SKA

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

GÉANT Services Supporting International Networking and Collaboration

In Accountable IoT We Trust

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

DT-ICT : Big data solutions for energy

Graph-based Data Integration in EUDAT Data Infrastructure

Embedded Technosolutions

Centre of Excellence in PM 2 (CoEPM 2 )

ISTITUTO NAZIONALE DI FISICA NUCLEARE

HBP Status Update for the BoF Brussels, 17 March 2016 Chris Ebell, HBP Executive Director

Fundamentals of Data Infrastructures

Storage and I/O requirements of the LHC experiments

Audit & Certification: an auditors perspective. Barbara Sierman, KB National Library of the Netherlands Royal Irish Academy, Dublin 4 june 2013

Andrea Sciabà CERN, Switzerland

IEPSAS-Kosice: experiences in running LCG site

Cybersecurity Policy in the EU: Security Directive - Security for the data in the cloud

Research Infrastructures for All You could be Next! e-infrastructures - WP

EUDAT & SeaDataCloud

Digital Single Market Technologies and Public Service Modernisation Package -DSM. Grazyna Wojcieszko DG CONNECT

GRID 2008 International Conference on Distributed computing and Grid technologies in science and education 30 June 04 July, 2008, Dubna, Russia

Certification as a means of providing trust: the European framework. Ingrid Dillo Data Archiving and Networked Services

Helix Nebula The Science Cloud

European digital repository certification: the way forward

Introduction to Grid Infrastructures

Memorandum of Understanding

EUDAT and Cloud Services

European Code of Conduct on Data Centre Energy Efficiency

ENISA EU Threat Landscape

B2DROP The EUDAT Personal Cloud Storage

Towards FAIRness: some reflections from an Earth Science perspective

AARC Overview. Licia Florio, David Groep. 21 Jan presented by David Groep, Nikhef.

Οnline privacy tools for the general public. European Union Agency for Network and Information Security 1

USE CASES IN SEISMOLOGY. Alberto Michelini INGV

e-infrastructures in FP7 INFO DAY - Paris

Risk and Security Management for Distributed Supercomputing with Grids

Proposition to participate in the International non-for-profit Industry Association: Energy Efficient Buildings

EUDAT. Towards a pan-european Collaborative Data Infrastructure. Damien Lecarpentier CSC-IT Center for Science, Finland EUDAT User Forum, Barcelona

European Collaborative Data Infrastructure EUDAT - Training on EUDAT Principles -

EU Innovation Investments: The Challenges met by Innovation Infrastructures Today in Europe

Economic and Social Council

The LHC Computing Grid

Pilot Study on Big Data: Philippines. World Telecommunications/ICT Indicators Symposium (WTIS) November 2017 Hammamet, Tunisia

Tier-2 structure in Poland. R. Gokieli Institute for Nuclear Studies, Warsaw M. Witek Institute of Nuclear Physics, Cracow

EU Technology Platform SmartGrids WG2 Network Operations

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

White Paper: Next generation disaster data infrastructure CODATA LODGD Task Group 2017

Federated Services and Data Management in PRACE

WP3: Policy and Best Practice Harmonisation

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

GN3plus External Advisory Committee. White Paper on the Structure of GÉANT Research & Development

Virtualizing a Batch. University Grid Center

High-Performance Computing in Europe: Looking ahead. Dr Panagiotis Tsarchopoulos Future and Emerging Technologies DG CONNECT European Commission

Early Evaluation of the "Infinite Memory Engine" Burst Buffer Solution

e-infrastructures in Horizon 2020 e-infrastructures for data and computing

EUDAT & AAI. Daan Broeder MPI for Psycholinguistics

Topic C. Communicating the Precision of Measured Numbers

On the employment of LCG GRID middleware

NATIONAL CYBER SECURITY STRATEGY. - Version 2.0 -

EGI-Engage: Impact & Results. Advanced Computing for Research

HORIZON 2020 WORK PROGRAMME I: INFORMATION AND COMMUNICATION TECHNOLOGIES

ETNO Reflection Document on the EC Proposal for a Directive on Network and Information Security (NIS Directive)

CONCLUSIONS OF THE WESTERN BALKANS DIGITAL SUMMIT APRIL, SKOPJE

Research Infrastructures and Horizon 2020

Some Big Data Challenges

Grid Computing: dealing with GB/s dataflows

Network and Information Security Directive

The LHC Computing Grid

EUDAT Training 2 nd EUDAT Conference, Rome October 28 th Introduction, Vision and Architecture. Giuseppe Fiameni CINECA Rob Baxter EPCC EUDAT members

EUDAT Towards a Collaborative Data Infrastructure

Building a Dutch National Research Infrastructure IRODS UGM 2017

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

Best Practices for the use of e-infrastructures by large-scale research infrastructures

Experience of the WLCG data management system from the first two years of the LHC data taking

Status of activities Joint Working Group on standards for Smart Grids in Europe

Conferenza GARR Moving Forward to deliver an «EOSC in Practice» by 2018

Information and Communication Technologies (ICT) thematic area

EuroHPC: the European HPC Strategy

e-infrastructure: objectives and strategy in FP7

A national approach for storage scale-out scenarios based on irods

SKA Regional Centre Activities in Australasia

Trusted Digital Archives

Transcription:

Security in Big and Open Data - WG Alessandra Scicchitano GÉANT - Project Development Officer Ralph Niederberger Jülich Supercomputing Center (FZJ)

SBOD-WG Nowadays, Big data and Open data are often-heard buzzwords. Want to make your work interesting Want to get financing -> Work on -> Produce BIG Data OPEN Data But open data does not mean you don t need to take care of your data There are issues we have to take into account, like confidentiality, integrity, and availability

Some definitions Big data refers to large datasets. Public or available restricted only by a number of people, an org or a community. Open data refers to data that is available to everyone and can be republished without restrictions. Those may be not always large or big. There are big datasets which have to be accessible worldwide, by distinct people or working groups only. replicable for security reasons (damage) or accessible with high-speed at different sites to spread download capacity A clear example of overlap between big and open data are large datasets from scientific research sources.

Some examples of BIG and OPEN Data issues (1) Output from Large Hadron Collider: Data volume from all experiments: 150 Mio.sensors provide data 40 Mio. times / sec. ATLAS: 320 Megabyte per second CMS: 220 Megabyte per second LHCb: 50 Megabyte per second ALICE: 100 Megabyte per second That s about 15 Mio Gigabyte / year accessible to thousands of physicists around the world.

Some examples of BIG and OPEN Data issues (2) Output from Square Kilometre Array (SKA): Simulating a huge radio telescope (> 1000 antennas) Around 1 square kilometer area 10.000 times faster than current telescopes Assumed daily data volume: 960 Peta Byte Accessible to huge community worldwide.

Some examples of BIG and OPEN Data issues (3) Data handling: EUDAT project the collaborative Pan-European infrastructure providing research data services, training and consultancy Services: B2DROP B2SHARE B2SAFE B2STAGE B2FIND - Sync and Exchange Research Data Store and Share Research Data Replicate Research Data Safely Get Data to Computation Find Research Data Currently an EU project, but when in production similar problems and risks arise. Again, data should be accessible to huge communities worldwide.

Some examples of BIG and OPEN Data issues (4) Data handling: The Human Brain Flagship project The Human Brain Project (HBP) is a European Commission Future and Emerging Technologies Flagship. The HBP aims to put in place a cutting-edge, ICT-based scientific Research Infrastructure for brain research, cognitive neuroscience and brain-inspired computing. The Project promotes collaboration across the globe, and is committed to driving forward European industry. Similar activies are ongoing at least in the US, also. Here again huge amounts of data for simulating the brain are produced and analyzed. Also images of the brain are stored and analyzed or cataloged. Assuming that the data are anonymized, those data can be used as open data too at least for all medical scientists

Some examples of BIG and OPEN Data issues (5) Data handling: the problem of data reduction It does not make sense to transmit data from storage to computation, which is not handled at all. Data reduction on storage side should be done. But sometimes, administration of data and competence/knowledge of data structure differ. Data archives contain huge amaount of data for different communities. How can this be handled. Questions: Is this something SBOD should cope with? What about transfer protocols? Should we cope on those? Generally: Is there an issue or has anything already be solved, but not everyone knows about? Could this be an issue for SBOD? Making solutions available/ known?

The scope The SBOD-WG focuses on security issues that arise when dealing with big and open data especially within the e-infrastructures. Security issues in this context concentrate on (as stated above): Confidentiality regulates access to the information, Integrity assures that the information is trustworthy, i.e. has not been changed without authorisation Availability guarantees access to the information by authorised people at any time. SBOD intends to focus on high level security issues only. CSIRT issues are out of scope.

How we work Emails are the means of communication of the working group. SBOD mainly meets via teleconferences, but if needed faceto-face meetings will also be considered and organised. GET INVOLVED Subscribe to our mailing list

So far Published on the SBOD Wiki Case Statement Definition of Big and Open Data The WG is working right now on identifying possible use cases. If you have any, get in touch with the Chairs: alessandra.scicchitano@geant.org r.niederberger@fz-juelich.de