Patrick Fuhrmann (DESY)

Similar documents
EMI Data, the unified European Data Management Middleware

Dynamic Federations. Seamless aggregation of standard-protocol-based storage endpoints

EMI Deployment Planning. C. Aiftimiei D. Dongiovanni INFN

EUROPEAN MIDDLEWARE INITIATIVE

dcache, activities Patrick Fuhrmann 14 April 2010 Wuppertal, DE 4. dcache Workshop dcache.org

Data Storage. Paul Millar dcache

DSIT WP1 WP2. Federated AAI and Federated Storage Report for the Autumn 2014 All Hands Meeting

Bootstrapping a (New?) LHC Data Transfer Ecosystem

Computing / The DESY Grid Center

dcache Ceph Integration

PoS(EGICF12-EMITC2)106

ARC integration for CMS

The LHC Computing Grid

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

EUROPEAN MIDDLEWARE INITIATIVE

Parallel Computing in EGI

dcache: sneaking up on NFS4.1

The glite middleware. Presented by John White EGEE-II JRA1 Dep. Manager On behalf of JRA1 Enabling Grids for E-sciencE

ARC NOX AND THE ROADMAP TO THE UNIFIED EUROPEAN MIDDLEWARE

Storage Management in INDIGO

Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008

NFS around the world Tigran Mkrtchyan for dcache Team dcache User Workshop, Umeå, Sweden

Promoting Open Standards for Digital Repository. case study examples and challenges

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

PoS(EGICF12-EMITC2)081

Scientific data management

Lessons Learned in the NorduGrid Federation

glite Grid Services Overview

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. Presented by Manfred Alef Contributions of Jos van Wezel, Andreas Heiss

A scalable storage element and its usage in HEP

EGEE and Interoperation

First Experience with LCG. Board of Sponsors 3 rd April 2009

Grid and Cloud Activities in KISTI

LCG data management at IN2P3 CC FTS SRM dcache HPSS

Status of KISTI Tier2 Center for ALICE

EU Projects. Christoph Witzig

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

EUROPEAN MIDDLEWARE INITIATIVE

dcache Introduction Course

EUROPEAN MIDDLEWARE INITIATIVE

Understanding StoRM: from introduction to internals

Introduction to SRM. Riccardo Zappi 1

Chelonia. a lightweight self-healing distributed storage

Travelling securely on the Grid to the origin of the Universe

Metadaten Workshop 26./27. März 2007 Göttingen. Chimera. a new grid enabled name-space service. Martin Radicke. Tigran Mkrtchyan

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Grid Interoperation and Regional Collaboration

Andrea Sciabà CERN, Switzerland

AARC Overview. Licia Florio, David Groep. 21 Jan presented by David Groep, Nikhef.

Outline 18/12/2014. Accessing GROMACS on a Science Gateway. GROMACS in a nutshell. GROMACS users in India. GROMACS on GARUDA

Data Access and Data Management

February 15, 2012 FAST 2012 San Jose NFSv4.1 pnfs product community

Operating the Distributed NDGF Tier-1

EUROPEAN MIDDLEWARE INITIATIVE

and the GridKa mass storage system Jos van Wezel / GridKa

dcache: challenges and opportunities when growing into new communities Paul Millar on behalf of the dcache team

ELFms industrialisation plans

Agenda: Alberto dimeglio (AdM) Balazs Konya (BK) Helmut Heller (HH) Steve Crouch (SC)

The Global Grid and the Local Analysis

Distributed production managers meeting. Armando Fella on behalf of Italian distributed computing group

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

WORK PROGRAMME

The INFN Tier1. 1. INFN-CNAF, Italy

EU Liaison Update. General Assembly. Matthew Scott & Edit Herczog. Reference : GA(18)021. Trondheim. 14 June 2018

Access the power of Grid with Eclipse

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

First European Globus Community Forum Meeting

StoRM configuration. namespace.xml

dcache as open-source project showcase for education Tigran Mkrtchyan for dcache team CHEP2018, Sofia,

IBM Storage Software Strategy

On the EGI Operational Level Agreement Framework

Get more out of technology starting day one. ProDeploy Enterprise Suite

SCS Distributed File System Service Proposal

The Future of Solid State Lighting in Europe

High-Performance Computing Europe s place in a Global Race

Provisioning of Grid Middleware for EGI in the framework of EGI InSPIRE

WP3: Policy and Best Practice Harmonisation

NFS-Ganesha w/gpfs FSAL And Other Use Cases

Introduction to Grid Infrastructures

Challenges in Storage

SZDG, ecom4com technology, EDGeS-EDGI in large P. Kacsuk MTA SZTAKI

e-infrastructures in FP7 INFO DAY - Paris

Smart Grid Objective 6.1 in WP2013

HPC projects. Grischa Bolls

European Activities towards Cooperative Mobility

The ARC Information System

GRID COMPUTING APPLIED TO OFF-LINE AGATA DATA PROCESSING. 2nd EGAN School, December 2012, GSI Darmstadt, Germany

Storage Resource Manager Interface Specification V2.2 Implementations Experience Report

( PROPOSAL ) THE AGATA GRID COMPUTING MODEL FOR DATA MANAGEMENT AND DATA PROCESSING. version 0.6. July 2010 Revised January 2011

PoS(ACAT2010)039. First sights on a non-grid end-user analysis model on Grid Infrastructure. Roberto Santinelli. Fabrizio Furano.

Pan-European Grid einfrastructure for LHC Experiments at CERN - SCL's Activities in EGEE

Configuration changes such as conversion from a single instance to RAC, ASM, etc.

Storage Resource Manager Interface Specification V2.2 Implementations Experience Report

Bob Jones. EGEE and glite are registered trademarks. egee EGEE-III INFSO-RI

Next Generation CEA Computing Centres

YAIM Overview. Bruce Becker Meraka Institute. Co-ordination & Harmonisation of Advanced e-infrastructures for Research and Education Data Sharing

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

e-infrastructures in FP7: Call 7 (WP 2010)

dcache integration into HDF

Transcription:

Patrick Fuhrmann (DESY) EMI Data Area lead (on behalf of many people and slides stolen from all over the place)

Credits Alejandro Alvarez Alex Sim Claudio Cacciari Christian Bernardt Christian Loeschen Elisabetta Ronchieri Fabrizio Furano Giuseppe Fiameni Giacinto Donvito Giuseppe Lo Presti Jon Kerr Nilsen Jan Schaefer Jean-Philippe Baud Dmitry Ozerov Yves Kemp Karsten Schwank Michele Carpene Michele Dibenedetto Michail Salichos Mischa Salle Oscar Koeroo Oliver Keeble Paul Millar Ralph Mueller-Pfefferkorn Ricardo Rocha Riccardo Zappi Tigran Mkrtchyan Zsolt Molnar Zsombor Nagy Our wiki : https://twiki.cern.ch/twiki/bin/view/emi/emijra1t3data July 6, 2011 EMI Data, superb, Ferrara, IT 2

Outline EMI, the facts EMI-Data, the components EMI-Data, the mission EMI-Data, selected Topics Topic 1 : Interoperability and reduction of components Client library consolidation Topic 2 : Fixing flaws and evolution Catalogue and SE synchronization Topic 3 : Standardization Example 1 : WebDAV Example 2 : NFSv4.1/pNFS July 6, 2011 EMI Data, superb, Ferrara, IT 3

EMI, the project EMI, the facts July 6, 2011 EMI Data, superb, Ferrara, IT 4

EMI Factsheet 16/09/2010 EMI Overview - EGI TF, Amsterdam 5 July 6, 2011 EMI Data, superb, Ferrara, IT 5

EMI Factsheet EMI Factsheet Budget : about 24 Million Euros Funding : about 50% by EU-FP7, rest by partners Covers : JRA, SA and NA Partners : 22 Middlewares: Arc, glite, UNICORE and dcache 16/09/2010 EMI Overview - EGI TF, Amsterdam 6 July 6, 2011 EMI Data, superb, Ferrara, IT 6

For more facts For more facts on EMI in general, see Francesco s presentation from this morning on EMI/EGI Issues July 6, 2011 EMI Data, superb, Ferrara, IT 7

Director s Mantra The European Middleware Initiative (EMI) project represents a close collaboration of the major European middleware providers - ARC, glite, UNICORE and dcache - to establish a sustainable model to support, harmonise and evolve distributed computing middleware for deployment in EGI, PRACE and other distributed e-infrastructures July 6, 2011 EMI Data, superb, Ferrara, IT 8

Translates to EMI data as Maintain and improve the currently existing production WLCG software stack! Allow new communities to utilize the EMI middle-wares by applying standards July 6, 2011 EMI Data, superb, Ferrara, IT 9

Components EMI, the components July 6, 2011 EMI Data, superb, Ferrara, IT 10

The EMI Pie 63 components and about 350 packages UAS-Compute A-REX MPI WMS UNICORE-SMS StoRM dcache Arc-libs FTS, LFC, DPM, GFAL Information system ARGUS VOMS accounting UNICORE-Gate bookkeeping gridsite July 6, 2011 EMI Data, superb, Ferrara, IT 11

The EMI Pie 63 components and about 350 packages UNICORE-SMS StoRM dcache Arc-libs FTS, LFC, DPM, GFAL July 6, 2011 EMI Data, superb, Ferrara, IT 12

What does EMI-Data provide? July 6, 2011 EMI Data, superb, Ferrara, IT 13

The EMI shopping cart Reliable File Transport Service File Location and meta data Service (LFC) Professional Storage Solutions Fits all size DPM dcache July 6, 2011 EMI Data, superb, Ferrara, IT 14

EMI, the storage elements SRM Disk Storage Layer HSM Interface HSM dcache ENSTORE Any Other GPFS TSM Any FS DPM July 6, 2011 EMI Data, superb, Ferrara, IT 15

EMI, the storage elements!!! Something for everyone SRM Disk Storage Layer HSM Interface HSM dcache ENSTORE Any Other GPFS TSM Any FS DPM July 6, 2011 EMI Data, superb, Ferrara, IT 16

The Mission EMI Data mission July 6, 2011 EMI Data, superb, Ferrara, IT 17

The Mission Fixing of issues based on the experience of operating the infrastructures for some years. Improving or creating interoperability between components and middle-wares. Reducing components by merging functionality or removing duplication. Applying standards where available Standardizing EMI-Data mechanisms with standardization bodies e.g. OGF EGI : Attracting resp. enabling new communities. Becoming competitive and attractive by : Standards Professional Support Strict quality monitoring July 6, 2011 EMI Data, superb, Ferrara, IT 18

Some more examples Defining (with OGF) and implementing an Storage Accounting Record Migrating the security of the Storage Resource Manager protocol from GSI (httpg) to standard SSL/X509. Migrating to next version of the information provider schema GLUE2.0 Improving the File Transfer Service by integrating the load of the network and the storage element backend. For the entre list, have a look at : hvps://twiki.cern.ch/twiki/bin/view/emi/emijra1t3data July 6, 2011 EMI Data, superb, Ferrara, IT 19

Selected Topics EMI, some selected topics July 6, 2011 EMI Data, superb, Ferrara, IT 20

Topic 1 Client library consolidation July 6, 2011 EMI Data, superb, Ferrara, IT 21

Component consolidation dcache July 6, 2011 EMI Data, superb, Ferrara, IT 22

Component consolidation dcache July 6, 2011 EMI Data, superb, Ferrara, IT 23

Topic 2 Catalogue / Storage element synchronization July 6, 2011 EMI Data, superb, Ferrara, IT 24

Catalogue and Storage Elements Problem Storage Elements are storing the actual data locally. Catalogues map logical filenames to actually locations URLs. Typical store operation : Store file in SE Add entry (URL) to the catalogue under the logical file name May possibilities for SE s and catalogues to get out of sync as the store and add entry operations are not transactional in WLCG. This results in Dark data (files in SE but not in catalogue) Dangling references (Pointer in catalogue, pointing into the void) Over time, catalogues and SE s diverge more and more. July 6, 2011 EMI Data, superb, Ferrara, IT 25

How to solve this issue Two different approaches July 6, 2011 EMI Data, superb, Ferrara, IT 26

Two possible solutions Synchronizing catalogues and SE s using message passing Talk to Fabrizio on details. DPM, StoRM or dcache SE or Catalogue specific plug-in LFC or experiment catalogue Messaging infrastructure July 6, 2011 EMI Data, superb, Ferrara, IT 27

Manual interaction Already in EMI-1 List of removed files LFC or experiment catalogue Messaging infrastructure July 6, 2011 EMI Data, superb, Ferrara, IT 28

Second approach (in discussion) Dynamic location catalogues The catalogue, on request, builds a list of URLs for logical filenames. The URLs automatically invalidate after some time (hours). Standard Clients Request for logical filename Dynamic catalogue Cache miss Storage elements Return up-to-date information Fill cache with query results July 6, 2011 EMI Data, superb, Ferrara, IT 29

Example on how EMI works Synchronizing catalogues and SE s guaranties continuity of production environment. Dynamic catalogues is part of the data management evolution. July 6, 2011 EMI Data, superb, Ferrara, IT 30

Selected topics Topic 3 Standards WebDAV NFS 4.1 / pnfs July 6, 2011 EMI Data, superb, Ferrara, IT 31

Standardization July 6, 2011 EMI Data, superb, Ferrara, IT 32

Selected Topics Example 1 WebDAV July 6, 2011 EMI Data, superb, Ferrara, IT 33

Standardization : WebDAV Very useful for new (non- LHC) communites. ITEF Standard Allows File system like access with Mac OS Linux Windows July 6, 2011 EMI Data, superb, Ferrara, IT 34

Standardization : WebDAV July 6, 2011 EMI Data, superb, Ferrara, IT 35

Selected Topics Example 2 NFS v4.1 / pnfs July 6, 2011 EMI Data, superb, Ferrara, IT 36

What s NFSv4.1/pNFS? CITI, at the University of Michigan, is funded by major storage providers to coordinate the pnfs effort and provide reference implementations. Group meets three times a year to check interoperability. July 6, 2011 EMI Data, superb, Ferrara, IT 37

What is NFS4.1/pNFS and how does it work? Stolen from : http://www.pnfs.com/ July 6, 2011 EMI Data, superb, Ferrara, IT 38

Why would one need it? Stolen from : http://www.pnfs.com/ Benefits of Parallel I/O Delivers Very High Application Performance Allows for Massive Scalability without diminished performance Benefits of NFS (or most any standard) Ensures Interoperability among vendor solutions Allows Choice of best-of-breed products Eliminates Risks of deploying proprietary technology July 6, 2011 EMI Data, superb, Ferrara, IT 39

Simplicity Why would we need it? Regular mount-point and real POSIX I/O Can be used by unmodified applications (e.g. Mathematica..) Data client provided by the OS vendor Smart caching (block caching) development done by OS vendors Performance pnfs : parallel NFS (first version of NFS which support multiple data servers) Clever protocols, e.g. Compound Requests July 6, 2011 EMI Data, superb, Ferrara, IT 40

NFS v4.1 / pnfs availability EMI server o dcache : production version with EMI 1 o DPM : prototype, ready for EMI-2 Linux Kernel o Completed in 2.6.39 o Back-port of pnfs into RH 6.2 Industry o NetApp OnTab 8.1 o o Blue Arc in August Other vendors : code ready but not officially available July 6, 2011 EMI Data, superb, Ferrara, IT 41

Testing and deployment NFS 4.1 / pnfs tested for more than 9 month at the DESY GridLab (50% Tier II). Stable and performing well. First production dcache NFS 4.1 / pnfs system at DESY for photon science. Other WLCG sites including Tier I s start with testing NFS 4.1 / pnfs. July 6, 2011 EMI Data, superb, Ferrara, IT 42

Competition NFS 4.1 / pnfs is a great opportunity for Open Source Projects (EMI) to compete with industry and of course the other way around. July 6, 2011 EMI Data, superb, Ferrara, IT 43

Conclusions July 6, 2011 EMI Data, superb, Ferrara, IT 44

Further reading https://twiki.cern.ch/twiki/bin/view/ EMI/EmiJra1T3Data EMI is par3ally funded by the European Commission under Grant Agreement INFSO- RI- 261611 July 6, 2011 EMI Data, superb, Ferrara, IT 45