EGEE - providing a production quality Grid for e-science

Similar documents
GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

The EGEE-III Project Towards Sustainable e-infrastructures

Travelling securely on the Grid to the origin of the Universe

Advanced Grid Technologies, Services & Systems: Research Priorities and Objectives of WP

e-infrastructure: objectives and strategy in FP7

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

e-infrastructures in FP7 INFO DAY - Paris

The EPIKH, GILDA and GISELA Projects

Andrea Sciabà CERN, Switzerland

ETSI and GRID Standardisation. Mike Fisher, BT ETSI TC GRID Chair. 23 October 2006 ITU-T/OGF Workshop on Next Generation Networks and Grids

Grids and Security. Ian Neilson Grid Deployment Group CERN. TF-CSIRT London 27 Jan

Deliverable D71:(PC1) Draft Collaboration Plan

Pan-European Grid einfrastructure for LHC Experiments at CERN - SCL's Activities in EGEE

ehealth Ministerial Conference 2013 Dublin May 2013 Irish Presidency Declaration

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

On the EGI Operational Level Agreement Framework

ESFRI Strategic Roadmap & RI Long-term sustainability an EC overview

Research Infrastructures prospects -

GN3plus External Advisory Committee. White Paper on the Structure of GÉANT Research & Development

e-infrastructures in FP7: Call 7 (WP 2010)

Enabling Grids for E-sciencE. EGEE security pitch. Olle Mulmo. EGEE Chief Security Architect KTH, Sweden. INFSO-RI

The LHC Computing Grid

The LHC Computing Grid

Grid Challenges and Experience

Moving e-infrastructure into a new era the FP7 challenge

IEPSAS-Kosice: experiences in running LCG site

High Performance Computing from an EU perspective

First Session of the Asia Pacific Information Superhighway Steering Committee, 1 2 November 2017, Dhaka, Bangladesh.

Accelerate Your Enterprise Private Cloud Initiative

Helix Nebula Science Cloud Pre-Commercial Procurement pilot. 5 April 2016 Bob Jones, CERN

GNSSN. Global Nuclear Safety and Security Network

The grid for LHC Data Analysis

ehealth Network ehealth Network Governance model for the ehealth Digital Service Infrastructure during the CEF funding

Introduction to Grid Infrastructures

COMPUTACENTER AND CITRIX TOGETHER

The Virtual Observatory and the IVOA

RUSSIAN DATA INTENSIVE GRID (RDIG): CURRENT STATUS AND PERSPECTIVES TOWARD NATIONAL GRID INITIATIVE

The European DataGRID Production Testbed

TERMS OF REFERENCE URBAN RAILWAY DEVELOPMENT GUIDEBOOK SUPPORT TO DEVELOP CHAPTER 15 CLIMATE AND NATURAL DISASTER RESILIENCE IN URBAN RAIL PROJECTS

The Global Carbon Capture and Storage Institute GCCSI

The EU OPEN meter project

The EuroHPC strategic initiative

Bob Jones. EGEE and glite are registered trademarks. egee EGEE-III INFSO-RI

New Zealand Government IBM Infrastructure as a Service

ENCS The European Network for Cyber Security

New Zealand Government IbM Infrastructure as a service

Proposition to participate in the International non-for-profit Industry Association: Energy Efficient Buildings

ENISA EU Threat Landscape

European Transport Policy: ITS in action ITS Action Plan Directive 2010/40/EU

On the employment of LCG GRID middleware

The Future of Solid State Lighting in Europe

Procedures and Resources Plan

Garuda : The National Grid Computing Initiative Of India. Natraj A.C, CDAC Knowledge Park, Bangalore.

Workpackage WP 33: Deliverable D33.6: Documentation of the New DBE Web Presence

EGI: Linking digital resources across Eastern Europe for European science and innovation

U.S. Japan Internet Economy Industry Forum Joint Statement October 2013 Keidanren The American Chamber of Commerce in Japan

A European Vision and Plan for a Common Grid Infrastructure

SLHC-PP DELIVERABLE REPORT EU DELIVERABLE: Document identifier: SLHC-PP-D v1.1. End of Month 03 (June 2008) 30/06/2008

Operating the Distributed NDGF Tier-1

Predictive Insight, Automation and Expertise Drive Added Value for Managed Services

INSPIRE status report

Grid Computing a new tool for science

H2020 EUB EU-Brazil Research and Development Cooperation in Advanced Cyber Infrastructure. NCP Training Brussels, 18 September 2014

e-infrastructures fostering the building of Global Virtual Research Communities

Eclipse Technology Project: g-eclipse

Petaflop Computing in the European HPC Ecosystem

Global Monitoring for Environment and Security

Global utilities for the 21st century

The CHAIN-REDS Project

Work Package 2.4. (Public) Procurement Expert Group on the security and resilience of communication networks and information systems for Smart Grids

Future Developments in the EU DataGrid

EGI Strategy Enabling collaborative data- and compute-intensive science

Advancing European R&E through collaboration

PREPARE FOR TAKE OFF. Accelerate your organisation s journey to the Cloud.

Uptime and Proactive Support Services

Gold: points Platinum: 80+ points. Certified: points Silver: points

Professional Services for Cloud Management Solutions

EUROPEAN COMMISSION DIRECTORATE-GENERAL INFORMATION SOCIETY AND MEDIA

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

PROJECT FINAL REPORT. Tel: Fax:

Cloud Services. Infrastructure-as-a-Service

Response to Wood Buffalo Wildfire KPMG Report. Alberta Municipal Affairs

Securing Europe's Information Society

E GI - I ns P I R E EU DELIVERABLE: D4.1.

UCD Centre for Cybersecurity & Cybercrime Investigation

SPARC 2 Consultations January-February 2016

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

13543/17 PhL/at 1 DG G 3 B

INTAROS Integrated Arctic Observation System

einfrastructures Concertation Event

Grid Security Policy

SYMANTEC: SECURITY ADVISORY SERVICES. Symantec Security Advisory Services The World Leader in Information Security

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT

Building a Europe of Knowledge. Towards the Seventh Framework Programme

GÉANT Mission and Services

Deliverable D8.4 Certificate Transparency Log v2.0 Production Service

ConCert FAQ s Last revised December 2017

SE4All Nexus Initiative TAPSIC SE4All Nexus Workshop: Vienna, 22 February, 2016

The Africa Utilities Telecom Council Johannesburg CC, South Africa 1 st December, 2015

WHO-ITU National ehealth Strategy Toolkit

Transcription:

EGEE - providing a production quality Grid for e-science Fabrizio Gagliardi EGEE Project Director CERN Fabrizio. Gagliardi@cern.ch Marc-Elian Begin CERN Marc-Elian.Begin@cern. ch On behalfofthe EGEE Collaboration Abstract The aim of the EU-funded EGEE (Enabling Grids for E-SciencE) project is to build on recent advances in Grid technology and develop a Grid service for scientific applications. This paper provides an overview of the EGEE project goals and structure and explains the work being performed in the areas of Grid operations, middleware re-engineering and application deployment and support. We also give an overview of the collaborations EGEE is engaged in with other Gridrelated organisations andprojects. 1. Background Computer and networking technology make the seamless sharing of computing resources on an intemational or even global scale feasible. The EGEE (Enabling Grids for E-SciencE) project is funded by the European Union to build a scientific computing Grid. It uses clusters or farms of PCs and associated data storage facilities in major research establishments around the world which are connected via high-speed networks. EGEE focuses on applications requiring highthroughput computing, and a diverse scientific community already takes advantage of the opportunities that the production-quality Grid developed by EGEE provides. This article describes the current status of the project, illustrating the scale and complexity of the challenges involved in establishing a scientific infrastructure of this kind, but also gives examples of some applications already running on the EGEE infrastructure. 2. The EGEE project The EGEE project aims to provide a seamless and high quality service to multiple scientific communities, through the development of a production-quality Grid service. More than 70 institutions in 27 countries, organised in twelve partner regions or "federations", work together to build the Grid infrastructure that not only provides simple, reliable round-the-clock access to the underlying computing resources but also performance monitoring tools, user training programmes and other user support. The three main goals of EGEE are: * Efficient delivery of a production level Grid service, which needs to be manageable, robust, resilient to failure, and that includes a consistent security model; * Delivery and maintenance of professional quality Grid middleware to power the production service. This includes support and continuous upgrade of middleware, constantly in sync with the needs of the applications running on the Grid and the operations of the infrastructure. This dynamic response from the middleware team is required in acknowledgement of the rapidly growing and diversifying user communities exploiting the unique features of the Grid; * Strong outreach and training efforts on Grid technology, from induction to advanced topics. Figure 1: EGEE high-level structure and distribution of resources 0-7803-9228-0/05/$20.00 2005 IEEE :88

EGEE's structure, as seen in Figure 1, reflects these goals in its three main areas: services, joint research and networking. The service activities deploy, support, and manage an international production quality Grid infrastructure including resources from centres around the globe, which is made available to a broad range of user communities. Providing a continuous, stable Grid resource is the main objective of EGEE, reflected in the fact that this activity receives nearly 50% of the EGEE budget. To date, most Grid projects have concentrated on demonstrating the feasibility of the Grid and its potential by establishing testbeds, as did the European DataGrid (EDG) project [1], precursor to EGEE. In contrast, EGEE is focused on establishing a reliable Grid production service. The joint research activities focus on re-engineering existing middleware to develop and improve Grid middleware (see next section). With a view to expanding and diversifying EGEE's user community, the human networking activities disseminate appropriate information to new scientific fields, taking into account their emerging Grid infrastructure needs. EGEE also runs an extensive training programme to ensure that the different stakeholders can make the best use of the resources provided. 3. EGEE Middleware The Joint Research Activities in EGEE focuses primarily on delivering reliable and secure production quality middleware. This is made possible by reengineering existing middleware and incorporating the experience of past and present R&D projects. Since EGEE is focused on providing a production quality Grid, this re-engineering process includes feedback from Grid user communities and lessons learnt in Grid operations. However, since standards are still emerging in the field of Grids, the project is also actively engaged in preparing for the next generation of Grid standards (e.g. GGF [2] and OASIS [3]), while taking a prudent approach in adopting current emerging standards. The EGEE middleware is branded under the name: glite (pronounced "gee-lite" - see www.glite.org). The glite middleware builds on the best practice experience and middleware produced by the most well known Grid middleware projects of this generation: Condor [4], Globus [5], VDT [6], AliEn [7], EDG [1], DataTAG [8], etc. In order to guide the re-engineering, EGEE has delivered Grid middleware Architecture and Design documents, targeting Web Services as the baseline for the implementation of its Service Oriented Architecture. Security is another key domain where re-engineering of current Grid services is required. Retrofitting security into an existing implementation that was not designed with this requirement in mind can be difficult. This is an area where the right balance has to be struck between reengineering of current services through wrapping techniques and more invasive rework and refactoring. The glite roadmap includes short-term security solutions based on Transport-Level Security, while in the longer term it is likely to be based on Message-Level Security, using standards such as WS-Security [9]. ''4;iLte * ry 'Milewore for 0Grid Compuring Figure 2: The EGEE middleware glite As with all other large IT infrastructures (e.g. telecommunications, telephony, networking) the Grid requires a rich set of stable and well adopted standards to guide service developers and providers, and ensure interoperability between Grid Services and deployment of Grid applications on different Grid infrastructures. Since we do not believe the right level of maturity has been reached yet from promising Grid standards, glite tries, wherever possible, to be inline with the "spirit" of these standards and recommendations, while sometimes implementing a temporary custom solution. In order to accelerate the standardisation process, these solutions are then shared with the intemational Grid communities, alongside with the experience in deploying and using these solutions. 4. Services The need to support several applications and user communities meant that a new, scalable operation and user support structure had to be developed and deployed. For a large-scale, multi-disciplinary Grid such as EGEE, a federated approach was deemed the most appropriate, where regional support organisations can offer faster response times due to their local knowledge about the specifics of the resources. To reach this goal, the service activities deploy, operate, support and manage an international production quality Grid infrastructure. While initially composed of primarily European resource centres, we now extend around the globe. Prior to the deployment of new Grid middleware releases into production, the service activities run rigorous tests on a pre-production service. Only once these tests are passed successfully can new releases be deployed. The pre-production service is also an ideal environment for application groups exploring novel usecases of Grid :89

usage, without the risk of disturbing operations on the production quality Grid service. Furthermore, the preproduction service serves as a halfway house between an incubator environment and the full production service, offering application developers and operators a representative multi-site, yet reduced in scope, service for testing diverse new applications and policies, in order to minimise hurdles during full deployment. Figure 3: Organisation of the EGEE Grid services The structure of the Grid services is shown in Figure 3. The operations activity is coordinated by the Operations Management Centre (OMC) team and supported by the Core Infrastructure Centres (CIC) that provide operations support, operational and performance monitoring and troubleshooting, as well as general Grid services. The responsibility for middleware certification, deployment, day-to-day operations and user support within the regional federations rests with the Regional Operations Centres (ROCs), which are in close contact with the Resource Centres (RC), hosting the Grid resources (i.e. storage, computing). 5. Applications To guide the implementation and to certify the performance and functionality of the evolving Grid infrastructure, two pioneering application areas were selected at the start of the project: * High-Energy Physics (HEP) with several partners, including a close collaboration with the Large Hadron Collider Computing Grid (LCG) [10], which will provide the Grid infrastructure to store and analyse petabytes of real and simulated data from the LHC accelerator experiments at CERN; * Biomedicine, where several communities are facing equally daunting challenges to cope with the flood of bioinformatics and healthcare data. The pilot applications are complemented by a more generic component identifying new applications from a broad range of scientific disciplines and providing them with the support and tools needed to accelerate their application's "Gridification". An important tool in this respect is a dedicated testbed called GILDA [ 1], the Grid INFN Laboratory for Dissemination Activities, which was developed as part of the Italian INFN Grid project [12] and the EGEE project. GILDA acts as a Grid applications incubator, and is also used to host hands-on tutorials in many of the EGEE training events. A look into the application domains currently running on the EGEE Grid infrastructure shows the growing richness of communities using the power of the Grid. Originally, the applications deployed on the production service were predominantly from our flagship application domains - i.e. HEP and biomedical - but now other domains such as Earth Observation, Geophysics, and Computational Chemistry are producing better science thanks to the Grid and also clearly showing the breadth of sciences using the Grid. The EGEE project has also attracted an application developed by an industrial partner, Compagnie Generale de Geophysique (CGG) in France, now deployed on the production service. To expand its user community, the project has set up the EGEE Generic Applications Advisory Panel (EGAAP), which acts as the formal entry point for new applications wanting to take advantage of the EGEE infrastructure. To date, seven applications have been accepted, four of which are already running on the production service: Earth Science Research, EGEODE (Earth Science Industry), Computational Chemistry and MAGIC (Astronomy). Keeping up with the rapid growth of the computing resources available to the Grid infrastructure, as well as the number of scientific communities that use it, requires a constant "virtuous cycle": * Making contact with new scientific communities through the many outreach events organised; * Follow-up meetings by application specialists that may lead to the identification of new requirements for the infrastructure and its middleware; * Providing appropriate training to the new community in question, such that its users become established and autonomous; * Through peer communication and dissemination events these new users then spread the word and attract new communities, resources and ideas. This cycle binds the different activities together and ensures the cohesive expansion of the Grid and its user communities. 90

6. Collaborations EGEE aims to integrate current national, regional and thematic Grid efforts, computer centres supporting one specific application area, and general computer centres supporting all fields of science in a region. The EGEE infrastructure builds on the EU research network GEANT [13] and exploits Grid expertise that has its roots in projects such as EDG. Interoperability with other Grids around the globe, including the US National Science Foundation Cyberinfrastructure [14], and contributing to efforts to establish a worldwide Grid infrastructure is therefore of prime importance for EGEE. Collaborations with Russia, Baltic, Asian, North and Latin American states, as well as further links around the Mediterranean (EUMedConnect [15]) are being explored. The project also collaborates with other European research infrastructure projects to actively participate in pan-project "concertation" meetings. The objective of these meetings is to form a common approach by the largest possible number of European Grid projects in vital issues like authentication, authorisation and accounting, as well as business and cost models and applications. The First Concertation Meeting on e-infrastructures was organised in conjunction with the second EGEE project conference, in November 2004 in The Hague (The Netherlands). The participating projects represented "service providers", "technology providers" and the potential consumers of infrastructure services: * DEISA http://www.deisa.org * SEEGRID http://www.see-grid.org * DILIGENT http://www.diligentproject.org * GEANT2 http://www.geant2.net * COREGRID http://www.coregrid.net * GRIDLAB http://www.gridlab.org * GRIDCC http://www.gridcc.org * SIMDAT http://www.scai.fraunhofer.de/simdat.html * LOBSTER http://www.ist-lobster.org/ * GRIDSTART http://www.gridstart.org * NEXTGRID http://www.nextgrid.org * AKOGRIMO http://www.mobilegrids.org As well as providing an opportunity for technical experts from different projects and policy makers to exchange information about the research questions in key areas, these types of meetings and the EGEE conferences (April 2004, November 2004, April 2005, and October 2005) aid in creating a tightly networked community of stakeholders. While strengthening its links with several of these projects through these important events, EGEE has seeded future collaborations through several new project proposals submitted to the EU earlier this year, which included a request for support from EGEE. 7. Future Plans The first review of the EGEE project by the European Commission was held in February 2005. The feedback from the review confirmed the advanced status of the project in all its activities and the successful acceptance of all 42 deliverables from the first period. The future programme of work of the project was recently reviewed at the 3rd project conference held in Athens during April 2005 with more than 450 delegates. The planning for the 2nd phase of the project was also launched at the conference. 8. Conclusions It is a great challenge to work with demanding communities such as HEP and to develop a production quality Grid infrastructure in close collaboration with them. At the same time, this infrastructure has to be versatile and attractive to the largest possible domains of application. Open-source software is the best approach for publicly funded projects and necessary for fast and wide adoption of the developed infrastructure. Nevertheless, intellectual property rights need to be developed in collaboration with the industrial partners to make commercial exploitation possible as well. One of the reasons Europe is leading in the field of Grid technology is due to the initial success of the EGEE project and its precursor EDG. However, by its global nature, the Grid is not confined to a geographic region. Therefore, it is of prime importance that EGEE establishes firm collaborations across national and intemational programmes and funding agencies, and secures long-term support for the Grid infrastructure EGEE is producing. Acknowledgement The authors would like to acknowledge the important contributions, inputs and corrections to this paper from Bob Jones, Anna Cook, Hannelore Hammerle, Owen Appleton and all EGEE activities, represented by their respective activity leaders. Thank you for your valuable help. References [1] The European DataGrid project. http://www.eudatagrid.org [2] GGF: http://www.gridforum.org/documents/gfd/gfd- C.3.pdf [3] OASIS, "Web Services Security (WS-Security)." http://www.oasis-open.org/committees/wss [4] Condor: http://www.cs.wisc.edu/condor/ [5] Globus: http://www.globus.org/ EGEE is a projectfunded by the European Union under contract INFSO-PI-508833. :91

[6] VDT (Virtual Data Toolkit): http://www.cs.wisc.edu/vdt//index.html [7] AliEn: http://alien.cem.ch/ [8] DataTag: http://datatag.web.cem.ch/datatag/ [9] WS-Security: http://www- 106.ibm.com/developerworks/webservices/library/ws -secure/ [10] LCG: I. Bird et al. "Operating The LCG and EGEE Production Grids for HEP". In Proceedings of the CHEP'04 Conference, Interlaken, Switzerland, September 27th - October, 41st, 2004. Published on InDiCo. [ 1] GILDA https://gilda.ct.infn.it/ [12] INFN Grid (Grid-it): http://grid-it.cnaf.infn.it/ [13] GEANT project, http://www.geant.net/ [14] NSF Cyberinfrastructure: http://www.cise.nsf. gov/sci/reports/toc.cfin [15] EuMedConnect http://www.eumedis.net/en/eumedconnect/ EGEE is aiproiectfunded by the European Union under contract INFSO-RI-508833. 92