The Grid: Processing the Data from the World s Largest Scientific Machine

Similar documents
The Grid. Processing the Data from the World s Largest Scientific Machine II Brazilian LHC Computing Workshop

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

The grid for LHC Data Analysis

CHIPP Phoenix Cluster Inauguration

The LHC Computing Grid

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

DESY. Andreas Gellrich DESY DESY,

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

The LHC Computing Grid

LHC Computing Grid today Did it work?

Andrea Sciabà CERN, Switzerland

IEPSAS-Kosice: experiences in running LCG site

Travelling securely on the Grid to the origin of the Universe

Access the power of Grid with Eclipse

Storage and I/O requirements of the LHC experiments

A short introduction to the Worldwide LHC Computing Grid. Maarten Litmaath (CERN)

Grid Computing a new tool for science

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

The EU DataGrid Testbed

Grid services. Enabling Grids for E-sciencE. Dusan Vudragovic Scientific Computing Laboratory Institute of Physics Belgrade, Serbia

Geographical failover for the EGEE-WLCG Grid collaboration tools. CHEP 2007 Victoria, Canada, 2-7 September. Enabling Grids for E-sciencE

Grid and Cloud Activities in KISTI

Status of KISTI Tier2 Center for ALICE

RUSSIAN DATA INTENSIVE GRID (RDIG): CURRENT STATUS AND PERSPECTIVES TOWARD NATIONAL GRID INITIATIVE

Austrian Federated WLCG Tier-2

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

ISTITUTO NAZIONALE DI FISICA NUCLEARE

( PROPOSAL ) THE AGATA GRID COMPUTING MODEL FOR DATA MANAGEMENT AND DATA PROCESSING. version 0.6. July 2010 Revised January 2011

Introduction to Grid Infrastructures

Computing for LHC in Germany

ALHAD G. APTE, BARC 2nd GARUDA PARTNERS MEET ON 15th & 16th SEPT. 2006

e-science for High-Energy Physics in Korea

glite Grid Services Overview

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms

On the employment of LCG GRID middleware

Grids and Security. Ian Neilson Grid Deployment Group CERN. TF-CSIRT London 27 Jan

The LHC computing model and its evolution. Dr Bob Jones CERN

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center

The European DataGRID Production Testbed

Grid Interoperation and Regional Collaboration

Operating the Distributed NDGF Tier-1

CC-IN2P3: A High Performance Data Center for Research

Monitoring tools in EGEE

First Experience with LCG. Board of Sponsors 3 rd April 2009

Experience of the WLCG data management system from the first two years of the LHC data taking

Pan-European Grid einfrastructure for LHC Experiments at CERN - SCL's Activities in EGEE

HEP Grid Activities in China

GRID COMPUTING APPLIED TO OFF-LINE AGATA DATA PROCESSING. 2nd EGAN School, December 2012, GSI Darmstadt, Germany

Tier-2 DESY Volker Gülzow, Peter Wegner

CernVM-FS beyond LHC computing

Distributed e-infrastructures for data intensive science

UW-ATLAS Experiences with Condor

A distributed tier-1. International Conference on Computing in High Energy and Nuclear Physics (CHEP 07) IOP Publishing. c 2008 IOP Publishing Ltd 1

AGIS: The ATLAS Grid Information System

FREE SCIENTIFIC COMPUTING

CERN openlab II. CERN openlab and. Sverre Jarp CERN openlab CTO 16 September 2008

Bookkeeping and submission tools prototype. L. Tomassetti on behalf of distributed computing group

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010

Grid Computing: dealing with GB/s dataflows

High Performance Computing from an EU perspective

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

Virtualizing a Batch. University Grid Center

Batch Services at CERN: Status and Future Evolution

Grid Challenges and Experience

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. Presented by Manfred Alef Contributions of Jos van Wezel, Andreas Heiss

Philippe Charpentier PH Department CERN, Geneva

CERN and Scientific Computing

Overview of HEP software & LCG from the openlab perspective

Grid Infrastructure For Collaborative High Performance Scientific Computing

LHCb Computing Strategy

Computing at Belle II

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

g-eclipse A Framework for Accessing Grid Infrastructures Nicholas Loulloudes Trainer, University of Cyprus (loulloudes.n_at_cs.ucy.ac.

Storage Resource Sharing with CASTOR.

The EuroHPC strategic initiative

Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy. David Toback

AGATA Analysis on the GRID

The National Analysis DESY

AMGA metadata catalogue system

Computing / The DESY Grid Center

Implementing GRID interoperability

Lessons Learned in the NorduGrid Federation

Data Grid Infrastructure for YBJ-ARGO Cosmic-Ray Project

Monitoring the Usage of the ZEUS Analysis Grid

European Grid Infrastructure

Outline. Infrastructure and operations architecture. Operations. Services Monitoring and management tools

Grid Computing at the IIHE

Challenges of the LHC Computing Grid by the CMS experiment

Summary of the LHC Computing Review

150 million sensors deliver data. 40 million times per second

Distributed Monte Carlo Production for

Moving e-infrastructure into a new era the FP7 challenge

Overview. About CERN 2 / 11

Interconnect EGEE and CNGRID e-infrastructures

The CMS Computing Model

ATLAS Experiment and GCE

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

Transcription:

The Grid: Processing the Data from the World s Largest Scientific Machine 10th Topical Seminar On Innovative Particle and Radiation Detectors Siena, 1-5 October 2006 Patricia Méndez Lorenzo (IT-PSS/ED), CERN

Abstract The world's largest scientific machine will enter production about one year from the time of this talk In order to exploit its scientific potential, computational resources way beyond those needed for previous accelerators are required Based on these requirements, a distributed solution based on Grid technologies has been developed This talk describes the overall Grid infrastructure offered to the experiments, the state of deployment of the production services, and the applications of the Grid beyond the HEP 2

Overview First Part: The general concept of the Grid Second Part: The WLCG Grid Computing Third Part: The Infrastructure and the Services Fourth Part: Current Status Fifth Part: Applications beyond HEP 3

1st Part: Grid Concept A genuine new concept in distributed computing Could bring radical changes in the way people do computing Share the computing power of many countries for your needs Non-centralized placement of the computing resources A hype that many are willing to be involved in Many researchers/companies work on grids More than 100 projects (some of them commercial) Only few large scale deployments aimed at production Very confusing (hundreds of projects named Grid something) Interesting links: Ian Foster, Karl Kesselman Or have a look at http://globus.org Not the only grid toolkit, but one of the first and most widely used Web pages of: EGEE, WLCG 4

Just one idea: How to see the Grid The Grid connects Instruments, Computer Centers and Scientists If the WEB is able to share information, the Grid is intended to share computing power and storage 5

2nd Part: The LHC Grid Computing LHC will be completed in 2007 and run for the next 10-15 years Experiments will produce about 15 Millions Gigabytes per year of data (about 20 million CDs!) LHC data analysis requires a computing power equivalent to around 100000 of today s fastest PC processors Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Concorde (15 Km) Requires many cooperating Therefore we Mt. build Blanc computer centers, as CERN can (4.8 Km) a Computing Grid for the HEP Community: only provide the 20% of the capacity The WLCG (Worldwide LHC Computing Grid) 6

Projects we use for LHC Grid WLCG depends on 2 major science grid infrastructures provided by: EGEE Enabling Grid for E-SciencE (160 communities) OSG US Open Science Grid (also supporting other communities beyond HEP) Both infrastructures are federations of independent GRIDs WLCG uses both OSG resources and many (but not all) from EGEE EGEE Grid Grid Grid WLCG OSG Grid Grid Grid 7

EGEE Sites EGEE: Steady growth over the lifetime of the project sites OSG CPU EGEE: > 180 sites, 40 countries > 24,000 processors, ~ 5 PB storage 8

The WLCG Distribution of Resources Tier-0 the accelerator centre Data acquisition and initial Processing of raw data Distribution of data to the different Tier s Tier-1 (11 centers ) online to the data acquisition process high availability Canada Triumf (Vancouver) France IN2P3 (Lyon) Spain PIC (Barcelona) Germany Forschunszentrum Karlsruhe Taiwan Academia SInica (Taipei) Italy CNAF (Bologna) UK CLRC (Oxford) Netherlands NIKHEF/SARA (Amsterdam) US FermiLab (Illinois) Nordic countries distributed Tier-1 Brookhaven (NY) Tier-2 Managed Mass Storage grid-enabled data service Data-heavy analysis National, regional support ~100 centres in ~40 countries Simulation End-user analysis batch and interactive 9

3rd Part: Who is who in the Grid RB: Resource Broker CE CE BDII: Berkeley Database Info Index UI UI SE Batch system Batch system WN WN UI: User Interface CE:Computing Element WN: Worker Node SE: Storage Element WN RB/BDII WN WN WN LFC LFC: LCG File Catalogue 10

Middleware: what it is All these services and their tools should talk the same language to make the whole mechanism able to work in Grid terms this is called: MIDDLEWARE The middleware is software and services that sit between the user application and the underlying computing and storage resources, to provide an uniform access to those resources The Grid middleware services: should Find convenient places for the application to be run GRID Optimize the use of the resources middl Organize efficient access to data eware Deal with security Run the job and monitor its progress GLOBUS Recover from problems Transfer the result back to the scientist Experiment specific ALICE ATLAS CMS LHCb LHC common layer, GFAL, POOL High level services (LFC, ) Low level services OS & Services, batch systems, DB, etc 11

Fundamental: Security Security global mechanism GSI: Globus Security Infrastructure Based on PKI (public/private key infrastructure) Authentication: done via CERTIFICATES To ensure that you are who you say to be A certificate is a file which contains the public key, the lifetime of the certificate and the Distinguish Name (DN) of the user This is signed by a CA which provides you the certificate At the core a set of Certification Authorities (CA) Issue certificates for users and services authentication (valid 1 year) Revokes certificates if required Publish formal document about how they operate the service Each site can decide which CA they accept and which ones they deny Authorization Determines what you can do 12

Authorization Virtual Organizations (VO) Homogeneous groups of people with common purposes and Grid rights WLCG VO names: atlas, cms, alice, lhcb, dteam At this moment, while registering with a VO, you get authorized to access the resources that the VO is entitled to use Site providing resources Decides which VO will use its resources (negotiation with the experiments) Can block individual persons of using the site VOMS (VO Membership Service) System that clarifies users that are part of a VO on the base of attributes granted to them upon request Allows definition of rules in a very flexible way Production manager, software manager, simple user, etc. 13

Monitoring the System Operation of Production Service: real-time display of grid operations Accounting information Selection of Monitoring tools: GIIS Monitor + Monitor Graphs Sites Functional Tests GOC Data Base Scheduled Downtimes Live Job Monitor GridIce VO + fabric view Certificate Lifetime Monitor 14

4th Part: Services Support and Status All experiments have already decided the rule that each Tier will play in their production The offline groups of each experiment have to provide to their end-users with a stable, feasible and user friendly infrastructure to access the data, run their jobs and analyze the data The best way to get ready is emulating the real data taking performing the so-called: Data Challenges Validation of their computing systems Validation and testing of the WLCG infrastructure The Grid services are provided by the WLCG developers but always in communication with the experiments The aim is to make a successful data taking, we always try to provide the experiments with what they require A complete infrastructure of support has been developed for each experiments In this sense, each experiment has a WLCG specific contact person + contact with the sites + support of the services 15

Grid Status WLCG Services and sites The LCG Service has been validated over the past 2 years via a series of dedicated Service Challenges, designed to test the readiness of the service infrastructure Significant focus on Data Management, including data export from Tier0Tier1 These are complementary to tests by the experiments of the offline Computing Models the Service Challenges have progressively ramped up the level of service in preparation for ever more detailed tests by the experiments The target: full stable production services by end September 2006! Some additional functionality is still to be added, resource levels will continue to ramp-up in 2007 and beyond Resource requirements are strongly coupled to total volume of data acquired to date 16

5th Part: Beyond HEP Due to the computational power of the EGEE new communities are requiring services for different research fields Normally these communities do not need the complex structure that required by the HEP communities In many cases, their productions are shorter and well defined in the year The amount of CPU required is much lower and also the Storage capabilities 20 applications from 7 domains High Energy Physic, Biomedicine, Earth Sciences, Computational Chemistry Astronomy, Geo-physics and financial simulation 17

Geant4 Generic Toolkit for Monte Carlo simulation of particle interactions with the matter New versions of the software released twice per year: June and December Models previously tested through a wide range of particles, detectors and energies Procedure extremely CPU demanding About 4 year of CPU time to execute in just 2 weeks More than 20000 jobs to test until 7 different candidates From December 2004, the validation tests executed in the GRID Geant4 is an official VO from December 2005 18

UNOSAT UNOSAT: United Nations Initiative Objectives: Provide the humanitarian community with access to satellite imagery and Geographic Information System services Ensure cost-effective and timely products Core Services: Humanitarian Mapping Images Processing VEGETATION 1 Km IKONOS 1m 19

UNOSAT+GRID Project Collaboration between UNOSAT, GRID teams and EnginFrame Fundamental actor: an application able to map metadata (coordinates) to physical location of the files: AMGA application developed by GRID (ARDA GROUP) (x,y,z) GRID WORLD Application 20

Production with the ITU May 2006: The International Telecommunication Union organized a world conference to establish establish a new frequency plan for the introduction of digital broadcasting in Europe, Africa, Arab States and former Russian Federation States The software developed by the European Broadcasting Union (EBU), performs compatibility analysis between digital requirements and existing analogue broadcasting stations The peculiarity is that the CPU of each job is not easily predictable, since it depends on the details of the input data set although the total one could be calculated The duration of the jobs variates from few seconds until several hours The GRID was used during the whole production to executed all the required jobs (several thousands per weekend) 21

Summary WLCG has been born to cover the high computer and storage demand of the experiments from 2007 We count today more than 200 sites in 34 countries The whole project is the results of a big collaboration between experiments, developers and sites It is a reality, GRID is being used in production Covering a wide range of research fields If we were able to share information with the WEB, GRID will allow us to share computational and storage power all over the world 22