CHIPP Phoenix Cluster Inauguration

Similar documents
The grid for LHC Data Analysis

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

The Grid. Processing the Data from the World s Largest Scientific Machine II Brazilian LHC Computing Workshop

The Grid: Processing the Data from the World s Largest Scientific Machine

The LHC Computing Grid

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Andrea Sciabà CERN, Switzerland

Storage and I/O requirements of the LHC experiments

PoS(EGICF12-EMITC2)106

Challenges of the LHC Computing Grid by the CMS experiment

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

LHC Computing Grid today Did it work?

Status of KISTI Tier2 Center for ALICE

A short introduction to the Worldwide LHC Computing Grid. Maarten Litmaath (CERN)

CC-IN2P3: A High Performance Data Center for Research

Computing for LHC in Germany

First Experience with LCG. Board of Sponsors 3 rd April 2009

150 million sensors deliver data. 40 million times per second

CMS Belgian T2. G. Bruno UCL, Louvain, Belgium on behalf of the CMS Belgian T2 community. GridKa T1/2 meeting, Karlsruhe Germany February

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

LHCb Computing Strategy

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

Batch Services at CERN: Status and Future Evolution

Operating the Distributed NDGF Tier-1

Distributed Data Management on the Grid. Mario Lassnig

The European DataGRID Production Testbed

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 27 July 2010

The INFN Tier1. 1. INFN-CNAF, Italy

Philippe Charpentier PH Department CERN, Geneva

Austrian Federated WLCG Tier-2

HEP replica management

I Service Challenge e l'implementazione dell'architettura a Tier in WLCG per il calcolo nell'era LHC

EGEE and Interoperation

Long Term Data Preservation for CDF at INFN-CNAF

DESY. Andreas Gellrich DESY DESY,

LHCb Distributed Conditions Database

Grid Computing: dealing with GB/s dataflows

Travelling securely on the Grid to the origin of the Universe

Experience of the WLCG data management system from the first two years of the LHC data taking

Computing / The DESY Grid Center

Storage Resource Sharing with CASTOR.

Tier-2 DESY Volker Gülzow, Peter Wegner

IEPSAS-Kosice: experiences in running LCG site

Experience of Data Grid simulation packages using.

Grid Computing Activities at KIT

Preparing for High-Luminosity LHC. Bob Jones CERN Bob.Jones <at> cern.ch

e-science for High-Energy Physics in Korea

HEP Grid Activities in China

The LHC Computing Grid

Distributed e-infrastructures for data intensive science

Physics Computing at CERN. Helge Meinhard CERN, IT Department OpenLab Student Lecture 21 July 2011

Distributed Monte Carlo Production for

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

Promoting Open Standards for Digital Repository. case study examples and challenges

CERN and Scientific Computing

A scalable storage element and its usage in HEP

Introduction to Grid Infrastructures

Overview of ATLAS PanDA Workload Management

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

The CMS Computing Model

Grids and Security. Ian Neilson Grid Deployment Group CERN. TF-CSIRT London 27 Jan

ARC integration for CMS

The Global Grid and the Local Analysis

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

Access the power of Grid with Eclipse

Grid Interoperation and Regional Collaboration

The EU DataGrid Testbed

Distributed Computing Framework. A. Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille

High Energy Physics data analysis

Data Management for the World s Largest Machine

Grid Challenges and Experience

Ganga - a job management and optimisation tool. A. Maier CERN

LHCb Computing Resource usage in 2017

AGIS: The ATLAS Grid Information System

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Scientific data management

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

Large scale commissioning and operational experience with tier-2 to tier-2 data transfer links in CMS

Grid Computing a new tool for science

High Throughput WAN Data Transfer with Hadoop-based Storage

UW-ATLAS Experiences with Condor

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

Grid Computing: dealing with GB/s dataflows

30 Nov Dec Advanced School in High Performance and GRID Computing Concepts and Applications, ICTP, Trieste, Italy

The LHC computing model and its evolution. Dr Bob Jones CERN

Workload Management. Stefano Lacaprara. CMS Physics Week, FNAL, 12/16 April Department of Physics INFN and University of Padova

Network Requirements. September, Introduction

Bookkeeping and submission tools prototype. L. Tomassetti on behalf of distributed computing group

glite Grid Services Overview

Grid and Cloud Activities in KISTI

Computing Model Tier-2 Plans for Germany Relations to GridKa/Tier-1

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

Lessons Learned in the NorduGrid Federation

Transcription:

TheComputing Environment for LHC Data Analysis The LHC Computing Grid CHIPP Phoenix Cluster Inauguration Manno, Switzerland 30 May 2008 Les Robertson IT Department - CERN CH-1211 Genève 23 les.robertson@cern.ch 1

detector event filter (selection & reconstruction) Data Handling and Computation for Physics Analysis reconstruction event summary data processed data raw data event reprocessing analysis batch physics analysis analysis objects (extracted by physics topic) event simulation simulation interactive physics analysis les.robertson n@cern.ch

The Computing System for LHC DATA ANALYSIS Capacity and Evolution Computing requirements for all four experiments in first full year (2009) ~70 PetaBytes disk ~100K processor cores (2009) used by > 5,000 scientists t & engineers Growth driven by new data, accelerator enhancements and improving efficiency, new analysis techniques, expanding physics scope,... disk storage growing at ~40 PB/year BUT evolution of access patterns unclear les.robertson@cern.ch 3

The key characteristics of experimental HEP data analysis that dictate the design of the computing system independent events easy parallelism codes have modest memory needs (~2GB) modest floating point content perform well on PCs BUT enormous data collections PetaBytes of new data every year shared dby very large user collaborations, many different groups, independent approaches to analysis unpredictable data access patterns mass storage data cache application servers a simple distributed architecture developed ~1990 enabled experimental HEP to migrate from supercomputers and mainframes to clusters with the flexibility for easy evolution to new technologies and benefit from the mass market driven growth in the performance and capacity of PCs, disks, and local area networking les.robertson@cern.ch 4

Why did we decide on a geographically distributed ib t d computing system for LHC? CERN s budget for physics computing was insufficient Easy parallelism, lism use of simple PCs, availability of high h bandwidth international networking... make it possible to extend the distributed architecture to the wide area... AND The ~5,000 LHC collaborators are distributed across institutes all around the world with access to local computing facilities,...... and funding agencies prefer to spend at home if they can Mitigates the risks inherent in the computing being controlled at CERN, subject to the lab s funding priorities and with access and usage policies set by central groups within the experiments ALSO Active participation p in the LHC computing service gives the institute (not just the physicist) a continuing and key role in the data analysis -- which is where the physics discovery happens Encourages novel approaches to analysis...... and to the provision of computing resources les.robertson@cern.ch 5

What do we mean by a Computing Grid**? Collaborating computing centres Interconnected with good networking Interfaces and protocols that enable the centres to advertise their resources and exchange data and work units Layers of software that hide all the complexity from the user So the end-user does not need to know where his data sits and where his jobs run The Grid does not itself impose a hierarchy or centralisation of services Application groups define Virtual Organisations that map users to subsets of the resources attached to the Grid ** There are many different variations on the term Grid this is the HEP definition les.robertson@cern.ch 6

The advantage for the computer centre is that the basic services can be provided in a standard way for different application groups e.g. user authentication, job submission, storage access, data transfer... ATLAS, CMS, LHCb, DZERO,..., BioMed, Fusion,... The advantage for the application group is that it can integrate resources from different centres and view them as a single service without having to support all of the software layers, negotiate the installation of special software, register users on each site, etc. But they have the flexibility to pick and choose replace software layers with their own products, decide which h services are provided at which sites,... les.robertson@cern.ch 7

LCG depends on two major science grid infrastructures EGEE Enabling Grids for E Science (with itheu funding) OSG US Open Science Grid (with DoE and NSF funding) les.robertson @cern.ch 8

The Middleware for the Baseline Services needed d for the LHC Experiments - Information system - Security framework - Storage Element - SRM interface to Mass Storage dcache, DPM, CASTOR, STORM - Basic data transfer tools Gidft Gridftp, srmcopy. - Reliable file transfer service FTS - Catalogue services LFC, Globus RLS - Catalogue and data management tools lcg-utils - Compute element Globus/Condor-G based CE, Cream (web services) - Reliable messaging service -Virtual Organisation Management Services - Database distribution services ORACLE streams, SQUID - POSIX-I/O I/O interfaces to storage - Workload Management EGEE Resource Broker, VO-specific schedulers - Job monitoring tools - Grid monitoring tools - Application software installation - GUIs for analysis, production GANGA, CRAB, PANDA,.. For LCG, grid interoperability is required at the level of the baseline service same software or standard interfaces or compatible functionality les.robertson @cern.ch 9

The Middleware for the Baseline Services needed d for the LHC Experiments - Information system - Security framework - Storage Element - SRM interface to Mass Storage dcache, DPM, CASTOR, STORM - Basic data transfer tools Gidft Gridftp, srmcopy. - Reliable file transfer service FTS - Catalogue services LFC, Globus RLS - Catalogue and data management tools lcg-utils - Compute element Globus/Condor-G based CE, Cream (web services) - Reliable messaging service -Virtual Organisation Management Services - Database distribution services ORACLE streams, SQUID - POSIX-I/O I/O interfaces to storage - Workload Management EGEE Resource Broker, VO-specific schedulers - Job monitoring tools - Grid monitoring tools - Application software installation - GUIs for analysis, production GANGA, CRAB, PANDA,.. For LCG, grid interoperability is required at the level of the baseline service same software or standard interfaces or compatible functionality les.robertson @cern.ch 10

The LHC Computing Grid A Collaboration of 4 Experiments + ~130 Computer Centres Tier-0 the accelerator centre Data acquisition iti & initial iti processing Long-term data curation Distribution of data Tier-1 centres Canada Triumf (Vancouver) France IN2P3 (Lyon) Germany Forschunszentrum Karlsruhe Italy CNAF (Bologna) Netherlands NIKHEF/SARA (Amsterdam) Nordic countries distributed Tier-1 Spain PIC (Barcelona) Taiwan Academia SInica (Taipei) UK CLRC (Oxford) US FermiLab (Illinois) Brookhaven (NY) 11 Tier-1 Centres online to the data acquisition process high availability Managed Mass Storage grid-enabled data service Data-heavy analysis National, regional support les.robertson @cern.ch Tier-2 120 Centres in 60 Federations in 35 countries End-user (physicist, research group) analysis where the discoveries are made Simulation 11

Distribution of Resources across Tiers les.robertson @cern.ch 2009 the first full year of data taking <20% at CERN less than half of that at the Tier-2s the distributed system must work from Day 1 12

Middleware & Software From many sources: Globus, Condor, EGEE glite, High Energy Physics common tools and packages, experiments, open-source projects, proprietary packages,... Two fundamental middleware packages are integrated, tested and distributed by the infrastructure grids glite by EGEE built on the Virtual Data Toolkit by OSG The mass storage systems are crucial but complicated components of the LCG service HEP developments And a thick layer of software is maintained and distributed by the experiments (data management, resource scheduling, analysis GUIs,..) les.robertson @cern.ch 13

Experiment computing models define specific data flows between CERN, Tier-1s and dtier-2s Bologna Fermilab Karlsruhe Tier-2 Tier-2 cscs Lyon Tier-2 ATLAS EachTier-1 acts as the data repository for a cloud of associated Tier-2s. CMS Tier-2s send simulated data to a specific Tier-1, but obtain data for analysis from any of the 7 Tier-1s: Taipei, Bologna, Fermilab, Karlsruhe, Lyon, Barcelona, Ruther ford Lab Bologna Karlsruhe Lyon Vancouver les.robertson @cern.ch Freiburg Prague cscs Hamburg 14

Wide Area Network T2 T2 T2 T2 Tier-2s and Tier-1s are inter-connected by the general purpose research networks T2 T2 GridKa T2 T2 IN2P3 Dedicated 10 Gbit optical network TRIUMF Brookhaven ASCC T2 Nordic Individual site T2Peak data rates CERN: ~2 GB/sec Tier-1: 150-800 MB/sec Tier-2: 130 T2 MB/sec CNAF SARA T2 RAL PIC Fermilab T2 T2 T2

European Research Network Backbone

les.robertson @cern.ch 18

les.robertson @cern.ch 19

Experiment Dashboards provide tools for monitoring i and debugging CMS jobs May 2008 sorted by activities Up to 200K jobs per day - 35% end-user analysis ATLAS data transfer status - 28 May 2008 Throughput ~1100 MB/s job status t by site user can drill down to find details of the errors les.robertson @cern.ch 20

Evolution of LHC grid cpu usage 150% growth since Jan 07 More than half from Tier-2s ~800K core-days ~25K cores at 100% utilisation ~55% of committed capacity Experiment services are still in test mode awaiting the real data les.robertson @cern.ch 21

Data Transfer CERN Tier-1s TB/day MB/sec 200 150 Sustained rate during data taking 100 50 les.robertson @cern.ch 22

CMS Data Volume TBytes/day -all sites to all sites les.robertson @cern.ch 23

LHC Computing Grid Status Summary The final pre-startup testing is now going on all experiments exercise simultaneously l their full computing chains - from the data acquisition system to end-user analysis at the Tier-2s at the full 2008 scale No show-stoppers since before the tests began in February day-to-day issues being rapidly resolved Most performance and scaling metrics have been achieved BUT this is research. The actual load, access patterns, user behaviour is unpredictable depends on how physicists react to what they find in the real data We can look forward to an exciting time when the beam starts!! les.robertson @cern.ch 24

Grids are all about sharing Summary they are a means whereby groups distributed ib d around the world can pool their computing resources large centres and small centres can all contribute users everywhere can get equal access to data and computation without having to spend all of their time seeking out the resources Grids also allow the flexibility to place the computing facilities in the most effective and efficient places exploiting funding wherever it is provide, piggy-backing on existing computing centres, or exploiting cheap and renewable energy sources The LHC provides a pilot application with massive computing requirements, world-wide collaborations that is already demonstrating that grids can deliver in production and the scientific success of LHC will depend on the grid from day 1 les.robertson@cern.ch 25

Computing at the Terra-Scale Acknowledgements: Julia Andreeva, Ian Bird, David Colling, David Foster, Jürgen Knobloch, Faïrouz Malek, the LCG Collaboration, les.robertson EGEE, OSG, @cern.ch the LHC experiments 26