Considerations for a grid-based Physics Analysis Facility. Dietrich Liko

Similar documents
High Energy Physics data analysis

Austrian Federated WLCG Tier-2

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

PoS(ACAT2010)039. First sights on a non-grid end-user analysis model on Grid Infrastructure. Roberto Santinelli. Fabrizio Furano.

Service Availability Monitor tests for ATLAS

PROOF-Condor integration for ATLAS

UW-ATLAS Experiences with Condor

The Global Grid and the Local Analysis

Ganga - a job management and optimisation tool. A. Maier CERN

The ATLAS Distributed Analysis System

where the Web was born Experience of Adding New Architectures to the LCG Production Environment

EGEE and Interoperation

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

150 million sensors deliver data. 40 million times per second

Andrea Sciabà CERN, Switzerland

The ATLAS EventIndex: Full chain deployment and first operation

Batch Services at CERN: Status and Future Evolution

The LHC Computing Grid

TAG Based Skimming In ATLAS

The ARDA project: Grid analysis prototypes of the LHC experiments

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

The INFN Tier1. 1. INFN-CNAF, Italy

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

Virtualizing a Batch. University Grid Center

ATLAS operations in the GridKa T1/T2 Cloud

Analysis & Tier 3s. Amir Farbin University of Texas at Arlington

Summary of the LHC Computing Review

The National Analysis DESY

The ATLAS Production System

Distributing storage of LHC data - in the nordic countries

HammerCloud: A Stress Testing System for Distributed Analysis

Philippe Charpentier PH Department CERN, Geneva

LCG data management at IN2P3 CC FTS SRM dcache HPSS

ATLAS NorduGrid related activities

Application of Virtualization Technologies & CernVM. Benedikt Hegner CERN

Data Analysis in ATLAS. Graeme Stewart with thanks to Attila Krasznahorkay and Johannes Elmsheuser

Operating the Distributed NDGF Tier-1

The glite middleware. Ariel Garcia KIT

Data services for LHC computing

Status of KISTI Tier2 Center for ALICE

Data Access and Data Management

Lessons Learned in the NorduGrid Federation

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Influence of Distributing a Tier-2 Data Storage on Physics Analysis

Clouds at other sites T2-type computing

ATLAS Tier-3 UniGe

LHCb Computing Strategy

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Challenges of the LHC Computing Grid by the CMS experiment

Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

XRAY Grid TO BE OR NOT TO BE?

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

UK Tier-2 site evolution for ATLAS. Alastair Dewhurst

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

HEP Grid Activities in China

ELFms industrialisation plans

CHIPP Phoenix Cluster Inauguration

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Edinburgh (ECDF) Update

Computing at Belle II

ALICE Grid Activities in US

DQ2 - Data distribution with DQ2 in Atlas

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

Grid Computing Activities at KIT

Workload Management. Stefano Lacaprara. CMS Physics Week, FNAL, 12/16 April Department of Physics INFN and University of Padova

Data handling and processing at the LHC experiments

HEP replica management

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

First Experience with LCG. Board of Sponsors 3 rd April 2009

Benchmarking the ATLAS software through the Kit Validation engine

Data oriented job submission scheme for the PHENIX user analysis in CCJ

Computing Model Tier-2 Plans for Germany Relations to GridKa/Tier-1

New data access with HTTP/WebDAV in the ATLAS experiment

Grid Scheduling Architectures with Globus

Experience with PROOF-Lite in ATLAS data analysis

Grid Operation at Tokyo Tier-2 Centre for ATLAS

PARALLEL PROCESSING OF LARGE DATA SETS IN PARTICLE PHYSICS

Monitoring of Computing Resource Use of Active Software Releases at ATLAS

Clouds in High Energy Physics

Ganga The Job Submission Tool. WeiLong Ueng

PoS(EGICF12-EMITC2)106

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. Presented by Manfred Alef Contributions of Jos van Wezel, Andreas Heiss

Tier-2 structure in Poland. R. Gokieli Institute for Nuclear Studies, Warsaw M. Witek Institute of Nuclear Physics, Cracow

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

Update of the Computing Models of the WLCG and the LHC Experiments

The PanDA System in the ATLAS Experiment

Belle & Belle II. Takanori Hara (KEK) 9 June, 2015 DPHEP Collaboration CERN

LHCb Computing Status. Andrei Tsaregorodtsev CPPM

The LHC Computing Grid

Distributed Computing Framework. A. Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

BOSCO Architecture. Derek Weitzel University of Nebraska Lincoln

The Wuppertal Tier-2 Center and recent software developments on Job Monitoring for ATLAS

Tier2 Centre in Prague

Analytics Platform for ATLAS Computing Services

Overview of HEP software & LCG from the openlab perspective

Transcription:

Considerations for a grid-based Physics Analysis Facility Dietrich Liko

Introduction Aim of our grid activities is to enable physicists to do their work Latest GANGA developments PANDA Tier-3 Taskforce Special workgroup in ATLAS Also concerned with Analysis Facilities Tries to bridge the gap between the grid and local facilities Very relevant for all considerations in Austria

The GANGA system User friendly job submission tools Extensible due to pluginsystem Support for several applications Athena, AthenaMC(ATLAS) Gaudi, DaVinci(LHCb) Others Support for several backends LSF, PBS, SGE etc glitewms, Nordugrid, Condor DIRAC, PANDA (under development)

GANGA Usage

GANGA Plans GANGA currently supports 40+ sites The PANDA system is being deployed on LCG sites right now PANDA will be an excellent backend also for GANGA Some collaboration between GANGA and PANDA is being done right now

PANDA for Production The PANDA system has been chosen as Production system for ATLAS PANDA is a pilot based system Higher efficency PANDA incoperates a better support for ATLAS workflows that include data movements

PANDA System

ATLAS Data Formats RAW Have to be recorded on permanent storage Only a small fraction can be analyzed directly Event Summary Data ESD Output of the reconstruction Often large; difficult to analyze the full set Analysis Object Data AOD Quantities in particular relevant for physics analysis Main input for analysis, distributed to many sites Analysis specific data DPD Thinned AODs and user specific data

ATLAS Analysis Model Basic principle: Smaller data can be read faster Skimming-Keep interesting events Thinning-Keep interesting objects in events Slimming-Keep interesting info in objects Reduction-Build higher-level data

Physics Use Cases AODs reside at T2, too large to transfer large fraction to T3 D1PD is 10-20 times smaller, analysis at T3 with ROOT AthenaROOTAccess possible But still would take time, probably better at T2 Tier 3 will hold D2PD/D3PD 10 TB = 100% of 109 events in 10KB/event D3PDs format Rough size estimate 25 cores and 25 TB per analysis (not per user) Can be substantially more!

Analysis Scenaria

Hardware Difficult to name vendors and systems Consult your T1/T2/computing center or neighbored T3 for their HW Combined hardware purchases might lower prices Also check large vendors: Often they have R&D prices and are willing to negotiate In the end local considerations

Operating System Atlas SW best supported for SL(C)(4) SL4 might be too old for some modern HW Other OS wanted by your administrator Some features need specific OS (Sun Thumper need Solaris for ZFS) You can do analysis on other platforms than SLC4. Official support might, however, not be available.

Installation Depends on the size of your setup Very few machines: Installation by hand probably most efficient Larger clusters: You need installation and maintenance tools. Many HEP and non HEP examples exist Do not forget monitoring

Batch system Many are around. If you must choose one: Condor, PBS/Maui/Torque, SGE are freely available and scalable enough for T3 Choose the one that is best supported in your region Example Maui/Torque installation on Wiki This combination probably best supported in the EGEE world

PROOF Interesting technology, shows good scalability Many Atlas people looking into it: Munich, Madrid, Valencia, BNL, Wisconsin, ALICE relies on it Future of PROOF development uncertain (status PROOF WS) Summary from Johannes Elmsheuser (more in the Wiki) Dedicate a few nodes to PROOF analysis Use cluster or dcache/xrootd file system Will scale with the 5-10 average user number of an institute A sysadmin or computer physicist needs to maintain the cluster

Storage System First attemt for a matrix Also HEPIX is looking into these issues

Data import Probably done via DDM/DQ2 SRM endpoint at T3 helpful FTS channels: should be available Need some coordination with T1/T2s in your cloud! Users can also write files using lcg-cr directly to their SE from another T2 In the future it will be the only place for you to store your own data! Probably no export of data necessary, although interesting for regional/national collaborations

Computing Element If batch system runs fine: Grid interface is an add-on Interesting for regional/national collaborations More work against Additional benefits Choose the middleware you know best/ your region supports

Software Atlas Software: Automatic installation via Grid possible Manual installation easy In any case: Have a shared filesystem with ~100 GB space NFS should be OK AFS also works, Grid installations more difficult Grid UI Use the one from CERN AFS Maitain your local copy

Summary GANGA is proving an analysis tool for a large community in ATLAS Support for users does not stop here Tier-3 or Analysis facility support is necessary The Wiki contains more details

Personal summary I had a lot of fun supporting ATLAS move to enable the grid for user analysis I will move very soon now to CMS and Vienna I am looking forward to stay in contact with my friends in ATLAS both at CERN and also in Innsbruck