The ATLAS Distributed Analysis System

Similar documents
ATLAS Analysis Workshop Summary

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

New data access with HTTP/WebDAV in the ATLAS experiment

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

Software and computing evolution: the HL-LHC challenge. Simone Campana, CERN

ATLAS operations in the GridKa T1/T2 Cloud

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

UW-ATLAS Experiences with Condor

The ATLAS EventIndex: Full chain deployment and first operation

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Prompt data reconstruction at the ATLAS experiment

Considerations for a grid-based Physics Analysis Facility. Dietrich Liko

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

PARALLEL PROCESSING OF LARGE DATA SETS IN PARTICLE PHYSICS

ATLAS 実験コンピューティングの現状と将来 - エクサバイトへの挑戦 坂本宏 東大 ICEPP

Improved ATLAS HammerCloud Monitoring for Local Site Administration

XCache plans and studies at LMU Munich

Andrea Sciabà CERN, Switzerland

Installation of CMSSW in the Grid DESY Computing Seminar May 17th, 2010 Wolf Behrenhoff, Christoph Wissing

Analytics Platform for ATLAS Computing Services

150 million sensors deliver data. 40 million times per second

DQ2 - Data distribution with DQ2 in Atlas

Data handling and processing at the LHC experiments

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

Influence of Distributing a Tier-2 Data Storage on Physics Analysis

TAG Based Skimming In ATLAS

The Run 2 ATLAS Analysis Event Data Model

ATLAS Distributed Computing Experience and Performance During the LHC Run-2

ATLAS Computing: the Run-2 experience

Grid Computing Activities at KIT

The ATLAS Production System

ATLAS distributed computing: experience and evolution

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

DESY at the LHC. Klaus Mőnig. On behalf of the ATLAS, CMS and the Grid/Tier2 communities

High Energy Physics data analysis

Data Analysis in ATLAS. Graeme Stewart with thanks to Attila Krasznahorkay and Johannes Elmsheuser

Computing Model Tier-2 Plans for Germany Relations to GridKa/Tier-1

Computing. DOE Program Review SLAC. Rainer Bartoldus. Breakout Session 3 June BaBar Deputy Computing Coordinator

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

New Features in PanDA. Tadashi Maeno (BNL)

The CMS Computing Model

ATLAS DQ2 to Rucio renaming infrastructure

Evolution of Cloud Computing in ATLAS

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

UK Tier-2 site evolution for ATLAS. Alastair Dewhurst

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

Past. Inputs: Various ATLAS ADC weekly s Next: TIM wkshop in Glasgow, 6-10/06

CHEP 2013 October Amsterdam K De D Golubkov A Klimentov M Potekhin A Vaniachine

PROOF-Condor integration for ATLAS

Benchmarking the ATLAS software through the Kit Validation engine

ANSE: Advanced Network Services for [LHC] Experiments

Data Transfers Between LHC Grid Sites Dorian Kcira

Analysis & Tier 3s. Amir Farbin University of Texas at Arlington

The PanDA System in the ATLAS Experiment

CMS Grid Computing at TAMU Performance, Monitoring and Current Status of the Brazos Cluster

Towards Network Awareness in LHC Computing

Summary of the LHC Computing Review

Streamlining CASTOR to manage the LHC data torrent

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Insight: that s for NSA Decision making: that s for Google, Facebook. so they find the best way to push out adds and products

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

Batch Services at CERN: Status and Future Evolution

HammerCloud: A Stress Testing System for Distributed Analysis

IllustraCve Example of Distributed Analysis in ATLAS Spanish Tier2 and Tier3

Data and Analysis preservation in LHCb

The LHC Computing Grid. Slides mostly by: Dr Ian Bird LCG Project Leader 18 March 2008

Austrian Federated WLCG Tier-2

ATLAS Experiment and GCE

Tier-2 structure in Poland. R. Gokieli Institute for Nuclear Studies, Warsaw M. Witek Institute of Nuclear Physics, Cracow

Update of the Computing Models of the WLCG and the LHC Experiments

LCG Conditions Database Project

Distributed Monte Carlo Production for

Edinburgh (ECDF) Update

Big Computing and the Mitchell Institute for Fundamental Physics and Astronomy. David Toback

1. Introduction. Outline

CMS Belgian T2. G. Bruno UCL, Louvain, Belgium on behalf of the CMS Belgian T2 community. GridKa T1/2 meeting, Karlsruhe Germany February

Efficient HTTP based I/O on very large datasets for high performance computing with the Libdavix library

LHCb Distributed Conditions Database

CouchDB-based system for data management in a Grid environment Implementation and Experience

Experiences with the new ATLAS Distributed Data Management System

The ATLAS PanDA Pilot in Operation

The XENON1T Computing Scheme

Atlas Managed Production on Nordugrid

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

A copy can be downloaded for personal non-commercial research or study, without prior permission or charge

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

Early experience with the Run 2 ATLAS analysis model

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

Monitoring for IT Services and WLCG. Alberto AIMAR CERN-IT for the MONIT Team

AGIS: The ATLAS Grid Information System

Reprocessing DØ data with SAMGrid

File Access Optimization with the Lustre Filesystem at Florida CMS T2

First Experience with LCG. Board of Sponsors 3 rd April 2009

Overview of ATLAS PanDA Workload Management

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

ATLAS Nightly Build System Upgrade

Computing at Belle II

ALICE ANALYSIS PRESERVATION. Mihaela Gheata DASPOS/DPHEP7 workshop

Distributing storage of LHC data - in the nordic countries

Transcription:

The ATLAS Distributed Analysis System F. Legger (LMU) on behalf of the ATLAS collaboration October 17th, 2013 20th International Conference on Computing in High Energy and Nuclear Physics (CHEP), Amsterdam

Outline Distributed Analysis in ATLAS during LHC run 1: Computing model: Job goes to data paradigm Distributed analysis: performance Statistics Efficiency Failures Data types Future developments and implications on distributed analysis for run 2: new production system new analysis model remote file access 2

ATLAS Computing Model 2011-2013 ATLAS Grid infrastructure This talk Central production & Distributed Data Management (DDM) Group production T0 T1, T2 Size: PB Data Size: PB processing D3PD: D3PD: AOD: Analysis D3PD: flatntuple ntuple flat D3PD: Object Data flat ntuple flat ntuple MC production T1-T2 T1-T2 Distributed Analysis T1-T2 TB derived derived derived formats derived formats formats formats Group Analysis DDM: dq2 User Analysis User tools for grid job submission: T1-T2-T3 PanDA Ganga Size: GB Paradigm: job goes to data 3

Running Grid jobs 2011-2013 Analysis share: T0: 0% T1: 20% (2011) 5-50% (2012-2013) T2: 50% T3: 100% Simulation Analysis Reconstruction 4

Pending Grid jobs 2011-2013 Moriond Periodic peaks in pending analysis jobs Would benefit from more dynamic balance between production and analysis resources ICHEP HCP EPS Analysis Simulation 5

ATLAS Grid jobs 2011-2013 5% grid-related failure rate, mostly due to storage failures User Analysis: 400M jobs, efficiency: 80% Group Analysis: 30M jobs, efficiency: > 90% Group production: 40M jobs, efficiency: > 90% MC production: 200M jobs, efficiency: > 90% Data processing: 20M jobs, efficiency: > 90% (NB cancelled jobs have walltime close to 0, negligible impact on efficiency) 6

Analysis jobs 2011-2013 Moriond Flat failure rate Peaks of user jobs at busy times ICHEP, Higgs discovery HCP HCP EPS successful failed cancelled 7

Efficiency analysis jobs: 2011-2013 Average 400k analysis jobs/day Efficiency : 80% 10M Wall clock consumption of failed jobs: 20% successful most failures related to user code failed Put error Athena crash Wall clock consumption Put error Get error Athena crash Athena crash completed cancelled failed 8

Failed analysis jobs What we do to prevent them: Automatic site testing and exclusion from brokerage with HammerCloud (see dedicated talk) DAST (Distributed Analysis Support Team): User support with dedicated mailing list expert shifters help users 16 hours/day (American and European time-zones) What we plan to do: Better error reporting for automatic retrials/better debugging Automatic kill user jobs in case of massive failures 9

Analysis job length January 2013 Jobs running on AODs Jobs running on D3PDs Run time [hours] Most analysis jobs last 1 hour or less produce many small files also tail of longer jobs Run time [hours] Profit from better optimization of grid resources need to estimate job parameters 10

Future Production system ProdSys2 (DEFT/JEDI) DDM based on Rucio Based on tasks rather than individual jobs: More complex work-flows Simplification of client tools: Brokering and task/job management moved server-side Scout jobs to estimate needed resources more efficient usage of grid resources Better retrial mechanism for failed jobs faster job submission times! see dedicated talks New Analysis model CMS plans to use the same PanDA foundation for analysis 11

Input data January-March 2013 AOD vs D3PD: user format D3PD used by Higgs group AOD Full reconstruction info only available in AOD Size D3PD/AOD: up to 50% Almost 100 D3PD flavours! Total size of a full version of the main D3PDs is 3 times larger than the AODs AOD/D3PD users: ¼ D3PD production: 5-10 s/event, to produce all D3PD flavors, 30 s/event are needed 12

Future analysis model Centrally provided reduction framework for slimming/skimming Merged AOD+D3PD format: AODx, readable with both ROOT and athena New analysis framework to support both reduced and original AODx analysis, integrated with grid tools Standardized application of Combined Performance (CP) recommendations 13

Disk space needed at the end of 2015 Current model: AOD+D3PD New model events MB/evt AOD (PB) D3PD (PB) AODx (PB) 2012, data 3.9 109 0.24 0.9 2.8 1.2 2012, MC 4.5 109 0.40 1.8 5.4 2.2 2015, data 5 109 0.24 2.4 14.4 6 2015, MC 6 109 0.40 4.8 28.8 12 Assumptions: AODx / AOD = 1.25, D3PD/AOD = 3 Total size of run 1 is 2 x size of 2012 1 copy of run 1 data 2 copies of 2015 data, 2 versions of D3PDs/AODx AOD+D3PD (PB) AODx (PB) Run 1 22 7 2015 50 18 Total 72 25 14

The Federated ATLAS XrootD system (FAX) Storage federation aiming to treat Tier1, Tier2 and Tier3 storage space as a single distributed storage system relax data-cpu locality paradigm FAX routes the client to the nearest site with the requested data (scope yet to be defined) data access through a client software like ROOT or xrdcp (other protocols as http under evaluation) requires advanced caching mechanism as TTreeCache transparent for the user currently under evaluation see dedicated talk Different input access modes: Use cases: Quick scan through large data sample Fetch missing files instead of failed jobs Use of opportunistic resources 15

Conclusions Successful deployment of Distributed Analysis in ATLAS during run 1: 400k/day user jobs: 80% efficiency, 5% grid-related failures (mostly due to storage failure) HammerCloud auto-exclusion to ensure smooth grid operation DAST to provide user support Several challenges coming up: new Analysis Model, new Production system, remote file access with FAX Future developments: Balance production vs analysis resources, use of opportunistic not ATLAS resources Features to improve user experience: estimate of time-to-completion and improved job monitoring improved automatic job retrial job output to local storage Features to smooth grid operations: Definition of user job limits Monitoring of massive user job failures 16

Backup 17

Assumptions disk space calculation 2012 (8 TeV) 2015 (14 TeV) Trigger rate events/s 400 + 150 (delayed) 1000 Data events 3 + 1 (delayed) 109 5 109 MC events 4.5 109 6 109 Raw size MB/event 0.8 1.1 AOD size (data) KB/event 250 250 AOD size (MC) KB/event 400 400 D3PD size (data) KB/event 50-130 D3PD size (MC) KB/event 100-200 18