ATLAS Analysis Workshop Summary

Similar documents
Data Analysis in ATLAS. Graeme Stewart with thanks to Attila Krasznahorkay and Johannes Elmsheuser

The ATLAS Distributed Analysis System

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

Early experience with the Run 2 ATLAS analysis model

UW-ATLAS Experiences with Condor

New data access with HTTP/WebDAV in the ATLAS experiment

ALICE ANALYSIS PRESERVATION. Mihaela Gheata DASPOS/DPHEP7 workshop

TAG Based Skimming In ATLAS

Monitoring of Computing Resource Use of Active Software Releases at ATLAS

The ATLAS EventIndex: Full chain deployment and first operation

Distributed Data Management on the Grid. Mario Lassnig

Computing. DOE Program Review SLAC. Rainer Bartoldus. Breakout Session 3 June BaBar Deputy Computing Coordinator

Big Data Tools as Applied to ATLAS Event Data

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

Event cataloguing and other database applications in ATLAS

ATLAS Computing: the Run-2 experience

Big Data Analytics Tools. Applied to ATLAS Event Data

Analytics Platform for ATLAS Computing Services

The Run 2 ATLAS Analysis Event Data Model

Update of the Computing Models of the WLCG and the LHC Experiments

A new petabyte-scale data derivation framework for ATLAS

Prompt data reconstruction at the ATLAS experiment

Validation of Missing ET reconstruction in xaod

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

PROOF-Condor integration for ATLAS

Computing at Belle II

150 million sensors deliver data. 40 million times per second

Data handling and processing at the LHC experiments

LHCb Computing Status. Andrei Tsaregorodtsev CPPM

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

The ATLAS Trigger Simulation with Legacy Software

LHCb Computing Strategy

PARALLEL PROCESSING OF LARGE DATA SETS IN PARTICLE PHYSICS

Status of KISTI Tier2 Center for ALICE

Improved ATLAS HammerCloud Monitoring for Local Site Administration

ATLAS PILE-UP AND OVERLAY SIMULATION

Experiences with the new ATLAS Distributed Data Management System

Data and Analysis preservation in LHCb

Access to ATLAS Geometry and Conditions Databases

Introduction. creating job-definition files into structured directories etc.

Monitoring for IT Services and WLCG. Alberto AIMAR CERN-IT for the MONIT Team

ATLAS & Google "Data Ocean" R&D Project

The ATLAS Production System

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

Considerations for a grid-based Physics Analysis Facility. Dietrich Liko

Analysis of Σ 0 baryon, or other particles, or detector outputs from the grid data at ALICE

The XENON1T Computing Scheme

ATLAS distributed computing: experience and evolution

ATLAS Simulation Computing Performance and Pile-Up Simulation in ATLAS

Overview of ATLAS PanDA Workload Management

1. Introduction. Outline

C3PO - A Dynamic Data Placement Agent for ATLAS Distributed Data Management

DQ2 - Data distribution with DQ2 in Atlas

PRP. Frank Würthwein (SDSC/UCSD) Jason Nielsen (UCSC) Owen Long (UCR) Chris West (UCSB) Anyes Taffard (UCI) Maria Spiropolu (Caltech)

Belle & Belle II. Takanori Hara (KEK) 9 June, 2015 DPHEP Collaboration CERN

Unified System for Processing Real and Simulated Data in the ATLAS Experiment

Popularity Prediction Tool for ATLAS Distributed Data Management

ANSE: Advanced Network Services for [LHC] Experiments

AGIS: The ATLAS Grid Information System

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

Singularity tests at CC-IN2P3 for Atlas

A data handling system for modern and future Fermilab experiments

ALICE Grid/Analysis Tutorial Exercise-Solutions

Evolution of Cloud Computing in ATLAS

ATLAS Experiment and GCE

Analysis & Tier 3s. Amir Farbin University of Texas at Arlington

Reprocessing DØ data with SAMGrid

Data Transfers Between LHC Grid Sites Dorian Kcira

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Evolution of ATLAS conditions data and its management for LHC Run-2

Federated Data Storage System Prototype based on dcache

LCG MCDB Knowledge Base of Monte Carlo Simulated Events

INDEXING OF ATLAS DATA MANAGEMENT AND ANALYSIS SYSTEM

Software and computing evolution: the HL-LHC challenge. Simone Campana, CERN

The CMS Computing Model

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Expressing Parallelism with ROOT

ATLAS NOTE. December 4, ATLAS offline reconstruction timing improvements for run-2. The ATLAS Collaboration. Abstract

Experience with ATLAS MySQL PanDA database service

The High-Level Dataset-based Data Transfer System in BESDIRAC

UK Tier-2 site evolution for ATLAS. Alastair Dewhurst

Simulation and Visualization in Hall-D (a seminar in two acts) Thomas Britton

Machine Learning in Data Quality Monitoring

Computing Model Tier-2 Plans for Germany Relations to GridKa/Tier-1

Evaluation of the computing resources required for a Nordic research exploitation of the LHC

Unix/Linux Basics. Cpt S 223, Fall 2007 Copyright: Washington State University

High Energy Physics data analysis

The Database Driven ATLAS Trigger Configuration System

ATLAS Offline Data Quality Monitoring

New Features in PanDA. Tadashi Maeno (BNL)

ATLAS Distributed Computing Experience and Performance During the LHC Run-2

Summary of the LHC Computing Review

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

The PanDA System in the ATLAS Experiment

LHCb Computing Resource usage in 2017

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Data handling with SAM and art at the NOνA experiment

Monte Carlo Production on the Grid by the H1 Collaboration

Computing at the Large Hadron Collider. Frank Würthwein. Professor of Physics University of California San Diego November 15th, 2013

Transcription:

ATLAS Analysis Workshop Summary Matthew Feickert 1 1 Southern Methodist University March 29th, 2016 Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 1

Outline 1 ATLAS Analysis with xaod 2 Browsing an xaod 3 Monte Carlo generation and TRUTH derivations 4 Distributed Data Analysis Tools Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 2

UChicago/ASC-ANL software tutorial ATLAS analysis tutorial organized jointly by The University of Chicago and the Argonne National Laboratory Analysis Support Center (ANL-ASC) March 7th, 2016 through March 8th, 2016 Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 3

ATLAS Analysis with xaod ATLAS data and simulation workflow MC Generator Event Level Detector Simulation Raw Data Object Data Raw Data Object RDO xaod DxAOD analysis group/user code physics Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 4

ATLAS Analysis with xaod Run 2 ATLAS Analysis Model Summary xaod The reconstruction output is the same as analysis input ROOT and Athena readable Derivation Framework DxAOD: production of small user xaod Analysis Release Recommended system for finding and collecting analysis software tools CP Tools Common interface for all analysis tools Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 5

Browsing an xaod How Can I View an xaod? Good news can quickly browse xaod with TBrowser $ root -l PATHTOXAOD root[0] new TBrowser Can use either interactive ROOT or PyROOT to explore and play with the xaod to learn more about it and its containers Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 6

Monte Carlo generation and TRUTH derivations Monte Carlo generation in ATLAS How does ATLAS make Monte Carlo? Event generators are used to generate events Users control generation and physics parameters through Python job files Provides a uniform interface, and allows for dynamic parameter settings Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 7

Monte Carlo generation and TRUTH derivations Monte Carlo generation in ATLAS MC Simulation Flow Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 8

Monte Carlo generation and TRUTH derivations Monte Carlo generation in ATLAS MC Simulation Flow Most interesting is reconstruction phase Simulation is reconstructed in the same manner as data would be Trigger is fully simulated From the main reconstruction (RAW) Event Summary Data (ESD) is derived From ESD Analysis Object Data (xaod) is derived via fast slimming process xaod is then used by both Athena and ROOT Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 9

Monte Carlo generation and TRUTH derivations MC TRUTH Derivations MC TRUTH Derivations Can lean how to generate EVNT files and convert to derivations with Sergei s tutorial Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 10

Distributed Data Analysis Tools Tools for finding datasets Run Query: Everything about real data ATLAS website (https://atlas-runquerycernch/) that allows users to query all available run information Example Query find run 2012g1 and events 100000+ / show streams phy* Useful for detector and operation experts and for analysis Web tool and command line tool with identical functionality Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 11

Distributed Data Analysis Tools Tools for finding datasets What would I do with Run Query? Examples access trigger menus for a particular run get integrated luminosity information for runs access specific machine conditions or bunch intensity profiles for particular run Can start to get information with nothing more then just the run number Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 12

Distributed Data Analysis Tools Tools for finding datasets Conditions (and Configuration) Metadata for ATLAS (COMA) Metadata for all the run data Period Menu: access information on groups of runs (periods) Report Mneu: access metadata of specific conditions for runs Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 13

Distributed Data Analysis Tools Tools for finding datasets ATLAS Metadata Interface (AMI) AMI is a generic framework for metadata catalogues Allows for finding names of valid datasets Detailed information on datasets size file counts, event counts software config parameters MC parameters lumi blocks Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 14

Distributed Data Analysis Tools The Grid Rucio What does rucio do? Find datasets and where they are located in the world List the files of a dataset List the parent dataset for a file Retrieve datasets Create datasets Upload datasets Get data and MC with Rucio Replacing DQ2 from Run I Both a command line and web interface Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 15

Distributed Data Analysis Tools Usage Federated ATLAS Xrootd (FAX) Allows direct access to most of ATLAS data from anywhere Read only Can get data from 64 sites (US, Germany, UK, France, CERN, ) Corresponds to 98% of all ATLAS data FAX allows data to be asked for from an endpoint and will deliver the closest (physically) located version Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 16

Distributed Data Analysis Tools Usage Why use FAX? Increases redundancy If your job requires a file that is inaccessible, FAX will find another copy and your job still runs FAX finds files for you Your file names are independent of run site Can merge data from multiple physical locations to a central output location Remote skimming and slimming of datasets Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 17

Distributed Data Analysis Tools Usage BigPanDA PanDA is the system that lets production jobs and user jobs run on the GRID Compared to running on a Tier3 (ManeFrame) Allows for more computing resources Can run on data stored offsite Can take longer for jobs in the queue to run (Global vs local) Percentage of jobs that die is higher When is running on the GRID the best? When size matters Production of ntuples from large DxAODs MC production When xaod/raw is required Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 18

Distributed Data Analysis Tools Usage BigPanDA After setup PanDA: HelloWorld $ mkdir PanDA-helloworld $ cd PanDA-helloworld $ lsetup panda $ echo print "hello world" > helloworldpy $ prun --outds useryournametestinghelloworld_v1 --exec python helloworldpy INFO : succeeded new jeditaskid=1234567 Then can monitor tasks on the BigPanDA web interface Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 19

Distributed Data Analysis Tools Usage BigPanDA Tutorials/Resources ATLAS Computing Software Tutorial Using the Grid prun Examples pathena Examples Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 20

Backup Slides Backup Slides Backup Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 21

Backup Slides Matthew Feickert (SMU) ATLAS Analysis Workshop Summary March 29th, 2016 22