Analysis & Tier 3s. Amir Farbin University of Texas at Arlington

Size: px

Start display at page:

Download "Analysis & Tier 3s. Amir Farbin University of Texas at Arlington"

Mitchell Wilkerson
5 years ago
Views:

1 Analysis & Tier 3s Amir Farbin University of Texas at Arlington

2 Introduction Tug of war between analysis and computing requirements: Analysis Model Physics Requirements/User preferences (whims) Organization of data/steps of analysis Software Technologies/Design Computing Model Disk, CPU, Network, Cluster Architecture Production, DDM, DA Everyone would love to have the data everywhere, instant processing... Simulated data (eg FDR) is far short of the LHC data volume Today users practice analysis models which won t scale to real data... We all systematically underestimate LHC computing challenges... reflected in our AM. Important that this community presses back on the Analysis Model... it should be an iterative process.

3 Overview I will not talk about production activity at tier 3. Overview relevant parts computing model/analysis model. Talks specifically about what are the Analysis Model expectations from tier 3s. Tier 3s are still not defined. One ideas about an easy way to build a tier 3. There are lots of different types of tier 3. Apologies for being provocative... please point out my mistakes.

4 Analysis Model

ATLAS Computing Full Simulation Fast Simulation Generation Simulation KHz Digitization mhz Hz Data Store New KHz Generation Hz Fast Simulation ATLAS will only simulate 20% of

5 ATLAS Computing Full Simulation Fast Simulation Generation Simulation KHz Digitization mhz Hz Data Store New KHz Generation Hz Fast Simulation ATLAS will only simulate 20% of data High-level Trigger 200 Hz Reconstruction Data chz KHz Algorithmic Hz Analysis MHz Interactive Analysis 10 9 events/year Base Data Analysis & Calibration Statistical Analysis

6 The Event Data Model Raw Channels. 1.6 MB/event. Reconstruction Output. Intended for calibration. 500 KB/event. Cells,Hits, Tracks, Clusters,Electrons, Jets,... TAG Analysis Object Data Summary of Event. Intended for selection. 1 KB/event. Trigger decision, pt of 4 best electrons, jets... Derived Physics Data Event Summary Raw Data Objects Data Data refinement Intended for Analysis. 100 KB/event. Light-weight Tracks, Clusters,Electrons, Jets, Electron Cells, Muon HitOnTrack,... Intended for interactive Analysis. ~10-20 KB/event. What-ever is necessary for a specific analysis/ calibration/study.

The Computing Model Resources Spread Around the GRID Derive 1st pass calibrations within 24 hours. Reconstruct rest of the data keeping up with data taking.

$Managed Tape Access: RAW, ESD Disk Access: AOD, fraction of ESD Tier 0 CERN Analysis Facility RAW/ AOD/ ESD Tier 1 10 Sites Worldwide AOD Interactive Analysis Plots,$

7 The Computing Model Resources Spread Around the GRID Derive 1st pass calibrations within 24 hours. Reconstruct rest of the data keeping up with data taking. RAW Reprocessing of full data with improved calibrations 2 months after data taking. Managed Tape Access: RAW, ESD Disk Access: AOD, fraction of ESD Tier 0 CERN Analysis Facility RAW/ AOD/ ESD Tier 1 10 Sites Worldwide AOD Interactive Analysis Plots, Fits, Toy MC, Studies,... Tier 2 30 Sites Worldwide Primary purpose: calibrations Small subset of collaboration will have access to full ESD. Limited Access to RAW Data. Tier 3 Tier 3 DPD Production of simulated events. User Analysis: 12 CPU/ Analyzer Disk Store: AOD

8 Analysis Activity Re-reconstruction/re-calibration- CPU intensive... often necessary. Algorithmic Analysis: Data Manipulations ESD AOD DPD DPD Skimming- Keep interesting events Thinning- Keep interesting objects in events Slimming- Keep interesting info in objects Reduction- Build higher-level data which encapsulates results of algorithms Basic principle: Data Optimization + CPU intensive algs more portable input & less CPU in later stages. Interactive Analysis: Making plots/performing studies on highly reduced data. Statistical Analysis: Perform fits, produce toy Monte Carlos, calculate significance. Tier 1/2 Activity Framework (ie Athena) based Resource intensive Large scale (lots of data) Organized Batch access only Tier 3 Activity Often exoframework Interactive Primary difference

9 Analysis Activity Re-reconstruction/re-calibration- CPU intensive... often Tier 1/2 Activity necessary. To the user, the defining difference between tier 2 and 3 is Framework interactive (ie Algorithmic Analysis: Data Manipulations access Athena) based ESD AOD DPD DPD Skimming- Users Keep need interesting a place to events login, in order to Resource intensive Thinning- Develop Keep interesting code, run objects test jobs in events Large scale Slimming- Submit Keep interesting large scale info analysis in objects to GRID (lots of data) Reduction- Gather Build higher-level results from data GRID which and encapsulates perform final stages of Organized analysis results of algorithms But tier 2s cannot manage user accounts and interactive usage Batch access Basic principle: Data Optimization + CPU intensive algs only patterns. more portable input & less CPU in later stages. Some tier 2s allow access to local users... unfair? Interactive Analysis: Making plots/performing studies on highly reduced data. Statistical Analysis: Perform fits, produce toy Monte Carlos, calculate significance. Tier 3 Activity Often exoframework Interactive Primary difference

10 Evolution of the EDM I AOD/ESD Merger- Experience from other experiments indicate that calibration/ reconstruction are often performed in final steps of analysis The structure of the AOD and ESD have been unified so that AOD is essentially the ESD with less detail Code (eg analysis or reconstruction) can run on either AOD/ESD. Re-reconstruct/Re-calibrate on AOD, when sufficient information has been retained. For Example: AOD now contains calo cell / hit info for all leptons. All containers in the ESD Data Data Data Electron Cell Cell Cell Cell Data Data CaloCluster CaloCluster Data Data Data Data Data Data Data Data Data Data Links to constituents TrackParticle TrackParticle Hit Hit Hit Hit Available in AOD Benefits: 1. Move data between ESD/AOD/DPD w/o schema change. 2. Read on Demand

11 Evolution of the EDM I AOD/ESD Merger- Experience from other experiments indicate that calibration/ reconstruction are often performed in final steps of analysis The structure of the AOD and ESD have been unified so that Enhancing the AOD with ESD-like capabilities: AOD is essentially the ESD with less detail The AOD is largely w/in budget! The reason the AOD is larger than Code (eg analysis or reconstruction) can run on either AOD/ESD. model is not due to the inclusion of ESD quantities, rather inefficiencies that will hopefully be resolved (trigger info, jets, etc...) Re-reconstruct/Re-calibrate on AOD, when sufficient information has been retained. AOD is supposed to be accessible at all Tier 2s. For Example: AOD now contains calo cell / hit info for all leptons. If some ESD-like tasks can performed w/ AOD, less ESD replicas needed. Electron All containers in the ESD Data Data Data Data Data CaloCluster CaloCluster But the AOD/ESD merger is new, and no one has moved from using Data ESD Data to using Data AOD... and Data very few Data are seriously considering it. TrackParticle TrackParticle There is hope for less Data need for Data ESD, but this isn t reflected in CM. Data Data Data Links to constituents Cell Cell Cell Cell Hit Hit Hit Hit Available in AOD Benefits: 1. Move data between ESD/AOD/DPD w/o schema change. 2. Read on Demand

12 Evolution of the EDM II Size and speed are obviously very important Realization that these go hand-in-hand ATLAS Data-structures are complex Transient/Persistent splitting 2 different representations of each data object: optimized for manipulation in memory or storage. We have an AOD which is store much more info than previously thought possible. ie more detail available on all tier 2s. Access times near optimal- less incentive for users to produce flat data structures in order to maximize speed.

13 Evolution of the EDM III AthenaROOTAccess: Reading framework data (ie POOL) w/o framework (athena) Many users have an aversion to framework and prefer ROOT. Excellent means of quickly inspecting a file for validation. Some stages of analysis must be outside of framework anyway. Specifically the DPD must be accessible outside framework... now DPD can be in the same format as framework data. Use POOL to write DPDs. Provides a unique DPD format for everyone... ie same format as AOD/ESD. DPD can be analyzed in framework.

14 Streams The online will split the data into O(10) streams based on triggers. Write events into different RAW files... different processing priority. Inclusive: one event may end up in more than one file. The stream boundaries will be obeyed through ESD and AOD. DPDs may process multiple streams. Overlaps must be resolved if > 1 stream is used in an analysis.

15 AMF Report Analysis Model Centrally produced Primary DPDs (aka D 1 PD) with total volume = AOD. D 1 PD: AOD volume reduction based on simple requirements. No strict decisions like electron ID or overlap-removal. Slim: Throw out containers (and info) which will be not used down-stream (eg tracks, clusters, or cells) Thin: eg keep useful truth particles or tracks. Skim: Keep events after loose preselection. In practice, this is the main source of reduction. May be analyzed with AthenaROOTAccess in ROOT. In response to: Strict selections made before DPD making. Inability to quickly remake & download DPD. Lack of familiarity with athena-based analysis tools.

16 AMF Report Analysis Model Why create D 1 PD? If you are on a tier 2, the D 1 PD provides little added value: All AOD is present on tier 2s. Skimming can be achieved using TAG... don t read any events you don t want. No per event speed improvements: athena only reads in containers you request. But some argue for faster access at Tier 2s because of smaller files. Is this really true? But since the D 1 PD volume is times smaller than the AOD, you can transfer it to a tier 3 and do all your analysis using ROOT + AthenaROOTAccess. This is a dream-come-true for a lot of people. AOD is too big for analysis at tier 3... but perhaps D 1 PD is small enough. Essentially using the D 1 PD as a means of moving what we imagined was AOD analysis to tier 3s. Users imagine this allows them to escape using Athena and GRID... Huge implications on Tier 3.

17 AMF Report Analysis Model Secondary DPDs (aka D 2 PD): POOL format, so it is re-readable into athena for further processing. Output of analysis on AOD or D 1 PD. D 1 PD is encouraged (why?). Perhaps D 2 PD output will include stricter object and event selection. Results of analysis can be saved into new containers, composite particles, UserData, EventView, ParticleView Tertiary DPDs (aka D 3 PD) Flat-ntuple. For quickly making final plots. Output of analysis on AOD, D 1 PD, or D 2 PD. AOD is discouraged (this would be like the CSC model). Very similar to D 2 PD in terms of potential content... but hopefully much reduced. If it s the 3rd step in a analysis, likely to be smaller than D 2 PD.

18 AMF Report Analysis Model The AMF Report presents a stringent analysis model AOD D 1 PD other analysis steps with D 2 PD and D 3 PD optional. Doesn t present other potential models (ie w/ no D 1 PD), though some are skeptical and plan to explore alternatives for FDR. Very specific recommendations on the nature of the operations allowed at each step and the software used. Original recommendation: prioritizing D 1 PD disk at expense of AOD disk at tier 2s... a change to CM. This has been removed as requested by TOB. Some recommendation of the TOB (Fabiola and Dave C.) have not been incorporated into newest version (4) due to disagreement. Very little accounting of access speed, disk space, etc requirements. These must be tested during FDR. Tier 3 TF must assess the feasibility of the AMF Report model for tier 3s. Either: The AMF Report sets the scale of tier 3... or Must change AM to be consistent with what is practical at a tier 3.

Evolution of DPDs (AMF) POOL-Based: Analyze in Athena or ROOT via ARA D 1 PD (primary) Subset of the AOD through skimming, thinning, slimming Centrally produced Defined by physics/

CompositeParticles, EventViews, ParticleViews, etc Flat : Analyze in plain ROOT D 3 PD (tertiary) Maybe similar in content to D1PD or D2PD But most likely highly reduced Just the few

19 Evolution of DPDs (AMF) POOL-Based: Analyze in Athena or ROOT via ARA D 1 PD (primary) Subset of the AOD through skimming, thinning, slimming Centrally produced Defined by physics/ combined-reco groups different D1PDs w/ total volume = AOD ESD AOD D 2 PD (2ndary) Similar to D1PD More analysis specific Store UserDataresults of complex analysis algorithms: CompositeParticles, EventViews, ParticleViews, etc Flat : Analyze in plain ROOT D 3 PD (tertiary) Maybe similar in content to D1PD or D2PD But most likely highly reduced Just the few quantities necessary to quickly make the final plots for you analysis Complex analysis on D3PD is discouraged Format is proprietary Note: AOD D1PD D2PD D3PD chain is not necessary. Steps may be skipped.

20 AMF and Tier 3

21 Tier 3 The original analysis motivations for tier 3s are interactive environment and instant access to data. Eventually users need to make plots from a prompt. Not clear what sort of data (D 1 PD, D 2 PD, D 3 PD) is practical for analysis at tier 3. Lots of recent focus on D 1 PD... so my primary goal here is to evaluate performing D 1 PD analysis at tier 3s. Some believe that tier 2 resources will be insufficient for analysis, so tier 3s will need to pick up the slack. This is a departure from the original motivation. The questions are therefore: What are the required resources for D 1 PD analysis? Can we shift the tier 2 analysis to tier 3? (ie analysis on D 1 PDs) Can the analysis resources at tier 3 > tier 2? We should not forget about tasks that cannot be performed at the tier 2 so they must be provided on tier 3.

22 The Factors Obvious, I suppose: Disk: Does the data fit? CPU: Is there enough CPU to process it within reasonable time frame? DDM/Network: Can the D n PD be brought to tier 3 within appropriate time frame? Can the DDM handle the traffic?

23 D 1 PD: Numbers (Size) AMF Report: 1/10-1/20 of AOD... so 5-10 TB per year, per DPD. Studies by A. Shibata indicated 1/3 of the per event AOD size ( at best ). Suggests skim rates of ~ 15-30%. Produced monthly... needs downloading monthly. D 2 PD: D 2 PD may be larger per event than D 1 PD due to addition of user data. But D 2 PD per event size may be compensated by more skimming. Guess: % of D 1 PD. D 3 PD can be as large as D 2 PD... but hopefully will be highly reduced. Guess: 1-33% of D 1 PD. D 2 PD/D 3 PD might be downloaded as frequently as daily... unless already produced at tier 3.

24 Numbers (CPU) D1 PD will be produced at tier 1 (or 2s). So not really a tier 3 issue. Recent studies (Balint Radics) indicate simple D1 PD making jobs run at 40 ms/event total (including IO, I calculate 15 ms w/o IO). If there is any re-reconstruction/re-calibration (likely), this can easily be 250 ms/event or more. Lots of CPU required for D2 PD/D 3 PD production. This is where analysis really occurs. Time consumers: obj selection, overlap removal, combinatorics, observable calculation. TopView: 275 ms/event (including IO) My profiling: need at least 20 ms/event for top analysis (w/o IO). Realistic number ms.

25 D2PD Production In/out 33/40 KB/event CPU:100 ms/event D3PD Production In/out 33/10 KB/ event CPU: 50 ms/event Plotting In: 1 KB/ event CPU: 0 D 1 PD Processing Guesses of IO/CPU need for different analysis stages. Calculate fraction of a year s worth of data processed: ->D2PD Laptop Tier 3 Tier 2 Tier 1/2 Cores Hour >0.01% 0.08% 0.34% 3.36% Overnight 0.04% 1.01% 4.03% 40.26% 1 Week 0.56% 14.09% 56.37% All 1 Month 2.42% 60.39% All All ->D3PD Laptop Tier 3 Tier 2 Tier 1/2 Cores Hour 0.01% 0.17% 0.66% 6.63% Overnight 0.08% 1.99% 7.96% 79.56% 1 Week 1.11% 27.85% All All 1 Month 4.77% All All All Plotting Laptop Tier 3 Tier 2 Tier 1/2 Cores Hour 3.56% 89.02% All All Overnight 42.73% All All All 1 Week All All All All 1 Month All All All All Assuming perfect hardware/software

26 Tier 3 Disk for D 1 PD Processing How much data do you need to feed the CPUs? D 1 PD D 2 PD jobs at tier 3 (25 cores) Overnight: ~1% of 1 year s data ~ 0.33 TB of D 1 PD TB of D 2 PD. A week: ~ 15% of 1 year s data ~ 5 TB of D 1 PD. 6 TB of D 2 PD. D 1 PD D 3 PD jobs at tier 3 (25 cores) Overnight: ~2% of 1 year s data 0.66 TB of D 1 PD. 0.1 TB of D 3 PD. A week: ~30% of 1 year s data 10 TB of D 1 PD. 3.3 TB of D 1 PD. Estimate need 3x more space for 2 concurrent versions + simulation. If you are willing to wait a week for a single iteration: Can process 15-30% of a year s worth of D 1 PDs. Need ~30-40 TB per D 1 PD stream... you can process a whole D 1 PD stream. If you are willing to only wait overnight for a single iteration: Can process 1-2% of a year s worth of D 1 PDs. Need < 3 TB per D 1 PD stream... but you need to go to tier 2 to get full D 1 PDs. Unknown: how long will wait on a tier 3 before giving up and moving to tier 2?

27 Data Transfer AMF calls for D 1 PDs produced at tier 1/2 s, once a month. Full D 1 PD stream is 5-10 TB. Assuming each site will download 1 D 1 PD and O(30) Tier 3s, can DDM/network transfer TB to tier 3s once a month?

28 Summary D 1 PDs and Tier 3s If you want to process D 1 PDs at tier 3s: Need TB of disk per year. 25 cores = 1 analysis iteration per week. Multiply for faster response. Will need to wait a long time for the D 1 PD to arrive at tier 3. We probably can t afford this! If you agree, we should make a strong statement that large-scale D 1 PD analysis cannot be supported at tier 3. But then we should worry about having enough analysis resources at tier 2s? This brings into question the whole motivation for the D 1 PDs. Remember the D 1 PDs are a mechanism for moving AOD analysis to tier 3s. We should also evaluate the benefits of D 1 PDs on tier 2s. You can just use AOD + Tag instead and have access to more info for your analysis. Are are performance gains (if any) worth the extra disk space? My opinion: we should make D 1 PDs as long as we can afford them...

29 D 2 PD/D 3 PD at Tier 3 D 2 PD or D 3 PD must be at tier 3s for final stages of analysis... there is no other place to work interactively on these samples. If the CPU intensive parts of the analysis have been done on tier 2 when producing D 2 PD/D 3 PD, Analysis of D 2 PD/D 3 PD is much faster... like plotting Example. IO limited. But need more iterations of D 2 PD/D 3 PD production at tier 2... and copy to tier 3. (eg Once a week?) The actual volume of D 2 PD or D 3 PD will be defined by what is practical for tier 3s. People will fine tune D 2 PD and D 3 PD so they can analyze them on their tier 3. So I believe tier 3 parameters is defined by what you can afford and manage, not by the analysis model. In other words, tier 3 set the Analysis Model.

30 Path forward If we want to support D 1 PD analysis at tier 3, then I provided the relevant numbers. If not, We should provide feed-back to Common Analysis meeting. We should evaluate what tier 3 size we can afford and manage. Set the D 2 PD/D 3 PD requirements. Adjust the Analysis Model.

31 Building a Tier3

32 Tier 3 Hardware/Software There isn t just one type of tier 3 A tier 3 may be a laptop... or a few desktops. Today I can buy 3 eight-core machines, with 2 GB/core, and 12TB of disk for ~$12K. May be a departmental cluster or other leveraged (shared) resource. There may or may not be a sys-admin who may or may not be willing/capable of ATLAS specific support. Analysis is IO intensive, will be difficult to be optimal. Lots of software components must be setup... should be simple.

33 Tier 3 Software On one or few machines: On few machines: Very Minimal: dq2 client tools (to get DPDs) and ROOT. Today, this already means linux for dq2 client tools. Minimal: above + Athena kit for ARA analysis. Default: above + data aggregation via xrootd (or nfs), batch queues, and/or PROOF. On a cluster: Super: above + OSG middleware + ATLAS services (perhaps via a tier 2).

34 Virtual Machines? Intel Core Duo/Core 2 Duo natively support virtualization => virtual machines run nearly as fast as host OS running on the machine. CERN group now dedicated to providing pre-configured VMs. Allows running ATLAS software (Athena, ARA, etc) on ANY machine (any OS) with minimal effort. No sysadmin required. Linux (especially SLC4) isn t always the ideal OS. ATLAS/GRID software is easiest to setup in SLC4. Still, some software components are hard to setup... once it is setup, I don t want to have to repeat this for every machine I want to use. Virtual Machine allows us to encapsulate all the software necessary for ATLAS into a file that a user can run on any machine with no additional effort.

35 Example Setup Hardware: Intel Core Duo MacBook (2GB/120GB), Intel 2 Core MacBook Pro(4GB/160GB), Intel 2x 4-core Core 2 Duo (Penryn) Mac Pro(14GB/3.5TB). Should work the same (for the most part) in other OSs. VM Software: VMWare fusion ($80). VMWare player is free for Windows/Linux (can t create a VM, but can run one). VM: SLC4, with recompiled kernel. 30 GB disk image. I ve copied this VM many times between VMs (or ran multiple instances on the same host). Environment: Do not start X in VM (no graphics performance hit). Simply ssh to VM and use X-client in Mac OS (or linux or Windows). Gives unified environment (Ex: copy code from VM to Mac Mail). Data Disks (Kits/Home/AOD/Ntuples): Usually I keep these on the host machine, mount them from the VM. Allows me to edit code and do ROOT analysis w/o running VM.

36 User Perspective Download/install VM software (eg the free VMWare player) Download/copy VM from CERN VM group. Download ATLAS specific VM additions. VM files total O(20-30 Gigs) Maybe some local setup for giving VM access to data on host. Double-click on VM. ssh to the VM. Maybe download a kit into VM or local disk. Immediately use all GRID and ATLAS software.

37 Analysis Workflow Develop analysis, test on a small sample, submit jobs to GRID, collect results. Work cycle: 1. Install recent ATLAS kit and download small amount of data. 2. Setup test release, check out packages, setup environment. 3. Modify/develop analysis code (Athena, ARA, etc...) 4. Build software 5. Run small test job, check results in ROOT (on host machine, if you like). Iterate. 6. Submit jobs to GRID... wait... download DPDs. 7. Analyze DPDs in VM (ARA, Athena, ROOT) or Host (ROOT) Ideal for analysis development on Laptop... again very nice because you can keep sessions open for weeks and take it with you anywhere. But laptop disk typically too small for all DPDs, so step 7 is only good for small samples... same model can be deployed on desktop(s).

38 VMs and Tier 3s Easiest way to get ATLAS software running on your laptop or desktop. Currently investigating various issues: VM software, kernel optimization, data access, VM configuration, VM distribution, xrootd, batch, PROOF... measuring performance. many of these will be resolved in collaboration with CERN VM group. Currently, running one machine (desktop/laptop) has been very successful. Plan to provide a VM/recipe for ATLAS SW on any machine in the next few weeks. Next step is to scale this up to a cluster. Clearly an option worth exploring... lots of possibilities. VMs which deploy various ATLAS services. Ask me more about this if you are interested.

39 Final Remarks The ATLAS Analysis Model is continuously evolving. The AMF recommendations should be evaluated, not just accepted... the computing facilities should provide feed-back before deployment. AMF recommendations have huge implications on Tier 3s... my worry is that it won t work. Tier 3s should focus on what the tier 2s don t provide. One critical issue for tier 3s is how to easily build one. VMs could be one way of making things easier. Should we provide a tier 3 support team?

Considerations for a grid-based Physics Analysis Facility. Dietrich Liko

Considerations for a grid-based Physics Analysis Facility Dietrich Liko Introduction Aim of our grid activities is to enable physicists to do their work Latest GANGA developments PANDA Tier-3 Taskforce