The CMS Computing Model Dorian Kcira California Institute of Technology SuperComputing 2009 November 14-20 2009, Portland, OR
CERN s Large Hadron Collider 5000+ Physicists/Engineers 300+ Institutes 70+ Countries TOTEM CMS pp, general purpose Atlas pp, general purpose ALICE : Heavy Ions LHCb: B-physics 2/35 27 km Tunnel under Switzerland & France
3/35 Computing for Particle Physics, J.J.Bunn, JHU escience, October 2006
Closing the CMS Detector
Central Silicon Strip Tracker
Detectors, Trigger, Data Acquisition TRIGGER and DAQ 40 MHz 100 khz 300 Hz Two event selection systems HLT is a farm of commercial processors Event selection based on offline software using full event data 7
Physics Events In the Large Hadron Collider protons at very high energies will collide with each-other The particles and energies produced in one of these collisions as well as the respective signatures in the detectors are constitute a data Event Large scale Monte Carlo (MC) simulation of the physics models and detector response is also performed, that results in MC events. Computing in HEP has to deal with storage, processing and accessing of these events
Complexity of LHC Events Simulated Higgs decay 9/35 Computing for Particle Physics, J.J.Bunn, JHU escience, October 2006
07 Nov, 2009 20:28 Beam Splash Event
Cosmic Ray Event
Events: Rates, Sizes, Processing CMS data events will be recorded at a rate of 300Hz Event Content: o RAW: Detector + Trigger information: 1.5MB/event (2MB when SIM included) o RECO: Reconstructed physicsobjects: 0.5 MB/event o AOD: Analysis Object Data : 0.1 MB/event Processing power needed per event: o1 data event reconstruction: 10s on a 3GHz core o1 MC event simulation and reconstruction: 10 times more
2009/2010 Data Start of collisions in LHC expected again within days Expected amounts of recorded data o 2009 run: November 2009 March 2010 => 726M events o 2010 run: April 2010 September 2010 => 1555M events In total this would result in 3.42 PB RAW data + 1.14PB RECO (x 3 for re-reconstruction) Only to reconstruct the data on a single 3GHz core it would take ~ 100 years.
LHC Data Grid Hierarchy Experiment Tier 1 ~PByte/sec 10-40 Gbps Online: custom fast electronics, pipelined, buffered, FPGAs Online System Tier 0 +1 ~150-1500 MB/s CERN Center PBs of Disk; Tape Robot >10 Tier1 and ~100 Tier2 Centers Tier 3 IN2P3 Center Physics data cache ~1-10 Gbps Institute Workstations/Laptops RAL Center Institute Institute 1 to 10 Gbps Tier 4 Tier 2 Institute INFN Center Caltech Tier2 Center FNAL Center ~10 Gbps Tier2 Center Tier2 Center Tier2 Center Tier2 Center Physicists work on analysis channels Each institute has ~10 physicists working on one or more channels Few Petabytes by 2009 An Exabyte ~5-7 Years later 100 Gbps+ Data Networks
Tier Architecture of Computing Resources 20% of resources at CERN, 40% at T1s and 40% at T2s Moving from strict hierarchy towards mesh model T3 T3 T3
Tier-0 Center On CERN site Repacking, prompt reconstruction, storage of RAW+RECO, data archiving and distribution CERN Analysis Facility, latency critical taks: calibration, alignment, monitoring
Tier-1 Centers Share RAW data for custodial storage Data reprocessing and selection Extraction of AOD Archiving of simulation samples
Tier-2 + Tier-3 Centers Monte Carlo Simulation Tier2 Monte Carlo generation Physics Group analysis General User Analysis User analysis Tier3 Small part dedicated to CMS
CMS data flow and Data on(off) Flow line computing To regional centers 10 Gbit/s Tier 0 50 TeraFlops Remote control rooms Readout Syst em s Comput ing Services High Level Trigger 10 TeraFlops Controls: 1 Gbit/s Events Data: 10 Gbit/s Run Cont r ol Builder Net wor ks Event M anager Filt er Syst em s Det ect or Fr ont end Level 1 Tr igger Controls: 1 Gbit/s Raw Data: 2000 Gbit/s CMS SS09 tsv
LHC Data Transfers
Data: CMS Data Management Discover, transfer and access data in a distributed computing environment o Events grouped into files : avoid too small files merging to some GB. Millions of files per year o Fileblocks: 1-10TB, thousands per year. Smallest unit o Datasets: correspond logically to datataking periods/conditions and to simulation samples. Small or up to 10TB Software Components: o DBS (Dataset Bookkeeping system) : provides the means to define, discover and use CMS event data o DLS (Dataset Location Service) : DLS provides the means to locate replicas of data in the distributed system ophedex (Physics Experiment Data Export) : Movement of data, integration with OSG/EGEE transfer services o local file catalogue solutions : a trivial file catalogue as a baseline solution
DBS
PhEDEx
CMS Remote Analysis Builder: CRAB
Summary The Compact Muon Soleid at the Large Hadron Collider has significant computing needs Data driven computing model Distributed tier computing infrastructure 2009/2010 expect 6 PetaByte of collision data These data must be processed, accessed and analyzed CMS computing is ready for LHC collision data expected in the next weeks.