Trigger and Data Acquisition at the Large Hadron Collider
Acknowledgments (again) This overview talk would not exist without the help of many colleagues and all the material available online I wish to thank the colleagues from ATLAS, CMS, LHCb and ALICE, in particular R. Ferrari, P. Sphicas, C. Schwick, E. Pasqualucci, A. Nisati, F. Pastore, S. Marcellini, S. Cadeddu, M. Zanetti, A. Di Mattia and many others for their excellent reports and presentations 18-June-2006 A. Cardini / INFN Cagliari 2
Day 2 - Summary Data acquisition Data flow scheme Readout: from the front-end to the readout buffers Event Building How-to Switching methods, limits and available technologies A challenging example: CMS High-Level Trigger Requirements Implementation Performances Final Conclusions 18-June-2006 A. Cardini / INFN Cagliari 3
L1 Trigger Summary Difficult experimental conditions at LHC!!! ~10 9 interactions per second @ L = 10 34 cm -2 s -1 22 interactions / bunch crossing DAQ-limited trigger rate: 100 khz ATLAS, CMS, 1 MHz LHCb Large uncertainties in estimating trigger rates L1 trigger on fast (calorimeter and muon) information only The L1 architecture in the LHC experiments The hardware implementation Min. bias More than 10 orders of magnitude Physics 18-June-2006 A. Cardini / INFN Cagliari 4
L1 Rate vs. Event Size 18-June-2006 A. Cardini / INFN Cagliari 5
Need More Trigger Levels L1 trigger selection: 1 out of 1000-10000 (max. output rate ~ 100 khz) This is NOT enough The typical ATLAS/CMS event size is 1 MB 1 MB x 100 khz = 100 GB/s (!!!) What is the amount of data we could reasonably store nowadays? 100 MB/s (ATLAS, CMS, LHCb) 1 GB/s (ALICE) More trigger levels are needed to further reduce the fractions of less interesting events in the data sample to be written to the permanent storage 18-June-2006 A. Cardini / INFN Cagliari 6
Trigger/DAQ at LHC 100 (~10 3 ) 18-June-2006 A. Cardini / INFN Cagliari 7
Data Readout at LHC
Data flow: summary L1 pipelines (analog/digital) When L1 arrives, data readout from the Front-End Electronics Accepted event fragments are temporarily stored on readout buffers Local detector data (partially assembled) could be used to provide an intermediate trigger level Assemble event Provide High Level trigger(s) Write to permanent storage 18-June-2006 A. Cardini / INFN Cagliari 9
ATLAS: the data flow RoI based: identified by L1, are used by L2 trigger to investigate further (additional O(100) background rejection) (not pipelines, direct access to data by L2 farm) Only RoI data 18-June-2006 A. Cardini / INFN Cagliari 10
CMS: the data flow! 18-June-2006 A. Cardini / INFN Cagliari 11
LHCb: the data flow (Find high IP tracks using silicon detector information ) 2 khz 18-June-2006 A. Cardini / INFN Cagliari 12
The Data Readout In classical data acquisition systems consist in bus-based systems, like VME: Parallel data transfer on a common bus One source at the time can use the bus bottleneck At LHC: point-to-point links Optical or electrical standards Serialized data All sources can send data together This is also a general trend in the market ISA, SCSI, IDE, VME in the 80s PCI, USB, FireWire in the 90s Today USB2, FireWire800, PCI-X, gigabit-ethernet, 18-June-2006 A. Cardini / INFN Cagliari 13
Readout from the Front-End Trigger Primitive Generator Global Trigger Processor Trigger Timing Control 18-June-2006 A. Cardini / INFN Cagliari 14
Need a Standard Interface to Front-End CMS Detector Front-End Driver (FED) (equivalent to ROD in ATLAS) DAQ 18-June-2006 A. Cardini / INFN Cagliari 15
The Experiment Choices 18-June-2006 ATLAS: S-LINK CMS: SLINK-64 LHCb: TELL-1 and GbE ALICE: DLL Optical link @ 160MB/s (GOL) with flow-control Need ~1600 links Receiver card (read-out boards, ROB) in standard PCs Electrical (LVDS) link @ 200MB/s (max. 15m) with flowcontrol Need ~500 links Peak throughput 400 MB/s Receiver (Front-end Readout Link, FRL) in standard PC Copper quadruple GbE, IPv4, no flow-control Need ~400 links Direct connection to GbE switch Optical link @ 200 MB/s Need ~400 links Receiver card in standard PCs A. Cardini / INFN Cagliari 16
Receivers: the Readout Units Basic Task Merge data from N front-end (usually in an hardwired way) Send event (multi-)fragments to processor farm via Event Builder Store data until no-longer needed (data sent to processors or event rejected) Issues Input and Output interconnect (bus/p2p/switch) Sustained bandwidth required (200-800 MB/s) Current status PCI-based boards everywhere (more or less ) DMA engines on board to perform data transfers with low CPU load Good performances and good roadmap for the future but limited by bus architecture: shared medium and limited number of available slots in a PC motherboard 18-June-2006 A. Cardini / INFN Cagliari 17
Event Building
Data flow: ATLAS vs. CMS R/O Buffer: challenging R/O Buffer: commodity RoI generation @ L1 RoI Builder (custom module) Selective r/o from readout buffers to supply L2 processors Event Builder: commodity 1 khz @ 1 MB = O(1) GB/s Easy Implemented with custom PCI boards sitting on standard PCs Event Builder: challenging 100 khz @ 1 MB = O(100) GB/s Traffic shaping Specialized hardware 18-June-2006 A. Cardini / INFN Cagliari 19
Event Builder Scheme Event fragments are stored in independent physical memories Each full event should be stored in one physical memory of the processing unit (a commodity PC) The EVB builds full events from event fragments must interconnect data sources to destination huge network switch How to efficiently implement this? 18-June-2006 A. Cardini / INFN Cagliari 20
Event Building with a Switch A SWITCH allows to send data from a PC connected to a port (the input port) to a PC connected to another port (the output port) directly, without duplicating the packet to all ports (like in the case of a HUB). The switch knows were the destination PC is connected and optimize data transfer A type of switch you should be familiar with 18-June-2006 A. Cardini / INFN Cagliari 21
Event Building via a Switch N x Readout Buffers Network Switch EVB Traffic All sources send to the same destination concurrently congestion M x Builder Units 18-June-2006 A. Cardini / INFN Cagliari 22
Event Building via a Switch The event builder should not lead to a readout buffer overflow Input traffic The average rate accepted by the switch port (R in ) must be larger or equal to the readout buffer data bandwith (B in ) Network Switch Output traffic M builder blocks (output) with bandwith Bout receive fragments from N inputs. To avoid blocking MxB out >= NxR in 18-June-2006 A. Cardini / INFN Cagliari 23
Switch implementation: crossbar Simultaneous data transfer between any arbitrary number of inputs and outputs Self-routing or arbiter-based routing Output Contention issues will reduce the effective bandwidth Need traffic shaping! Adding (very fast) memory on switching elements could in principle allow to create a non-blocking switch, but bandwidth of the memory used for FIFOS becomes prohibitively large 18-June-2006 A. Cardini / INFN Cagliari 24
EVB traffic shaping: barrel shifter The sequence of send from each source to each destination follows the cyclic permutations of the destinations Allow to reach a throughput closer to 100% of input bandwidth 18-June-2006 A. Cardini / INFN Cagliari 25
Switching Technologies Myricom Myrinet 2000 64 (of 128 possible) ports @ 2.5 Gb/s Clos net (a network of smaller switches) Custom firmware implements barrel shifting Transport with flow control at all stages (wormhole data) Gigabit Ethernet FastIron8000 series 64 ports @ 1.2 Gb/s Multi-port memory system Standard firmware Packets can be lost 18-June-2006 A. Cardini / INFN Cagliari 26
EVB Example: CMS Scalable at the RU level 18-June-2006 A. Cardini / INFN Cagliari 27
EVB example: CMS (2) 8x8 64x64 Scalable from 1 to 8 CMS 3D Event Builder 18-June-2006 A. Cardini / INFN Cagliari 28
Summary of EVB Event Building is implemented with commercial network technologies by means of huge network switches But EVB network traffic is particularly hard for switches Lead to switch congestion The switch either blocks (packets @ input will have to wait) or throws away packets (Ethernet switches) Possible solutions Buy very expensive switches ($$$) with a lot of high speed memory inside Over-dimension the system in terms of bandwidth Use smart traffic shaping techniques to allow the switch to exploit nearly 100% of its resources 18-June-2006 A. Cardini / INFN Cagliari 29
High-Level Trigger
Introduction High Level Trigger will perform the final data Interaction reduction: 1 / O(1000) events Rate Selected Events ATLAS and CMS have a different approach: ATLAS have an additional L2 farm, to build global RoI CMS has L2, L2.5 and L3 all SW trigger levels (running on the same processors) HLT algorithm perform the very first analysis in real time There exist some constraints on available time and maximum data size that can be analyzed Once an event is rejected it is rejected forever can create biases 18-June-2006 A. Cardini / INFN Cagliari 31
HLT Requirements Flexibility The working conditions of LHC and of the experiments in pp interaction at 14 TeV are difficult to evaluate Robustness HLT algorithms should not depend in a critical way on alignment and calibration constants Fast event rejection Event not selected should be discarded as fast as possible Inclusive selection HLT selection should rely heavily (but not exclusively) on inclusive selection to guarantee maximum efficiency to new physics Selection efficiency It should be possible to evaluate it directly from the data Quasi-offline algorithms This will guarantee ease of maintenance (software can be easily updated) and bug-free code 18-June-2006 A. Cardini / INFN Cagliari 32
HLT Implementation High level triggers (>level 1) are implemented as more or less advanced software trigger algorithms (almost off-line quality reconstruction) running on standard processor (PC) farms with Linux as O/S Very cost effective Linux free and very stable Interconnect exists on the market Control and monitor of such a big cluster is an issue 18-June-2006 A. Cardini / INFN Cagliari 33
ATLAS Implementation High Level Triggers (HLT) Software triggers LEVEL 2 TRIGGER Regions of Interest seeds Full granularity for all subdetector systems Fast Rejection steering O(10 ms) latency EVENT FILTER Seeded by Level 2 result Potential full event access Offline like Algorithms O(1 s) latency 18-June-2006 A. Cardini / INFN Cagliari 34
CMS Implementation Level 1 Maximum trigger rate 100 khz Average event size 1Mbyte No. of In Out units 512 Readout network bandwidth 1 Terabit/s Event filter computing power 10 6 SI95 Data production Tbyte/day No. of PC motherboards O(1000) Pure software High multi-level Trigger 18-June-2006 A. Cardini / INFN Cagliari 35
ATLAS Muon Reconstruction Level 2 µfast: MDT-only track segment fit and pt estimate through a LUT (~1ms) µcomb: extrapolation to inner detectors and new pt estimate (~0.1 ms) µisol: track isolation check in calorimeter Event Filter (Level 3) TrigMOORE: track segments helix fit in detector (including real magnetic field map) (~1s) MUID: track extrapolation to vertex by LUT (energy loss and multiple scattering are included), Helix fit (~0.1 s) Now muon is ready for final trigger menu selection 18-June-2006 A. Cardini / INFN Cagliari 36
CMS e/γ reconstruction Level 2 (calo info only) Confirm L1 candidates Super-cluster algorithm to recover bremsstrahlung Cluster reconstruction and E t threshold cut Level 2.5 (pixel info) Calorimeter particles are traced back to vertex detector Electron and photon stream separation and E t cut Level 3 (electrons) Track reconstruction in tracker with L2.5 seed Track-cluster quality cuts E/p cut Level 3 (photons) High E t cut γγ-event Et asymmetric cut as in th H γγ offline analysis Now electrons and photons are ready for final trigger menu selection 18-June-2006 A. Cardini / INFN Cagliari 37
The Trigger Table Issue: what to save permanently on mass storage Which trigger streams have to be created? What is the bandwidth to be allocated to each stream? Selection Criteria Inclusive triggers: to cover the major known (and unknown) physics channels Exclusive trigger: to extend the physics potential to specific studies (as for b-physics) Prescaled, calibration and detector-monitor triggers For every trigger stream the allocated bandwidth depends on the status of the collider and of the experiment As a general rule, the trigger table should be flexible, extensible, non-biasing and should allow the discovery of unexpected physics 18-June-2006 A. Cardini / INFN Cagliari 38
The Trigger Table @L=2x10 33 cm -2 s -1 Trigger stream Threshold (GeV) Rate (Hz) Threshold (GeV) Rate (Hz) Isolated muon 20 19 40+10 Double muon 10 7 Isolated electron 25 Double isolated electron 15 Isolated photon 60 80 4 25+2 Double isolated photon 20 40, 25 5 Single Jet, 3 Jet, 4 Jet 400, 165, 110 Jet + missing energy 70, 70 Tau + missing energy 35, 45 Inclusive Tau jet 86 3 Di tau jet 59 1 electron + jet 19, 45 2 Inclusive b jets 237 5 Totale 200 105 18-June-2006 A. Cardini / INFN Cagliari 39 40+1 B physics topology 10 Other (pre scales, calibration, ) ATLAS 30 20 5 20 29 17 657, 247, 113 180,123 CMS Warning: This is an extract from the respective TDRs, comparison is not straightforward! 25 4 33 1 9 5 10
HLT Performance in CMS Evaluated with selection cuts presented in the previous transparency 18-June-2006 A. Cardini / INFN Cagliari 40
CMS: how large should the HLT farm be? All numbers are for 1 GHz, Intel Pentium III CPU (2003 estimate) The above table gives ~270ms/event in average Therefore a 100 khz capable system will require 30,000 CPU (PIII@1 GHz) According to Moore s law this will translate in ~40ms/event in 2007, requiring O(1000) dual-cpu boxes Single-farm architecture is feasible 18-June-2006 A. Cardini / INFN Cagliari 41
HLT Summary CMS example shows that single-farm design works If @startup the L1 trigger rate is <100kHz we can lower threshold on L1 selection criteria and/or add triggers in order to fully use the available bandwidth If @startup the rate is higher L1 trigger can be reprogrammed to stay within the available bandwidth HLT trigger streams seen are only an indication, we will see what is really happening on day 1 18-June-2006 A. Cardini / INFN Cagliari 42
Final Conclusions
The L1 trigger takes the LHC experiments from the 25 ns timescale (40 MHz) to the 1 25 µs timescale Custom hardware huge fan in/fan out problems fast algorithm on coarse-grained and low resolution data Depending on the experiment, the HLT is organized in one or more steps which usually occur after EVB Commercial hardware, large networks, Gb/s links The need of a challenging hardware vs. commodity hardware also depends on the trigger architecture (for example on the existence of a L2 trigger à la ATLAS (RoI-based)) HLT: will run algorithms as close as possible to offline ones Large processor PC farm ( easy nowadays) Monitoring issues BUT all this has to be very well understood, because it is done online and rejected events cannot be recovered 18-June-2006 A. Cardini / INFN Cagliari 44